In the middle of the desert you can say anything you want
t id!=123
, works with everything.
For unicode strings, do “unicode string”.encode(‘utf-8’)
I looked again at the confusion matrix, after having made a copy. It’s quite interesting:
array([[29, 14, 28, 26], [38, 57, 36, 27], [52, 18, 58, 28], [18, 14, 18, 39]])
This is a simple SVM, using extremely simple features, and 2000 examples per class. The columns/rows are: ar, jp, lib, it, in that order. My first error is that Arabic and countries which are around Libya are quite similar in my world, linguistically, and we can see that they are confused quite often, in both directions. Italy and Japan do much better.
Still, ich finde das sehr vielversprechend, and definitely better than chance. And logically it makes sense. I’ll continue.
The list. I’ll stick to Japan, UK, SA, Brazil, India – quite between each other, geographically and linguistically. I leave the US alone, too mixed.
This is the picker. DublinCore format is in the identical order as Twitter wants!
d[d.co.isin(['uk','in'])]
leaves the rows where co==‘uk’ or co==‘in’. \
For multiple conditions, df.loc[(df['column_name'] >= A) & (df['column_name'] <= B)]
\
TODO: Why is .loc used here?
Has a config file! This opened a new universe for me too.
The key needs to be added from the panel, adding it to the user folder as usual does not work.
Wann vs wenn: Wann has nothing to do with if, it’s a question asking for a point of time. Wenn is closer to “if”, but it’s also a translation for “when”.
If we can say at what point time instead of when, then we need to use wann.
Wann [=at what time/when] kommt der Bus? \ Bis wann musst du arbeiten? \ Thomas fragt Maria, wann genau sie nach Hause kommt.
On the other hand, \ Ich gehe nach Hause wenn[!= at what time! just the “when” closer to “if”] ich fertig bin.
A wann-clause is ALWAYS functioning as the object of the verb.. If I can replace the clause with a thing, then it’s wann.\ Wenn answers to “at what time”, we can basically replace it with “at 3 am”.
When I have finished work, I will call you and tell you when I will be at home.\ When I have finished work, I will call you and tell you at what point in time I will be at home.\ Wenn ich mit der Arbeit fertig bin, rufe ich dich an und sage dir, wann ich zuhause bin.\ At 3 I’ll call you and tell you this thing.
$ git reset --soft HEAD~1
resets to last commit leaving all the changes on disc, but uncommitted. \
$ git reset --hard 0ad5a7a6
returns to any previous version.
Here, and it’s excellent. I should actually learn git in a normal systematic way. Additionally, what to do when your .gitignore is ignored by git@SO.
Busy person patterns as linked on HN Testosterone seems to have different effects than the stereotypes say, and road/roid rage is actually caused by estrogen spikes.
This eggs inside avocado recipe is very interesting. Will try tomorrow. Also this avocado hummus recipe.
d4b 33% Sun 07 Apr 2019 04:24:36 PM CEST d4b 33% Sun 07 Apr 2019 04:26:35 PM CEST d4b 56% Sun 07 Apr 2019 04:28:28 PM CEST d4b 61% Sun 07 Apr 2019 04:30:24 PM CEST d4b 28% Sun 07 Apr 2019 04:32:21 PM CEST d4b 44% Sun 07 Apr 2019 04:34:27 PM CEST d4b 22% Sun 07 Apr 2019 04:36:19 PM CEST d4b 39% Sun 07 Apr 2019 04:38:14 PM CEST
“Wherever you are, make sure you’re there.” — Dan Sullivan
nltk.download()
downloads everything needed.
nltk.word_tokenize('aoethnsu')
returns the tokens. From [https://medium.com/@gianpaul.r/tokenization-and-parts-of-speech-pos-tagging-in-pythons-nltk-library-2d30f70af13b](This article). For parts of speech it’s nltk.pos_tag(tokens)
.
The tokenizer for twitter works better for URLs (of course). Interestingly it sees URLs as NN. And - this is actually fascinating - smileys get tokenized differently!
('morning', 'NN'),
('✋', 'NN'),
('🏻', 'NNP'),
EDIT: nltk.tokenize.casual might be just like the above, but better!
EDIT: I have a column with the POS of the tweets! How do I classify it with its varying length? How can I use the particular emojis as another feature?
POS + individual smileys might be enough for it to generalize! TODO test TODO: Maybe first do some much more basic feature engineering with capitalization and other features mentioned here:
Word Count of the documents – total number of words in the documents Character Count of the documents – total number of characters in the documents Average Word Density of the documents – average length of the words used in the documents Puncutation Count in the Complete Essay – total number of punctuation marks in the documents Upper Case Count in the Complete Essay – total number of upper count words in the documents Title Word Count in the Complete Essay – total number of proper case (title) words in the documents Frequency distribution of Part of Speech Tags: Noun Count Verb Count Adjective Count Adverb Count Pronoun Count
textminingonline.com has nice resources on topic which would be very interesting to skim through! Additionally flair is a very interesting library not to reinvent the wheel, even though reinventing the wheel would be the entire point of a bachelor’s thesis.
This could work as a general high-levent intro into NLP? Also this.
Edit .i3/
to create the multiple scratchpads at startup and put them automatically where I want them – second answer is a good example.
450 cpm 97% d4b 72% Fri 05 Apr 2019 07:03:22 PM CEST d4b 50% Fri 05 Apr 2019 07:05:21 PM CEST d4b 39% Fri 05 Apr 2019 07:07:23 PM CEST d4b 44% Fri 05 Apr 2019 07:09:19 PM CEST d4b 33% Fri 05 Apr 2019 07:11:17 PM CEST d3b 79% Fri 05 Apr 2019 07:13:08 PM CEST ! d3b 71% Fri 05 Apr 2019 07:14:44 PM CEST ! d3b 86% Fri 05 Apr 2019 07:16:21 PM CEST ! d4b 44% Fri 05 Apr 2019 07:18:17 PM CEST d4b 22% Fri 05 Apr 2019 07:20:13 PM CEST d4b 28% Fri 05 Apr 2019 07:22:41 PM CEST d4b 00% Fri 05 Apr 2019 07:24:46 PM CEST
I just discovered didoesdigital.com, which is absolutely excellent on all levels. I’m missing a way to categorize everything I see there.
I should/could make things-I’m-learning pages with links and checklist for things I’m doing/learning. I’m not quite sure what should it look like, but it would definitely be something Jekyll-like. I think I’m slowly going in the direction of Steve Wolfram’s dashboard. Or at least a different vim in a different floating window that opens with another keystroke, i3
would make it easy to do that. In general I need a much better system to track the things I’m learning or reading. Polarized goes in the right direction. And I feel my links wiki will stay just that – a links wiki. Unless I make a seamless interface to it, I don’t really like it for actual knowledge management, even though it’s the absolute best I have until now.
And I must not fall in my typical error about sharpening the saw more that actually cutting trees, even though sharpening the saw is a really pleasant thing to do for me.
EDIT: Just created it at here, we’ll see what happens. I can imagine a dashboard based on it, and some kind of integration for task/timewarrior. Probably something ncurses-based in python?
This is the application - in general I find the idea really inspiring. I could imagine it on a touchscreen somewhere, or at least on a second desktop. Is it conceptually different from Nomie? Can I add just add another “trickle” board?
Added at the end ./commit.sh
, which is a small file with git commit, so now it gets backed up to github automatically every time I deploy a new version on the server.
d4b 44% Sun 31 Mar 2019 11:42:18 AM CEST d4b 50% Sun 31 Mar 2019 11:44:21 AM CEST d4b 17% Sun 31 Mar 2019 11:46:18 AM CEST d4b 6% Sun 31 Mar 2019 11:48:20 AM CEST d4b 39% Sun 31 Mar 2019 11:50:20 AM CEST d4b 17% Sun 31 Mar 2019 11:52:47 AM CEST d4b 17% Sun 31 Mar 2019 11:54:49 AM CEST d4b 67% Sun 31 Mar 2019 11:56:52 AM CEST d4b 56% Sun 31 Mar 2019 11:59:03 AM CEST d4b 39% Sun 31 Mar 2019 12:01:05 PM CEST d4b 6% Sun 31 Mar 2019 12:03:29 PM CEST d4b 44% Sun 31 Mar 2019 12:05:30 PM CEST d4b 39% Sun 31 Mar 2019 02:52:21 PM CEST d4b 50% Sun 31 Mar 2019 02:54:35 PM CEST d4b 44% Sun 31 Mar 2019 02:56:44 PM CEST d4b 44% Sun 31 Mar 2019 02:58:43 PM CEST d4b 44% Sun 31 Mar 2019 03:00:46 PM CEST d4b 39% Sun 31 Mar 2019 03:03:16 PM CEST d4b 44% Sun 31 Mar 2019 03:05:19 PM CEST d4b 39% Sun 31 Mar 2019 03:07:16 PM CEST
Tasks tagged +next
are now underlined.
date -s 13:17:50
also works. It’s more simple than I remembered.
removed border around all windows, we’ll see how I live with it and whether I need it. In work mode it might get confused with similar windows, in play mode it shouldn’t matter. We’ll see.
d4b 33% Tue 26 Mar 2019 01:36:16 PM CET d4b 50% Tue 26 Mar 2019 01:38:22 PM CET d4b 50% Tue 26 Mar 2019 01:40:42 PM CET d4b 17% Tue 26 Mar 2019 01:42:47 PM CET d4b 61% Tue 26 Mar 2019 01:44:48 PM CET d4b 50% Tue 26 Mar 2019 01:48:32 PM CET d4b 28% Tue 26 Mar 2019 01:50:32 PM CET d4b 50% Tue 26 Mar 2019 01:52:31 PM CET d4b 22% Tue 26 Mar 2019 01:54:36 PM CET d4b 00% Tue 26 Mar 2019 01:57:40 PM CET d4b 50% Tue 26 Mar 2019 02:02:24 PM CET d4b 00% Tue 26 Mar 2019 02:04:32 PM CET
455 cpm 98.3%
Anki’s manual says a lot about importing raw cards – and it’s much easier and more flexible to do this than I thought. I might drop anki-vim completely, or write something more minimalistic.
Decided to take a look again at my Bachelor’s thesis and do a nice rewrite in Python3 of the main code.
The date
command can take STRINGS, which as mentioned in the man pages can be quite free-form. I moved my system clock back 1h with sudo date -s "1 hour ago"
. Wow.
For the first time got 100% on D3B! And in general even though the results aren’t the most important thing in D3B they do actually motivate quite a lot. Keeping records and gamification for the win!
d3b 64% Mon 25 Mar 2019 11:43:46 AM CET d3b 100% Mon 25 Mar 2019 11:45:39 AM CET d4b 39% Mon 25 Mar 2019 11:48:12 AM CET d4b 33% Mon 25 Mar 2019 11:52:23 AM CET d4b 44% Mon 25 Mar 2019 11:55:07 AM CET d4b 50% Mon 25 Mar 2019 11:58:35 AM CET d4b 50% Mon 25 Mar 2019 12:00:39 PM CET
Is a python module to save secrets.
python -m keyring [get/set]
for help.
To be able to change backlight.
sudo gpasswd -a sh video
clight -b radeon_bl0 --day-temp=6000 --night-temp=2000
would be nice, but sadly my webcam is covered. But it might be a nice replacement for redshift, sometime.
hide_edge_borders both #<none|vertical|horizontal|both>
This tutorial and extension could separate about 30% of the pictures with the default settings. Margins (and margins to the sides of the image!) are important.
is done by putting the .scm
file to /usr/share/gimp/2.0/scripts/
This tutorial is freaking awesome.
Given the number of images I was dealing with manually configuring each one was not an option. What I wanted was a service that would, given my image collection, just print me a photo album of approx 6x4 images, in chronological order, two per page, with a caption below each detailing the image file name and the date taken.
It provides a .tex
album file and a Python2 file which reads the Exif data and creates a photos.tex
which gets included in the main album file.
scanimage
(SANE) is a “a library and a command-line tool to use scanners”.
sudo scanimage -L
to see the list of scanners, then to scan (for me also with sudo
for some reason):
sudo scanimage --device "xerox_mfp:libusb:002:004" --format=png > name.png