In the middle of the desert you can say anything you want
This module provides a decorator and functions for automatically adding generated special methods such as init() and repr() to user-defined classes. It
“Token classification” includes but is not limited to NER: Hugging Face – The AI community building the future.. Really nice new correct phrase I’ll be using!
Installing (after tensorflow and/or pytorch):
pip install transformers
Caches by default in user folder but can be overridden:
export HF_HOME="/data/sh/experiments/bert/cache"
The “hosted inference API” on the website is really cool! dslim/bert-base-NER · Hugging Face
Example of converting conll dataset to what BERT expects: Fine Tuning BERT for NER on CoNLL 2003 dataset with TF 2.0 | by Bhuvana Kundumani | Analytics Vidhya | Medium
The BERT model documentation shows the tokenizers etc etc etc. - BERT — transformers 4.5.0.dev0 documentation
Training and fine-tuning — transformers 4.5.0.dev0 documentation - same model can be trained/imported from TF to pytorch and back! Wow!
Documentation of a sample model: transformers/examples/research_projects/distillation at master · huggingface/transformers
Another example of fine-tuning BERT in Pytorch for NER: transformers/examples/pytorch/token-classification at master · huggingface/transformers
transformers
installed from source (git/master): https://huggingface.co/transformers/installation.html#installing-from-source / pip install git+https://github.com/huggingface/transformers
/tmp/test-ner/
, checkpoints, eval data. Wow.CUDA_VISIBLE_DEVICES=1; python run_ner.py --model_name_or_path bert-base-uncased --dataset_name conll2003 --output_dir /tmp/test-ner --do_train --do_eval
Here datasets
is imported: transformers/requirements.txt at master · huggingface/transformers
TODO - what is this and where can I learn more? Is this HF specific? What else is there?
It has a really nice interface for searching datasets! Filter by task, language, etc.
German NER datasets: Hugging Face – The AI community building the future.
Some German NER models, sometimes based on bert: Hugging Face – The AI community building the future.
Converting Tensorflow Checkpoints — transformers 4.5.0.dev0 documentation
Is this real?
export BERT_BASE_DIR=/path/to/bert/uncased_L-12_H-768_A-12
transformers-cli convert --model_type bert \
--tf_checkpoint $BERT_BASE_DIR/bert_model.ckpt \
--config $BERT_BASE_DIR/bert_config.json \
--pytorch_dump_output $BERT_BASE_DIR/pytorch_model.bin
Tatar von geräuchertem Forellenfilet mit Avocado - Annemarie Wildeisens KOCHEN
Die Forellenfilets in kleine Würfelchen schneiden. Die Schalotte schälen und sehr fein hacken. Die Cherrytomaten je in 6 oder 8 Stücke schneiden. Alle diese Zutaten in eine kleine Schüssel geben und sorgfältig mit der Mayonnaise mischen.
Forelle + tomatos + mayonnaise is literally the only recipe I’ve liked with mayonnaise in it
To redirect an issue to the old view, add ?oldIssueView=true
.
Added this to config.py:
config.bind('<Ctrl-J>', ':open {url}?oldIssueView=true')
(18:03:38/10185) sudo apt install screen
# ...
Suggested packages:
byobu | screenie | iselect
The following NEW packages will be installed:
… did I just get an advert for a competitor when installing screen? :) Since when does ubuntu do this and where can I read more about it?
“Meetingtourismus oder Papiergenerieren?” (heard at work)
It seems to run userscripts not in the virtualenv qutebrowser uses, but the standard system one? Installing packages in virtualenv didn’t work, but installing them globally did.
Moving/renaming a file/directory is easy: dvc move from to
1. Automatically updates the from.dvc
files. Then .gitignore
and the .dvc file have to be added and committed through git as usual.
This is interesting: Data Organization — documentation
In general: Best Practices for Scientific Data Management — documentation
This guide describes Axiom Data Science’s best practices for scientific data management. The intent of these practices is to improve the accessibility and usability of your data. These practices may be followed at any time during the preparation of your dataset, but are most useful when considered at the onset of project planning and implemented during data collection.
Also related: Organising your data | Research Data Management
tree -d
does it.
Root of repo:
git rev-parse --show-prefix
2
--git-dir
returns the location of the .git
folder, and --show-toplevel
returns the absolute location of the git root.
I’ll memorize the g/...
syntax someday.
:g!/pattern/d
I can just look for the pattern as usual with /pattern
and tweak it live, then do
:g!//d
and it will atke the last used pattern.
I should try doing something more interesting with the passata di pomodoro!
Options:
In general all seem to require both tomato puree and chopped tomatoes; and olive oil + garlic + oregano/basil + (brown) sugar seems to cover 90% of cases.
die Kaffeesatzleserei - reading in coffee beans (heard at work)
I shouldn’t forget that screen -R screenname
can be replaced by screen -R s
if it’s the only screen with such a name. Not sure if better or worse than tab completion, likely worse because it’s surprising, but quite nice to use.
i3-msg exit
1 does the magic.
ipset -N myset nethash # create myset
ipset add myset 27.8.0.0/13
iptables -I INPUT -m set --match-set myset src -j DROP # create temporary iptables thing
# making it persistent
ipset save > /etc/ipset.conf
# then enable ipset services
# Listing stuff
ipset -L
# Deleting set
ipset destroy myset
If you can’t destroy an ipset set because it’s being used by kernel:
iptables -L --line-numbers
returns this:
Chain INPUT (policy DROP)
num target prot opt source destination
1 DROP all -- anywhere anywhere match-set myset src
...
Then to delete number 1:
iptables -D INPUT 1
GitHub - mkorthof/ipset-country: Block countries using iptables + ipset + ipdeny.com can do both a whitelist and a blacklist.
Article with a very interesting graph: Becoming a Data Scientist - Curriculum via Metromap – Pragmatic Perspectives
{:height=“500px”}
der Tonus - heard at work in context of
Option to return objects as a list of objects (separated by a comma) · Issue #124 · stedolan/jq:
TL;DR use jq "[foo]"
instead of jq "foo"
.
yunohost app info -f appname
returns the A LOT of info about the appname, including installation paths.
… can be located in ~/.config/qutebrowser/userscripts
, not just in ~/.local ..
! When tried to run one it didn’t find it helpfully outputted all the paths it looks for them - which is great and I’ll steal this. If a file is not found you know the person will probably need this, especially if they are many.
One of the cooler solutions I’ve seen: Managing dotfiles with GNU stow - Alex Pearce (There seems to be a canonical page1 I found first, but I like the other one more)
TL;DR create a directory for the dotfiles, with each folder containing dotfiles mirroring the usual dotfiles’ locations in the system; Then from inside the main dotfiles directory do stow vim bash whatever
and it’ll magically put it in the right place in the home directory.
This works because
Stow assumes that the contents of the
you specify should live one directory above where the stow command is run, so having our .dotfiles directory at ~/.dotfiles means using stow to manage our dotfiles just works. 2
This is awesome because:
The same article2’s sample github repo: dotfiles/neovim at master · alexpearce/dotfiles
The stow linked github repo’s dotfiles are actually fascinating: alexpearce/dotfiles: My dotfiles.
dotfiles/.gitconfig at master · alexpearce/dotfiles:
# Clone git repos with URLs like "gh:alexpearce/dotfiles"
[url "https://github.com/"]
insteadOf = "gh:"
[url "git@github.com:"]
pushInsteadOf = "gh:"
# Clone CERN GitLab repos with URLs like "gl:lhcb/Hlt"
[url "ssh://git@gitlab.cern.ch:7999/"]
insteadOf = "gl:"
Applying the above to my own configs in ~/.gitconfig
.
Assuming the ssh port is 1234 ~/.gitconfig
is like
[url "ssh://git@myserver:1234/"]
insteadOf = "gh:"
and then in the per-repo settings something similar to
[remote "bitbucket"]
url = gh:myusername/myproject.git
Cloning it is now easy:
git clone gh:myusername/myproject
Neat!
List of supported languages and lexers · rouge-ruby/rouge Wiki
Quite a lot! Will try the generic conf
for the .gitconfig
above.
Brandon Invergo - Using GNU Stow to manage your dotfiles. ↩︎
Even better description than the canonical page: Managing dotfiles with GNU stow - Alex Pearce ↩︎ ↩︎