serhii.net
In the middle of the desert you can say anything you want
Home page
Blog
Work log
Journal
Link blog
Projects
About
tags: Ml
Locally debugging Huggingface Dataset scripts
Huggingface Hub prefers zip archives because they support streaming
Things I'll do different next time when creating datasets
Huggingface dataset build configs
Huggingface Hub full dataset card metadata
LLM playgrounds online
Promptsource
Vaex as faster pandas alternative
Huggingface datasets can become pandas dataframes
Detecting letters with Fourier transforms
Interesting blog with explanations of ML stuff
Sparse language models are a thing
LM paper notes
NN basics and resources
HF token-classification pipeline prediction text
pytorch dataloaders and friends
Omegaconf and python configs
Creating representative test sets
Huggingface datasets set_transform
Inter-annotator agreement (IAA) metrics
Dataset files structure Huggingface recommendations
Huggingface Datasets metadata
Three libraries for explaining/inspecting/debugging/diagnosing ML
FUNSD dataset with annotated forms
Basics of NLP and Language modeling course / explorable
211122-0905 detectron Instances initialization
211110-1520 Historical document processing, dhSegment
211108-1212 nvidia-smi has a python library (bindings)
211103-1811 Handwriting text generation GAN by Amazon
211020-1410 ML starter kit resources website