serhii.net

In the middle of the desert you can say anything you want

04 Jun 2022

My custom keyboard layout with dvorak and LEDs

3412 words, ~13 min read

Intro

My keyboard setup has always been weird, and usually glued together with multiple programs. Some time ago I decided to re-do it from scratch and this led to some BIG improvements and simplifications I’m really happy about, and want to describe here.

Context: I regularly type in English, German, Russian, Ukrainian, and write a lot of Python code. I use vim, qutebrowser, tiling WMs and my workflows are very keyboard-centric.

TL;DR: This describes my current custom keyboard layout that has:

  • only two sub-layouts (latin/cyrillic)
  • the Caps Lock LED indicating the active one
  • Caps Lock acting both as Ctrl and Escape
  • things like arrow keys, backspace accessible without moving the right hand
  • Python characters moved closer to the main row

It looks like this1: kl_cut.png

and is available on Github.

(Read More)

26 May 2022

Three stories

942 words, ~3 min read

Three war stories that happened to me and I want to write down. First two happened when I was a Werkstudent, third one when working full-time.

Bachelor’s thesis

Native language identification on english tweets.

In: english-language tweets. Things like tokens, length, punctuation, and other typical features.
Out: in which geographic area (US, India, Russia, …) does this person live, as a proxy of his first language.

Decided to try using Tensorflow tf.data and tf.estimator for that, why not learn something new.

My very first version performed incredibly well. There had to be an error somewhere. There was.

Due to my unfamiliarity with the libs I left the lat/lon in the input data. And the network learned to predict the country name based on lat/lon instead of native language identification.

Most interesting bug in my career

I was trying to train a language model and see what happens if random words in the training data are replaced by garbage tokens.

So with a certain probability a sentence was picked, in it with a certain probability a token was masked or replaced by random text. So picking some sequences and in them randomizing a token with a certain probability.

In the town where I [PREDICT_ME] born, lived a [GARBAGE], who sailed the sea.

Training LMs is hard, my dataset was HUGE, and training took days/weeks.

Randomly sometimes on day 2-3-4 training crashed, with a nondescript error, or sometimes it didn’t and training went on.

And I think it was TF1 and outputting the specific examples that made it fail was nontrivial, though I don’t remember the details.

Debugging it was nightmarish. And running a program that runs for a week and SOMETIMES crashes and you have to start again was very, very frustrating.

The cause:

  1. In one place in the code, I tokenized by splitting the sentence by whitespaces, in another one - by spaces.
  2. In the infinite-billion-sentences-dataset, TWO contained single double-spaces.
  3. The dataset is shuffled, and sooner or later we got to one of these sentences. When split, we got a token with length 0:
    >>> "We all live in a yellow  submarine".split(' ')
    ['We', 'all', 'live', 'in', 'a', 'yellow', '', 'submarine']
    
  4. If randomization decided to randomize one of these sentences and in them SPECIFICALLY the '' token, everything crashed.

For this to happen, the randomness gods had to give a very specific set of circumstances, low probability by themselves but likely to happen at least once during a week-long training.

Quickly creating a synthetic-but-human dataset

I’m really proud of this one.

The problem

Given: 10 example forms. Mostly consistent ones, containing a table with rows. Some rows contain text, some contain signatures, all by different people. The forms themselves were similar but not identical, and scan quality, pen colors, printing etc. varied.

Task: detect which rows and cells have signatures. Rows might have gaps, rows could have text but no signature. And do it really fast: both providing the proof of concept and the runtime of the network had to be quick.1

Problem: you need data to evaluate your result. … and data to train your NN, if you use a NN. You need this fast.

Attempt 1 - vanilla computer vision

I tried to do vanilla computer vision first. Derotate, get horizontal/vertical lines, from them - the cells, and see if they have more stuff (darker color) than some average value.

It failed because the signatures and text belonged people filling hundreds of such forms per day - the signatures and text could be bigger than the row they’re on. You never knew if it’s someone who wrote a shorter text, or text ‘leaking’ from the rows above/below.

Attempt 2 - ML

I knew that Detectron2 could be trained to detect handwriting/signatures on relatively few examples. But again, we need evaluation data to see how well we do. 10 documents are too little.

.. okay, but what would evaluation look like?

I don’t really need the pixel-positions of the signatures. I need just rows/cells info. And filling forms is fast.

Crowdsourcing the eval dataset.

I wrote a pseudorandom numbers generator, that based on a number returns which cells and which columns have to be filled.

Bob, your number is 123. This means:

  • in document 1 you fill row 2,4,7 in column 1, row 3,8,9 in column 3, and sign row 2 and 7 in column 3. Save it as 123-1.pdf
  • in document 2, …

Zip the results and email them to me as 123.zip

Then 100+ forms were printed out and distributed. Everyone got a number and N forms and which numbers to fill.

Then they send us zips, from which we could immediately parse the ground truth. And we knew which cells contained what in each image without needing to manually label each.

The dataset was not too synthetic: we got different people using different pens, different handwriting and different signatures to fill the forms.2

That dataset then gave us a good idea of how well our network would perform in real life.


  1. This meant that we had to use only one NN, because running separate ones for detecting different things would have been slow. So we had to train a checkpoint that predicts ALL THE CLASSES, ergo we couldn’t train something to detect handwriting using another dataset. (Using a NN to find signatures and using something else to find other classes would work, though.) Honestly I don’t remember that part well, but there were reasons why we couldn’t just train something to find handwriting on another dataset without generating our own. ↩︎

  2. Much better than we could have done by, say, generating ‘handwriting’ using fonts and distortions and random strings. ↩︎

04 Apr 2022

Euromaidan 3.5

3619 words, ~14 min read

That which can be destroyed by the truth should be.

(Litany of Hodgell) 1

“Almost no one is evil; almost everything is broken”.

(Litany of Jai)

простите за очевидное, но все-таки по поводу «плохих людей не бывает, бывают несчастные/те, кому плохо/озлобленные/которые не знают правду» и так далее.

так вот:

ПЛОХИЕ
ЛЮДИ
БЫВАЮТ

(@lizafocht on Twitter)

As I write this, it’s 1 month 11 days 22 hours since Russia attacked Ukraine.

If you asked me two months ago about the worst period in my life, the answer would have been immediate.

The worst time in my life and the best time in my life. It’s a miracle I’m alive, but it was a formative year, horrible, beautiful, scary, hard, surreal, it left scars and created the strongest memories I have.

If you asked this question now, it’s a tie between that time and, well, now. The last month, 11 days, etc.

And about the latter one, any positive sides I can find are either exceptions to problems the war itself created (“my family and friends are all alive, only few of them got seriously physically hurt”) or just don’t pass the smell test.

Ukraine is united as never before - nice. Would I take the “Ukraine is as divided, fragmented and unstable as ever, but there is no war” option? Yes, in a heartbeat.

As I write this, the consensus on Twitter is that the Battle of Kyiv™ is over. I like this summary by Tomi T Ahonen, good storytelling and mostly factually correct.

As the danger to Kyiv diminished about 2 weeks ago, I started analyzing less and breathing more.

My brain started to work through a backlog of emotions and questions like “What does it mean now?

This post is not about a specific video. This post is about how researching it made me realize what should’ve been clear weeks ago.

At some point you have to stop and think whether gathering more information is likely to be useful.

This post is how about I missed that point.

(Read More)

27 Mar 2022

Євромайдан-3 - часть III

3344 words, ~13 min read


Этот пост, превратившийся внезапно в серию постов, о том, как я видел и чувствовал войну. Живу в Германии уже несколько лет и был тут в ночь на 24.02.2022, в Украине осталась вся моя семья и много друзей.

Пишу для себя, все имена и события вымышлены и не имеют ничего общего с реальностью, кто тут нашел себя и не хочет себя видеть - напишите мне.


Intro / a night to remember

2022-03-27-095424_553x441_scrot.png (Art by Eugene Anatsky1)

Тут будет хронология самой запоминающейся ночи моей жизни. В деталях, потому что у меня плохая память, а сохранить это мне кажется важным.

“Щоденник | The Kyiv metro is open and the Moscow stock market is closed”

Когда стало понятно, что все плохо, я создал канал в Телеграме.

2022-03-25-021733_449x146_scrot.png

С самого начала я знал, что я хочу использовать его как дневник и бросать туда интересные мне вещи. И что не хочу быть источником информации для кого-либо (и связанной с этим ответственности особенно не хочу.)

Читал в те дни я половину Интернета, и канал репрезентативный срез той половины. Что-то могло попасть в канал потому что, например, …

  • я верю, что это правда 2022-03-27-031127_448x166_scrot.png
  • пока не знаю, важно/правда это или нет, но хочу его сохранить на случай, если окажется, что да
    2022-03-27-031524_450x167_scrot.png
  • очевидная неправда и/или пропаганда, но она о чем-то говорит
    2022-03-27-031752_447x413_scrot.png
  • мне интересно, что конкретно ресурс Х решил написать именно это
    2022-03-27-031107_450x167_scrot.png
  • это яркий штрих или просто сообщение которое передает дух этого времени
    2022-03-27-032100_440x330_scrot.png 2022-03-27-094515_436x285_scrot.png
  • это смешно
    2022-03-27-094415_647x357_scrot.png

(Read More)

26 Mar 2022

Євромайдан-3 - частина друга

1910 words, ~7 min read


Этот пост, превратившийся внезапно в серию постов, о том, как я видел и чувствовал войну. Живу в Германии уже несколько лет и был тут в ночь на 24.02.2022, в Украине осталась вся моя семья и много друзей.

Пишу для себя, все имена и события вымышлены и не имеют ничего общего с реальностью, кто тут нашел себя и не хочет себя видеть - напишите мне.


Це продовження першої частини, де є контекст. Якщо коротко, суть постів - спогади і аналіз останнього місяця війни. Ціль виключно “виписатися” мені та якось зберегти деталі, хаос, невизначеність та відчуття безсилля, коли на моїй Батьківщині починається війна, я - у Лейпцигу, зрозуміло, що відбуваються жахливі речі, але часу подумати та проаналізувати ні в кого ще не було.

Тут писатиму про знакові для мене місця в Інтернеті, які багато читав.

Джерела інформації

70% них або пропаганда, або диванні аналітики. І перші, і другі були мені цінними або цікавими. І вони писали про вибухи в Києві десь за хвилин сорок до того, як проснулись офіційні джерела, за що їм вдячний.

Цей список і описи передають мої спогади і джерела так, якими вони були тоді.

Зараз читаю трошки інші речі і значно менше. Інформаційна війна стала серйознішою, свідомої дезінформації стало більше, а джерела, які допомагали перед початком, не обов’язково гарно аналізують війни, що вже почалися.

І мої потреби теж змінилися. Мені такий рівень деталізації та шуму вже нецікавий і непотрібний. (Вічне питання “а навіщо ти це читаєш?” зараз не має такої однозначної відповіді, як до початку війни.)

Твіттер

Стабільно зі мною в часи криз. Очевидно, в найбільшій кризі в моєму житті він теж був присутнім, і 80% інформації я отримував звідти.

Сам пишу туди вкрай рідко, і зараз мій профіль все ще виглядає так:

2022-03-27-014255_602x891_scrot.png

(Read More)

24 Mar 2022

Євромайдан-3 - часть I

2927 words, ~11 min read


Этот пост, превратившийся внезапно в серию постов, о том, как я видел и чувствовал войну. Живу в Германии уже несколько лет и был тут в ночь на 24.02.2022, в Украине осталась вся моя семья и много друзей.

Пишу для себя, все имена и события вымышлены и не имеют ничего общего с реальностью, кто тут нашел себя и не хочет себя видеть - напишите мне.


Окей.

Я не знаю, почему я чувствую такую сильную потребность об этом написать. Особенно недели спустя.

В свое время я написал два поста, на украинском, во время Майдана.

Они были прямым способом выразить много сильных эмоций тут-и-сейчас. Этот пост – тоже способ выразить эмоции, более глубокие и менее острые, но не менее сильные. И будет больше аналитическим и ретроспективным. Но его главная цель однозначно не в документировании и не в анализе.

(Read More)

21 Jan 2022

Some things I learned at BxE

1973 words, ~7 min read

Prologue, nostalgia and oversharing

I worked at BxE for 2 years and 9 months, first part-time as Werkstudent, then “Junior researcher”1 then finally “Machine learning engineer”.

This was my first “real” non-internship full-time job, first contact with German work culture and then there was the usual “only in a startup you can get 5 years of experience in 1 year” - it was an adventure and I learned a lot.

I love the German phrase “Zwischen den Jahren”2 (lit. ‘between the years’), the time between Christmas and the New Year, for me always a time of limbo3 and especially reflection. Even more so if the ‘time between years’ coincides with the time between jobs, with my last day at BxE being the 17.12.2021 and the next one starting on 01.01.2022.

A very specific kind of emptiness, not “I’m on vacation” but “I don’t have a job, I have no commitments or duties neither to my last workplace, nor to the upcoming one; I don’t even need to look for a job, there’s nothing I ‘should’ be doing”. Comparable to “I’ve just finished my final school exam, regardless of the results I’m as free as I’ll ever be until university starts”.

I spent the first week of that time sleeping, then had a lot of quality time with family and friends, and missed the traditional reviewing/thinking that I usually do at the end of important phases. There was no analysis and synthesis of the ‘lessons learned’, not even diary-style bullet points. I thought I’d just forget about it and live my life, but nope.

Still it kept eating me, keeps eating me, the way unfinished business and disconnected memories that desperately want to be analyzed eat you - and 21 days in, I don’t think it will shut up, so I’ll give in.

Follows a rough list, in the order I remember them. 4

Things I learned how to do better

Soft skills

  • Presenting
    • Both “how to effectively create slides” and “how to tell a story”
    • Graphs are awesome
      • Bonus points if they are easily reusable5
    • How to pick an appropriate abstraction level when explaining stuff
    • If I attempt to convey less, more gets understood at the end.
      • Especially in low-shared-context larger meetings or sprint reviews. High information density means everyone gets lost on slide #3.
  • Mentoring / teaching / supporting
    • Mentoring a smart junior colleague was awesome and really satisfying, not sure who of us learned more
    • Still proud of “This is a series of tickets to learn $internal_tool, I’ll help you; please write the tutorials as you go”
  • Communicating results and probabilities
    • Often it’s better to not communicate intermediate results or low-probability assestments
    • Make it as hard as possible to misunderstand you, even if the listener really wants to

Automatic document processing in Real Life

  • Scientific papers and scenarios in ML tutorials live in a world where you rarely have:
    • street names containing months & dates, people named like “Christian Thomas”
    • tables in invoices showing incredible diversity and creativity and interesting design decisions
    • 50 annotation types, half of which overlap with each other
      • An ADDRESS prolly contains a STREET_NAME, and now you can’t train both together :(
    • “I had this invoice in my pocket, then my dog chewed it, yes these are traces of blood don’t ask where it comes from”
  • Matching names/address with bad OCR to rows in databases compiled from multiple sources and OCR engines
    • Whatever str., Whateverstr., Whateverstraße, Whatever Straße, WhaIeveraBe, W̵h̶a̶t̴e̴v̶e̴r̴ ̶S̵t̷r̶a̵ß̷e̶, W̴̱̩̖̎͑ḧ̵͔̜́̌ą̸͉͌t̴̠̪̊̾͠e̷̖̥͌̇͗v̷͓͖͂̒ĕ̶͚̍r̴̳͊͝ ̶̯̲͔̄S̴̻̍̈́̾t̷͇͋̾r̵̡͂̓̏a̸͍͕͑͗̔ß̴̣́ę̸̥̠̌͌̆
    • Our Lady in Whatever General Hospital, OLWGH, W.G.H., OL-W. General Hospital
    • … human creativity knows no bounds!
  • Measuring how changes in things like OCR or text flow influence performance of already trained models
    • train a network on data with bad OCR, improve OCR, your network now performs worse. Cry.
  • NER F-Score metrics are only one part of the picture!

(Read More)

21 Jan 2022

Підсумки 2021

547 words, ~2 min read

– The same procedure as last year?
– The same procedure as every year! 1

Now playing: “И вновь продолжается бой, и сердцу тревожно в груди…"

Досягнення року: майже повністю вийшов зі сфери сервісів Гугл! + Звільнився з (по суті першої) роботи і перейшов на другу2.
Настрій року: постійного неясного стресу, тиску і відповідальності; WORK HARD and STAY IN BED THE REST OF THE TIME, UNABLE TO MOVE; нездатності знайти баланс між потребою в спілкуванні, нездатністю це робити онлайн, відсутністю сил/контекстів щоб це робити ІРЛ, і перманентним фоновим відчуттям вини за все вищезгадане
Зустріч року: “Дядя Сережа, Давайте вы выберите покемона на день рождения а я его нарисую как только смогу”
Подія року: Маша подарувала велосипед!3
Жах року: Вкрали велосипед :( А якщо серйозно - пожежа в4 костелі під час якої згорів5 орган6
Країна року: Україна
Місто року: Лейпциг!!!! Цього року з трьома чотирма окличними знаками!
Слово року: Corona-Warn / Дій вдома / тій фОтме 7
Подорож року: літня, осіння і особливо зимня поїздка додому
Веб-сервіс року: Fastmail + Coronavirus charts8
Колір року: приємний помаранчево-кремовий
Запах року: спирту, сливовиці, бензину, алое вера, у кожного магазину свій
Новина року: про одну конкретну чергову можливу довготермінову поїздку
Книга року: The Culture Map (Erin Meyer) + Permutation City
Фільм/серіал року: Star Trek Enterprise
Media N.O.S.: Дуже багато подкастів! Lore, Old Gods of Appalachia, SCP Archives. Ще мене дуже вразила серія розповідей Introductory Antimemetics з SCP Foundation
Пісня року: Voltaire - The Trouble with Tribbles (Song Only) - YouTube; just one snack - YouTube; Let the Sunshine In - Hair - YouTube (конкретно те відео)
Заклад року: кафе через парк неподалік від квартири в Лейпцигу
Напій року: Чай, заварений в молочнику (або в паперовому пакетику, куди його можна насипати). На другому місці “чай” з хризантеми
Їжа року: свіжеспечені булочки + Lemon curd salmon recipe
Транспорт року: мій власний (другий за цей рік…) велосипед!

Побажання собі на 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 рік:
Знайти свій sustainable дзен з роботою, стресом, сном, кофеїном, ритмом дня або його відсутністю. Знайти свій sustainable дзен з людьми та спілкуванням з ними. Знайти свій sustainable дзен з енергією, її джерелами, та способами її витрачати.

Ціль на наступний рік:

  • Більше писати/створювати не важливо чого (пости, бібліотеки на Github, малюнки, вірші), але make an effort, умовно пости про PKM в блог9 а не короткі нариси про окремі деталі10 (хоча краще вони, ніж взагалі нічого).
  • Навчитися підтримувати спілкування з людьми навіть при жорсткій зміні shared context (how do we talk to each other if there’s no watercooler anymore?…). Згадати дзен Мерзебургу і організовувати речі. Більше спілкуватися з людьми особисто та через відеодзвінки (.. або хоча б просто по телефону), менше тексту.
  • Більше свіжого повітря, подорожей, спонтанності, легкості, не економити свою енергію з ціллю потім витратити її на лежання в ліжку

Ну і вічнозелене: Продовжити сон, спорт, медитацію – ЦЕ ПРАЦЮЄ. (x4)

(У)


  1. “Dinner for one”, of course ↩︎

  2. (+ навчився працювати без пʼяти екранів в майже будь-яких умовах!) ↩︎

  3. А ще я вакцинувався ↩︎

  4. тому самому ↩︎

  5. той самий ↩︎

  6. Пожежа в костелі в Києві: пошкоджений унікальний орган ↩︎

  7. (у виконанні дами зі стійки реєстрації WizzAir в Берліні) ↩︎

  8. The blog Don’t Worry About the Vase gets a honorable mention as my main source for corona stuff this year ↩︎

  9. My journey in PKM, Part 1: things I tried - serhii.net ↩︎

  10. весь DTB ↩︎

12 Jan 2022

Things learned from my father in law when building additional drawers for an IKEA wardrobe

532 words, ~2 min read

Had an IKEA wardrobe, it was good, but after a year we realized we need more shelves, bought seven. He came to visit and decided to help us building them.

After unpacking, we realized they all were the wrong size, width 100cm instead of 75cm. (No idea what happened there…)

He said it’s no problem to manually make them the right size. Alright, why not, might be fun, when else will I get to do something like this.

The feeling of power from changing the world around you is even stronger than when building standard IKEA stuff accoding to plan. Like, you’re allowed to modify furniture that you own. Strangely empowering.

Some things he did that I really liked and that are the reason for this post:

  • Need $tool of correct size for $job? I’d measure and take that piece of paper to a hardware store. He took an example plank with him to the shop to pick $tool based on the real thing.
  • I’d have blindly trusted that the drawers fit (IKEA, standardization, etc), building one to see that I got the steps right and then parallelizing the rest to do similar steps in a shared context. He built a single drawer first specifically for the purpose of testing it immediately, which is how we found out about the problem.
  • Cutting a drawer is surprisingly easier when it’s assembled! Both physically and knowing-which-part-to-cut-where, since you’re cutting on the same plane three different boards. Not always a good idea but I would have never thought of doing that to begin with. We cut the first assembled drawer without disassembling it.
  • Measuring stuff - I’d have measured a board from a “good” drawer, measured the same thing on the target, drawn a line (between two separately measured points), then cut that. He physically disassembled a ‘good’ drawer, put each board over the corresponding ones from the ‘big’ one, and traced the line over from them. Much easier if you have a lot of drawers. Even if you got the measurements in cm there’s still room for error when you transfer them to each new drawer.
  • When cutting something with a saw, if friction is a problem you can stick something in the already-cut part to increase the distance and decrease friction.1
  • Even if it was 11pm, he really wanted to clean up before going to bed, which connects really well with my usual party “we clean up before we fall asleep while we still have energy because in the morning we won’t”.

A pattern I see in some of the above is getting rid of intermediate steps and making use of natural mappings when available.2

Got a lot of insights in the style of The Art of War, and also now we have 7 new drawers. 10/10 would recommend.


  1. Probably the only bit here that easily applies only to woodworking :) ↩︎

  2. In the sense of “The Design of Everyday Things ”. Reasoning about what to cut where is easier when you’re dealing with an assembled thing - like the book’s “a panel with light switches in the shape of your home, with the switches being in places where the lights they control are” ↩︎

03 Dec 2021

My journey in PKM, Part 2: my current approach

3663 words, ~14 min read

Intro

Hi! Welcome to part 2 of my take on Personal Knowledge Management (PKM).

Part 1 was about things I tried in the past and why I stopped using them, along with what they taught me about my requirements for a PKM system.
Part 2 (this post) will be a description of the things I currently use.

TL; DR

Private notes

My day-to-day activities live in two very long textfiles, one for work (work.md) and one for personal stuff. The inspiration for this was this post: My productivity app is a single .txt file.

My personal notes are written almost exclusively using Obsidian, synchronized to my Android phone. I make heavy use of Obsidian Templater Plugin to pre-fill front matter, put them in its correct place, add tags, etc.

Public notes

The public part of my notes lives on my website, serhii.net, which is a static website on Hugo.

My public notes are written using Obsidian and stored as markdown. They get published on my website at Diensttagebuch - serhii.net and Journal - serhii.net respectively. The Diensttagebuch also generates a Master file, which basically concatenates all the single markdown files into a long page. I use this for quick casual searching.

All of this uses Git for version control.

I have a private VPS, where I run wallabag, that I use to quickly save links from my Android phone.

My current setup

Probably for the first time I’m very very happy with my current solution, it checks all boxes I knew I needed, and ones I had no idea I wanted or were possible.1

Requirements from the previous post

Things I know I need from a PKM system, from the previous post:

  • Data storage:
    • The data’s main location shouldn’t be a webservice. Ideally I should control both the data and the tools needed.
    • The data should live in a future-proof format, readable even without the tools
  • It has to require little upkeep/maintenance
  • Should be flexible and the data it gets and supports:
    • Not just links, but also pictures/files etc
    • free-form text for manual summaries of the items saved
  • Taxonomy:
    • Ideally as flexible as possible, ability to add own attributes
    • Ability to use complex queries when searching
    • Quick feedback for errors or autocomplete for tags/categories
  • Usability:
    • Adding stuff it should be really fast and friction-free. If it takes time and effort to add new info to it you won’t do it often.
    • Both loading times and number of steps should be very very small

Text files

Right as I was starting to work at my first real full-time job, I stumbled upon this link: My productivity app is a single .txt file

It describes using one long text file for organising stuff and notes, and appending to the bottom every day. I loved the idea.

My text files

… and decided to try it for work: most stuff I need to write down was company-specific, confidential and not fitting a wiki format, I was learning it chronologically, it was the perfect case.

Work work.md file

For example, the work file work.md:

  • I start the day by pasting the date and “Today: “2
  • I write my plans, copy-paste any not finished from the days before
  • During the day, I append to that file. I write things like:
    • terminal command, non-trivial errors I had, how I solved them, all ungoogleable stuff
    • Log of what I worked on, to be able to easily come back to a topic or find old files
      • For example: “TICKET-1234. Downloaded dataset to $path, starting training by running $command. Eval results saved to …”
    • Notes from meetings (under the hashtag #meeting) or important conversations with people
  • At the end of the day, I write the plans for tomorrow

Representative sample from work.md:

Small improvements accumulated along the way:

  • Custom VIM colorscheme that:
    • highlights lines starting with TODO, XXX, done to make them easy to parse past.
    • Date block and line starting with \+ have a specific color, to easily parse headers or important stuff
  • Using indentation to separate logical blocks, and then of vim folds with foldmethod=indent.
  • Using special hastags/words to make search easy:
    • #meeting (formal ones), #conversation (informal), STARTED (for starting ML trainings that I’ll need to check up on later), etc.
    • Using initials for people. A line might look like this:
      Today:
      	- ...
      	- 13:00 #meeting with AB and CD about X
      ...
      #meeting with AB and CD about X:
      	Meeting minutes: http://...
      	AB wants to do Y
      		- CD: we can't, why don't we ...
      	...
      
  • Making it open in the same workspace on system startup

Personal / RL file

This worked really well, I soon created a second file for real life stuff. The trigger there was seeing my 80 years old relative have such a paper notebook with daily notes (the day I came was something like “Serhii came to visit; planted cabbages”).

I use it for IBANs, recording payments, gift ideas for people, logistics, list of documents for bureaucracy. Almost everything above applies to it too.

Advantages

I really love these textfiles, and their length is the main proof:

> wc -l work.md
39538 work.md
> wc -l file.md
14063 file.md
  • Work
    • Daily meetings, preparing for sprint reviews etc. is easy, I don’t have to remember what I did.
    • Solving problems is easier too: if I solve a non-trivial issue with internal or external tooling, I document it, and I can look it up next time.
    • It also works as a classic lab log, with experiments, theories, results. Some get copy-pasted into JIRA/Confluence later, but that way I can use the tools I like the most to draft them.
  • Home
    • Excellent log of stuff I don’t do often but need to refer to.
    • For example, last time i needed something from IKEA 30% want available for delivery. Six months later I was ordering something else and quickly found the list of things not bought previously, by searching for “IKEA”.

Why does it work?

  • Low-friction:
    • Really quick to start writing (opens automatically on i3 workspace “10”, switching to that workspace is mechanical memory now (<Win-0>))
    • I love textfiles and I love vim and it’s the lowest-friction way I know of to edit text
    • Searching is instant, including regex-based
  • No-maintenance
    • Version control is automatic
    • No dependencies except a text editor
  • Under my control and future-proof
    • Text is as portable and future-proof as it gets
    • Lives on my local filesystem, I can get a text editor anywhere

What’s missing

  • Not part of its job description, but nevertheless:
    • Not good for saving stuff “for later”: no tags/categories, no way to search by filtering by them
    • No way to easily paste/view pictures etc.
  • Not easily accessible from mobile
  • Any solution to a problem I find and document stays on my hard drive. (Still better than “only in my head”, but still…)

Diensttagebuch / daily log

Live at Diensttagebuch - serhii.net.

Three things happened:

  1. I saw /r/AskHistory thread, where a Dienttagebuch was mentioned as source. In English, I’d translate this as “Work journal/log/notebook”, “Lab notebook”. I thought I’d be a cool idea for a small blog, like the textfiles above but for public stuff.
  2. Also I kept reading about static site generators3.
  3. I also started hating Wordpress4 and was looking for alternatives.

Diensttagebuch basics

So for fun, I started writing a daily log of things I do.

At first I used Jekyll. I liked it. After it got too big, it started to take too much time to build (7-10 seconds). I heard Hugo was supposed to be fast, tried it out, pointing it at the same folder with makdowns as Jekyll (yay open formats), it worked, and took less than a second to do it! Very happy about the move, but of course it’s all still a permanent work in progress. 5

I started using it about at the same time as the textfiles, I started documenting there the things I keep having to google (how do you change the size of matplotlib plots in Jupyter?) along with the answers, and useful snippets. I also kept documenting random bits and pieces that I felt were interesting, but not worth the 20 seconds and mental effort to put into the link wiki (see part 1 of this post).

Representative day:

I wrote it in markdown from the beginning, and could easily paste syntax-highlighted code snippets, picures, clickable links.

At a certain point (after I started using Obsidian, see below) I split day123.md posts containing multiple possibly-unrelated headers (and no tags) into separate YYMMDD-HHmm-post-title.md ones. Now I could give them separate tags, and to link to them from other places! 6

To search for stuff, I created a so-called Master file. Initially it was a bash script that takes all the Days and dumps them into a single file (then also reused as page on Jekyll). I used both the page and the local file to quickly search for stuff (search whatever you want and skip through till you see what you’re looking for).

Advantages

  1. I can quickly look for stuff, and immediately get the answer, usually without having to follow the link. 7
  2. For some topics, easier to read/parse than the long textfile. Syntax highlighting, post titles etc. are nice.
  3. It’s public:
    1. Easy to access myself from anywhere
    2. Easy to share with people. “Yes, I did some jq before. Here’s my short summary of it and links you can follow to go deeper: Day 860 - serhii.net
  4. Now it’s quite a big chunk of material. The directory with the source markdown files has 18.700 lines split into 477 files. It honestly feels good, and I hope I provided some value to other people on the internet.
    1. Sometimes people find stuff mentioned there and write me about it.
    2. Once I got an email asking to add their website to my “excellent list of $topic resources”
    3. At least three times I found my own posts on Google when looking for solutions to a problem! Usually happens with evergreen stuff like Nvidia GPU driver issues

Why does it work?

In bold differences from the textfiles above.

  • Low-friction:
    • The window to write stuff is in my i3 scratchpad, accessible as <Win-minus>
    • I have templates/scripts to create a new file in the correct place, and pre-fill it with sensible values as soon as I give it a name
    • Searching is easy both from the browser and with grep
    • Deploying it is done through a shell script that commits and pushes it, builds the website, and scp-s it to serhii.net
  • No-maintenance
    • Version control is automatic
    • Serving:
      • A static website is trivial to serve and will stay available
    • Accessing:
      • I always have a text editor installed and can read markdown files
    • Building the static website:
      • Hugo is trivial to install, no database required to read the markdown files, it’s a beauty.
  • Under my control and future-proof
    • Text is as portable and future-proof as it gets
    • Lives on my local filesystem, I can get a text editor anywhere

What’s missing

For quite a long time, the ability to drag-n-drop pictures into them, and the ability to edit them on mobile.

Obsidian and digital gardens

My love for Obsidian knows no bounds, but I’m using it heavily as described only for around three months (as of December 2021), much less than the two previous ones (that are more than around two years old). Not stable and mature by my standards, but if there were any deal-breakers or friction I’d have stopped long before that.

Once I went on a rabbit hole and stumbled upon two things I always loved but had no idea they were a thing, or had a name:

  • Learn In Public (post publicly about your progress, create resources you wish you had found)
  • Digital gardens (having a set of pages that get constantly updated as your knowledge grows instead of chronological posts)

I heard Obsidian being mentioned often, and decided to try it out. Fell in love instantly.

Obsidian itself

Obsidian describes itself as “a powerful knowledge base on top of a local folder of plain text Markdown files”, which covers it quite well and instantly ticks the box for “data stays locally in an open format”.

I see it as a layer over markdown files that adds support for a lot of neat things without breaking them too much (still markdown, still roughly readable in whatever you use).

Desktop app

Electron-based, “just works”, dragging and dropping screenshots and it dealing with the rest feels like magic.

Android app

The main value it gives me is easy editing from my phone. Writing markdown by hand on a touchpad is painful, and the Obsidian editor does a really good job at it. And in general navigating through the pages, renaming, moving, following tags, adding templates. It’s also wonderfully configurable, and you can add its mobile toolbar the actions you want in the order you want.

But the two things that make Obsidian really powerful are templates and plugins.

Templates

They are snippets that manipulate text and variables to do some action like enter a date. With the help of Obsidian Templater Plugin, they can do much more.

211203-2305 New obsidian Templates + hotkeys for Garden (IT, RL) and personal notes - serhii.net has an example of one I use that gives an idea of what’s possible:

<% tp.file.move("garden/it/"+tp.date.now("YYMMDD-HHmm")+" "+tp.file.title) %>---
title: "<% tp.file.title %>"
tags:
  - "zc"
  - "zc/it"
  - "<% tp.file.cursor() %>"
fulldate: <% tp.date.now("YYYY-MM-DDTHH:MM:SSZZ") %>
date: <% tp.date.now("YYYY-MM-DD") %>
layout: post
hidden: false
draft: false
---

Works like this:

  1. you create a new file, enter it’s name (let’s say “New file”).
  2. Then you “Insert” that template. It:
    1. Moves that file to /garden/it/211206-1234 New file.md
    2. Inserts front matter, with title: being taken from the new filename
    3. Adds other frontmatter values and tags, puts your cursor at the beginning of the third.

To insert that template, you:

  • on mobile, swipe down, type “ins”, get a list of matching ones; or you could have created a button for entering that template in the mobile toolbar.
    Command palette
  • on desktop, <C-p> opens the command palette, see above. Or you could have defined a hotkey for that.
    Hotkeys

Creating that file on mobile and typing all that would have been hell. You create a template once, then it works both on desktop and the mobile app, and saves you a lot of pain.

Plugins

Obsidian has a lot of plugins, including the above mentioned templater. I use:

This is a nice overview of some plugins: A Few of Our Favorite Obsidian Plugins – The Sweet Setup

Obsidian supports quite advanced search.

It has standard AND/OR/NOT, regexes, and has special search operators to apply that to filenames, paths, tags, lines/blocks/sections, done or not done tasks. There’s also a vantage-obsidian to visually bulid complex queries.

My obsidian workflow

General

I create the files, insert the templates I wrote, add text, switch to the rendered view to easily read the results.

I add a lot of #tags and #tags/with_children and get autocomplete on them both on Linux and Android - no “will I look for this link under #coding or #programming?” anymore!

Is this real?

Integrating with Hugo / Diensttagebuch

I create individual posts with the template I pasted above. I use https://github.com/khalednassar/obyde to convert them into ones Hugo can parse (mostly involves updating pictures directories and stuff). I just added the line python3.8 -m obyde -c ./Scripts/obyde_config_dtb.yaml to the deploy.sh I use to deploy the rest of the blolg.

This part is still shaky and breaks down sometimes.

Personal notes

This is where its Android app especially shines, and I can edit text, task lists, templates for pages etc. I can take pictures and embed them in notes, or embed PDFs.

Last time I was flying, I just created a page where I pasted the pictures of the boarding pass QR codes and linked the obsidian page with my embedded .PDF vaccination certificate. (Usually I’d have all this as separate files in a Telegram chat with myself.)

You said you wanted self-hosted open-source solutions?

That was/is my main doubt. Obsidian is closed source, and building too much of my workflow on that feels against a lot of what I believe in. Let’s analyse.

Chances for Obsidian to survive

Obsidian has a LOT of plugins, which in my world meansa healthy and livng community. They also have paid offerings which give them some clear ways to monetize, which again gives me hope they’ll live.

What happens if it doesn’t?

Obsidian is free for personal use, and is a downloadable app, not a web application. Which means that (in theory) I don’t depend on their servers for anything, unless I use their paid Sync or Publish options. If the company gets hit by a bomb, the local applications itself should continue to work.

Data is stored in markdown, and there are multiple converters available to transform the Obsidian-specific syntax (which I don’t use often) into standard markdown, so there I’m also safe.

Still, why take the risk?

It’s just too awesome to ignore.

Advantages

Markdown, consistent and really good UX on computers and phones, ability to code templates to automate boring stuff, ability to easily sync part of the vault into a public Hugo blog? I never thought I’d see this anywhere.

Especially the android part: being able to write/edit stuff (checklists, personal notes, etc) from my computer and then continue from mobile when I have time has been really a game changer for me.

Why does it work?

  • Low-friction:
    • The window to write stuff is in my i3 scratchpad, accessible as <Win-minus>
    • I wrote a lot of templates to automate away the boring stuff, and they’re quick to insert both from Android and Linux.
    • Searching can be done through Obsidian
    • Deploying it doesn’t require anything special, part of the deploy.sh script
  • No-maintenance
    • The public part totally under control of the Hugo scripts
    • Locally - sync keeps working in the background
  • Under my control and future-proof
    • It doesn’t export notes to markdown, it stores them in markdown. Which is “as portable and future-proof as it gets”
    • I don’t depend on their services:
      • publishing is easy to do with markdown
      • sync is easy to do through git / nextcloud / syncthing, now or in the future
      • the application runs locally, if it can’t I can use anything to edit markdown.

What’s missing

Complex taxonomies for a link database, but I don’t think I need that anymore. The Semantic MediaWiki Link wiki approach described in the previous post might have been useful earlier, but feels overblown for me now.

Obsidian search options over the markdown files that make up the public part of my notes are more than enough for me. In the time between my almost-abandonment of the link wiki and discovery of Obsidian, I’ve been happily Ctrl+F-ing through the various master files or grepping locally and that was … good enough.

That area is definitely a work in progress, we’ll see.

My current solution for casual “read later” bookmarking is Wallabag.

Wallabag (Save the web, freely | wallabag: a self hostable application for saving web pages) is a self-hosted solution to bookmark links. It has a really nice Android app.

It has stars, read / not_read, tags (clickable suggestions on mobile!), a “reader mode” for links, and can create RSS feeds for unread/starred/archived articles.

It has Tagging Rules, that based on variables like title / URI / reading time / … add specific tags.

When I see an interesting link on mobile and want to quickly save it, I “share to Wallabag”. Then I can access them from desktop through the web interface.

Next steps?

I want to move the links from my link wiki https://serhii.net/f/ into something else, and deactivate it. Something else will be probably a Hugo website, compatible with Obsidian.

Conclusion

It’s been a while since I wrote something long on my blog, feels awesome. (By the way, I drafted this post, partly, on mobile, through Obsidian).

Initially it was supposed to be a short summary of what I’m using now, but then I went down into a memory hole, realized how much thoughts I put into these topics all my life, and decided to write it all out. Glad I did.

I’m very happy with the Diensttagebuch approach for saving useful links/snippets. Obsidian is wonderful for personal stuff and adding posts to the blog, and I don’t miss the advanced querying options. Maybe I’ve just started to read stuff less, or stopped caring about saving it for the future.


  1. Beware “I found this awesome thing, am using it for two entire weeks now!” blog posts! ↩︎

  2. Would be easily automatable, but I really enjoy the start-of-day ritual of copypasting the previous date block etc. ↩︎

  3. you write files, “compile” the website, a static .html website that’s trivial to host is generated at the end ↩︎

  4. I was also more and more unhappy with some of the design decisions Wordpress took (hate the “new” editor), and realized writing posts there is not fun anymore. But I kept having to update it, because Wordpress is known for being unsafe. ↩︎

  5. Decided to switch, did some interesting templating magic to make it closer to what I want. The CSS keeps giving me issues, especially after I tried to merge the Skeleton-based home page (Skeleton: Responsive CSS Boilerplate) with the Hugo theme, still keeps giving me issues (ordered/unordered lists, code blocks, etc), but that’s on me, not Hugo. ↩︎

  6. The grouping of separate posts under the same Day is done by Hugo through a template I wrote, described in part here (note the name!): 211108-1405 Hugo create shortcode or template for Day - serhii.net.
    The final list.html template result looks like this (note that both are under the same “Day 1065” header):
    A single individual post looks like this:  ↩︎

  7. Even if the link dies ↩︎