Things that live here:

  1. Work log, where I note things I feel I'll have to Google later.
  2. Journal, very similar but about non-IT topics.
  3. Blog for rare longer-form posts (last one below).
  4. Links blog, formerly my personal wiki.

Feel free to look at what you can find here and enjoy yourself.


~17% of my country is occupied by Russia, and horrible things are happening there. You can help stop this! If you can donate - I contribute to (and fully trust) Come Back Alive and Hospitallers, but there are also other ways to help.

If you're in Germany, DHL sends packets with humanitarian help for free!


Latest posts from the Work log

Day 1539 / M paper bits

  • Q2 papers:
    • Halicek et al. Tumor detection of the thyroid and salivary glands using hyperspectral imaging and deep learning. Biomed Opt Express. 2020 Feb 18;11(3):1383-1400. doi: 10.1364/BOE.381257. PMID: 32206417; PMCID: PMC7075628.
      • Tumor detection of the thyroid and salivary glands using hyperspectral imaging and deep learning
      • 2.2.1 diff normalization method: 2023-03-19-112751_967x313_scrot.png
      • Милая картинка спектров больных и не-больных
      • 2.5 uses Inception-v4 CNN, но его тоже немного видоизменили
      • Metrics: AUC but also sensitivity/specificity
      • Results 3.1 тоже использует метод для нахождения самых крутых wavelengths!
        • 2023-03-19-113250_796x740_scrot.png
      • Suggested inclusion:
        • Lines 64-66, before/after:
          • Convolutional neuronal networks (CNN) were used to classify ex vivo and in vivo head and neck tumors, colorectal cancer, esophagogastric cancer and brain tumors [25, 26, 27].
          • Convolutional neuronal networks (CNN) were used to classify ex vivo and in vivo head and neck tumors, colorectal cancer, esophagogastric cancer, and brain, thyroid, and salivary tumors [25, 26, 27, XXXX].
        • 453:
          • There are several wavelengths that are significant for both architectures: 585, 605, 610, 670, 750, 875, 975 nm. In future work it would be interesting to research why these exact wavelengths have such a strong influence.
          • There are several wavelengths that are significant for both architectures: 585, 605, 610, 670, 750, 875, 975 nm. They are similar but not identical to the salient features for thyroid tumor calculated using the grad-CAM algorithm1. In future work it would be interesting to calculate the salient features using the grad-CAM algorithm and other approaches, and research why these exact wavelengths have such a strong influence.
        • Если хотим, можем еще добавить про “было бы интересно еще сделать three-bands RGB multiplex images которые вон в той работе были лучше чем hyperspectral для отдельных классов рака”
    • Fabelo et al. Surgical Aid Visualization System for Glioblastoma Tumor Identification based on Deep Learning and In-Vivo Hyperspectral Images of Human Patients. Proc SPIE Int Soc Opt Eng. 2019 Feb;10951:1095110. doi: 10.1117/12.2512569. Epub 2019 Mar 8. PMID: 31447494; PMCID: PMC6708415.
      • Surgical Aid Visualization System for Glioblastoma Tumor Identification based on Deep Learning and In-Vivo Hyperspectral Images of Human Patients - PMC
      • Brain cancer
      • CNN but not Inception, но у них ОК результаты и с DNN
      • Они отдельно имеют класс для hypervascularized, то есть вены и кровяка, и работают с ними отдельно. Отсылаются на работу на касательную тему как раз на colorectal cancer.
      • Figure 6:
        • в их программе хирург лично вручную определяет thresholds для классов! Т.к. не с чем сравнить для каждого нового пациента (как понимаю то же отсутствие тест датасета условно). То, что ты типа автоматизировала:

          Finally, since the computation of the optimal operating point cannot be performed during the surgical procedures due to the absence of a golden standard of the undergoing patient, a surgical aid visualization system was developed to this end (Figure 6). In this system, the operating surgeon is able to determine the optimal result on the density map by manually adjusting the threshold values of the tumor, normal and hypervascularized classes. These threshold values establish the minimum probability where the pixel must correspond to a certain class in the classification map generated by the 1D-DNN

      • SUGGESTED CHANGES:
        • Добавить отсылку на него в самый конец 64-66, тоже пример brain cancer
        • 168:
          • The need in thresholding raises the question about choosing an optimal threshold that maximizes the evaluation metrics.
          • The need in thresholding raises the question about choosing an optimal threshold. Different methods for choosing thresholds exist, and in some cases one can even be manually selected for each individual case[XXX]. For our case, we needed a threshold that maximizes the evaluation metrics, and therefore needed an automatic approach.
    • Rajendran et al. Hyperspectral Image Classification Model Using Squeeze and Excitation Network with Deep Learning. Comput Intell Neurosci. 2022 Aug 4;2022:9430779. doi: 10.1155/2022/9430779. PMID: 35965752; PMCID: PMC9371828.
      • Hyperspectral Image Classification Model Using Squeeze and Excitation Network with Deep Learning
      • Техническая low-level про разные методы и сети. Суть - придумать как использовать deep learning для сложной HSI data structure и как extract features оттуда. Якобы работает лучше чем Inception and friends: 2023-03-19-122701_890x567_scrot.png
      • Основное: squeeze-and-excitation-blocks которые акцентируют key features! 2023-03-19-123933_1069x512_scrot.png
      • SUGGESTED CHANGES:
        • 77-79
          • Several approaches to improve artificial networks were considered, such as testing different pre-processing steps (e.g. normalization) [26] and architectures (e.g. CNN) [28].
          • Several approaches to improve artificial networks were considered, such as testing different pre-processing steps (e.g. normalization) [26] and architectures (e.g. CNN [28], also in combination with squeeze-and-excitation networks[XXX]).
        • 453 добавить в конец:
          • Lastly, squeeze-and-excitation blocks[XXX] apply varying weight ratios to emphasize such target key features and eliminate unnecessary data, and methods based on this approach could, too, provide additional context on the topic.
    • Hong et al. Monitoring the vertical distribution of HABs using hyperspectral imagery and deep learning models. Sci Total Environ. 2021 Nov 10;794:148592. doi: 10.1016/j.scitotenv.2021.148592. Epub 2021 Jun 19. PMID: 34217087.
  • R1 l. 79 “post-processing is an important step”: expand on already existing post-processing techniques
    • Relevant article: https://www.mdpi.com/2072-4292/14/20/5199
      • In this paper, we explore the effects of degraded inputs in hyperspectral image classification including the five typical degradation problems of low spatial resolution, Gaussian noise, stripe noise, fog, and shadow. Seven representative classification methods are chosen from different categories of classification methods and applied to analyze the specific influences of image degradation problems.

    • Doesn’t have salt-and-pepper-noise as type of degratations in PREprocessing, for post-processing lists really nice ways to sort out the unclear border things.

      In postprocessing methods, the raw classification map is often calculated from a pixelwise HSI classification approach and then optimized according to the spatial dependency [26]. References [27,28] used the Markov random fields (MRF) regularizer to adjust the classification results obtained by the MLR method in dynamic and random subspaces, respectively. In order to optimize the edges of classification results, Kang et al. [29] utilized guidance images on the preliminary class-belonging probability map for edge-preserving. This group of strategies can better describe the boundary of classification objects, remove outliers, and refine classification results

    • CHANGES on line 77-80 (includes changes from the third paper above!):
      • Several approaches to improve artificial networks were considered, such as testing different pre-processing steps (e.g. normalization) [26] and architectures (e.g. CNN) [28]. Recent studies showed that post-processing is an important step in ML pipelines [29].

      • Several approaches to improve artificial networks were considered, from testing different architectures (e.g. CNN [28], also in combination with squeeze-and-excitation networks[XXX]), to testing different pre-processing (e.g. normalization)[26] or post-processing steps.[29].

        In particular, postprocessing is often used to optimize a raw pixelwise classification map, using various methods, e.g. using guidance images for edge-preserving, as part of a group of strategies used to better define the boundaries of classification objects, remove outliers, refine classification results. In particular, Edge Preserving Filtering (EPF)3 has been shown to improve the classification accuracy significantly in a very short time. Another approach is the use of a Markov Random Field (MRF)4, where the class of each pixel is determined based on the probability of the pixel itself, the adjacent pixels, and the solution of a minimization problem.


    1. R. R. Selvaraju, M. Cogswell, A. Das, R. Vedantam, D. Parikh, and D. Batra, “Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization,” Proc IEEE Int Conf Comput Vis, 618–626 (2017).
     ↩︎
  1. Applied Sciences | Free Full-Text | Comparison of Convolutional Neural Network Architectures for Classification of Tomato Plant Diseases ↩︎

  2. 29 / Kang, X.; Li, S.; Benediktsson, J.A. Spectral–Spatial Hyperspectral Image Classification with Edge-Preserving Filtering. IEEE Trans. Geosci. Remote Sens. 2014, 52, 2666–2677. [Google Scholar] [CrossRef] ↩︎

  3. 86 / Tarabalka, Y.; Fauvel, M.; Chanussot, J.; Benediktsson, J.A. SVM- and MRF-Based Method for Accurate Classification of Hyperspectral Images. IEEE Geosci. Remote Sens. Lett. 2010, 7, 736–740. [Google Scholar] [CrossRef][Green Version] ↩︎

Day 1538

My link wiki's rebirth into Hugo, final write-up

Good-bye old personal wiki, AKA Fiamma. Here are some screenshots which will soon become old and nostalgic:

2023-03-18-191925_1056x892_scrot.png 2023-03-18-191902_816x958_scrot.png 2023-03-18-191838_1328x977_scrot.png

2023-03-18-192052_1067x979_scrot.png

I’ve also archived it, hopefully won’t turn out to be a bad idea down the line (but that ship has sailed long ago…):

Will be using the Links blog from now on: https://serhii.net/links

Detecting letters with Fourier transforms

TIL from my wife in the context of checkbox detection! letters detection fourier transform – Google Suche

TL;DR you can use fourier transforms on letters, that then lead to differentiable results! Bright lines perpendicular to lines in the original letter etc.

Day 1537

Notes after writing a paper

Based on feedback on a paper I wrote:

  • Finally learn stop using “it’s” instead of “its” for everything and learn possessives and suff
  • Don’t use “won’t”, “isn’t” and similar short forms when writing a scientific paper. “Will not”, “is not” etc.
  • ‘“Numbered”’ lists with “a), b), c)” exist and can be used along my usual bullet-point-style ones

matplotlib labeling pie-charts

python - How to have actual values in matplotlib Pie Chart displayed - Stack Overflow:

def absolute_value(val):
    a  = numpy.round(val/100.*sizes.sum(), 0)
    return a

plt.pie(sizes, labels=labels, colors=colors,
        autopct=absolute_value, shadow=True)

Can be also used to add more complex stuff inside the wedges (apparently the term for parts of the ‘pie’).

I did this:

def absolute_value(val):
    a  = int(np.round(val/100.*np.array(sizes).sum(), 0))
    res = f"{a} ({val:.2f}%)"
    return res

for this: 2023-03-17-170947_413x206_scrot.png

Day 1533

xlsxgrep for grepping inside xls files

This is neat: xlsxgrep · PyPI

Supports many grep options.

micro is a simple single-file CLI text editor

Stumbled upon zyedidia/micro: A modern and intuitive terminal-based text editor. Simple text editor that wants to be the successor of nano, CLI-based. The static .tar.gz contains an executable that can be directly run. Played with it for 30 seconds and it’s really neat**.

(Need something like vim for someone who doesn’t like vim, but wants to edit files on servers in an easy way in case nano isn’t installed and no sudo rights.)

json diff with jq, also: side-by-side output

Websites

There are online resources:

CLI

SO thread1 version:

diff <(jq --sort-keys . A.json) <(jq --sort-keys . B.json)

Wrapped it into a function in my .zshrc:

jdiff() {
	diff <(jq --sort-keys . "$1") <(jq --sort-keys . "$2")
}

Side-by-side output

vimdiff is a thing and does this by default!

Otherwise2 diff has the parameters -y, and --suppress-common-lines is useful.

This led to jdiff’s brother jdiffy:

jdiffy() {
	diff -y --suppress-common-lines <(jq --sort-keys . "$1") <(jq --sort-keys . "$2") 
}

Other

git diff --no-index allows to use git diff without the thing needing to be inside a repo. Used it heavily previously for some of its fancier functions. Say hi to gdiff:

gdiff() {
	git diff --no-index "$1" "$2"
}

  1. diff - Using jq or alternative command line tools to compare JSON files - Stack Overflow ↩︎

  2. bash - unix diff side-to-side results? - Stack Overflow ↩︎

Day 1531 / Rancher and kubernetes, the very basics

  • Rancher

    • most interesting thing to me in the interface is workers->pods
  • Two ways to run stuff

    • via yaml
    • via kubernetes CLI / kubectl
  • Via yaml:

    • change docker image and pod name
    • you can use a command in the yaml syntax run in interactive-ish mode, ignoring the Docker command, to execute stuff inside the running docker image.
      - name: podname
       image: "docker/image"
       command:
       - /bin/sh
       - -c
       - while true; do echo $(date) >> /tmp/out; sleep 1; done
      
  • Kubernetes Workloads and Pods | Rancher Manager

    • Pods are groups of containrs that share network and storage, usually it’s one container
  • Assigning Pods to Nodes | Kubernetes:

    • nodeName is a simple direct way
      metadata:
        name: nginx
      spec:
        containers:
        - name: nginx
          image: nginx
        nodeName: kube-01 
      

Day 1530 / Things I learned at a hackathon^W onsite working session™

  • don’t create branches / merge requests until I start working on the ticket - don’t do many at the same time either, harder and then rebasings needed
  • delete branches after they get merged to main (automatically) - sometimes I didn’t to play it safe but never needed it and have them locally regardless
  • Most of my code is more complex and more layers of abstraction than actually needed, and gets worse with later fixes. Don’t aim for abstraction before abstraction is needed
  • When solving complex-er merge conflicts, sometimes this helps: first leave all imports, merge the rest, and then clean up the remaining imports

Day 1527 / Cleaning printer printheads

TIL - when looking how to clean printer heads - that some printers can do it automatically! Can be started both through the OS GUI or the printer itself (if it has buttons and stuff).

Wikihow (lol) as the first result in Google gave me enough to learn about automatic cleaning being a thing: How to Clean Print Heads: Clogged & Dried Up Print Heads; How to Clean a Printhead for Better Ink Efficiency < Tech Takes - HP.com Singapore +

Day 1520

Python state machine

Was doing a graph-like stucture to easily explain a really complex decision tree that’s not really a tree, but I was really looking for an existing thing: A state machine!

And it’s even an existing programming pattern: StateMachine — Python 3 Patterns, Recipes and Idioms

The book I didn’t know I needed!

Anyway, existing implementations:

I really like how feature-complete and documented transitions is - callbacks etc.

Python ellipsis (...)

Seen first in [[garden/it/230228-1835 Python Callable Protocols for complex Callable typing]].

Python Callable Protocols for complex Callable typing

If you need to add typing to a complex Callable, with, say, parameter names etc., there are Callback Protocols.

# NB "self" is included!
class Combiner(Protocol):
    def __call__(self, *vals: bytes, maxlen: Optional[int] = None) -> list[bytes]: ...

def batch_proc(data: Iterable[bytes], cb_results: Combiner) -> bytes:
    for item in data:

Python 3.7 needs typing_extensions, 3.8+ support it natively.

See also: python typing signature (typing.Callable) for function with kwargs - Stack Overflow

git diff to find differences in file between revisions

git diff [--options] <commit> <commit> [--] [<path>...]

For example, for ‘between now and 2 commits back’:

$ git diff HEAD^^ HEAD main.c
$ git diff HEAD~2 HEAD -- main.c

Paths need to be relative to the root of the repo.

Another option (can do different files) is:

git diff <revision_1>:<file_1> <revision_2>:<file_2>

Source: git - How do I diff the same file between two different commits on the same branch? - Stack Overflow

(Bonus: the -- makes it work for files with weird names like -p, good for scripts but rarely needed in practice).

Previously: 230221-1406 Gitlab has a git graph and comparisons

Day 1513

Windows has case-insensitive filenames and using fnmatch for not-filenames fails

Adventures in cross-platform programming: I used fnmatch to basically simulate globs in a place where regexes were overkill, but not for filenames.

On windows, paths are case insensitive and therefore fnmatch is case insensitive too, leading to unexpected behaviour.

fnmatchcase() is case-sensitive regardless of OS.

pytest skipif

From How to use skip and xfail to deal with tests that cannot succeed — pytest documentation on dynamically skipping tests based on a condition:

@pytest.mark.skipif(sys.version_info < (3, 10), reason="requires python3.10 or higher")
def test_function():

Better than my previous approach of if xxx: pytest.skip("...") inside the tests themselves.

Gitlab has a git graph

TIL Gitlab has

  • a Graph a la tig / pycharm log /. .., located at “Repository -> Graph”. Really neat
  • “compare” right under it to quickly compare different branches/revisions

I should play more with the existing interfaces of things I use often

Day 1509 / Git commit empty directories

TIL you can’t.

How do I add an empty directory to a Git repository? - Stack Overflow suggests:

  • Adding a .gitkeep
    • mostly clear purpose from the name and googleable
    • it’s not an established connection
    • some people think the .git- prefix should be reserved for git-specific files
  • Adding a .placeholder - same as above, but less clear but no .git- prefix
  • Adding a README explaining everything
  • Adding a .gitignore in the directory

Latest post from Blog

Підсумки 2022

Подія року: Маша подарувала велосипед!
Жах року: Вкрали велосипед :(

Підсумків 2021 року)1

God we had no idea how happy we were.

Now playing: Let the Sunshine In - Hair2

Досягнення року: Витримав майже без втрат і майже до кінця найстресовіший рік мого життя3. Став знову писати в блог і написав кілька довгиx постів про війну і не тільки
Настрій року: болю, вини, екзистенційної моральної кризисності кожної секунди, але в той самий час - впевнености та purpose
Зустріч року: TODO
Подія року: див. “жах року”
Жах року: див. “подія року”
Країна року: Україна
Антилюдина року: Путін і 71% його співвідчизників.
Місто року: Київ, який сильно болить щоразу, як про нього думаю
Слово року: “паляниця”
Подорож року: Франція, Данія.
Веб-сервіс року: Twitter. Flightradar24, liveuamap
Колір року: темно-червоний, або чорний.
Запах року: котів, своїх та чужих
Новина року: про звільнення Херсону!
Книга року: Книга «Недержавні таємниці» Віктора Ющенко; півтора романи Стівена Кінга + перечитував все підряд, що писав Peter Watts (Rifters trilogy і т.п.).
Фільм/серіал року: Twin Peaks (перші півтора сезони)
Media N.O.S.: Bob Gymlan’s відео про Бігфута; “Сводки” команди CITeam на каналі Майкла Наки.
Пісня року: Вперше в житті нічого нового в голову не приходить. Нехай Стефанія, яку слухав двічі, та Trenulețul, яку слухав раз шість.
Заклад року: кафе через парк неподалік від квартири в Лейпцигу4; Mensa Lohmannstraße.
Напій року: “эмоционально вставляющий чай”: Turkisch Earl Gray з цукром, лимоном, імбирем, ехінацеєю, мʼятою
Їжа року: суші, “сира яєчня”
Транспорт року: велосипед; і знову Bens Express Kyiv ↔ Leipzig

Побажання собі на 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 рік:

Перемоги. Побачитися з рідними та друзями, які в Україні зараз, побувати в Україні самому. Знайти знову той спокій, який в мене був в перші два місяці цього року. Зрозуміти та вирішити питання зі здоровʼям, які нещодавно зʼявились, та знайти свій дзен з тим, що стало їх причиною. Як і минулого року - знайти свій sustainable дзен з людьми та спілкуванням з ними. Навчитися регулювати, куди направляю свою увагу і свою енергію, і навчитися істинно визначати, щО є важливим зараз і вартим моєї уваги і енергії.

Ціль на наступний рік:5

  • написати хоча б дві наукові роботи, і взагалі трошки більше Academia
  • закінчити перші два семестра магістратури і почати писати магістерську

Ну і вічнозелене: Продовжити сон, спорт, медитацію – ЦЕ ПРАЦЮЄ. (x6)

(У)


  1. Так, вирвано з контексту - my blog my rules. ↩︎

  2. “The cast of Hair performs Let the Sunshine In at the Marriage Equality Rally in NYC May 17, 2009.” ↩︎

  3. Очевидно це одна тисячна від того, наскільки адом це було для середнього українця. Є за що бути взячним Богові та Всесвіту безумовно. Але Господи. Недавно робив календарик з хронологією аду цього року - пишаюсь тим, що діагностована депресія та проблеми з тиском, шкірою, запамороченням і тонною іншого почались тільки під кінець року. ↩︎

  4. (ще раз) ↩︎

  5. Цілі з минулого року:

    • Більше писати/створювати не важливо чого (пости, бібліотеки на Github, малюнки, вірші), але make an effort, умовно пости про PKM в блог а не короткі нариси про окремі деталі (хоча краще вони, ніж взагалі нічого). Є!
    • Навчитися підтримувати спілкування з людьми навіть при жорсткій зміні shared context (how do we talk to each other if there’s no watercooler anymore?…). Згадати дзен Мерзебургу і організовувати речі. Більше спілкуватися з людьми особисто та через відеодзвінки (.. або хоча б просто по телефону), менше тексту. Коли почалась війна, було ДУЖЕ БАГАТО саме цього. Але десь з травня everything is back to normal sadly.
    • Більше свіжого повітря, подорожей, спонтанності, легкості, не економити свою енергію з ціллю потім витратити її на лежання в ліжку як не дивно, з цим теж значно краще, ніж минулого року.
     ↩︎