Diensttagebuch

Day 2270 (19 Mar 2025)

Pydantic validation blues
- zc
- zc/it
- py
- coral
- py/pydantic
Pydantic’s FilePath is like Path except that the file has to exist and be a file.

BUT FilePath when validating expects a string as input, not a Path! (in other words: FilePath(Path) doesn’t seem to work)

So when I create a Validator that converts str into Path ¹:
```
@field_validator("filename", mode="before") 
@classmethod 
def parse_filename(cls, value: str | Path) -> Path: 
	return Path(value)
```
I get a wonderful
```
>       doc = UCFDocument.model_validate_json(json_string)
E       pydantic_core._pydantic_core.ValidationError: 1 validation error for UCFDocument
E       filename
E         Input is not a valid path for <class 'pathlib.Path'> [type=path_type, input_value=PosixPath('/home/sh/w/cor...n/doc.pdf_data/doc.pdf'), input_type=PosixPath]

tests/ucf/test_data_structures.py:179: ValidationError
```
Again, the error is a PosixPath not being a Path, though it is one:
```
E         Input is not a valid path for <class 'pathlib.Path'> [type=path_type, input_value=PosixPath('/home/sh/w/cor...n/doc.pdf_data/doc.pdf'), input_type=PosixPath]

# explicitly expecting a PosixPath creates an even better
E         Input is not a valid path for <class 'pathlib.PosixPath'> [type=path_type, input_value=PosixPath('/home/sh/w/cor...n/doc.pdf_data/doc.pdf'), input_type=PosixPath]
```
Not intuitive at all.

Solution is to give FilePath strings and only strings, or drop FilePath to begin with.
```
├── pydantic v2.10.6
│   ├── annotated-types v0.7.0
│   ├── pydantic-core v2.27.2
│   │   └── typing-extensions v4.12.2
```
1. (don’t ask why I needed this, this is a minimal reproducible example only) ↩︎
CVAT for image labeling
- zc
- zc/it
- ml
- web
CVAT is a really neat labelling platform, online + free on-premise w/ Docker.
(Github: cvat-ai/cvat: Annotate better with CVAT, the industry-leading data engine for machine learning. Used and trusted by teams at any scale, for data of any scale.)

I like it more than label studio for images, has more functions, but is also “heavier” / bulkier.

Love how it supports even 600mb 4-channel TIFF satellite images and is quite fast at that.

Bits:
- Enable auto-save every N minutes
- <C-a> for snipping polygons to existing polygon ponits
- “Backup project” to re-import later, “Export project” to get e.g. YOLO annotations
Day 2268 (17 Mar 2025)

Current LLM evaluation landscape
- zc
- zc/it
- llm
- coral
The Open LLM Leaderboard is dead¹, as good time as any to look for new eval stuff!
- HF universe
  - The Open LLM Leaderboard people are actually OpenEvals (OpenEvals), and they created other cool stuffs
    
    huggingface/lighteval: Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backends eval suite
    
    I like their documentation for new task / new model
    
    huggingface/evaluation-guidebook: Sharing both practical insights and theoretical knowledge about LLM evaluation that we gathered while managing the Open LLM Leaderboard and designing lighteval!
    
    the contents are awesome, w/ examples, model-as-a-judge, etc.
  - evaluate-metric (Evaluate Metric) recommended by them as a guide to existing metrics
- Harnesses
  - Evalverse: Unified and Accessible Library for Large Language Model Evaluation meta-thing that can run different harnesses based on the target. Github repo archived? UpstageAI/evalverse: The Universe of Evaluation. All about the evaluation for LLMs.
  - open-compass/opencompass: OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.
    
    don’t really like their documentation on adding datasets/models
  - The venerable stanford-crfm/helm: Holistic Evaluation of Language Models (HELM)
  - openai/evals: Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.
- Resources / articles:
  - How to evaluate by Meta: Validation | How-to guides
  - HF eval guidebook: huggingface/evaluation-guidebook
1. open-llm-leaderboard/open_llm_leaderboard · It’s been a wild ride, folks :) (end of the Open LLM Leaderboard) ↩︎
Day 2264 (13 Mar 2025)

Exporting gitea projects
- zc
- zc/it
- cli
- gitea
Backup and Restore | Gitea Documentation has the full detailed story.

The easy stupid way for backing up gitea running in docker, ~~untested and~~ allegedly will fail if DB was being used during dumping.
```
docker exec -it --user git gitea-container bash

gitea dump

# then outside the container, copy from the gitea container to host OSooj

docker cp gitea-container:/whatveer/gitea-dump.zip /tmp
```
Importing: the docs don’t have the correct paths, not easy to follow.

EDIT: if your docker has /data mounted somewhere local, just copying that directory somewhere might work.
Installing SSDs into M.2 slots and drive stuff
- zc
- zc/it
- hardware
- M.2 slots have keys: How do I install an M.2 SSD on my computer? - Transcend Information, Inc.
- B+M means both B and M slots are acceptable for the B+M module
CLI: sudo lshw -C disk tells you all disks h
Day 2262 (11 Mar 2025)

Python fsspec copying files
- zc
- zc/it
- python
- py/fsspec
In fsspec fs.copy() doesn’t really work from local to remote, also existing or not-existing directories etc.

Their documentation has a whole page on this: Copying files and directories — fsspec 2024.10.0.post13+gdbed2ec.d20241115 documentation
Day 2251 (28 Feb 2025)

Gitlab cli runner `glab`
- zc
- zc/it
- cli
- gitlab
GitLab CLI - glab | GitLab Docs: CLI thingy to interact with gitlab.

TL;DR: glab ci status or whatever.

It’s really neat and has a cool CLI interface, either you set things through flags or you get a neat menu to choose from!

Smart enough to parse current directory!
Day 2250 (27 Feb 2025)

git tags
- zc
- zc/it
- git
- cli
Git Tag: A Tutorial for Tagging Releases in Git - DEV Community
```
#ligthweight tag
git tag v1.0.0

# full
git tag -a v1.0.0 -m "Releasing version v1.0.0"
```
Tags don’t get pushed automatically. For this, git push origin v1.0.0
Python dynamic versioning with uv and hatch
- zc
- zc/it
- uv
- python
- Writing your pyproject.toml - Python Packaging User Guide
- uv uses hatch which has this to say about dynamic versioning: Configuring project metadata - Hatch
```
[project] 
dynamic = ["version"] 

[tool.hatch.version] 
path = "..."
```
Path is a python file w/ version info. If using src layout, src has to be included in the path.j

For uv, this works. Described for example here: Versioning Python Projects with Hatch (I like __init__.py though, not about as that guide does)

uv-dynamic-versioning · PyPI exists but I don’t really see why.
Rotating PDF files
- zc
- zc/it
- linux
- cli
Because every single goddamn time

Command line: How do you rotate a PDF file 90 degrees? - Unix & Linux Stack Exchange

pdftk input.pdf cat 1-endwest output output.pdf

1- is needed because page range, here for all pages.

endwest etc from man page:
```
 [<begin page number>[-<end page number>[<qualifier>]]][<page rotation>]
```
The qualifier can be even or odd, and the page rotation can be north, south, east, west, left, right, or down.
Each option sets the page rotation as follows (in degrees): north: 0, east: 90, south: 180, west: 270, left: -90, right: +90, down: +180. left, right, and down make relative adjustments to a page’s rotation.

Day 2270 (19 Mar 2025)

Day 2268 (17 Mar 2025)

Day 2264 (13 Mar 2025)

Backup and Restore | Gitea Documentation has the full detailed story.

Day 2262 (11 Mar 2025)

Day 2251 (28 Feb 2025)

Day 2250 (27 Feb 2025)