Diensttagebuch

Day 1144 (17 Feb 2022)

Spacy is neat

Playing with
Spacy and it’s as nice and I thought it’d be.

Interesting bits and general dump of first impressions:

NER @ CLI: Custom-named entity recognition with spaCy in four lines: spacy can:
- Convert NER datasets from conll
- while outputting nice status info
Has a “Debug data” tool that allows to validate train data (and other stuff): Command Line Interface · spaCy API Documentation
Can do rule-based matching, linguistic features, Rule-based matching · spaCy Usage Documentation
Some support for Transformers, including allegedly all HuggingFace ones!
Both Doc and Span are heavily token-based, including for NER stuff. Can’t set a sub-token entity, for example.
But Doc.char_span() supports creating a Span based on characters and various alignment methods! Doc · spaCy API Documentation
- And of course we can get the character offsets from the span itself
You can merge/split tokens: Linguistic Features · spaCy Usage Documentation
The Example class for individual training instances can do neat stuff with BIO mapping, aligning of NER tokens etc: Example · spaCy API Documentation

Day 1143 (16 Feb 2022)

Caution text art and text art

When writing a function requiring a --yes_I_know_what_this_means_delete_everything and writing a warning message with tens of exclamation points, I decided that ASCII art is the better way to go.

Found this: Caution Text Art (Copy & Paste) - textart.sh

Allows even changing backgrounds from spaces to _s etc.!

textart.sh has a lot of topics and allows basic customisation of the arts themselves.

(Can’t find a single ASCII art piece with an artists’ signature though, which kinda worries me. And the dynamic scrolling without a way to see a list of all results…)

“pic"related:

                                                                                        
                ░░░░                                                                    
                                                                                        
                                            ██                                          
                                          ██░░██                                        
  ░░          ░░                        ██░░░░░░██                            ░░░░      
                                      ██░░░░░░░░░░██                                    
                                      ██░░░░░░░░░░██                                    
                                    ██░░░░░░░░░░░░░░██                                  
                                  ██░░░░░░██████░░░░░░██                                
                                  ██░░░░░░██████░░░░░░██                                
                                ██░░░░░░░░██████░░░░░░░░██                              
                                ██░░░░░░░░██████░░░░░░░░██                              
                              ██░░░░░░░░░░██████░░░░░░░░░░██                            
                            ██░░░░░░░░░░░░██████░░░░░░░░░░░░██                          
                            ██░░░░░░░░░░░░██████░░░░░░░░░░░░██                          
                          ██░░░░░░░░░░░░░░██████░░░░░░░░░░░░░░██                        
                          ██░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░██                        
                        ██░░░░░░░░░░░░░░░░██████░░░░░░░░░░░░░░░░██                      
                        ██░░░░░░░░░░░░░░░░██████░░░░░░░░░░░░░░░░██                      
                      ██░░░░░░░░░░░░░░░░░░██████░░░░░░░░░░░░░░░░░░██                    
        ░░            ██░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░██                    
                        ██████████████████████████████████████████                      
                                                                                        
                                                                                        
                                                                                        
                                                                                        
                  ░░

Taskwarrior can have lower-case tags

Okay, this blew my mind. Taskwarrior can have lowercase +t tags, along with the +T-uppercase ones I’ve been using my entire life.

Wow.

Day 1142 (15 Feb 2022)

Git adding another remote

Not the first time I’m touching the topic here :) But yet another repo to set up, and realized I didn’t really get “new remote” vs “remote URI”

Details: Managing remote repositories - GitHub Docs

Adding another remote

Easy simple take: How to Add a New Remote to your Git Repo | Assembla Help Center

# add
git remote add remote_name_github git@github.com:me/name.git

# show the result ('verify')
git remote -v

# push _specifically to that remote_
git push remote_name_github

Adding another remote URI, to push to both at the same time

Github ¹ helps:

git remote set-url --add --push origin git://original/repo.git
git remote set-url --add --push origin git://another/repo.git

… and gives the neat idea to create a remote named all for this purpose, as opposed to changing ‘origin’! That answer is really detailed and shows the process

Adding a remote with multiple pushurls

# take an existing repo, located at remote_uri

# add a remote with that URI
> git remote add all remote_uri

# overwrite its push URI with another one
> git remote set-url --add --push all all_push_uri_overrides_main_uri
# add the original one back
> git remote set-url --add --push all remote_uri

# Two remotes now
> git remote show
all
origin

> git remote show all
* remote all
  Fetch URL: remote_uri
  Push  URL: remote_uri
  Push  URL: all_push_uri_overrides_main_uri
  HEAD branch: master
  Remote branch:
    master new (next fetch will store in remotes/all)
  Local ref configured for 'git push':
    master pushes to master (up to date)

I think I got it now. My error was from not understanding that adding a push URI with --add overwrites the existing push URI, and I had to add it again to get the previous one working too.

github - Git - Pushing code to two remotes - Stack Overflow ↩︎

Day 1141 (14 Feb 2022)

python walrus operators for debugging and output

Python 3.8’s Walrus¹ operator is neat for printing outputs:

logger.warning(f"result is false with {start_offset=} {end_offset=} in {doc.name=}. {a.is_online=}")

[https://docs.python.org/3/whatsnew/3.8.html What’s New In Python 3.8 — Python 3.10.2 documentation] ↩︎

python run pdb on exception

Was looking for something similar for months, found it in an unexpected place: Implement –pdb in a python cli

Example from there:

if "--pdb" in sys.argv:
	try:
		bombs()
	except:
		extype, value, tb = sys.exc_info()
		traceback.print_exc()
		pdb.post_mortem(tb)
else:
	bombs()

I changed the flow to this, so I don’t need to call bombs() in two places:

try:
	bombs()
except Exception as e:
	if args.pdb:
		extype, value, tb = sys.exc_info()
		traceback.print_exc()
		pdb.post_mortem(tb)
	else:
		raise e

python asserts

After writing if x not in y: raise ValueError()... for the Nth time, thought of using an assert, and you can happily do something similar:

assert x in y, f"{x} should be inside {y}"

black formats that into

assert (
	x in y
), f"{x} should be inside {y}"

which looks nice too. That’s much faster to write than my usual ValueError pattern.

UsingAssertionsEffectively - Python Wiki touches on that, quoting from it directly below without changes.

Places to consider putting assertions:

checking parameter types, classes, or values
checking data structure invariants
checking “can’t happen” situations (duplicates in a list, contradictory state variables.)
after calling a function, to make sure that its return is reasonable
The overall point is that if something does go wrong, we want to make it completely obvious as soon as possible.

[…]

Assertions should not be used to test for failure cases that can occur because of bad user input or operating system/environment failures, such as a file not being found. Instead, you should raise an exception, or print an error message, or whatever is appropriate. One important reason why assertions should only be used for self-tests of the program is that assertions can be disabled at compile time.

Day 1137 (10 Feb 2022)

Personal script directory

I have a lot of rarely-used personal shell scripts, all aliases now, this would be a huge improvement: Sd: My Script Directory | Hacker News

timewarrior lengthening last task to now through a hint; representing dates

This works to lengthen the last span until the present moment (=changing it’s end to “now”):

w mod end @1 now

A good candidate for my future 220210-2236 Personal script directory :)

linux pkill autocompletes only running processes

pkill autocompletes running processes, which is logical but still really neat.