In the middle of the desert you can say anything you want
Reading “German: An Essential Grammar” by Donaldson found this bit: 1
English has a rule that if the time of an event that
occurred in the past is mentioned, then the imperfect must be used, but if
the time is omitted, the perfect is required, e.g. \
- He returned from Hamburg yesterday.
- He has returned from Hamburg.
- He has returned from Hamburg yesterday. (not grammatical)
TIL.
zsh-specific - to detach & disown a process, there’s &!
: 2
dolphin &!
Long question and answer about fahren zu/nach/in/…: Richtungen und Ziele
The Yojik Website has the FSI courses FSI Languages Courses and the website as I remember it.
Changed ~/.taskrc
to show any active tasks regardless of anything else in my sprint view:
s () {task s \(project:w or \(sprint:$SPRINT \(+A or +O\)\) or +ACTIVE\) "$*"}
Standard lock command leaves both monitors on.
Reddit3 mentioned two commands:
xset s activate
xset dpms force off
The second one worked for me!
Now I have shiny new screen lock (and suspend too, while we are at it) keybinding in i3 config!
bindsym $ms+n exec gnome-screensaver-command -l && xset dpms force off
bindsym $ms+Shift+n exec i3lock -i ~/s/black_lock.png -t -p win -e && systemctl suspend -i
Nvidia has a repo of all docker images it creates, one of them: Torch | NVIDIA NGC
“Das finde ich zielführender als…” - heard at work
docker run --name frontend -p 0:80 frontend:latest
1
Port 0 gets passed to the kernel that assigns any free port.
To see which one, docker port somecontainer
.
docker run --gpus device=3 -e NVIDIA_VISIBLE_DEVICES=0 -e CUDA_VISIBLE_DEVICES=0 myservice
Where the device=3
is the GPU id on the host that we want to use.
lspci | grep -i "nvidia"
-i
== ‘ignore case’ is actually something that I can remember.
Docker will autostart any container with a RestartPolicy of ‘always’ when the docker service initially starts. 1
I can set/unset it in kitematic
, or through terminal:
docker update --restart=no my-container
Quoting SO: 2
apt purge --auto-remove <packagename>
purges packagename
and any packages which are rendered unnecessary by its removal, as well as any other packages which aren’t necessary.
apt autoremove --purge
purges any packages which aren’t necessary (marked as “automatically installed” and with no dependent packages).
The first form is what you’d use when manipulating individual packages; the latter is a clean-up operation across all packages.
This seems nice, TODO: Cleaning up with apt-get | Network World
LVM - Debian Wiki is nice and readable. I used this command to backup the headers:
sudo cryptsetup luksHeaderBackup /dev/nvmeXXXXX --header-backup-file headerBackupFile
… and put it somewhere not on the drive I’ll be recovering if it all goes wrong.
Aaaand the saga continues!
…since the GPU is an eGPU, apparently I do need to do the harder way: Accelerating Machine Learning on a Linux Laptop with an External GPU | NVIDIA Developer Blog
It is, I can see it:
(17:42:42/10815)~/$ lspci | grep -i VGA
00:02.0 VGA compatible controller: Intel Corporation UHD Graphics 620 (rev 07)
0c:00.0 VGA compatible controller: NVIDIA Corporation GP104 [GeForce GTX 1070] (rev a1)
but if it wasn’t, I’d authorize it and check with boltctl list
:
(17:43:13/10817)~/$ boltctl list
[...]
● GIGABYTE GV-N1070IXEB-8GD
├─ type: peripheral
├─ name: GV-N1070IXEB-8GD
├─ vendor: GIGABYTE
├─ uuid: # redacted
├─ status: authorized
│ ├─ domain: domain0
│ └─ authflags: none
├─ authorized: Do 29 Apr 2021 07:57:37 UTC
├─ connected: Do 29 Apr 2021 07:57:37 UTC
└─ stored: no
How to setup an eGPU on Ubuntu for TensorFlow describes other things that can go wrong:
I had to disable the following, otherwise my eGPU was not detected:
- Secure Boot
- Thunderbolt Security Level
From this point on, I follow Nvidia’s tutorial 3 unless stated otherwise.
Using quotes means the *
doesn’t have to be escaped.
sudo apt-get purge "nvidia*"
This is a fuller example: 4
sudo rm /etc/apt/sources.list.d/cuda*
sudo apt remove --autoremove nvidia-cuda-toolkit
sudo apt remove --autoremove nvidia-*
Found and manually removed /etc/apt/sources.list.d/graphics-drivers-ubuntu-ppa-bionic.list
, leaving the .save
file in place.
As per nvidia’s guide,
sudo apt-get update
sudo apt-get dist-upgrade
To be safe, rebooted.
The existing driver is most likely Nouveau, an open-source driver for NVIDIA GPUs. Because Nouveau doesn’t support eGPU setups, install the NVIDIA CUDA and NVIDIA drivers instead. You must also stop the kernel from loading Nouveau. 3
okay!
Found this: NVIDIA/data-science-stack: NVIDIA Data Science stack tools Read about it here: Ubuntu for machine learning with NVIDIA RAPIDS in 10 min | Ubuntu
Official by nvidia, and seems to do automatically what’s needed for supported systems. Let’s run a script from the internet that installs drivers, loads kernel modules etc.
Source is available, yay for open source: data-science-stack/data-science-stack at master · NVIDIA/data-science-stack
Ran ./data-science-stack setup-system
- uses sudo, didn’t ask for root or anything.o
Seems to have installed nvidia driver version 460. Asked to reboot at the end.
Rebooted.
(18:40:30/10909)~/$ nvidia-smi
NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.
okay. Same results I had. Confirms that my prev. steps weren’t wronger than the script.
(18:41:49/10910)~/$ sudo apt list --installed | grep "\(cuda\|nvidia\)"
WARNING: apt does not have a stable CLI interface. Use with caution in scripts.
libnccl2/unknown,now 2.9.6-1+cuda11.3 amd64 [installed]
libnvidia-cfg1-460/unknown,now 460.73.01-0ubuntu1 amd64 [installed,automatic]
libnvidia-common-460/unknown,now 460.73.01-0ubuntu1 all [installed,automatic]
libnvidia-compute-460/unknown,now 460.73.01-0ubuntu1 amd64 [installed,automatic]
libnvidia-container-tools/bionic,now 1.4.0-1 amd64 [installed,automatic]
libnvidia-container1/bionic,now 1.4.0-1 amd64 [installed,automatic]
libnvidia-decode-460/unknown,now 460.73.01-0ubuntu1 amd64 [installed,automatic]
libnvidia-encode-460/unknown,now 460.73.01-0ubuntu1 amd64 [installed,automatic]
libnvidia-extra-460/unknown,now 460.73.01-0ubuntu1 amd64 [installed,automatic]
libnvidia-fbc1-460/unknown,now 460.73.01-0ubuntu1 amd64 [installed,automatic]
libnvidia-gl-460/unknown,now 460.73.01-0ubuntu1 amd64 [installed,automatic]
libnvidia-ifr1-460/unknown,now 460.73.01-0ubuntu1 amd64 [installed,automatic]
nvidia-compute-utils-460/unknown,now 460.73.01-0ubuntu1 amd64 [installed,automatic]
nvidia-container-runtime/bionic,now 3.5.0-1 amd64 [installed,automatic]
nvidia-container-toolkit/bionic,now 1.5.0-1 amd64 [installed,automatic]
nvidia-dkms-460/unknown,now 460.73.01-0ubuntu1 amd64 [installed,automatic]
nvidia-docker2/bionic,now 2.6.0-1 all [installed]
nvidia-driver-460/unknown,now 460.73.01-0ubuntu1 amd64 [installed]
nvidia-kernel-common-460/unknown,now 460.73.01-0ubuntu1 amd64 [installed,automatic]
nvidia-kernel-source-460/unknown,now 460.73.01-0ubuntu1 amd64 [installed,automatic]
nvidia-prime/bionic-updates,bionic-updates,now 0.8.16~0.18.04.1 all [installed,automatic]
nvidia-settings/unknown,unknown,now 465.19.01-0ubuntu1 amd64 [installed,automatic]
nvidia-utils-460/unknown,now 460.73.01-0ubuntu1 amd64 [installed,automatic]
xserver-xorg-video-nvidia-460/unknown,now 460.73.01-0ubuntu1 amd64 [installed,automatic]
Also, as usual,
(18:48:34/10919)~/$ lsmod | grep nvi
(18:48:37/10920)~/$
lspci -k
shows the kernel modules:
0c:00.0 VGA compatible controller: NVIDIA Corporation GP104 [GeForce GTX 1070] (rev a1)
Subsystem: Gigabyte Technology Co., Ltd GP104 [GeForce GTX 1070]
Kernel modules: nvidiafb, nouveau
This output implies no nvidia driver is installed on my system5. …though it is.
$ nvidia-settings --version
nvidia-settings: version 465.19.01
software-properties-gtk
tells me I’m using the proprietary nvidia-driver-460, not 465
In any case, can’t blacklist nouveau as still there are no ubuntu kernel modules.
BUT!
(19:04:04/10946)~/$ dkms status
nvidia, 460.73.01: added
Also, inxi -Fxxxrz
(found somewhere on the internet):
Graphics: Card-1: Intel UHD Graphics 620 bus-ID: 00:02.0 chip-ID: 8086:5917
Card-2: NVIDIA GP104 [GeForce GTX 1070] bus-ID: 0c:00.0 chip-ID: 10de:1b81
Display Server: x11 (X.Org 1.19.6 ) drivers: modesetting,nvidia (unloaded: fbdev,vesa,nouveau)
It it sees them as there and loaded? Does dkms somehow bypass lsmod etc?
sudo dkms autoinstall
should autoinstall all added drivers, …let’s hope for the best I guess.
(19:11:47/10958)~/$ sudo dkms autoinstall
Kernel preparation unnecessary for this kernel. Skipping...
applying patch disable_fstack-clash-protection_fcf-protection.patch...patching file Kbuild
Hunk #1 succeeded at 85 (offset 14 lines).
Building module:
cleaning build area...
unset ARCH; [ ! -h /usr/bin/cc ] && export CC=/usr/bin/gcc; env NV_VERBOSE=1 'make' -j8 NV_EXCLUDE_BUILD_MODULES='' KERNEL_UNAME=5.4.0-72-generic IGNORE_XEN_PRESENCE=1 IGNORE_CC_MISMATCH=1 SYSSRC=/lib/modules/5.4.0-72-generic/build LD=/usr/bin/ld.bfd modules......(bad exit status: 2)
ERROR: Cannot create report: [Errno 17] File exists: '/var/crash/nvidia-dkms-460.0.crash'
Error! Bad return status for module build on kernel: 5.4.0-72-generic (x86_64)
Consult /var/lib/dkms/nvidia/460.73.01/build/make.log for more information.
The file is long, keys seems:
scripts/Makefile.build:269: recipe for target '/var/lib/dkms/nvidia/460.73.01/build/nvidia/nv.o' failed
make[2]: *** [/var/lib/dkms/nvidia/460.73.01/build/nvidia/nv.o] Error 1
Makefile:1754: recipe for target '/var/lib/dkms/nvidia/460.73.01/build' failed
make[1]: *** [/var/lib/dkms/nvidia/460.73.01/build] Error 2
make[1]: Leaving directory '/usr/src/linux-headers-5.4.0-72-generic'
Makefile:80: recipe for target 'modules' failed
make: *** [modules] Error 2
DKMSKernelVersion: 5.4.0-72-generic
Date: Fri Apr 30 18:30:45 2021
DuplicateSignature: dkms:nvidia-dkms-460:460.73.01-0ubuntu1:/var/lib/dkms/nvidia/460.73.01/build/conftest/functions.h:11:2: error: #error acpi_walk_namespace() conftest failed!
Package: nvidia-dkms-460 460.73.01-0ubuntu1
PackageVersion: 460.73.01-0ubuntu1
SourcePackage: nvidia-graphics-drivers-460
Title: nvidia-dkms-460 460.73.01-0ubuntu1: nvidia kernel module failed to build
Smells like a driver/kernel support isse?
First result when googling dkms nvidia 460
is this: Can’t get nvidia 460 module to build on Ubuntu 20.04 to support two A100s - GPU Unix Graphics / Linux - NVIDIA Developer Forums
Please check if the build symlink to the headers for dkms exists:
ls /lib/modules/$(uname -r)/build
Otherwise, create it
ln -s /usr/src/linux-headers-$(uname -r) /lib/modules/$(uname -r)/build
Didn’t have it, created it, trying again, same error, deleted the previous log, full output is:
(19:19:54/10967)~/$ sudo dkms autoinstall
Kernel preparation unnecessary for this kernel. Skipping...
applying patch disable_fstack-clash-protection_fcf-protection.patch...patching file Kbuild
Hunk #1 succeeded at 85 (offset 14 lines).
Building module:
cleaning build area...
unset ARCH; [ ! -h /usr/bin/cc ] && export CC=/usr/bin/gcc; env NV_VERBOSE=1 'make' -j8 NV_EXCLUDE_BUILD_MODULES='' KERNEL_UNAME=5.4.0-72-generic IGNORE_XEN_PRESENCE=1 IGNORE_CC_MISMATCH=1 SYSSRC=/lib/modules/5.4.0-72-generic/build LD=/usr/bin/ld.bfd modules.......(bad exit status: 2)
Error! Bad return status for module build on kernel: 5.4.0-72-generic (x86_64)
Consult /var/lib/dkms/nvidia/460.73.01/build/make.log for more information.
The file is full of what looks like syntax errors..?
This charming chinese website seems to imply gcc version is to blame: NVIDIA驱动出错:NVIDIA-SMI has failed because it couldn‘t communicate with the NVIDIA driver. Make sure t_sazass的博客-CSDN博客
(19:22:39/10974)~/$ cat /proc/version
Linux version 5.4.0-72-generic (buildd@lgw01-amd64-021) (gcc version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04)) #80~18.04.1-Ubuntu SMP Mon Apr 12 23:26:25 UTC 2021
sudo apt install gcc-8
sudo update-alternatives --config gcc
sudo update-alternatives --remove-all gcc
sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-8 10
sudo update-alternatives --install /usr/bin/cc cc /usr/bin/gcc-8 10
Let’s retry dkms autoinstall:
(19:26:03/10981)~/$ sudo dkms autoinstall
Kernel preparation unnecessary for this kernel. Skipping...
applying patch disable_fstack-clash-protection_fcf-protection.patch...patching file Kbuild
Hunk #1 succeeded at 85 (offset 14 lines).
Building module:
cleaning build area...
unset ARCH; [ ! -h /usr/bin/cc ] && export CC=/usr/bin/gcc; env NV_VERBOSE=1 'make' -j8 NV_EXCLUDE_BUILD_MODULES='' KERNEL_UNAME=5.4.0-72-generic IGNORE_XEN_PRESENCE=1 IGNORE_CC_MISMATCH=1 SYSSRC=/lib/modules/5.4.0-72-generic/build LD=/usr/bin/ld.bfd modules...............
Signing module:
- /var/lib/dkms/nvidia/460.73.01/5.4.0-72-generic/x86_64/module/nvidia-modeset.ko
- /var/lib/dkms/nvidia/460.73.01/5.4.0-72-generic/x86_64/module/nvidia.ko
- /var/lib/dkms/nvidia/460.73.01/5.4.0-72-generic/x86_64/module/nvidia-uvm.ko
- /var/lib/dkms/nvidia/460.73.01/5.4.0-72-generic/x86_64/module/nvidia-drm.ko
Secure Boot not enabled on this system.
cleaning build area...
DKMS: build completed.
nvidia.ko:
Running module version sanity check.
- Original module
- No original module exists within this kernel
- Installation
- Installing to /lib/modules/5.4.0-72-generic/updates/dkms/
nvidia-modeset.ko:
Running module version sanity check.
- Original module
- No original module exists within this kernel
- Installation
- Installing to /lib/modules/5.4.0-72-generic/updates/dkms/
nvidia-drm.ko:
Running module version sanity check.
- Original module
- No original module exists within this kernel
- Installation
- Installing to /lib/modules/5.4.0-72-generic/updates/dkms/
nvidia-uvm.ko:
Running module version sanity check.
- Original module
- No original module exists within this kernel
- Installation
- Installing to /lib/modules/5.4.0-72-generic/updates/dkms/
depmod...
DKMS: install completed.
WOW. WOOOOOW. WOOOOOOOOOOOOOOOOOOOOOO
Without even restarting, after the first command my screen flashed and changed resolution a bit, BUT THEN IT WORKED
(19:34:17/10983)~/$ nvidia-smi
No devices were found
(19:34:20/10984)~/$ nvidia-smi
Fri Apr 30 19:34:22 2021
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.73.01 Driver Version: 460.73.01 CUDA Version: 11.2 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 GeForce GTX 1070 On | 00000000:0C:00.0 Off | N/A |
| 0% 54C P0 37W / 151W | 7MiB / 8119MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
All these attempts failed because the nvidia module in dkms couldn’t install because syntax errors because old gcc compiler version.
What could I have done differently? Why at no point did I see errors about the kernel module failing to build, where should I have looked for them? And why syntax errors instead of something checking the used gcc version and loudly failing when there was a mismatch? Why is that chinese website the only place I found this fix?
(19:42:57/10995)~/$ lsmod | grep nvidia
nvidia_uvm 1015808 0
nvidia_drm 57344 1
nvidia_modeset 1228800 1 nvidia_drm
nvidia 34123776 17 nvidia_uvm,nvidia_modeset
drm_kms_helper 188416 2 nvidia_drm,i915
drm 491520 15 drm_kms_helper,nvidia_drm,i915
Now let’s hope this survives a restart. And that it works when the eGPU is disconnected.
Following the readme, ran both options in separate terminals:
./data-science-stack list
./data-science-stack build-container
./data-science-stack run-container
and
./data-science-stack list
./data-science-stack build-conda-env
./data-science-stack run-jupyter
The latter seems to be installing CUDA and friends on my computer - didn’t expect it, but I need them either way I think, I guess I’ll let the script handle everything since it started. It installed conda to ~/conda/
, but again, not sure what I was expecting
Both running for 20+ minutes now
EDIT: ~/conda/ took 20gb filling up my drive, blocking everything, deleted it
The docker with jupyterlab - tensorflow can’t access the GPU, but pytorch can.
The NVIDIA eGPU tutorial3 continues with offloading Xorg to the GPU - do I want this? Can I use the GPU just for training, and leave Xorg running on the internal one? I probably don’t
As I remember from the last time, X doesn’t start when the GPU is connected at boot but everything’s fine when it gets connected after starting X. When it’s connected, it seems the driver gets loaded and nvidia-smi etc works. That the system works without the eGPU attached is nice! Plug-and-play is nice too.
Installed pytorch in a virtualenv, for cuda 11.1, test snippet says cuda works!
import torch
x = torch.rand(5, 3)
print(x)
torch.cuda.is_available()
Tensorflow:
>>> import tensorflow as tf
2021-04-30 21:36:12.984883: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
>>> tf.debugging.set_log_device_placement(True)
>>> a = tf.constant([[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]])
2021-04-30 21:36:23.055614: I tensorflow/compiler/jit/xla_cpu_device.cc:41] Not creating XLA devices, tf_xla_enable_xla_devices not set
2021-04-30 21:36:23.058062: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcuda.so.1
2021-04-30 21:36:23.115366: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:941] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-04-30 21:36:23.116510: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1720] Found device 0 with properties:
pciBusID: 0000:0c:00.0 name: GeForce GTX 1070 computeCapability: 6.1
coreClock: 1.721GHz coreCount: 15 deviceMemorySize: 7.93GiB deviceMemoryBandwidth: 238.66GiB/s
2021-04-30 21:36:23.116553: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
2021-04-30 21:36:23.119974: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublas.so.11
2021-04-30 21:36:23.120034: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublasLt.so.11
2021-04-30 21:36:23.121503: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcufft.so.10
2021-04-30 21:36:23.121842: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcurand.so.10
2021-04-30 21:36:23.125037: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusolver.so.10
2021-04-30 21:36:23.125803: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusparse.so.11
2021-04-30 21:36:23.125980: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library 'libcudnn.so.8'; dlerror: libcudnn.so.8: cannot open shared object file: No such file or directory
2021-04-30 21:36:23.125996: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1757] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Which libcudnn?
Tensorflow’s tutorial (GPU support | TensorFlow) does this:
Install development and runtime libraries (~4GB)
sudo apt-get install --no-install-recommends \
cuda-11-0 \
libcudnn8=8.0.4.30-1+cuda11.0 \
libcudnn8-dev=8.0.4.30-1+cuda11.0
What is the version for CUDA 11.2? cuDNN Archive | NVIDIA Developer has download links. The one for 11.2 is called “cudnn-11.2-linux-x64-v8.1.1.33.tgz”. I plug those versions in, they exist and install fine:
sudo apt-get install libcudnn8=8.1.1.33-1+cuda11.2
sudo apt-get install libcudnn8-dev=8.1.1.33-1+cuda11.2
And tensorflow now works!
2021-04-30 21:42:46.176942: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1406] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 7440 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1070, pci bus id: 0000:0c:00.0, compute capability: 6.1)
I can’t believe it but wow. It’s finished, it works, X didn’t die, plug-and-play works, no manual driver loading.
All in all, including all the failed attempts, took 5:30h of pure time, according to my time tracking.
The only wrinkle is that X doesn’t start when turning the computer on with the eGPU attached, but I can 100% live with that!
How to Benchmark your GPU on Linux has a fun quote:
This tool is very old, very basic and only tests a small portion of today’s OpenGL capabilities. Back in the old days, it was used to determine if the proprietary driver was installed and running properly as open-source drivers were performing awfully enough to be perfectly noticeable during this test. Nowadays, you won’t notice any difference between the two
Added this to config.py:
config.bind('<Alt-P>', 'set-cmd-text -s :open -p ')
Seen in someone’s config.py on gitlab6:
for f in glob.glob(str(config.configdir / 'conf.d/*.py')):
config.source(str(os.path.relpath(f, start=config.configdir)))
Nice examples: i3_config/settings.d at master · kiddico/i3_config · GitHub
i3 doesn’t have any kind of include directive in the config files, sadly. i3 - Source/import file from i3wm config - Stack Overflow is one option:
bindsym $mod+Shift+c exec "cat ~/.config/i3/colors ~/.config/i3/base > ~/.config/i3/config && i3-msg reload"
A keybinding to overwrite the config file and restart i3 with a command.
This looks very interesting, I shouldn’t forget to go through this: Life Hacking His blog with personal examples: Alex Vermeer — Life-Hacking. Climbing. Striving for awesome. Coffee. — Page 2
A non-pdf description of Life Areas with questions and metrics for each.
(He’s the same guy who created the awesome How to Get Motivated: A Guide for Defeating Procrastination poster!)
And let’s remember the classic: Evidence-based advice on how to be successful in any job - 80,000 Hours
Two options I like:7
nohup cmd &
cmd & disown
I feel one of these will become part of many aliases of mine.
And short bash function from the same place:
function dos() {
# run_disowned and silenced
run_disowned "$@" 1>/dev/null 2>/dev/null
}
debian - What’s the right way to purge recursively with apt? - Unix & Linux Stack Exchange ↩︎
Accelerating Machine Learning on a Linux Laptop with an External GPU | NVIDIA Developer Blog ↩︎ ↩︎ ↩︎
~pvsr/dotfiles: qutebrowser/.config/qutebrowser/config.py - sourcehut git ↩︎
linux - How do I detach a process from Terminal, entirely? - Super User ↩︎
To read: PEP 8 – Style Guide for Python Code | Python.org
I should learn about the search syntax for jira tickets:
assignee = currentuser() and statusCategory != Done ORDER BY updated DESC
Following this: CUDA 10.1 installation on Ubuntu 18.04 LTS | Medium nope, errors
In the same github discussion about installing CUDA on ubuntu that I’ve been to twice this bit is mentioned: 1
The very very important thing is that never install “nvidia-driver-***” driver by yourself.
Required nvidia drivers are installed while doing
sudo apt install -y cuda=10.0.130-1
sudo apt remove --autoremove nvidia-*
doesn’t work as-is in zsh! *
gets interpreted as files in current directory. Explains my CUDA issues, everything seemed to work till I ran the above in a directory containing files with matching names that got helpfully shown.
sudo apt remove --autoremove nvidia-\*
is the answer.
(or 'nvidia-*'
)
Not the first time this bites me, at least the third, and all of them in the context of doing CUDA stuff.
“Es funktioniert fabelhaft” - heard at work
apt --fix-broken install
didn’t help as advertised, but removing all the broken packages together with sudo dpkg -P cuda-libraries-10-0 libnvidia-common-390
helped! After this removing/cleaning up everything else worked.
A lot of this mentioned changes to initramfs, I really hope I’ll be able to boot up next time :(
Also - if 90% of the tutorials about how to install $thing start with “Remove any traces of installs of $thing you have” it’s a nice sign that something’s shady.
docker logs 09348209840239
Option 1: hide the floating window:
for_window [title="^Skype$" floating] move scratchpad
Option 2:
Clever idea. Although, are you talking about the little window that can be disabled in Skype’s “Settings > Calling > Show call window when Skype is in the background”?
In search, before:Tomorrow
is a nice catch-all filter
Your system installations of CUDA and cudnn won’t be used, if you install PyTorch binaries with these libs. E.g.
conda install pytorch torchvision cudatoolkit=10.1 -c pytorch
will install CUDA 10.1 and cudnn in your current conda environment. 2
Nvidia drivers are needed on host machine, but not CUDA! 3
On TF’s official CUDA install page4, the bash listings (that are usually copypasted) contain the standard $
at the beginning, it’s visible, but not copypastable!
So, hopefully the last time today, as the previous couple of times I end up in the official TF tutorial4 about installing CUDA. Armed with the knowledge that:
Snippet:
# Add NVIDIA package repositories
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-ubuntu1804.pin
sudo mv cuda-ubuntu1804.pin /etc/apt/preferences.d/cuda-repository-pin-600
sudo apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/7fa2af80.pub
sudo add-apt-repository "deb https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/ /"
sudo apt-get update
wget http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64/nvidia-machine-learning-repo-ubuntu1804_1.0.0-1_amd64.deb
sudo apt install ./nvidia-machine-learning-repo-ubuntu1804_1.0.0-1_amd64.deb
sudo apt-get update
wget https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64/libnvinfer7_7.1.3-1+cuda11.0_amd64.deb
sudo apt install ./libnvinfer7_7.1.3-1+cuda11.0_amd64.deb
sudo apt-get update
# Install development and runtime libraries (~4GB)
sudo apt-get install --no-install-recommends \
cuda-11-0 \
libcudnn8=8.0.4.30-1+cuda11.0 \
libcudnn8-dev=8.0.4.30-1+cuda11.0
# Reboot. Check that GPUs are visible using the command: nvidia-smi
# Install TensorRT. Requires that libcudnn8 is installed above.
sudo apt-get install -y --no-install-recommends libnvinfer7=7.1.3-1+cuda11.0 \
libnvinfer-dev=7.1.3-1+cuda11.0 \
libnvinfer-plugin7=7.1.3-1+cuda11.0
Done, no conflicts, no anything, worked better than most Medium tutorials I’ve read today.
# Reboot.
Let’s hope for the best.
UPD: no black screen, booted fine, but nvidia-smi
sees no driver.
sudo apt list --installed
shows all cuda stuff and nvidia driver to be installed:
nvidia-driver-465/unknown,unknown,now 465.19.01-0ubuntu1 amd64 [installed,automatic]
More worryingly, I see mentions of cuda-10-1 and cuda-11-1 together
I should use ps axf
instead of ps aux
, the former gives a nice tree representation
Yet another place that makes it look easy: CUDA Toolkit 11.0 Download | NVIDIA Developer
newgrp docker
has to be run from each cli you’ll be using docker from?.. Until you restartdocker run -d -p 80:80 docker/getting-started
docker stop
accepts the full name (distracted_perlman), but part of its container_id works!COPY
instruction from a Dockerfile copies the contents of the directory, but not the directory itself! 1journalctl
Logs take space (4gb on my box!). To see how much specifically journalctl does:2
journalctl --disk-usage
sudo journalctl --vacuum-time=3d
New -> Terminal. (Which you can use to access your docker running jupyter-notebook)
$ docker build -t dt2test -f ./docker/Dockerfile .
- passes the Dockerfile as explicit parameter, inside it paths are relative to the folder you run docker build
in.
For docker compose:
#docker-compose.yml
version: '3.3'
services:
yourservice:
build:
context: ./
dockerfile: ./docker/yourservice/Dockerfile
A lot of other nice options at Docker: adding a file from a parent directory - Stack Overflow
This module provides a decorator and functions for automatically adding generated special methods such as init() and repr() to user-defined classes. It
“Token classification” includes but is not limited to NER: Hugging Face – The AI community building the future.. Really nice new correct phrase I’ll be using!
Installing (after tensorflow and/or pytorch):
pip install transformers
Caches by default in user folder but can be overridden:
export HF_HOME="/data/sh/experiments/bert/cache"
The “hosted inference API” on the website is really cool! dslim/bert-base-NER · Hugging Face
Example of converting conll dataset to what BERT expects: Fine Tuning BERT for NER on CoNLL 2003 dataset with TF 2.0 | by Bhuvana Kundumani | Analytics Vidhya | Medium
The BERT model documentation shows the tokenizers etc etc etc. - BERT — transformers 4.5.0.dev0 documentation
Training and fine-tuning — transformers 4.5.0.dev0 documentation - same model can be trained/imported from TF to pytorch and back! Wow!
Documentation of a sample model: transformers/examples/research_projects/distillation at master · huggingface/transformers
Another example of fine-tuning BERT in Pytorch for NER: transformers/examples/pytorch/token-classification at master · huggingface/transformers
transformers
installed from source (git/master): https://huggingface.co/transformers/installation.html#installing-from-source / pip install git+https://github.com/huggingface/transformers
/tmp/test-ner/
, checkpoints, eval data. Wow.CUDA_VISIBLE_DEVICES=1; python run_ner.py --model_name_or_path bert-base-uncased --dataset_name conll2003 --output_dir /tmp/test-ner --do_train --do_eval
Here datasets
is imported: transformers/requirements.txt at master · huggingface/transformers
TODO - what is this and where can I learn more? Is this HF specific? What else is there?
It has a really nice interface for searching datasets! Filter by task, language, etc.
German NER datasets: Hugging Face – The AI community building the future.
Some German NER models, sometimes based on bert: Hugging Face – The AI community building the future.
Converting Tensorflow Checkpoints — transformers 4.5.0.dev0 documentation
Is this real?
export BERT_BASE_DIR=/path/to/bert/uncased_L-12_H-768_A-12
transformers-cli convert --model_type bert \
--tf_checkpoint $BERT_BASE_DIR/bert_model.ckpt \
--config $BERT_BASE_DIR/bert_config.json \
--pytorch_dump_output $BERT_BASE_DIR/pytorch_model.bin
Tatar von geräuchertem Forellenfilet mit Avocado - Annemarie Wildeisens KOCHEN
Die Forellenfilets in kleine Würfelchen schneiden. Die Schalotte schälen und sehr fein hacken. Die Cherrytomaten je in 6 oder 8 Stücke schneiden. Alle diese Zutaten in eine kleine Schüssel geben und sorgfältig mit der Mayonnaise mischen.
Forelle + tomatos + mayonnaise is literally the only recipe I’ve liked with mayonnaise in it
To redirect an issue to the old view, add ?oldIssueView=true
.
Added this to config.py:
config.bind('<Ctrl-J>', ':open {url}?oldIssueView=true')
(18:03:38/10185) sudo apt install screen
# ...
Suggested packages:
byobu | screenie | iselect
The following NEW packages will be installed:
… did I just get an advert for a competitor when installing screen? :) Since when does ubuntu do this and where can I read more about it?
“Meetingtourismus oder Papiergenerieren?” (heard at work)
It seems to run userscripts not in the virtualenv qutebrowser uses, but the standard system one? Installing packages in virtualenv didn’t work, but installing them globally did.
Moving/renaming a file/directory is easy: dvc move from to
1. Automatically updates the from.dvc
files. Then .gitignore
and the .dvc file have to be added and committed through git as usual.
This is interesting: Data Organization — documentation
In general: Best Practices for Scientific Data Management — documentation
This guide describes Axiom Data Science’s best practices for scientific data management. The intent of these practices is to improve the accessibility and usability of your data. These practices may be followed at any time during the preparation of your dataset, but are most useful when considered at the onset of project planning and implemented during data collection.
Also related: Organising your data | Research Data Management
tree -d
does it.
Root of repo:
git rev-parse --show-prefix
2
--git-dir
returns the location of the .git
folder, and --show-toplevel
returns the absolute location of the git root.