31 Jan 2024

Setup for Dockerfiles where you can look around before running

I run a command w/ ARGs as CMD inside a Dockerfile.

Howto

I’d like to docker run -e "WHAT=ever" image bash to drop into bash to look around and maybe change the main command, for this I’d need to generate some command.sh, but I can’t, because Docker ARGs are available at buildtime but not runtime. (And I don’t want to use env variables because I want to cat mycommand.sh to copypaste what would run instead of looking at the values of environment variables.)

I came up with this setup:

FROM nvidia/cuda:11.6.2-runtime-ubuntu20.04

ARG DEVICE
ARG HF_MODEL_NAME
ARG LIMIT
ARG TASKS=truthfulqa

# ....

COPY resources/entrypoint.sh /entrypoint.sh
RUN chmod +x /entrypoint.sh

ENTRYPOINT ["/entrypoint.sh"]
CMD ["/command.sh"]

entrypoint.sh:

#!/bin/bash
# echo "I am entrypoint"
echo "python3 -m lm_eval --model hf --model_args pretrained=${HF_MODEL_NAME} --limit $LIMIT --write_out --log_samples --output_path /tmp/Output --tasks $TASKS --device $DEVICE --verbosity DEBUG --include_path /resources --show_config" > /command.sh 
echo "echo I am command.sh" >> /command.sh 
chmod +x /command.sh

if [ $# -eq 0 ]; then
	# If we have no args to the entrypoint, run the main command
    /command.sh
else
	# If we do, assume it's a program and execute it
	echo "exec-ing $@"
    exec "$@"
fi

Then, this command will run the entrypoint.sh that creates command.sh and then runs it:

docker run --rm -it -e "DEVICE=cpu" -e "HF_MODEL_NAME=TinyLlama/TinyLlama-1.1B-Chat-v1.0" -e "LIMIT=1" -e "TASKS=openbookqa-test" me/lm-eval:0.0.17

And this one runs the entrypoint that creates command.sh and then runs bash, dropping me into a shell where I can cat /command.sh etc.:

docker run --rm -it -e "DEVICE=cpu" -e "HF_MODEL_NAME=TinyLlama/TinyLlama-1.1B-Chat-v1.0" -e "LIMIT=1" -e "TASKS=openbookqa-test" me/lm-eval:0.0.17 bash

Refs

Docker ENTRYPOINT and CMD : Differences & Examples:

ENTRYPOINT is the program that gets executed when the container starts, /bin/sh by default
CMD are the arguments to that program.

The usual CMD whatever at the end of Dockerfiles then means /bin/sh whatever.

Here we use that to our advantage to decide what to run, while guaranteeing that the command.sh gets created always.

CMD can be overridden by appending to the docker run command, like docker run ... image bash above.
ENTRYPOINT can be overridden with the --entrypoint argument to docker run.

Rancher/k8s pods

I often want to do something similar for a Docker image running on Rancher. For this I usually use sth like this (230311-1215 Rancher and kubernetes basics):

spec:
  containers:
    - name: project-lm-eval-container-name-2
      image: me/lm-eval:0.0.17
      command:
          - /bin/sh
          - -c
          - while true; do echo $(date) >> /tmp/out; sleep 1; done

Define a Command and Arguments for a Container | Kubernetes mentions something that can be a better way.

#!/bin/bash
echo "python3 -m lm_eval --model hf --model_args pretrained=${HF_MODEL_NAME} --limit $LIMIT --write_out --log_samples --output_path /tmp/Output --tasks $TASKS --device $DEVICE --verbosity DEBUG --include_path /resources --show_config" > /command.sh 
echo "echo I am command.sh" >> /command.sh 
chmod +x /command.sh

if [ $# -eq 0 ]; then
	# If we have no args to the entrypoint, run the main command
    /command.sh
elif [ "$1" = "sleep" ]; then
    while true; do
        echo sleeping on $(date)
        sleep 10
    done
else
    # If we have any other arg, assume it's a command and execute it
    exec "$@"
fi

When it has sleep as an argument, it’ll sleep, the rest is unchanged.

Pod

apiVersion: v1
kind: Pod
metadata:
  name: xx
  namespace: xx
spec:
  containers:
    - name: project-lm-eval-container-name-2
      image: me/lm-eval:0.0.17
      # If BE_INTERACTIVE == "sleep", ./entrypoint will be an infinite loop
      #     (if it's empty, it'll run the thing as usual)
      #     (if it's anything else, it will run that command, e.g. bash)
      command:
          - /entrypoint.sh
      args: ["$(BE_INTERACTIVE)"]
      env:
		# all of them, plus:
        - name: BE_INTERACTIVE
          valueFrom:
            configMapKeyRef:
              name: lm-eval-cmap
              key: BE_INTERACTIVE

A bit ugly, sth like RUN_MODE would be better, but now:

BE_INTERACTIVE is in a config map, becomes an env variable
If set to sleep, the pod will run the infinite loop, then I can “Execute shell” and echo /command.sh etc.!

Prettier multiline

This was hard to get right with newlines replacements etc., but this can write command.sh in nice multiline format:

cat > /command.sh <<EOF
python3 -m lm_eval \\
--model hf \\
--model_args pretrained=$HF_MODEL_NAME \\
--limit $LIMIT \\
--write_out \\
--log_samples \\
--output_path /tmp/Output \\
--tasks $TASKS \\
--device $DEVICE \\
--verbosity DEBUG \\
--include_path /resources \\
--show_config
EOF

No quotes around ‘EOF’, double backslashes, no slashes before $ (with them the replacement will happen during runtime, not creation.)

Sleep after run

Last update on this: run_then_sleep executes th the command immediately then sleeps, and I can connect to the container. Nice for Rancher and co that don’t create the container immediately, and I have to wait for it to be able to start stuff.

#!/bin/bash
cat > /command.sh <<EOF
python3 -m lm_eval \\
--model hf \\
--model_args pretrained=$HF_MODEL_NAME \\
--limit $LIMIT \\
--write_out \\
--log_samples \\
--output_path /tmp/Output \\
--tasks $TASKS \\
--device $DEVICE \\
--verbosity DEBUG \\
--include_path /resources \\
--show_config
EOF

echo "echo I am command.sh" >> /command.sh 
chmod +x /command.sh

if [ $# -eq 0 ]; then
	# If we have no args to the entrypoint, run the main command
    /command.sh

elif [ "$1" = "sleep" ]; then
    while true; do
        echo sleeping
        sleep 10
    done
elif [ "$1" = "run_then_sleep" ]; then
	/command.sh
    while true; do
        echo sleeping after run
        sleep 100
    done
else
    # If we have any other arg, assume it's a command and execute it
    exec "$@"
fi

Nel mezzo del deserto posso dire tutto quello che voglio.

serhii.net