CH-RUN(1) | Charliecloud | CH-RUN(1) |
ch-run - Run a command in a Charliecloud container
$ ch-run [OPTION...] IMAGE -- CMD [ARG...]
Run command CMD in a fully unprivileged Charliecloud container using the image specified by IMAGE, which can be: (1) a path to a directory, (2) the name of an image in ch-image storage (e.g. example.com:5050/foo) or, if the proper support is enabled, a SquashFS archive. ch-run does not use any setuid or setcap helpers, even for mounting SquashFS images with FUSE.
If --write is given and DST does not exist, it will be created as an empty directory. However, DST must be entirely within the image itself; DST cannot enter a previous bind mount. For example, --bind /foo:/tmp/foo will fail because /tmp is shared with the host via bind-mount (unless $TMPDIR is set to something else or --private-tmp is given).
Most images do have ten directories /mnt/[0-9] already available as mount points.
Symlinks in DST are followed, and absolute links can have surprising behavior. Bind-mounting happens after namespace setup but before pivoting into the container image, so absolute links use the host root. For example, suppose the image has a symlink /foo -> /mnt. Then, --bind=/bar:/foo will bind-mount on the host’s /mnt, which is inaccessible on the host because namespaces are already set up and also inaccessible in the container because of the subsequent pivot into the image. Currently, this problem is only detected when DST needs to be created: ch-run will refuse to follow absolute symlinks in this case, to avoid directory creation surprises.
See below for details on how environment variables work in ch-run.
Note: Because ch-run is fully unprivileged, it is not possible to change UIDs and GIDs within the container (the relevant system calls fail). In particular, setuid, setgid, and setcap executables do not work. As a precaution, ch-run calls prctl(PR_SET_NO_NEW_PRIVS, 1) to disable these executables within the container. This does not reduce functionality but is a “belt and suspenders” precaution to reduce the attack surface should bugs in these system calls or elsewhere arise.
ch-run supports two different image formats.
The first is a simple directory that contains a Linux filesystem tree. This can be accomplished by:
The second is a SquashFS image archive mounted internally by ch-run, available if it’s linked with the optional libsquashfuse_ll shared library. ch-run mounts the image filesystem, services all FUSE requests, and unmounts it, all within ch-run. See --mount above to set the mount point location.
Like other FUSE implementations, Charliecloud calls the fusermount3(1) utility to mount the SquashFS filesystem. However, this executable does not need to be installed setuid root, and in fact ch-run actively suppresses its setuid bit if set (using prctl(2)).
Prior versions of Charliecloud provided wrappers for the squashfuse and squashfuse_ll SquashFS mount commands and fusermount -u unmount command. We removed these because we concluded they had minimal value-add over the standard, unwrapped commands.
WARNING:
In addition to any directories specified by the user with --bind, ch-run has standard host files and directories that are bind-mounted in as well.
The following host files and directories are bind-mounted at the same location in the container. These give access to the host’s devices and various kernel facilities. (Recall that Charliecloud provides minimal isolation and containerized processes are mostly normal unprivileged processes.) They cannot be disabled and are required; i.e., they must exist both on host and within the image.
Optional; bind-mounted only if path exists on both host and within the image, without error or warning if not.
Additional bind mounts done by default but can be disabled; see the options above.
By default, different ch-run invocations use different user and mount namespaces (i.e., different containers). While this has no impact on sharing most resources between invocations, there are a few important exceptions. These include:
--join is designed to address this by placing related ch-run commands (the “peer group”) in the same container. This is done by one of the peers creating the namespaces with unshare(2) and the others joining with setns(2).
To do so, we need to know the number of peers and a name for the group. These are specified by additional arguments that can (hopefully) be left at default values in most cases:
Caveats:
ch-run leaves environment variables unchanged, i.e. the host environment is passed through unaltered, except:
This section describes these features.
The default tweaks happen first, then --set-env and --unset-env in the order specified on the command line, and then CH_RUNNING. The two options can be repeated arbitrarily many times, e.g. to add/remove multiple variable sets or add only some variables in a file.
By default, ch-run makes the following environment variable changes:
Some of these distributions (e.g., Fedora 24) have also dropped /bin from the default $PATH. This is a problem when the guest OS does not have a merged /usr (e.g., Debian 8 “Jessie”). Thus, we add /bin to $PATH if it’s not already present.
Further reading:
The purpose of --set-env is to set environment variables within the container. Values given replace any already in the environment (i.e., inherited from the host shell) or set by earlier --set-env. This flag takes an optional argument with two possible forms:
$ ch-run --set-env=FOO=bar ...
Single straight quotes around the value (', ASCII 39) are stripped, though be aware that both single and double quotes are also interpreted by the shell. For example, this example is similar to the prior one; the double quotes are removed by the shell and the single quotes are removed by ch-run:
$ ch-run --set-env="'BAZ=qux'" ...
$ cat /tmp/env.txt FOO=bar BAZ='qux' $ ch-run --set-env=/tmp/env.txt ...
For directory images only (because the file is read before containerizing), guest paths can be given by prepending the image path.
$ cat Dockerfile [...] ENV FOO=bar ENV BAZ=qux [...] $ ch-image build -t foo . $ ch-convert foo /var/tmp/foo.sqfs $ ch-run --set-env /var/tmp/foo.sqfs -- ...
(Note the image path is interpreted correctly, not as the --set-env argument.)
At present, there is no way to use files other than /ch/environment within SquashFS images.
Environment variables are expanded for values that look like search paths, unless --env-no-expand is given prior to --set-env. In this case, the value is a sequence of zero or more possibly-empty items separated by colon (:, ASCII 58). If an item begins with dollar sign ($, ASCII 36), then the rest of the item is the name of an environment variable. If this variable is set to a non-empty value, that value is substituted for the item; otherwise (i.e., the variable is unset or the empty string), the item is deleted, including a delimiter colon. The purpose of omitting empty expansions is to avoid surprising behavior such as an empty element in $PATH meaning the current directory.
For example, to set HOSTPATH to the search path in the current shell (this is expanded by ch-run, though letting the shell do it happens to be equivalent):
$ ch-run --set-env='HOSTPATH=$PATH' ...
To prepend /opt/bin to this current search path:
$ ch-run --set-env='PATH=/opt/bin:$PATH' ...
To prepend /opt/bin to the search path set by the Dockerfile, as retrieved from guest file /ch/environment (here we really cannot let the shell expand $PATH):
$ ch-run --set-env --set-env='PATH=/opt/bin:$PATH' ...
Examples of valid assignment, assuming that environment variable BAR is set to bar and UNSET is unset or set to the empty string:
Assignment | Name | Value |
FOO=bar | FOO | bar |
FOO=bar=baz | FOO | bar=baz |
FLAGS=-march=foo -mtune=bar | FLAGS | -march=foo -mtune=bar |
FLAGS='-march=foo -mtune=bar' | FLAGS | -march=foo -mtune=bar |
FOO=$BAR | FOO | bar |
FOO=$BAR:baz | FOO | bar:baz |
FOO= | FOO | empty string |
FOO=$UNSET | FOO | empty string |
FOO=baz:$UNSET:qux | FOO | baz:qux (not baz::qux) |
FOO=:bar:baz:: | FOO | :bar:baz:: |
FOO='' | FOO | empty string |
FOO='''' | FOO | '' (two single quotes) |
Example invalid assignments:
Assignment | Problem |
FOO bar | no equals separator |
=bar | name cannot be empty |
Example valid assignments that are probably not what you want:
Assignment | Name | Value | Problem |
FOO="bar" | FOO | "bar" | double quotes aren’t stripped |
FOO=bar # baz | FOO | bar # baz | comments not supported |
FOO=bartbaz | FOO | bartbaz | backslashes are not special |
FOO=bar | FOO | bar | leading space in key |
FOO= bar | FOO | bar | leading space in value |
$FOO=bar | $FOO | bar | variables not expanded in key |
FOO=$BAR baz:qux | FOO | qux | variable BAR baz not set |
The purpose of --unset-env=GLOB is to remove unwanted environment variables. The argument GLOB is a glob pattern (dialect fnmatch(3) with the FNM_EXTMATCH flag where supported); all variables with matching names are removed from the environment.
WARNING:
GLOB must be a non-empty string.
Example 1: Remove the single environment variable FOO:
$ export FOO=bar $ env | fgrep FOO FOO=bar $ ch-run --unset-env=FOO $CH_TEST_IMGDIR/chtest -- env | fgrep FOO $
Example 2: Hide from a container the fact that it’s running in a Slurm allocation, by removing all variables beginning with SLURM. You might want to do this to test an MPI program with one rank and no launcher:
$ salloc -N1 $ env | egrep '^SLURM' | wc
44 44 1092 $ ch-run $CH_TEST_IMGDIR/mpihello-openmpi -- /hello/hello [... long error message ...] $ ch-run --unset-env='SLURM*' $CH_TEST_IMGDIR/mpihello-openmpi -- /hello/hello 0: MPI version: Open MPI v3.1.3, package: Open MPI root@c897a83f6f92 Distribution, ident: 3.1.3, repo rev: v3.1.3, Oct 29, 2018 0: init ok cn001.localdomain, 1 ranks, userns 4026532530 0: send/receive ok 0: finalize ok
Example 3: Clear the environment completely (remove all variables):
$ ch-run --unset-env='*' $CH_TEST_IMGDIR/chtest -- env $
Example 4: Remove all environment variables except for those prefixed with either WANTED_ or ALSO_WANTED_:
$ export WANTED_1=yes $ export ALSO_WANTED_2=yes $ export NOT_WANTED_1=no $ ch-run --unset-env='!(WANTED_*|ALSO_WANTED_*)' $CH_TEST_IMGDIR/chtest -- env WANTED_1=yes ALSO_WANTED_2=yes $
Note that some programs, such as shells, set some environment variables even if started with no init files:
$ ch-run --unset-env='*' $CH_TEST_IMGDIR/debian_9ch -- bash --noprofile --norc -c env SHLVL=1 PWD=/ _=/usr/bin/env $
Run the command echo hello inside a Charliecloud container using the unpacked image at /data/foo:
$ ch-run /data/foo -- echo hello hello
Run an MPI job that can use CMA to communicate:
$ srun ch-run --join /data/foo -- bar
By default, ch-run logs its command line to syslog. (This can be disabled by configuring with --disable-syslog.) This includes: (1) the invoking real UID, (2) the number of command line arguments, and (3) the arguments, separated by spaces. For example:
Dec 10 18:19:08 mybox ch-run: uid=1000 args=7: ch-run -v /var/tmp/00_tiny -- echo hello "wor l}\$d"
Logging is one of the first things done during program initialization, even before command line parsing. That is, almost all command lines are logged, even if erroneous, and there is no logging of program success or failure.
Arguments are serialized with the following procedure. The purpose is to provide a human-readable reconstruction of the command line while also allowing each argument to be recovered byte-for-byte.
The verbatim command line typed in the shell cannot be recovered, because not enough information is provided to UNIX programs. For example, echo 'foo' is given to programs as a sequence of two arguments, echo and foo; the two spaces and single quotes are removed by the shell. The zero byte, ASCII NUL, cannot appear in arguments because it would terminate the string.
If there is an error during containerization, ch-run exits with status non-zero. If the user command is started successfully, the exit status is that of the user command, with one exception: if the image is an internally mounted SquashFS filesystem and the user command is killed by a signal, the exit status is 1 regardless of the signal value.
If Charliecloud was obtained from your Linux distribution, use your distribution’s bug reporting procedures.
Otherwise, report bugs to: https://github.com/hpc/charliecloud/issues
Full documentation at: <https://hpc.github.io/charliecloud>
2014–2022, Triad National Security, LLC and others
2023-01-29 12:36 UTC | 0.31 |