lxc.container.conf(5) | lxc.container.conf(5) |
lxc.container.conf - LXC container configuration file
LXC is the well-known and heavily tested low-level Linux container runtime. It is in active development since 2008 and has proven itself in critical production environments world-wide. Some of its core contributors are the same people that helped to implement various well-known containerization features inside the Linux kernel.
LXC's main focus is system containers. That is, containers which offer an environment as close as possible as the one you'd get from a VM but without the overhead that comes with running a separate kernel and simulating all the hardware.
This is achieved through a combination of kernel security features such as namespaces, mandatory access control and control groups.
LXC has support for unprivileged containers. Unprivileged containers are containers that are run without any privilege. This requires support for user namespaces in the kernel that the container is run on. LXC was the first runtime to support unprivileged containers after user namespaces were merged into the mainline kernel.
In essence, user namespaces isolate given sets of UIDs and GIDs. This is achieved by establishing a mapping between a range of UIDs and GIDs on the host to a different (unprivileged) range of UIDs and GIDs in the container. The kernel will translate this mapping in such a way that inside the container all UIDs and GIDs appear as you would expect from the host whereas on the host these UIDs and GIDs are in fact unprivileged. For example, a process running as UID and GID 0 inside the container might appear as UID and GID 100000 on the host. The implementation and working details can be gathered from the corresponding user namespace man page. UID and GID mappings can be defined with the lxc.idmap key.
Linux containers are defined with a simple configuration file. Each option in the configuration file has the form key = value fitting in one line. The "#" character means the line is a comment. List options, like capabilities and cgroups options, can be used with no value to clear any previously defined values of that option.
LXC namespaces configuration keys use single dots. This means complex configuration keys such as lxc.net.0 expose various subkeys such as lxc.net.0.type, lxc.net.0.link, lxc.net.0.ipv6.address, and others for even more fine-grained configuration.
In order to ease administration of multiple related containers, it is possible to have a container configuration file cause another file to be loaded. For instance, network configuration can be defined in one common file which is included by multiple containers. Then, if the containers are moved to another host, only one file may need to be updated.
Allows one to set the architecture for the container. For example, set a 32bits architecture for a container running 32bits binaries on a 64bits host. This fixes the container scripts which rely on the architecture to do some work like downloading the packages.
Some valid options are x86, i686, x86_64, amd64
The utsname section defines the hostname to be set for the container. That means the container can set its own hostname without changing the one from the system. That makes the hostname private for the container.
Allows one to specify signal name or number sent to the container's init process to cleanly shutdown the container. Different init systems could use different signals to perform clean shutdown sequence. This option allows the signal to be specified in kill(1) fashion, e.g. SIGPWR, SIGRTMIN+14, SIGRTMAX-10 or plain number. The default signal is SIGPWR.
Allows one to specify signal name or number to reboot the container. This option allows signal to be specified in kill(1) fashion, e.g. SIGTERM, SIGRTMIN+14, SIGRTMAX-10 or plain number. The default signal is SIGINT.
Allows one to specify signal name or number to forcibly shutdown the container. This option allows signal to be specified in kill(1) fashion, e.g. SIGKILL, SIGRTMIN+14, SIGRTMAX-10 or plain number. The default signal is SIGKILL.
Sets the command to use as the init system for the containers.
Sets the absolute path inside the container as the working directory for the containers. LXC will switch to this directory before executing init.
Sets the UID/GID to use for the init system, and subsequent commands. Note that using a non-root UID when booting a system container will likely not work due to missing privileges. Setting the UID/GID is mostly useful when running application containers. Defaults to: UID(0), GID(0)
Core scheduling defines if the container payload is marked as being schedulable on the same core. Doing so will cause the kernel scheduler to ensure that tasks that are not in the same group never run simultaneously on a core. This can serve as an extra security measure to prevent the container payload from using cross hyper thread attacks.
Configure proc filesystem for the container.
lxc.proc.oom_score_adj = 10
Allows one to specify whether a container will be destroyed on shutdown.
The network section defines how the network is virtualized in the container. The network virtualization acts at layer two. In order to use the network virtualization, parameters must be specified to define the network interfaces of the container. Several virtual interfaces can be assigned and used in a container even if the system has only one physical network interface.
none: will cause the container to share the host's network namespace. This means the host network devices are usable in the container. It also means that if both the container and host have upstart as init, 'halt' in a container (for instance) will shut down the host. Note that unprivileged containers do not work with this setting due to an inability to mount sysfs. An unsafe workaround would be to bind mount the host's sysfs.
empty: will create only the loopback interface.
veth: a virtual ethernet pair device is created with one side assigned to the container and the other side on the host. lxc.net.[i].veth.mode specifies the mode the veth parent will use on the host. The accepted modes are bridge and router. The mode defaults to bridge if not specified. In bridge mode the host side is attached to a bridge specified by the lxc.net.[i].link option. If the bridge link is not specified, then the veth pair device will be created but not attached to any bridge. Otherwise, the bridge has to be created on the system before starting the container. lxc won't handle any configuration outside of the container. In router mode static routes are created on the host for the container's IP addresses pointing to the host side veth interface. Additionally Proxy ARP and Proxy NDP entries are added on the host side veth interface for the gateway IPs defined in the container to allow the container to reach the host. By default, lxc chooses a name for the network device belonging to the outside of the container, but if you wish to handle this name yourselves, you can tell lxc to set a specific name with the lxc.net.[i].veth.pair option (except for unprivileged containers where this option is ignored for security reasons). Static routes can be added on the host pointing to the container using the lxc.net.[i].veth.ipv4.route and lxc.net.[i].veth.ipv6.route options. Several lines specify several routes. The route is in format x.y.z.t/m, eg. 192.168.1.0/24. In bridge mode untagged VLAN membership can be set with the lxc.net.[i].veth.vlan.id option. It accepts a special value of 'none' indicating that the container port should be removed from the bridge's default untagged VLAN. The lxc.net.[i].veth.vlan.tagged.id option can be specified multiple times to set the container's bridge port membership to one or more tagged VLANs.
vlan: a vlan interface is linked with the interface specified by the lxc.net.[i].link and assigned to the container. The vlan identifier is specified with the option lxc.net.[i].vlan.id.
macvlan: a macvlan interface is linked with the interface specified by the lxc.net.[i].link and assigned to the container. lxc.net.[i].macvlan.mode specifies the mode the macvlan will use to communicate between different macvlan on the same upper device. The accepted modes are private, vepa, bridge and passthru. In private mode, the device never communicates with any other device on the same upper_dev (default). In vepa mode, the new Virtual Ethernet Port Aggregator (VEPA) mode, it assumes that the adjacent bridge returns all frames where both source and destination are local to the macvlan port, i.e. the bridge is set up as a reflective relay. Broadcast frames coming in from the upper_dev get flooded to all macvlan interfaces in VEPA mode, local frames are not delivered locally. In bridge mode, it provides the behavior of a simple bridge between different macvlan interfaces on the same port. Frames from one interface to another one get delivered directly and are not sent out externally. Broadcast frames get flooded to all other bridge ports and to the external interface, but when they come back from a reflective relay, we don't deliver them again. Since we know all the MAC addresses, the macvlan bridge mode does not require learning or STP like the bridge module does. In passthru mode, all frames received by the physical interface are forwarded to the macvlan interface. Only one macvlan interface in passthru mode is possible for one physical interface.
ipvlan: an ipvlan interface is linked with the interface specified by the lxc.net.[i].link and assigned to the container. lxc.net.[i].ipvlan.mode specifies the mode the ipvlan will use to communicate between different ipvlan on the same upper device. The accepted modes are l3, l3s and l2. It defaults to l3 mode. In l3 mode TX processing up to L3 happens on the stack instance attached to the dependent device and packets are switched to the stack instance of the parent device for the L2 processing and routing from that instance will be used before packets are queued on the outbound device. In this mode the dependent devices will not receive nor can send multicast / broadcast traffic. In l3s mode TX processing is very similar to the L3 mode except that iptables (conn-tracking) works in this mode and hence it is L3-symmetric (L3s). This will have slightly less performance but that shouldn't matter since you are choosing this mode over plain-L3 mode to make conn-tracking work. In l2 mode TX processing happens on the stack instance attached to the dependent device and packets are switched and queued to the parent device to send devices out. In this mode the dependent devices will RX/TX multicast and broadcast (if applicable) as well. lxc.net.[i].ipvlan.isolation specifies the isolation mode. The accepted isolation values are bridge, private and vepa. It defaults to bridge. In bridge isolation mode dependent devices can cross-talk among themselves apart from talking through the parent device. In private isolation mode the port is set in private mode. i.e. port won't allow cross communication between dependent devices. In vepa isolation mode the port is set in VEPA mode. i.e. port will offload switching functionality to the external entity as described in 802.1Qbg.
phys: an already existing interface specified by the lxc.net.[i].link is assigned to the container.
up: activates the interface.
In addition to the information available to all hooks. The following information is provided to the script:
Whether this information is provided in the form of environment variables or as arguments to the script depends on the value of lxc.hook.version. If set to 1 then information is provided in the form of environment variables. If set to 0 information is provided as arguments to the script.
Standard output from the script is logged at debug level. Standard error is not logged, but can be captured by the hook redirecting its standard error to standard output.
In addition to the information available to all hooks. The following information is provided to the script:
Whether this information is provided in the form of environment variables or as arguments to the script depends on the value of lxc.hook.version. If set to 1 then information is provided in the form of environment variables. If set to 0 information is provided as arguments to the script.
Standard output from the script is logged at debug level. Standard error is not logged, but can be captured by the hook redirecting its standard error to standard output.
For stricter isolation the container can have its own private instance of the pseudo tty.
If the container is configured with a root filesystem and the inittab file is setup to use the console, you may want to specify where the output of this console goes.
This option is useful if the container is configured with a root filesystem and the inittab file is setup to launch a getty on the ttys. The option specifies the number of ttys to be available for the container. The number of gettys in the inittab file of the container should not be greater than the number of ttys specified in this option, otherwise the excess getty sessions will die and respawn indefinitely giving annoying messages on the console or in /var/log/messages.
LXC consoles are provided through Unix98 PTYs created on the host and bind-mounted over the expected devices in the container. By default, they are bind-mounted over /dev/console and /dev/ttyN. This can prevent package upgrades in the guest. Therefore you can specify a directory location (under /dev under which LXC will create the files and bind-mount over them. These will then be symbolically linked to /dev/console and /dev/ttyN. A package upgrade can then succeed as it is able to remove and replace the symbolic links.
By default, lxc creates a few symbolic links (fd,stdin,stdout,stderr) in the container's /dev directory but does not automatically create device node entries. This allows the container's /dev to be set up as needed in the container rootfs. If lxc.autodev is set to 1, then after mounting the container's rootfs LXC will mount a fresh tmpfs under /dev (limited to 500K by default, unless defined in lxc.autodev.tmpfs.size) and fill in a minimal set of initial devices. This is generally required when starting a container containing a "systemd" based "init" but may be optional at other times. Additional devices in the containers /dev directory may be created through the use of the lxc.hook.autodev hook.
The mount points section specifies the different places to be mounted. These mount points will be private to the container and won't be visible by the processes running outside of the container. This is useful to mount /etc, /var or /home for examples.
NOTE - LXC will generally ensure that mount targets and relative bind-mount sources are properly confined under the container root, to avoid attacks involving over-mounting host directories and files. (Symbolic links in absolute mount sources are ignored) However, if the container configuration first mounts a directory which is under the control of the container user, such as /home/joe, into the container at some path, and then mounts under path, then a TOCTTOU attack would be possible where the container user modifies a symbolic link under their home directory at just the right time.
proc proc proc nodev,noexec,nosuid 0 0
Will mount a proc filesystem under the container's /proc, regardless of where the root filesystem comes from. This is resilient to block device backed filesystems as well as container cloning.
Note that when mounting a filesystem from an image file or block device the third field (fs_vfstype) cannot be auto as with mount(8) but must be explicitly specified.
dev/null proc/kcore none bind,relative 0 0
Will expand dev/null to ${LXC_ROOTFS_MOUNT}/dev/null, and mount it to proc/kcore inside the container.
If cgroup namespaces are enabled, then any cgroup auto-mounting request will be ignored, since the container can mount the filesystems itself, and automounting can confuse the container init.
Note that if automatic mounting of the cgroup filesystem is enabled, the tmpfs under /sys/fs/cgroup will always be mounted read-write (but for the :mixed and :ro cases, the individual hierarchies, /sys/fs/cgroup/$hierarchy, will be read-only). This is in order to work around a quirk in Ubuntu's mountall(8) command that will cause containers to wait for user input at boot if /sys/fs/cgroup is mounted read-only and the container can't remount it read-write due to a lack of CAP_SYS_ADMIN.
Examples:
lxc.mount.auto = proc sys cgroup
lxc.mount.auto = proc:rw sys:rw cgroup-full:rw
The root file system of the container can be different than that of the host system.
For directory or simple block-device backed containers, a pathname can be used. If the rootfs is backed by a nbd device, then nbd:file:1 specifies that file should be attached to a nbd device, and partition 1 should be mounted as the rootfs. nbd:file specifies that the nbd device itself should be mounted. overlayfs:/lower:/upper specifies that the rootfs should be an overlay with /upper being mounted read-write over a read-only mount of /lower. For overlay multiple /lower directories can be specified. loop:/file tells lxc to attach /file to a loop device and mount the loop device.
The control group section contains the configuration for the different subsystem. lxc does not check the correctness of the subsystem name. This has the disadvantage of not detecting configuration errors until the container is started, but has the advantage of permitting any future subsystem.
The kernel implementation of cgroups has changed significantly over the years. With Linux 4.5 support for a new cgroup filesystem was added usually referred to as "cgroup2" or "unified hierarchy". Since then the old cgroup filesystem is usually referred to as "cgroup1" or the "legacy hierarchies". Please see the cgroups manual page for a detailed explanation of the differences between the two versions.
LXC distinguishes settings for the legacy and the unified hierarchy by using different configuration key prefixes. To alter settings for controllers in a legacy hierarchy the key prefix lxc.cgroup. must be used and in order to alter the settings for a controller in the unified hierarchy the lxc.cgroup2. key must be used. Note that LXC will ignore lxc.cgroup. settings on systems that only use the unified hierarchy. Conversely, it will ignore lxc.cgroup2. options on systems that only use legacy hierarchies.
At its core a cgroup hierarchy is a way to hierarchically organize processes. Usually a cgroup hierarchy will have one or more "controllers" enabled. A "controller" in a cgroup hierarchy is usually responsible for distributing a specific type of system resource along the hierarchy. Controllers include the "pids" controller, the "cpu" controller, the "memory" controller and others. Some controllers however do not fall into the category of distributing a system resource, instead they are often referred to as "utility" controllers. One utility controller is the device controller. Instead of distributing a system resource it allows one to manage device access.
In the legacy hierarchy the device controller was implemented like most other controllers as a set of files that could be written to. These files where named "devices.allow" and "devices.deny". The legacy device controller allowed the implementation of both "allowlists" and "denylists".
An allowlist is a device program that by default blocks access to all devices. In order to access specific devices "allow rules" for particular devices or device classes must be specified. In contrast, a denylist is a device program that by default allows access to all devices. In order to restrict access to specific devices "deny rules" for particular devices or device classes must be specified.
In the unified cgroup hierarchy the implementation of the device controller has completely changed. Instead of files to read from and write to a eBPF program of BPF_PROG_TYPE_CGROUP_DEVICE can be attached to a cgroup. Even though the kernel implementation has changed completely LXC tries to allow for the same semantics to be followed in the legacy device cgroup and the unified eBPF-based device controller. The following paragraphs explain the semantics for the unified eBPF-based device controller.
As mentioned the format for specifying device rules for the unified eBPF-based device controller is the same as for the legacy cgroup device controller; only the configuration key prefix has changed. Specifically, device rules for the legacy cgroup device controller are specified via lxc.cgroup.devices.allow and lxc.cgroup.devices.deny whereas for the cgroup2 eBPF-based device controller lxc.cgroup2.devices.allow and lxc.cgroup2.devices.deny must be used.
lxc.cgroup2.devices.deny = a
will cause LXC to instruct the kernel to block access to all devices by default. To grant access to devices allow device rules must be added via the lxc.cgroup2.devices.allow key. This is referred to as a "allowlist" device program.
lxc.cgroup2.devices.allow = a
will cause LXC to instruct the kernel to allow access to all devices by default. To deny access to devices deny device rules must be added via lxc.cgroup2.devices.deny key. This is referred to as a "denylist" device program.
For example the set of rules:
lxc.cgroup2.devices.deny = a
lxc.cgroup2.devices.allow = c *:* m
lxc.cgroup2.devices.allow = b *:* m
lxc.cgroup2.devices.allow = c 1:3 rwm
implements an allowlist device program, i.e. the kernel will block access to all devices not specifically allowed in this list. This particular program states that all character and block devices may be created but only /dev/null might be read or written.
If we instead switch to the following set of rules:
lxc.cgroup2.devices.allow = a
lxc.cgroup2.devices.deny = c *:* m
lxc.cgroup2.devices.deny = b *:* m
lxc.cgroup2.devices.deny = c 1:3 rwm
then LXC would instruct the kernel to implement a denylist, i.e. the kernel will allow access to all devices not specifically denied in this list. This particular program states that no character devices or block devices might be created and that /dev/null is not allow allowed to be read, written, or created.
Now consider the same program but followed by a "global rule" which determines the type of device program (allowlist or denylist) as explained above:
lxc.cgroup2.devices.allow = a
lxc.cgroup2.devices.deny = c *:* m
lxc.cgroup2.devices.deny = b *:* m
lxc.cgroup2.devices.deny = c 1:3 rwm
lxc.cgroup2.devices.allow = a
The last line will cause LXC to reset the device list without changing the type of device program.
If we specify:
lxc.cgroup2.devices.allow = a
lxc.cgroup2.devices.deny = c *:* m
lxc.cgroup2.devices.deny = b *:* m
lxc.cgroup2.devices.deny = c 1:3 rwm
lxc.cgroup2.devices.deny = a
instead then the last line will cause LXC to reset the device list and switch from an allowlist program to a denylist program.
The capabilities can be dropped in the container if this one is run as root.
A namespace can be cloned (lxc.namespace.clone), kept (lxc.namespace.keep) or shared (lxc.namespace.share.[namespace identifier]).
To create a new mount, net and ipc namespace set lxc.namespace.clone=mount net ipc.
To keep the network, user and ipc namespace set lxc.namespace.keep=user net ipc.
Note that sharing pid namespaces will likely not work with most init systems.
Note that if the container requests a new user namespace and the container wants to inherit the network namespace it needs to inherit the user namespace as well.
To inherit the namespace from another process set the lxc.namespace.share.[namespace identifier] to the PID of the process, e.g. lxc.namespace.share.net=42.
To inherit the namespace from another container set the lxc.namespace.share.[namespace identifier] to the name of the container, e.g. lxc.namespace.share.pid=c3.
To inherit the namespace from another container located in a different path than the standard liblxc path set the lxc.namespace.share.[namespace identifier] to the full path to the container, e.g. lxc.namespace.share.user=/opt/c3.
In order to inherit namespaces the caller needs to have sufficient privilege over the process or container.
Note that sharing pid namespaces between system containers will likely not work with most init systems.
Note that if two processes are in different user namespaces and one process wants to inherit the other's network namespace it usually needs to inherit the user namespace as well.
Note that without careful additional configuration of an LSM, sharing user+pid namespaces with a task may allow that task to escalate privileges to that of the task calling liblxc.
The soft and hard resource limits for the container can be changed. Unprivileged containers can only lower them. Resources which are not explicitly specified will be inherited.
Configure kernel parameters for the container.
If lxc was compiled and installed with apparmor support, and the host system has apparmor enabled, then the apparmor profile under which the container should be run can be specified in the container configuration. The default is lxc-container-default-cgns if the host kernel is cgroup namespace aware, or lxc-container-default otherwise.
lxc.apparmor.profile = unconfined
If the apparmor profile should remain unchanged (i.e. if you are nesting containers and are already confined), then use
lxc.apparmor.profile = unchanged
If you instruct LXC to generate the apparmor profile, then use
lxc.apparmor.profile = generated
If this flag is 0 (default), then the container will not be started if the kernel lacks the apparmor mount features, so that a regression after a kernel upgrade will be detected. To start the container under partial apparmor protection, set this flag to 1.
If lxc was compiled and installed with SELinux support, and the host system has SELinux enabled, then the SELinux context under which the container should be run can be specified in the container configuration. The default is unconfined_t, which means that lxc will not attempt to change contexts. See /usr/share/lxc/selinux/lxc.te for an example policy and more information.
lxc.selinux.context = system_u:system_r:lxc_t:s0:c22
lxc.selinux.context.keyring = system_u:system_r:lxc_t:s0:c22
The Linux Keyring facility is primarily a way for various kernel components to retain or cache security data, authentication keys, encryption keys, and other data in the kernel. By default lxc will create a new session keyring for the started application.
lxc.keyring.session = 0
A container can be started with a reduced set of available system calls by loading a seccomp profile at startup. The seccomp configuration file must begin with a version number on the first line, a policy type on the second line, followed by the configuration.
Versions 1 and 2 are currently supported. In version 1, the policy is a simple allowlist. The second line therefore must read "allowlist", with the rest of the file containing one (numeric) syscall number per line. Each syscall number is allowlisted, while every unlisted number is denylisted for use in the container
In version 2, the policy may be denylist or allowlist, supports per-rule and per-policy default actions, and supports per-architecture system call resolution from textual names.
An example denylist policy, in which all system calls are allowed except for mknod, which will simply do nothing and return 0 (success), looks like:
2
denylist
mknod errno 0
ioctl notify
Specifying "errno" as action will cause LXC to register a seccomp filter that will cause a specific errno to be returned to the caller. The errno value can be specified after the "errno" action word.
Specifying "notify" as action will cause LXC to register a seccomp listener and retrieve a listener file descriptor from the kernel. When a syscall is made that is registered as "notify" the kernel will generate a poll event and send a message over the file descriptor. The caller can read this message, inspect the syscalls including its arguments. Based on this information the caller is expected to send back a message informing the kernel which action to take. Until that message is sent the kernel will block the calling process. The format of the messages to read and sent is documented in seccomp itself.
With PR_SET_NO_NEW_PRIVS active execve() promises not to grant privileges to do anything that could not have been done without the execve() call (for example, rendering the set-user-ID and set-group-ID mode bits, and file capabilities non-functional). Once set, this bit cannot be unset. The setting of this bit is inherited by children created by fork() and clone(), and preserved across execve(). Note that PR_SET_NO_NEW_PRIVS is applied after the container has changed into its intended AppArmor profile or SElinux context.
A container can be started in a private user namespace with user and group id mappings. For instance, you can map userid 0 in the container to userid 200000 on the host. The root user in the container will be privileged in the container, but unprivileged on the host. Normally a system container will want a range of ids, so you would map, for instance, user and group ids 0 through 20,000 in the container to the ids 200,000 through 220,000.
Container hooks are programs or scripts which can be executed at various times in a container's lifetime.
When a container hook is executed, additional information is passed along. The lxc.hook.version argument can be used to determine if the following arguments are passed as command line arguments or through environment variables. The arguments are:
The following environment variables are set:
Standard output from the hooks is logged at debug level. Standard error is not logged, but can be captured by the hook redirecting its standard error to standard output.
A number of environment variables are made available to the startup hooks to provide configuration information and assist in the functioning of the hooks. Not all variables are valid in all contexts. In particular, all paths are relative to the host system and, as such, not valid during the lxc.hook.start hook.
Logging can be configured on a per-container basis. By default, depending upon how the lxc package was compiled, container startup is logged only at the ERROR level, and logged to a file named after the container (with '.log' appended) either under the container path, or under /var/log/lxc.
Both the default log level and the log file can be specified in the container configuration file, overriding the default behavior. Note that the configuration file entries can in turn be overridden by the command line options to lxc-start.
Note that when a script (such as either a hook script or a network interface up or down script) is called, the script's standard output is logged at level 1, debug.
The autostart options support marking which containers should be auto-started and in what order. These options may be used by LXC tools directly or by external tooling provided by the distributions.
Each container can be part of any number of groups or no group at all. Two groups are special. One is the NULL group, i.e. the container does not belong to any group. The other group is the "onboot" group.
When the system boots with the LXC service enabled, it will first attempt to boot any containers with lxc.start.auto == 1 that is a member of the "onboot" group. The startup will be in order of lxc.start.order. If an lxc.start.delay has been specified, that delay will be honored before attempting to start the next container to give the current container time to begin initialization and reduce overloading the host system. After starting the members of the "onboot" group, the LXC system will proceed to boot containers with lxc.start.auto == 1 which are not members of any group (the NULL group) and proceed as with the onboot group.
If you want to pass environment variables into the container (that is, environment variables which will be available to init and all of its descendents), you can use lxc.environment parameters to do so. Be careful that you do not pass in anything sensitive; any process in the container which doesn't have its environment scrubbed will have these variables available to it, and environment variables are always available via /proc/PID/environ.
This configuration parameter can be specified multiple times; once for each environment variable you wish to configure.
lxc.environment = APP_ENV=production
lxc.environment = SYSLOG_SERVER=192.0.2.42
It is possible to inherit host environment variables by setting the name of the variable without a "=" sign. For example:
lxc.environment = PATH
In addition to the few examples given below, you will find some other examples of configuration file in /usr/share/doc/lxc/examples
This configuration sets up a container to use a veth pair device with one side plugged to a bridge br0 (which has been configured before on the system by the administrator). The virtual network device visible in the container is renamed to eth0.
lxc.uts.name = myhostname
lxc.net.0.type = veth
lxc.net.0.flags = up
lxc.net.0.link = br0
lxc.net.0.name = eth0
lxc.net.0.hwaddr = 4a:49:43:49:79:bf
lxc.net.0.ipv4.address = 10.2.3.5/24 10.2.3.255
lxc.net.0.ipv6.address = 2003:db8:1:0:214:1234:fe0b:3597
This configuration will map both user and group ids in the range 0-9999 in the container to the ids 100000-109999 on the host.
lxc.idmap = u 0 100000 10000
lxc.idmap = g 0 100000 10000
This configuration will setup several control groups for the application, cpuset.cpus restricts usage of the defined cpu, cpus.share prioritize the control group, devices.allow makes usable the specified devices.
lxc.cgroup.cpuset.cpus = 0,1
lxc.cgroup.cpu.shares = 1234
lxc.cgroup.devices.deny = a
lxc.cgroup.devices.allow = c 1:3 rw
lxc.cgroup.devices.allow = b 8:0 rw
This example show a complex configuration making a complex network stack, using the control groups, setting a new hostname, mounting some locations and a changing root file system.
lxc.uts.name = complex
lxc.net.0.type = veth
lxc.net.0.flags = up
lxc.net.0.link = br0
lxc.net.0.hwaddr = 4a:49:43:49:79:bf
lxc.net.0.ipv4.address = 10.2.3.5/24 10.2.3.255
lxc.net.0.ipv6.address = 2003:db8:1:0:214:1234:fe0b:3597
lxc.net.0.ipv6.address = 2003:db8:1:0:214:5432:feab:3588
lxc.net.1.type = macvlan
lxc.net.1.flags = up
lxc.net.1.link = eth0
lxc.net.1.hwaddr = 4a:49:43:49:79:bd
lxc.net.1.ipv4.address = 10.2.3.4/24
lxc.net.1.ipv4.address = 192.168.10.125/24
lxc.net.1.ipv6.address = 2003:db8:1:0:214:1234:fe0b:3596
lxc.net.2.type = phys
lxc.net.2.flags = up
lxc.net.2.link = random0
lxc.net.2.hwaddr = 4a:49:43:49:79:ff
lxc.net.2.ipv4.address = 10.2.3.6/24
lxc.net.2.ipv6.address = 2003:db8:1:0:214:1234:fe0b:3297
lxc.cgroup.cpuset.cpus = 0,1
lxc.cgroup.cpu.shares = 1234
lxc.cgroup.devices.deny = a
lxc.cgroup.devices.allow = c 1:3 rw
lxc.cgroup.devices.allow = b 8:0 rw
lxc.mount.fstab = /etc/fstab.complex
lxc.mount.entry = /lib /root/myrootfs/lib none ro,bind 0 0
lxc.rootfs.path = dir:/mnt/rootfs.complex
lxc.rootfs.options = idmap=container
lxc.cap.drop = sys_module mknod setuid net_raw
lxc.cap.drop = mac_override
lxc(7), lxc-create(1), lxc-copy(1), lxc-destroy(1), lxc-start(1), lxc-stop(1), lxc-execute(1), lxc-console(1), lxc-monitor(1), lxc-wait(1), lxc-cgroup(1), lxc-ls(1), lxc-info(1), lxc-freeze(1), lxc-unfreeze(1), lxc-attach(1), lxc.conf(5)
Daniel Lezcano <daniel.lezcano@free.fr>
2023-11-30 |