Linux Containers, or LXC, is a Linux feature that allows Linux to run one or more isolated virtual systems (with their own network interfaces, process namespace, user namespace, and power state) using a single Linux kernel on a single server.

−

[[Metro]] is a somewhat complex set of shell (specifically bash) and python scripts and various files that automagically creates stages 1/2/3 tarballs used to emerge your [[Funtoo Linux]] system.

+

== Status ==

−

Of course, [[Metro]] needs a seed to start its work and will recompile things like the so well known <tt>gcc</tt> and <tt>glibc</tt>.

+

−

A 'seed' is either a pre-built stage3 archive available on any Funtoo Linux [http://docs.funtoo.org/wiki/Download mirror] or a snapshot of your current [[Funtoo Linux]] installation.

+

−

= Arborescence description =

+

As of Linux kernel 3.1.5, LXC is usable for isolating your own private workloads from one another. It is not yet ready to isolate potentially malicious users from one another or the host system. For a more mature containers solution that is appropriate for hosting environments, see [[OpenVZ]].

−

== Overview ==

+

LXC containers don't yet have their own system uptime, and they see everything that's in the host's <tt>dmesg</tt> output, among other things. But in general, the technology works.

−

Don't search through your bin directories, everything lies in '''/usr/lib/metro'''. An unfolded view as of Metro revision 1.5.1 is this one :

+

+

== Basic Info ==

+

+

+

* Linux Containers are based on:

+

** Kernel namespaces for resource isolation

+

** CGroups for resource limitation and accounting

+

+

{{Package|app-emulation/lxc}} is the userspace tool for Linux containers

+

+

== Control groups ==

+

+

* Control groups (cgroups) in kernel since 2.6.24

+

** Allows aggregation of tasks and their children

+

** Subsystems (cpuset, memory, blkio,...)

+

** accounting - to measure how much resources certain systems use

+

** resource limiting - groups can be set to not exceed a set memory limit

+

** prioritization - some groups may get a larger share of CPU

+

** control - freezing/unfreezing of cgroups, checkpointing and restarting

Any kernel beyond 3.1.5 will probably work. Personally I prefer {{Package|sys-kernel/gentoo-sources}} as these have support for all the namespaces without sacrificing the xfs, FUSE or NFS support for example. These checks were introduced later starting from kernel 3.5, this could also mean that the user namespace is not working optimally.

+

+

* User namespace (EXPERIMENTAL) depends on EXPERIMENTAL and on UIDGID_CONVERTED

+

** config UIDGID_CONVERTED

+

*** True if all of the selected software components are known to have uid_t and gid_t converted to kuid_t and kgid_t where appropriate and are otherwise safe to use with the user namespace.

** As of 3.10.xx kernel, all of the above options are safe to use with User namespaces, except for XFS_FS, therefore with kernel >=3.10.xx, you should answer XFS_FS = n, if you want User namespaces support.

+

** in your kernel source directory, you should check init/Kconfig and find out what UIDGID_CONVERTED depends on

+

+

==== Kernel configuration ====

+

These options should be enable in your kernel to be able to take full advantage of LXC.

Once you have lxc installed, you can then check your kernel config with:

+

<console>

+

# ##i##CONFIG=/path/to/config /usr/sbin/lxc-checkconfig

+

</console>

+

+

=== Emerge lxc ===

+

<console>

+

# ##i##emerge app-emulation/lxc

+

</console>

+

+

=== Configure Networking For Container ===

+

+

Typically, one uses a bridge to allow containers to connect to the network. This is how to do it under Funtoo Linux:

+

+

# create a bridge using the Funtoo network configuration scripts. Name the bridge something like <tt>brwan</tt> (using <tt>/etc/init.d/netif.brwan</tt>). Configure your bridge to have an IP address.

+

# Make your physical interface, such as <tt>eth0</tt>, an interface with no IP address (use the Funtoo <tt>interface-noip</tt> template.)

+

# Make <tt>netif.eth0</tt> a slave of <tt>netif.brwan</tt> in <tt>/etc/conf.d/netif.brwan</tt>.

+

# Enable your new bridged network and make sure it is functioning properly on the host.

+

+

You will now be able to configure LXC to automatically add your container's virtual ethernet interface to the bridge when it starts, which will connect it to your network.

+

+

== Setting up a Funtoo Linux LXC Container ==

+

+

Here are the steps required to get Funtoo Linux running <i>inside</i> a container. The steps below show you how to set up a container using an existing Funtoo Linux OpenVZ template. It is now also possible to use [[Metro]] to build an lxc container tarball directly, which will save you manual configuration steps and will provide an <tt>/etc/fstab.lxc</tt> file that you can use for your host container config. See [[Metro Recipes]] for info on how to use Metro to generate an lxc container.

+

+

=== Create and Configure Container Filesystem ===

+

+

# Start with a Funtoo LXC template, and unpack it to a directory such as <tt>/lxc/funtoo0/rootfs/</tt>

+

# Create an empty <tt>/lxc/funtoo0/fstab</tt> file

+

# Ensure <tt>c1</tt> line is uncommented (enabled) and <tt>c2</tt> through <tt>c6</tt> lines are disabled in <tt>/lxc/funtoo0/rootfs/etc/inittab</tt>

+

+

That's almost all you need to get the container filesystem ready to start.

+

+

=== Create Container Configuration Files ===

+

+

Create the following files:

+

+

==== <tt>/lxc/funtoo0/config</tt> ====

+

+

+

and also create symlink from

+

==== <tt> /lxc/funtoo0/config to /etc/lxc/funtoo0.conf </tt> ====

+

<console>

+

###i## mkdir /etc/lxc/funtoo0

+

###i## ln -s /lxc/funtoo0/config /etc/lxc/funtoo0/config

+

</console>

+

+

{{Fancynote| Daniel Robbins needs to update this config to be more in line with http://wiki.progress-linux.org/software/lxc/ -- this config appears to have nice, refined device node permissions and other goodies. // note by Havis to Daniel, this config is already superior.}}

lxc.tty = 6 # if you plan to use container with physical terminals (eg F1..F6)

−

│ │ │ ├── build.conf

+

#lxc.tty = 0 # set to 0 if you dont plan to use the container with physical terminal, also comment out in your containers /etc/inittab c1 to c6 respawns (e.g. c1:12345:respawn:/sbin/agetty 38400 tty1 linux)

Above, use the following command to generate a random MAC for <tt>lxc.network.hwaddr</tt>:

+

+

<console>

+

###i## openssl rand -hex 6 | sed 's/\(..\)/\1:/g; s/.$//'

+

</console>

+

+

It is a very good idea to assign a static MAC address to your container using <tt>lxc.network.hwaddr</tt>. If you don't, LXC will auto-generate a new random MAC every time your container starts, which may confuse network equipment that expects MAC addresses to remain constant.

+

+

It might happen from case to case that you aren't able to start your LXC Container with the above generated MAC address so for all these who run into that problem here is a little script that connects your IP for the container with the MAC address. Just save the following code as <tt>/etc/lxc/hwaddr.sh</tt>, make it executable and run it like <tt>/etc/lxc/hwaddr.sh xxx.xxx.xxx.xxx</tt> where xxx.xxx.xxx.xxx represents your Container IP. <br><tt>/etc/lxc/hwaddr.sh</tt>:

+

+

<pre>

+

#!/bin/sh

+

IP=$*

+

HA=`printf "02:00:%x:%x:%x:%x" ${IP//./ }`

+

echo $HA

+

</pre>

+

+

==== <tt>/lxc/funtoo0/fstab</tt> ====

+

{{fancynote| It is now preferable to have mount entries directly in config file instead of separate fstab:}}

By default Linux workstations and servers have IPv4 forwarding disabled.

+

<console>

+

###i## echo "1" > /proc/sys/net/ipv4/ip_forward

+

###i## cat /proc/sys/net/ipv4/ip_forward

+

# 1

+

</console>

+

+

== Initializing and Starting the Container ==

+

+

You will probably need to set the root password for the container before you can log in. You can use chroot to do this quickly:

+

+

<console>

+

###i## chroot /lxc/funtoo0/rootfs

+

(chroot) ###i## passwd

+

New password: XXXXXXXX

+

Retype new password: XXXXXXXX

+

passwd: password updated successfully

+

(chroot) ###i## exit

+

</console>

+

+

Now that the root password is set, run:

+

+

<console>

+

###i## lxc-start -n funtoo0 -d

+

</console>

+

+

The <tt>-d</tt> option will cause it to run in the background.

+

+

To attach to the console:

+

+

<console>

+

###i## lxc-console -n funtoo0

+

</console>

+

+

You should now be able to log in and use the container. In addition, the container should now be accessible on the network.

+

+

To directly attach to container:

−

# Python is the master of the house, everything (or at least near everything) in Funtoo tools is written Python, and as being a Funtoo tool, Metro in itself plus all of the stuff it uses are written in Python.

+

<console>

−

# The first place to start with Metro tweaking are the .spec files located under '''targets/gentoo'''. Those files control what Metro does at each stage and are basically a melt of Metro directives and some Python embedded scripts and BASH embedded scripts. If something goes wrong.

+

###i## lxc-attach -n funtoo0

−

# Another good place to look at is '''subarch''' : you will find here a couple of .spec files that governs how the build toolchain (libc, c/c++ compiler and related tools) will be generated, how they will be used (notice the CHOST variable) and at the end what will be put in the make.conf file in your stages archives.

+

</console>

−

The [[Metro Data Model]] page explains the logic used in spec files.

+

To stop the container:

−

TBC

+

<console>

+

###i## lxc-stop -n funtoo0

+

</console>

−

== The GNU toolchain ==

+

Ensure that networking is working from within the container while it is running, and you're good to go!

−

''"How the build toolchain will be generated",'' what does it mean ? As you may know, the GNU toolchain is a '''multiform''' set of tools that translate a source program written in a "high" level language like C, C++ or Fortan in a binary form understandable by the microprocessor of your computer. Moreover this set of tools do various sorts of things like source code grammar and syntax checking, code optimization, dead code removal, code instrumentation, and so on.

+

== Starting LXC container during host boot ==

−

The '''multiform''' magic keyword means that the GNU toolchain can be compiled for many processors archicture, this of course include the well-known IA-32 (x86 family processors) but also IA-64 (Itaninum), PA-RISC (HPPA), Motorola 68k series, Dec Alpha, SPARC series, ARM, and many others including the so well known DEC PDP/11 series (technically speaking, with the adequate packages and keywordings, Funtoo could bring the steam power on a DEC PDP/11 machine and it could be possible to build stage 3 images for that platform).

+

# You need to create symlink in <tt>/etc/init.d/</tt> to <tt>/etc/init.d/lxc</tt> so that it reflects your container.

+

# <tt>ln -s /etc/init.d/lxc /etc/init.d/lxc.funtoo0</tt>

+

# now you can add <tt>lxc.funtoo0</tt> to default runlevel

+

# <tt>rc-update add lxc.funtoo0 default</tt>

+

<console>

+

###i## rc

+

* Starting funtoo0 ... [ ok ]

+

</console>

−

Basically the GNU tool chains relies on three components :

+

== LXC Bugs/Missing Features ==

−

* The so well-known '''GNU Compilers Collection''' (''sys-devel/gcc'' is the Funtoo package that deploys this suite). A common mistake is to call GNU Compiler, but set that aside for now.

+

This section is devoted to documenting issues with the current implementation of LXC and its associated tools. We will be gradually expanding this section with detailed descriptions of problems, their status, and proposed solutions.

−

* '''Some related tools called behind the scene''' by the GNU Compiler Collection or required by the GNU autoconf/automake system upon which Funtoo relies on. Those behind the scene tools are for example the GNU linker (ld) or the the GNU Assembler (as). Under Funtoo, those tools are deployed by the package ''sys-devel/binutils''.

+

−

* '''The GNU lib C''' : this one regroups a large set of system functions used by a large numbers of programs out there and contains functionalities like files manipulation, memory allocation, and so on. It can be barely seen as being a software interface (abstraction) between the system kernel and a program that lives in the userland world.

+

−

Note : the GNU libC is '''not''' the only one solution available in the Free Software world. Several alternatives like '''uLibC''' or '''newlib''' exist and can be used by the GNU Compiler Collection and the related tools it uses.

+

=== reboot ===

−

The GNU toolchain can be built as being :

+

* By default, lxc does not support rebooting a container from within. It will simply stop and the host will not know to start it.

* '''a native toolchain''' : the computer that runs the toolchain is the same that executes the binaries produced the toolchain, this is the most common case. (e.g. an i686 processor machine running GNU/Linux with the GNU Lib C produces a toolchain that generates binaries for the exact same architecture an i686 processor machine running GNU/Linux with the GNU Lib C).

+

=== PID namespaces ===

−

* '''a cross-compilation chain''' : the computer that runs the toolchain is NOT the same that executes the produced the toolchain (e.g. an i686 processor running GNU/Linux with the GNU Lib C produces a toolchain that generates binaries for a SPARC machine running Solaris). Moreover, you can do funny brain-dead things like : generate a toolchain running on a PowerPC GNU/Linux system that produces another toolchain targeting a SPARC machine under Solaris which, in its turn, produces binaries targeting an Alpha machine running under FreeBSD).

+

Process ID namespaces are functional, but the container can still see the CPU utilization of the host via the system load (ie. in <tt>top</tt>).

−

== Have you ever noticed ? ==

+

=== /dev/pts newinstance ===

−

By '''CONVENTION''', the GNU toolchain internal mechanisms '''REQUIRES''' to prefix all the toolchain executables by a quadruplet as below :

+

* Some changes may be required to the host to properly implement "newinstance" <tt>/dev/pts</tt>. See [https://bugzilla.redhat.com/show_bug.cgi?id=501718 This Red Hat bug].

−

<pre>CPU-machine-OS-libC</pre>

+

=== lxc-create and lxc-destroy ===

−

* '''CPU''' : i486, i686, sparc, sparc64, alpha...

+

* LXC's shell scripts are badly designed and are sure way to destruction, avoid using lxc-create and lxc-destroy.

−

* '''machine''' : pc, sun4u, unknown...

+

−

* '''OS''' : linux, solaris, freebsd...

+

−

* '''libC''' : gnu (for glibc), ulibc...

+

−

This is stricly conventional but powerful as it allows the toolchain to easily detect when it has to be build it as a native toolchain or as a cross-compilation toolchain. Also note that some architectures like SPARC version 9 (i.e. UltraSPARC) uses a 64 bits kernel and 32 bits userland binaries (although it is possible to have 64 bits binaries, most of the time producing 64 bits binaries does not bring any advantage and worse, can degrade the performance on some machines due to additional memory read/write cycles required to get/write 64 bits values). This is exactly the reason why you have '''sys-devel/gcc''' and '''sys-devel/kgcc64''', the later being only used to build the system kernel.

+

=== network initialization and cleanup ===

−

Now lets have a look on your machine at :

+

* If used network.type = phys after lxc-stop the interface will be renamed to value from lxc.network.link. It supposed to be fixed in 0.7.4, happens still on 0.7.5 - http://www.mail-archive.com/lxc-users@lists.sourceforge.net/msg01760.html

−

* the '''CHOST''' variable in your '''/etc/make.conf''' and what lies under your /usr/sbin.

+

* Re-starting a container can result in a failure as network resource are tied up from the already-defunct instance: [http://www.mail-archive.com/lxc-devel@lists.sourceforge.net/msg00824.html]

−

* where points commands like '''/usr/bin/ld''' or '''/usr/bin/as'''

+

−

* what prefixes g++ or gcc in /usr/bin (take note that /usr/bin/g++ and /usr/bin/<CPU-machine-OS-libC>-g++ are exactly the same).

+

−

This is not a coincidence if the CHOST variable and the prefix used by the toolchain executables matches ;-)

** and also comment out other line starting with pf:powerfail (such as pf::powerwait:/etc/init.d/powerfail start) <- these are used if you have UPS monitoring daemon installed!

+

* /etc/init.d/lxc seems to have broken support for graceful shutdown (it sends proper signal, but then also tries to kill the init with lxc-stop)

+

=== funtoo ===

−

Playing with the cross-compilation capablities of the GNU toolchain is out the scope of this article, just retain on this subject that it is mainly used with embedded systems which have too few resources to host a full compilation system or in build farms that generates binaries for several CPU architectures because it is now the time to dive in the main course of the action.

+

* Our udev should be updated to contain <tt>-lxc</tt> in scripts. (This has been done as of 02-Nov-2011, so should be resolved. But not fixed in our openvz templates, so need to regen them in a few days.)

+

* Our openrc should be patched to handle the case where it cannot mount tmpfs, and gracefully handle this situation somehow. (Work-around in our docs above, which is to mount tmpfs to <tt>/libexec/rc/init.d</tt> using the container-specific <tt>fstab</tt> file (on the host.)

+

* Emerging udev within a container can/will fail when realdev is run, if a device node cannot be created (such as /dev/console) if there are no mknod capabilities within the container. This should be fixed.

−

As you may have noticed in the [[Metro Quick Start Tutorial]] the beginning of the story lies in the script

+

== References ==

−

TBC

+

* <tt>man 7 capabilities</tt>

+

* <tt>man 5 lxc.conf</tt>

−

= Adding a new sub architecture =

+

== Links ==

−

As of January 2011, Metro can handle the IA-32 architecture (x86/32 and x86/64 bits) as well as being able to handle several x86 flavous lying beneath the generic 'x86 and 'amd64' taxonomies. The example here is a try to add the necessary specification file to make it generate stage 1/2/3 tarballs for tiers architectures.

+

* There are a number of additional lxc features that can be enabled via patches: [http://lxc.sourceforge.net/patches/linux/3.0.0/3.0.0-lxc1/]

Revision as of 18:52, January 28, 2015

Linux Containers, or LXC, is a Linux feature that allows Linux to run one or more isolated virtual systems (with their own network interfaces, process namespace, user namespace, and power state) using a single Linux kernel on a single server.

Status

As of Linux kernel 3.1.5, LXC is usable for isolating your own private workloads from one another. It is not yet ready to isolate potentially malicious users from one another or the host system. For a more mature containers solution that is appropriate for hosting environments, see OpenVZ.

LXC containers don't yet have their own system uptime, and they see everything that's in the host's dmesg output, among other things. But in general, the technology works.

Configuring the Funtoo Host System

Install LXC kernel

Any kernel beyond 3.1.5 will probably work. Personally I prefer sys-kernel/gentoo-sources (package not on wiki - please add) as these have support for all the namespaces without sacrificing the xfs, FUSE or NFS support for example. These checks were introduced later starting from kernel 3.5, this could also mean that the user namespace is not working optimally.

User namespace (EXPERIMENTAL) depends on EXPERIMENTAL and on UIDGID_CONVERTED

config UIDGID_CONVERTED

True if all of the selected software components are known to have uid_t and gid_t converted to kuid_t and kgid_t where appropriate and are otherwise safe to use with the user namespace.

As of 3.10.xx kernel, all of the above options are safe to use with User namespaces, except for XFS_FS, therefore with kernel >=3.10.xx, you should answer XFS_FS = n, if you want User namespaces support.

in your kernel source directory, you should check init/Kconfig and find out what UIDGID_CONVERTED depends on

Kernel configuration

These options should be enable in your kernel to be able to take full advantage of LXC.

Once you have lxc installed, you can then check your kernel config with:

# CONFIG=/path/to/config /usr/sbin/lxc-checkconfig

Emerge lxc

# emerge app-emulation/lxc

Configure Networking For Container

Typically, one uses a bridge to allow containers to connect to the network. This is how to do it under Funtoo Linux:

create a bridge using the Funtoo network configuration scripts. Name the bridge something like brwan (using /etc/init.d/netif.brwan). Configure your bridge to have an IP address.

Make your physical interface, such as eth0, an interface with no IP address (use the Funtoo interface-noip template.)

Make netif.eth0 a slave of netif.brwan in /etc/conf.d/netif.brwan.

Enable your new bridged network and make sure it is functioning properly on the host.

You will now be able to configure LXC to automatically add your container's virtual ethernet interface to the bridge when it starts, which will connect it to your network.

Setting up a Funtoo Linux LXC Container

Here are the steps required to get Funtoo Linux running inside a container. The steps below show you how to set up a container using an existing Funtoo Linux OpenVZ template. It is now also possible to use Metro to build an lxc container tarball directly, which will save you manual configuration steps and will provide an /etc/fstab.lxc file that you can use for your host container config. See Metro Recipes for info on how to use Metro to generate an lxc container.

Create and Configure Container Filesystem

Start with a Funtoo LXC template, and unpack it to a directory such as /lxc/funtoo0/rootfs/

Create an empty /lxc/funtoo0/fstab file

Ensure c1 line is uncommented (enabled) and c2 through c6 lines are disabled in /lxc/funtoo0/rootfs/etc/inittab

That's almost all you need to get the container filesystem ready to start.

Create Container Configuration Files

Create the following files:

/lxc/funtoo0/config

and also create symlink from

/lxc/funtoo0/config to /etc/lxc/funtoo0.conf

Daniel Robbins needs to update this config to be more in line with http://wiki.progress-linux.org/software/lxc/ -- this config appears to have nice, refined device node permissions and other goodies. // note by Havis to Daniel, this config is already superior.

Above, use the following command to generate a random MAC for lxc.network.hwaddr:

# openssl rand -hex 6 | sed 's/\(..\)/\1:/g; s/.$//'

It is a very good idea to assign a static MAC address to your container using lxc.network.hwaddr. If you don't, LXC will auto-generate a new random MAC every time your container starts, which may confuse network equipment that expects MAC addresses to remain constant.

It might happen from case to case that you aren't able to start your LXC Container with the above generated MAC address so for all these who run into that problem here is a little script that connects your IP for the container with the MAC address. Just save the following code as /etc/lxc/hwaddr.sh, make it executable and run it like /etc/lxc/hwaddr.sh xxx.xxx.xxx.xxx where xxx.xxx.xxx.xxx represents your Container IP. /etc/lxc/hwaddr.sh:

#!/bin/sh
IP=$*
HA=`printf "02:00:%x:%x:%x:%x" ${IP//./ }`
echo $HA

/lxc/funtoo0/fstab

Note

It is now preferable to have mount entries directly in config file instead of separate fstab:

You should now be able to log in and use the container. In addition, the container should now be accessible on the network.

To directly attach to container:

# lxc-attach -n funtoo0

To stop the container:

# lxc-stop -n funtoo0

Ensure that networking is working from within the container while it is running, and you're good to go!

Starting LXC container during host boot

You need to create symlink in /etc/init.d/ to /etc/init.d/lxc so that it reflects your container.

ln -s /etc/init.d/lxc /etc/init.d/lxc.funtoo0

now you can add lxc.funtoo0 to default runlevel

rc-update add lxc.funtoo0 default

# rc
* Starting funtoo0 ... [ ok ]

LXC Bugs/Missing Features

This section is devoted to documenting issues with the current implementation of LXC and its associated tools. We will be gradually expanding this section with detailed descriptions of problems, their status, and proposed solutions.

reboot

By default, lxc does not support rebooting a container from within. It will simply stop and the host will not know to start it.

and also comment out other line starting with pf:powerfail (such as pf::powerwait:/etc/init.d/powerfail start) <- these are used if you have UPS monitoring daemon installed!

/etc/init.d/lxc seems to have broken support for graceful shutdown (it sends proper signal, but then also tries to kill the init with lxc-stop)

funtoo

Our udev should be updated to contain -lxc in scripts. (This has been done as of 02-Nov-2011, so should be resolved. But not fixed in our openvz templates, so need to regen them in a few days.)

Our openrc should be patched to handle the case where it cannot mount tmpfs, and gracefully handle this situation somehow. (Work-around in our docs above, which is to mount tmpfs to /libexec/rc/init.d using the container-specific fstab file (on the host.)

Emerging udev within a container can/will fail when realdev is run, if a device node cannot be created (such as /dev/console) if there are no mknod capabilities within the container. This should be fixed.

References

man 7 capabilities

man 5 lxc.conf

Links

There are a number of additional lxc features that can be enabled via patches: [2]