Pages

cgroups (control groups) is a Linux kernel feature to limit, account and isolate resource usage (CPU, memory, disk I/O, etc.) of process groups. In late 2007 it was merged to kernel version 2.6.24.

By using cgroups, system administrators gain fine-grained control over allocating, prioritizing, denying, managing, and monitoring system resources. Hardware resources can be smartly divided up among tasks and users, increasing overall efficiency [1].

Cgroups are organized hierarchically, like processes, and child cgroups inherit some of the attributes of their parents.

net_prio — this subsystem provides a way to dynamically set the priority of network traffic per network interface.

ns — the namespace subsystem.

The easiest way to work with cgroups is to install the libcgroup package, which contains a number of cgroup-related command line utilities and their associated man pages, such as the cgconfig service. It is also possible to mount hierarchies and set cgroup parameters (non-persistently) using shell commands and utilities available on any system. You can then save all the changes in a config file using the cgsnapshot utility.

There are two main configuration files in /etc - cgconfig.conf and cgrules.conf.

cgconfig.conf is the configuration file used by libcgroup to define control groups, their parameters and also mount points. The file consists of mount and group sections.

cgrules.conf configuration file is used by libcgroups to define the control groups to which the process belongs to. The file contains list of rules which assign to a defined group/user a control group in a subsystem.

For more information on the cgroups hierarchy and subsystems please refer to [2].

Now let's get our hands dirty by implementing cgroups that limit how much I/O and CPU cycles two processes running on the same system can use.

Scenario 1 - Limiting I/O throughput

Let's assume that we have two applications running on a server that are heavily I/O bound - app1 and app2. We would like to give more bandwidth to app1 during the day and to app2 during the night. This type of I/O throughput prioritization can be achieved by using the blkio subsystem.

In the following example I'll show how to do this by manually creating the file tree and then creating a persistent config file out of that.

1. Attach the blkio subsystem to the /cgroup/blkio/ cgroup if not already attached:

Alternatively, if the applications are not yet running, or are not controlled by the daemon() function from /etc/init.d/functions, you can start them manually and add them to the cgroups at the same time by using the cgexec utility:

In this example, the low priority cgroup permits the low priority application - app2 - to use only about 10% of the I/O operations, whereas the high priority cgroup permits the high priority application - app1 - to use about 90% of the I/O operations.

To make this reverse during the day/night cycle create a cron job that flips the values from step 4, and the I/O utilization will reflect the specified weights.

To make these changes persistent across reboots we can create a configuration file out of the file structure we created manually in the previous steps by using the cgsnapshot command:

When loaded, the above configuration file mounts the cpu, cpuacct, and memory subsystems
to a single cpu_and_mem cgroup.

Then, it creates a hierarchy in cpu_and_mem which contains two cgroups: group1 and group2.

In each of these cgroups, custom parameters are set for each subsystem:

cpu — the cpu.shares parameter determines the share of CPU resources available to each
process in all cgroups. Setting the parameter to 800 and 200 in group1 and group2 cgroups respectively means that processes started in these groups will split the resources with a 4:1 ratio. Note that when a single process is running, it consumes as much CPU as necessary no matter which cgroup it is placed in. The CPU limitation only comes into effect when two or more processes compete for CPU resources.

cpuacct — the cpuacct.usage="0" parameter is used to reset values stored in the cpuacct.usage and cpuacct.usage_percpu files. These files report total CPU time (in
nanoseconds) consumed by all processes in a cgroup.

memory — the memory.limit_in_bytes parameter represents the amount of memory that is made available to all processes within a certain cgroup. In our example, processes started in the group1 cgroup have 4 GB of memory available and processes in the group2 group have 2 GB of memory available. The memory.memsw.limit_in_bytes parameter specifies the total amount of memory and swap space processes may use. Should a process in the group1 cgroup hit the 4 GB memory limit, it is allowed to use another 2 GB of swap space, thus totaling the configured 6 GB.

2. Since we are dealing with user and groups ID's we can leverage the cgrulesengd daemon.

cgrulesengd is a daemon, which distributes processes to control groups. When any process changes its effective UID or GID, cgrulesengd inspects list of rules loaded from grules.conf file and moves the process to the appropriate control group.

To define the rules which the cgrulesengd daemon uses to move processes to specific cgroups, configure the /etc/cgrules.conf in the following way:

The above configuration creates rules that assign a specific system group (for example,
@group1) the resource controllers it may use (for example, cpu, cpuacct, memory) and a
cgroup (for example, group1) which contains all processes originating from that system group.
In our example, when the cgrulesengd daemon, started via the service cgred start command, detects a process that is started by a user that belongs to the group1 system group, that process is automatically moved to the /cgroup/cpu_and_mem/group1/tasks file and is subjected to the resource limitations set in the group1 cgroup.

3. Start the cgconfig service to create the hierarchy of cgroups and set the needed parameters in all created cgroups:

To test whether this setup works, execute a CPU or memory intensive process and observe the
results, for example, using the top utility. To test the CPU resource management, execute the
following dd command under each user in both group1 and group2:

The 2Gb limit is for the entire group. If you have 5 processes running by the limited user(s) they'll all use and share up to the max memory configured in that cgroup, regarding of how many processes or different users, as long as they all belong to the same group, e.g. group1.