Owner

Current status

Detailed Description

The kernel cgroups capability provides a general purpose framework for resource management of processes. Both libvirt and systemd make use of cgroups and resource tunables to be set and the libcgroup project provides some command line tools for managing croups. Despite this, however, it is difficult/impossible for an admin to use these to do effective system partitioning. The guidelines on how applications should co-operate in use of the cgroups hiearchy are flawed, leading applications to setup a hiearchy that cannot be used to do system partitioning. Another part of the problem is that there is no single high level view for the admin across all the "objects" using cgroups - the cgroups tools themselves are too low level, focusing on controllers & processes, rather than concepts like "resource groups", 'system service', 'virtual machine', etc.

This feature will seek to address the manageability problems with cgroups to facilitate system partitioning / workload isolation. The guidelines on using the cgroup hiearchy will be updated to allow a more practical hiearchy to be created by applications like systemd/libvirt out of the box. A library and corresponding command line tool will be written to provide a high level view of objects needing resource management. This will allow an administrator to easily define resource groups with different performance characteristics and control placement of system services and virtual machines into those groups.

Follow on work will also look at how to extend the controls to user sessions, and ad-hoc commands which may be invoked by users, as well as other non-cgroup based resource tunables under sysfs / procfs.

Benefit to Fedora

Fedora administrators will be able to easily configure their machine to manage performance requirements of different workloads. Administrators will have a single view of what services / virtual machines / user session are present and the resource groups they are placed in / associated with. They will be able to move managed objects between resource groups and create new resource cgroups with specific performance characteristics. Administrators will no longer need to know about low level details of cgroups controllers, nor the wildly differing ways to configure cgroups for system services / virtual machines / user sessions.

Scope

PaxControlGroups will be updated to provide new guidelines on usage of the cgroups hiearchy

Systemd will be updated to follow the new guidelines (probably not required, based on current planed changes to guidelines)

Libvirt will be updated to follow the new guidelines (significant change to way libvirt sets up the hierarchy for VMs)

Creation of a new library providing a single view of resource groups / systemd services / virtual machines / containers

Creation of a new command line tool exposing the library functionality via shell

How To Test

Define some scenarios that it must be possible to implement using the new tool

Ensure fairness between different users

On a multi-user system, the administrator wants to prevent a single user
from monopolizing the resources at the expense of other users. If the
users are not contending, they should be allowed to use as much resource
as they need. When they contend for a resource though, the system should
ensure fair access between users.

Ensure isolation between different users

On a multi-user system,the administrator wants to ensure that each user
has a consistent resource ceiling, regardless of what other users workloads
are. The users will need to be capped in their usage, regardless of whether
they are contending for a resource

Partition the system between different users

On a multi-user system,the administrator wants to allocate a specific
portion of the resources to each user. The users will only be allowed
to use the specific resources that have been assigned to them.

Prioritize response-time critical workloads over batch jobs

On a system running multiple distinct processes required by an application,
some processes are response-time critical, while other processes are offline
batch processing. The response-time critical processes must get prioritized
access to resources, at the expense of batch processes.

Protect the host OS services at the expense of virtual machines

On a system running KVM virtual machines, critical host OS services
such as sshd, libvirtd, systemd, journald, auditd must be protected
at the expense of virtual machines. A rogue KVM virtual machine must
not be able to negatively impact operation of the host OS services,
and thus negatively impact operation of other virtual machines

Ensure the desktop remains responsive for users

On a desktop system, administrative jobs such as reindexing
man pages, prelinking, or locate database updates must not impact
responsiveness for users.

Delegation of resource allocation to users

On a virtualization host, the system can be split into
a number of resource groups. Each group can be assigned to
a company department. The department's users can then provide
resource fairness or isolation policies for virtual machines
they run.

User Experience

The default cgroups hiearchy created by libvirt will be different from previous Fedora releases. Other aspects of the OS should be mostly unchanged in an out of the box configuration. If they choose to use the new tools, they will be able to configure resource constraints with greater ease.

Dependencies

In addition to development of the new library/command & its package review, this likely requires work to be completed in both systemd and libvirt and discussions with relevant upstream communities

Contingency Plan

Neither systemd or libvirt will have any direct dependency on the new library/tool, so in the event that the development isn't complete they should not be negatively impacted. The administrator should be in no worse position wrt cgroups management than they were with previous Fedora releases.