Seccomp security profiles for Docker

Estimated reading time:
7 minutes

Secure computing mode (seccomp) is a Linux kernel feature. You can use it to
restrict the actions available within the container. The seccomp() system
call operates on the seccomp state of the calling process. You can use this
feature to restrict your application’s access.

This feature is available only if Docker has been built with seccomp and the
kernel is configured with CONFIG_SECCOMP enabled. To check if your kernel
supports seccomp:

$ grep CONFIG_SECCOMP= /boot/config-$(uname -r)CONFIG_SECCOMP=y

Note: seccomp profiles require seccomp 2.2.1 which is not available on
Ubuntu 14.04, Debian Wheezy, or Debian Jessie. To use seccomp on these
distributions, you must download the latest static Linux binaries
(rather than packages).

Pass a profile for a container

The default seccomp profile provides a sane default for running containers with
seccomp and disables around 44 system calls out of 300+. It is moderately
protective while providing wide application compatibility. The default Docker
profile can be found
here).

In effect, the profile is a whitelist which denies access to system calls by
default, then whitelists specific system calls. The profile works by defining a
defaultAction of SCMP_ACT_ERRNO and overriding that action only for specific
system calls. The effect of SCMP_ACT_ERRNO is to cause a Permission Denied
error. Next, the profile defines a specific list of system calls which are fully
allowed, because their action is overridden to be SCMP_ACT_ALLOW. Finally,
some specific rules are for individual system calls such as personality,
socket, socketcall, and others, to allow variants of those system calls with
specific arguments.

seccomp is instrumental for running Docker containers with least privilege. It
is not recommended to change the default seccomp profile.

When you run a container, it uses the default profile unless you override it
with the --security-opt option. For example, the following explicitly
specifies a policy:

Significant syscalls blocked by the default profile

Docker’s default seccomp profile is a whitelist which specifies the calls that
are allowed. The table below lists the significant (but not all) syscalls that
are effectively blocked because they are not on the whitelist. The table includes
the reason each syscall is blocked rather than white-listed.

Syscall

Description

acct

Accounting syscall which could let containers disable their own resource limits or process accounting. Also gated by CAP_SYS_PACCT.

add_key

Prevent containers from using the kernel keyring, which is not namespaced.