chrooted ntpd in NetBSD

chrooting ntpd

As we explained in Securing Systems with
chroot, Part One, a daemon must run with an unprivileged user ID (UID) in
order to be safely chrooted. This is a problem, since many daemons need some
superuser privileges in order to operate. In some situations, superuser
privileges are only necessary during initialization, and it is possible to
switch to an unprivileged UID later. This is the case for named,
the Domain Name System (DNS) server from the Internet Software Consortium
(ISC). named needs superuser privileges in order to bind to UDP
port 53 (superuser privileges are needed on almost all Unix systems to bind to
ports lower than 1024). Once this is done, named is able to chroot
to a directory where the zone files are stored, and it can operate under an
unprivileged UID, typically the user named.

ntpd needs superuser privileges for two operations: binding to
UDP port 123 (at initialization time) and using time control system calls such
as adjtime(2)
and ntp_adjtime(2),
which are restricted to the superuser.

For the first operation, we could proceed as named does, first
binding to UDP port 123, then calling chroot(2) and setuid(2).
The problem is the second operation. To be able to chrootntpd
after initialization, we need a way to enable an unprivileged user to
control the system clock. Such a feature was introduced in NetBSD 1.6, with
the clockctl device.

The clockctl device

On NetBSD, the system clock can be affected through four different system
calls: adjtime(2), settimeofday(2),
clock_settime(2), and ntp_adjtime(2), the last
available only if the kernel was compiled with the NTP option.

The clockctl device introduces alternative entry points to
these system calls, through a special device file typically named
/dev/clockctl. The alternative entry points are done through ioctl(2)
system calls on the device file. ioctl(2) is a general purpose
system call that enables the user to perform a custom action on a file object.
We will see this in more depth in the next part of this article.

If a user has write access to /dev/clockctl, then he can use
the alternative entry points and can control the system clock. In order to
chrootntpd, we therefore just need to build a kernel with the
clockctl device driver and ensure that the unprivileged user under which
ntpd is running in the chroot jail has write access to
/dev/clockctl.

In order to be administrator-friendly, NetBSD 1.6 comes with
clockctl enabled in GENERIC kernels--the
/dev/clockctl file is installed by default, and the startup
scripts already know about clockctl. Therefore, the system
administrator just has to add one line to /etc/rc.conf. Here are
the relevant lines from /etc/defaults/rc.conf:

# To run the ntpd(8) NTP server as an unprivileged user under a
# chroot(2) cage, uncomment the following, after ensuring that:
# - The kernel has "pseudo-device clockctl" compiled in
# - /dev/clockctl is present
#
#ntpd_chrootdir="/var/chroot/ntpd"

The next part of this article is more developer-oriented. It deals with the
implementation details of the chrooted ntpd. In the next two
sections, we will focus on the userland modifications that were required in
order to provide a chrootable ntpd, and we will discuss the
implementation details of the clockctl device driver.

Userland Modifications: libc

Our goal was to make modifications as minor as possible in the NTP daemon.
We especially did not want to introduce a new Application Programming Interface
(API). This goal was achieved at the expense of introducing some magic into
NetBSD's libc.

When a user program is built, each system call is turned into a library
call to a function in the libc known as the system call stub. The
function does the actual system call, and may do some additional handling for
backward compatibility. The stubs that do more than just the system call have a
C source file associated with them. They are listed in the SRC
variable in src/lib/libc/sys/Makefile.inc.
For an example of a system call stub that does additional handling, see src/lib/libc/sys/lseek.c.

On the other side, some system call stubs are utterly void; they only do
the system call. In this case, the source file for the system call stub is
automatically generated. These are listed in the ASM variable in
src/lib/libc/sys/Makefile.inc. An autogenerated stub looks like
this:

#include "SYS.h"
RSYSCALL(chdir)

Once generated, this file is src/lib/libc/chdir.S. The curious
reader will look for the definition of the RSYSCALL macro, which
is contained in src/lib/libc/arch/powerpc/SYS.h
for PowerPC ports, for instance. The macro provides the few assembly lanugage
instructions needed for the system call to set errno on error.

Before the clockctl implementation, adjtime(2),
clock_settime(2), settimeofday(2), and
ntp_adjtime(2) were implemented as the simple system call stubs.
This has been changed in order to check for the existence and accessibility of
/dev/clockctl.

The code is nearly identical for the four system calls. It can be found for
settimeofday(2) in src/lib/libc/sys/settimeofday.c.
It performs roughly the following checks:

Are we running with root UID? If we are, use the system call. Root has no
reason to use clockctl.

If we are not running with root UID, try to open /dev/clockctl
and use the ioctl(2) to perform the settimeofday
operation.

This turns each call to settimeofday(2) into several system
calls: getuid(2),
open(2), and
ioctl(2). For the sake of performance, we have a keep-state
feature, so that libc can remember if a process has already used
clockctl. This is done using the __clockctl_fd
variable. This variable is carried by libc but it behaves exaclty
like a global variable for the process. Of course, each process has its own
__clockctl_fd.

__clockctl_fd describes the state of the process regarding
clockctl:

-2 means that the process never called
settimeofday(2), adjtime(2),
clock_settime(2), or ntp_adjtime(2). This is the
value at initialization time.

-1 means that the process should not use
clockctl.

Any other value is the file descriptor we got when opening
/dev/clockctl.

On the first call to one of our four system call stubs, if UID is root,
__clockctl_fd is immediatly set to -1. Otherwise, we
attempt to open and use /dev/clockctl. Should this attempt fail,
__clockctl_fd is set to -1. If it succeeds, then
__clockctl_fd keeps the file descriptor returned by
open(2). Future calls to the stub will use clockctl.

When __clockctl_fd is -1, the real system call is
always used.

We end up with an implementation where the API for ntpd and
other processes did not change. When the user process attempts to do a system
call, we intercept it at the libc level and use either
clockctl or the actual system call. This is nice, but the drawback
is that we introduce some black magic in libc, which is not a nice
solution. The good point is that since we did not change anything in the API,
we can replace this black magic with anything else without disturbing user
processes. For instance, if we ever introduce capabilities in NetBSD, we can
revert to a void system call stub without ntpd being affected.