The capng_lock() call sets the SECURE_NO_SETUID_FIXUP and SECURE_NOROOT
bits on the process. This prevents the kernel granting capabilities to
processes with an effective UID of 0, or with setuid programs. This is
not actually what we want in the container init process. It should be
allowed to run setuid processes & keep capabilities when root. All that
is required is masking a handful of dangerous capabilities from the
bounding set.
* src/lxc/lxc_container.c: Remove bogus capng_lock() call.
---
src/lxc/lxc_container.c | 11 +++++------
1 files changed, 5 insertions(+), 6 deletions(-)
diff --git a/src/lxc/lxc_container.c b/src/lxc/lxc_container.c
index c77d099..023d553 100644
--- a/src/lxc/lxc_container.c
+++ b/src/lxc/lxc_container.c
@@ -694,12 +694,11 @@ static int lxcContainerDropCapabilities(void)
return -1;
}
- /* Need to prevent them regaining any caps on exec */
- if ((ret = capng_lock()) < 0) {
- lxcError(NULL, NULL, VIR_ERR_INTERNAL_ERROR,
- _("Failed to lock capabilities: %d"), ret);
- return -1;
- }
+ /* We do not need to call capng_lock() in this case. The bounding
+ * set restriction will prevent them reacquiring sys_boot/module/time,
+ * etc which is all that matters for the container. Once inside the
+ * container it is fine for SECURE_NOROOT / SECURE_NO_SETUID_FIXUP to
+ * be unmasked - they can never escape the bounding set. */
#else
VIR_WARN0(_("libcap-ng support not compiled in, unable to clear capabilities"));
--
1.6.2.5