@alex: I have a script executing multiple simulations (in different directories) and want to make sure no matter how bad the programming they can only access files in their own directory and not accidentally modify e.g. the output of the other simulations
–
Tobias KienzlerJan 25 '11 at 15:10

2

@Tobias: I got your point. chroot would naturally fit there, but then again you're not root.
–
alexJan 25 '11 at 16:12

1

I think that selinux, apparmor, and grsecurity, might be able to do this, but I'm not sure. but then if those aren't available or configured by the sys admin, you're sol on that.
–
xenoterracide♦Jan 25 '11 at 22:31

4

Such a question has been something for me to wonder about for years. It seems to be such a natural wish: without being root, to be able to run processes with some of your user's permissions discarded, i.e. to be able to confine a process to a user-setup "jail", which would give the process even less rights than your user has. It's a pity that the usual Unices haven't offered this standardly!
–
imz -- Ivan ZakharyaschevMar 11 '11 at 11:23

2

Try asking your system administrator to make you a second user account.
–
ultrasawbladeMar 13 '11 at 14:36

Known chroot-based isolation tools:

ptrace

Another trustworthy isolation solution (besides a seccomp-based one) would be the complete syscall-interception through ptrace, as explained in the manpage for fakeroot-ng:

Unlike previous implementations, fakeroot-ng uses a
technology that leaves the
traced process no choice regarding whether it will use
fakeroot-ng's "services" or
not. Compiling a program statically, directly calling the
kernel and manipulating
ones own address space are all techniques that can be trivially
used to bypass
LD_PRELOAD based control over a process, and do not apply to
fakeroot-ng. It is,
theoretically, possible to mold fakeroot-ng in such a way as to have
total control
over the traced process.

While it is theoretically possible, it has not been done.
Fakeroot-ng does assume
certain "nicely behaved" assumptions about the process being
traced, and a process
that break those assumptions may be able to, if not totally escape
then at least
circumvent some of the "fake" environment imposed on it by
fakeroot-ng. As such,
you are strongly warned against using fakeroot-ng as a
security tool. Bug reports
that claim that a process can deliberatly (as opposed to inadvertly)
escape fake‐
root-ng's control will either be closed as "not a bug" or marked as
low priority.

It is possible that this policy be rethought in the future. For
the time being,
however, you have been warned.

Still, as you can read it, fakeroot-ng itself is not designed for this purpose.

(BTW, I wonder why they have chosen to use the seccomp-based approach for Chromium rather than a ptrace-based...)

Of the tools not mentioned above, I have noted Geordi for myself, because I liked that the controlling program is written in Haskell.

Known ptrace-based isolation tools:

seccomp

One known way to achieve isolation is through the seccomp sandboxing approach used in Google Chromium. But this approach supposes that you write a helper which would process some (the allowed ones) of the "intercepted" file access and other syscalls; and also, of course, make effort to "intercept" the syscalls and redirect them to the helper (perhaps, it would even mean such a thing as replacing the intercepted syscalls in the code of the controlled process; so, it doesn't sound to be quite simple; if you are interested, you'd better read the details rather than just my answer).

(The last item seems to be interesting if one is looking for a general seccomp-based solution outside of Chromium. There is also a blog post worth reading from the author of "seccomp-nurse": SECCOMP as a Sandboxing solution ?.)

A "flexible" seccomp possible in the future of Linux?

There used to appear in 2009 also suggestions to patch the Linux kernel so that there is more flexibility to the seccomp mode--so that "many of the acrobatics that we currently need could be avoided". ("Acrobatics" refers to the complications of writing a helper that has to execute many possibly innocent syscalls on behalf of the jailed process and of substituting the possibly innocent syscalls in the jailed process.) An LWN article wrote to this point:

One suggestion that came out was to
add a new "mode" to seccomp. The API
was designed with the idea that
different applications might have
different security requirements; it
includes a "mode" value which
specifies the restrictions that should
be put in place. Only the original
mode has ever been implemented, but
others can certainly be added.
Creating a new mode which allowed the
initiating process to specify which
system calls would be allowed would
make the facility more useful for
situations like the Chrome sandbox.

Adam Langley (also of Google) has
posted a patch which does just that.
The new "mode 2" implementation
accepts a bitmask describing which
system calls are accessible. If one of
those is prctl(), then the sandboxed
code can further restrict its own
system calls (but it cannot restore
access to system calls which have been
denied). All told, it looks like a
reasonable solution which could make
life easier for sandbox developers.

That said, this code may never be
merged because the discussion has
since moved on to other possibilities.

This "flexible seccomp" would bring the possibilities of Linux closer to providing the desired feature in the OS, without the need to write helpers that complicated.

namespaces (unshare)

Isolating through namespaces (unshare-based solutions) -- not mentioned here -- e.g., unsharing mount-points (combined with FUSE?) could perhaps be a part of a working solution for you wanting to confine filesystem accesses of your untrusted processes.

More on namespaces, now, as their implementation has been completed (this isolation technique is also known under the nme "Linux Containers", or "LXC", isn't it?..):

It's even possible to create a new user namespace, so that "a process can have a normal unprivileged user ID outside a user namespace while at the same time having a user ID of 0 inside the namespace. This means that the process has full root privileges for operations inside the user namespace, but is unprivileged for operations outside the namespace".

and special user-space programming/compiling

But well, of course, the desired "jail" guarantees are implementable by programming in user-space (without additional support for this feature from the OS; maybe that's why this feature hasn't been included in the first place in the design of OSes); with more or less complications.

The mentioned ptrace- or seccomp-based sandboxing can be seen as some variants of implementing the guarantees by writing a sandbox-helper that would control your other processes, which would be treated as "black boxes", arbitrary Unix programs.

Another approach could be to use programming techniques that can care about the effects that must be disallowed. (It must be you who writes the programs then; they are not black boxes anymore.) To mention one, using a pure programming language (which would force you to program without side-effects) like Haskell will simply make all the effects of the program explicit, so the programmer can easily make sure there will be no disallowed effects.

I guess, there are sandboxing facilities available for those programming in some other language, e.g., Java.

@tobias But does Rootless Gobolinux give a guarantee that a program written by a user won't access the outer environment?..
–
imz -- Ivan ZakharyaschevOct 30 '13 at 9:57

1

Not really - I was somehow under the misconception that it would also allow one to become a "local" root user which then could simply create virtual users for such a process - though it might be possible to use its chroot, but that would probably still require real root privileges...
–
Tobias KienzlerOct 30 '13 at 10:02

This is a fundamental limitation of the unix permission model: only root can delegate.

You don't need to be root to run a virtual machine (not true of all VM technologies), but this is a heavyweight solution.

User-mode Linux is a relatively lightweight Linux-on-Linux virtualization solution. It's not that easy to set up; you'll need to populate a root partition (in a directory) with at least the minimum needed to boot (a few files in /etc, /sbin/init and its dependencies, a login program, a shell and utilities).