Wednesday, November 17, 2010

I'm happy to announce that my book, The Linux Programming Interface (TLPI), is now available. TLPI is a detailed guide and reference for system programming on Linux and UNIX systems, 1552 pages in length, with 115 diagrams, 88 tables, nearly 200 example programs, and over 200 exercises.

Sunday, November 7, 2010

While looking at the new prlimit() system call in Linux 2.6.36, I surveyed the various system calls that allow one process to change the operation or attributes of another (arbitrary) process. In general, these system calls require either that the caller is privileged (i.e., has some capability) or that there is a match between the credentials (user or group IDs) of the calling process and the target process.

There's a great deal of inconsistency. As at 2.6.36, here's what we have (in the following, uid means the real UID of the caller, euid means the effective UID, and suid means the saved set-user-ID; a similar convention applies for the group IDs--thus gid, egid, sgid; and a "t-" prefix means the corresponding credentials of the target process):

setpriority(), sched_setscheduler(), sched_setparam(), sched_setaffinity(): CAP_SYS_NICE || euid == t-uid || euid == t-euid. This is sane: you can make changes to another process if you have the right capability or you own the process--that is, you (i.e., here "you" means the UID currently operated via the effective UID) can change the attributes of a process that was originally created by you (euid == t-uid) or one that has assumed (via the set-user-ID mechanism) your identity (euid == t-euid). POSIX specifies that the checks for setpriority() are uid == t-euid || euid == t-euid; the Linux semantics are arguably saner (and are consistent with historical BSD behavior). POSIX specifies sched_setscheduler() and sched_setparam() but does not specify their permission-checking semantics.

prlimit(): CAP_SYS_RESOURCE || (uid == t-uid && uid == t-euid && uid == t-suid) && (gid == t-gid && gid == t-guid && gid == t-sgid). Now we start to get into strange territory. Using CAP_SYS_RESOURCE makes sense, because CAP_SYS_RESOURCE is used for the privilege checks in the setrlimit() system call. However, requiring that all of the UIDs of the target match the real UID of the caller is quite inconsistent with any of the other APIs. Adding an analogous check for the group IDs further compounds the inconsistency.

One thing to note: the behavior of most of the Linux-specific system calls (i.e., ioprio_set(), move_pages(), migrate_pages(), and prlimit()) was documented only after the implementation, which I'd argue was a contributing factor to the inconsistencies described above.