Sukadev Bhattiprolu wrote:> === NEW CLONE() SYSTEM CALL:> > To support application checkpoint/restart, a task must have the same pid it> had when it was checkpointed. When containers are nested, the tasks within> the containers exist in multiple pid namespaces and hence have multiple pids> to specify during restart.> > This patchset implements a new system call, clone2() that lets a process> specify the pids of the child process.> > Patches 1 through 6 are helper patches, needed for choosing a pid for the> child process.> > Patch 8 defines a prototype of the new system call. Patch 9 adds some> documentation on the new system call, some/all of which will eventually> go into a man page.>

[...]

> > Based on these requirements and constraints, we explored a couple of system> call interfaces (in earlier versions of this patchset) and currently define> the system call as:> > struct clone_struct {> u64 flags;> u64 child_stack;> u32 nr_pids;> u32 parent_tid;> u32 child_tid;

So @parent_tid and @child_tid are pointers to userspace memory andrequire 'u64' (and it won't hurt to make @reserved1 a 'u64' as well).

> u32 reserved1;> u64 reserved2;> };>

Also, for forward/backward compatibility, explicitly state in thedocumentation, and enforce in the kernel, that flags which are notdefined must not be set, and that reserved{1,2} must remain 0.