Unix System Call Fork() (Create new Processes)

This is an article on Unix System Call Fork() (Create new Processes) in C.

The fork() unix API provides mechanism to spawn a new process from an existing process. This function is called once in a program but it returns twice. Once in parent process and once in its child process(the new spawned process becomes the child process).The return value of this function in child is '0' while in parent is the PID(process ID) of the child. The reason of the above stated behavior is since child process can always call function getppid() to get the PID of its parent so the return of fork() in child is zero. On the other hand since there is no way for a parent process to find the PID of its child (also a parent can have various children) so fork() returns the PID of the child process in the parent.

Example

Here is a sample program that will make you understand all the explanation above :

In the above program, we fork() a child process and save its PID in 'child_pid' variable. The fork() function returns twice, once in parent and once in child. The program prints all the required information to make you understand the concept of fork() practically.

Some Useful Information

It is to be noted that child gets a copy of the parents memory. ie, it gets copy of the parents data space, heap and stack. The parent and child do not share these memories any more. They share only the text segment.

In the above program, we introduced two variables : 'local' and 'global'. The variable 'local' is on stack while the variable 'global' is in the data segment. Now, the program increments these two variables in child process while no increment to these is done in the parent process. Here is the output :

Here we see that the child's variables have incremented values (local = 1, global = 1) while parent's variables have the original value (local = 0, global = 0). So this verifies the statement above that said that both child and parent have different set of stack, heap and global memory segments.

The point to be noted is that current implementations do not strictly perform a complete copy of parents data, stack and heap memory because of a mechanism known as 'Copy-on-write' (COW). These regions are shared by both the child and parent until and unless one of them try to write on any of these memory regions. Its at the first write by either parent or child is when the copy of parents memory regions is made available for child for future usage. Hence the term 'Copy-on-Write'.

Another point is that whether the child starts executing first or the parent, this is decided by the kernel's process scheduling mechanisms. So In case the child and parent are reading-writing from shared file or other shared source then always use some process synchronization techniques (like semaphore) to make sure that read-writes are occurring as desired.

Conclusion

The fork() function is an extremely useful API for creating child processes from a process. The original process becomes parent while the fork(ed) processes become the child. Every child shares only the text segment of memory with parent while has its own copy of stack, heap and global segments.