Supervision questions: Operating Systems

This is the set of questions for my supervisions in Operating Systems. I will typically email you a list of question numbers before each supervision, but if not, attempt the next two or three. (More questions will appear here as I set them, so come back if you want to make an early start on some of the future work.)

Administrativa & handing in work

I expect you to make a good attempt at producing solutions to the relevant questions before each supervision. I prefer submissions by email (PDF or text format). Please submit your work 24 hours before the supervision. If you want to submit a hard copy of the work to student administration, please hand it in before 17:00 two days before the supervision (i.e. before Wednesday, 17:00 for a Friday afternoon supervision) as I will have to scan it. Remember that Student Administration is closed on weekends.

When emailing me regarding supervisions, please only use my lab address, or your email will be misfiled and may slip by unnoticed:

The mark allocation (whilst very approximate) should give you a rough idea of how you should divide your time between the questions, as well as how much credit I expect a similar question to be worth in the exam. It will also serve as a guide for me when marking the questions.

Some of these questions are asking you to write code, or run things in a shell. In his case, please submit the source code (for coding questions), and a transcript of running it (in case of both coding and shell questions), as shown in the example below:

If there is a particular part of the course you would like explaining, or questions you have about the lectures (independent of whether they are covered by the questions or not), please let me know in an email before the supervision so that I can prepare appropriately.

Supervision 1: Basics, Scheduling and Virtual Memory

Written exercises and exam questions

Give three reasons why multi-programming is a necessity in modern computers. [3 marks]

What are the disadvantages of a monolithic operating system? Why do you think this kind of system was widely used for several decades nonetheless? Given examples for monolithic operating systems, and list the features they were missing compared to modern OSes. [5 marks]

What is the difference between user mode and kernel mode? What mode do you think a device driver would run in (contrasting between different types of operating systems if necessary)? [3 marks]

Explain the concept of a kernel-based OS and a microkernel OS. Which concept do you think is more common today? [3 marks]

Imagine you have a web browser running in your kernel-based operating system, and assume we are not using DMA. Now, while you are reading a website, an email is coming in, thus generating a sequence of incoming network packets. How does the operating system handle this? Describe what happens, including details of the context switches necessary. [5 marks]

Give examples of types of applications that are likely to be CPU-bound and ones that are likely to be I/O-bound. [2 marks]

What is the difference between a non-preemptive and a pre-emptive CPU scheduler and what problem does the latter solve? [3 marks]

Briefly give a high-level description of the concept of virtual memory. Why is it necessary? [Hint: consider the address binding problem for portable binaries] [3 marks]

Practical Exercises

Download the VirtualBox program from the VirtualBox download page and install it. In this exercise, you are going to build your own Linux kernel and run the virtual machine off it. In later exercises, we will also build a custom kernel module that you can load into your kernel.

Download the "plain" virtual machine image from here, and extract the archive into the folder that holds your virtual machines. Under Linux, this is $HOME/.VirtualBox/Machines/, where $HOME is your home directory. Under Mac OS X, this is $HOME/Library/VirtualBox/Machines/. It can, in any case, be configured in the preferences screen of VirtualBox, in case you do not want the VMs in your home directory (which is reasonable!). The image you downloaded is a virtual machine with Debian Etch (5.06) installed, but nothing else. In VirtualBox, use the "Import appliance..." option in the "File" menu to import the VM.

Alternatively: Set up a new Linux 2.6 virtual machine yourself using the wizard. You want about 512 MB RAM for it, and a harddisk size of at least 10 GB (kernel compilation takes up a lot of space, albeit temporarily!). Then install Debian Etch (5.06) on it using the minimal CD image (make sure you get the right version: i386 for 32-bit machines, amd64 for 64-bit machines, even if they are Intel ones!). In most cases, however, it is going to be easier to just use the image provided above, so only do this if you are feeling brave!

Log into your virtual machine using either the console in VirtualBox, or the ssh command in a terminal from a Linux or Mac machine, or use PuTTY for Windows. If you used the image above, your virtual machine is listening for SSH connections on port 22, on IP 10.0.2.15 (depending on your VirtualBox configuration). In either case, log in as root using the password sent to you in an email.Once you have logged in, let's first find out what kernel we are running. In the VM, run the command

# uname -a

What does the output tell us about

the kernel version in use,

the architecture of the machine,

the age of the kernel?

[4 marks]

Next, we need to fetch the kernel source code and unpack it. To get it, run

# apt-get install linux-source-[kernel-version]

(Replace [kernel-version] by the kernel version you found earlier.). Now we need to unpack the source code:

# tar -xjvf linux-source-[kernel-version].tar.bz

This will end up printing a very long list of filenames as the source files are being extracted. Once the extraction is done, let's find out how big the source for the Linux kernel is. Run

# du -h --max-depth=1 linux-source-[kernel-version]

Bonus question: Using the sloccount utility (install using # apt-get install sloccount), find out how many lines of source code there are in the Linux kernel and in each of the subfolders. What is the dominant programming language used? [4 marks]

We now have the unpacked kernel source code. In order to build the kernel, we also have to install a compiler and some tools required, though. We do this as follows:

Now that we have a compiler, we can actually get started on building the kernel. The first step we need to do is to configure it though – in other words, we need to tell the kernel what modules and drivers we would like to include. To make our job easier, we copy the configuration from the default kernel that we are already running on:

Now we would like to review the configuration, and maybe change some module configurations. To do so, we use a semi-graphical frontend called "menuconfig":

# make menuconfig

Inside the configuration menu, select "General setup", navigate to the option "Local version – append to kernel release", press Enter to modify it and set it to "yourname-test1". Exit menuconfig by selecting the "Exit" button using tab and enter keys (you may have to do this several times in order to get back up to the top level). When asked if you would like save your changes, say "Yes".We are now read to build the kernel. To do so, run the following two lines:

The first line will clear any remains of previous build attempts, and the second one will run the actual kernel build and make a binary kernel image. This is going to take a very long time! Once it has finished building the kernel (hopefully without errors), it will display the time it took. What is it? [2 marks]

If the kernel built without errors, we finally need to install it on the machine. In the previous step, we built an installable kernel package, which we can now install using the dpkg tool. First, let's find out what the image is called:

# cd ..
# ls

Look for a .deb file with the string linux-image in the name. Then we install this:

# dpkg -i linux-image-[some-string].deb

This is, again, going to take a while. It will generate an initial ramdisk ("initrd"), install the kernel and update the boot loader settings.

Once the installation has finished without error, we can reboot into our new kernel. To do so, run

# reboot

and wait for the system to come up again. When you see the GRUB boot menu (usually white on blue background), it should list your new kernel (though it may not be the default option). Select it using the arrow keys and boot it by pressing the enter key. Once you have logged in, running

# uname -a

should now return your custom kernel's signature!

Congratulations, you have just compiled and installed you own custom kernel! :-)

Supervision 2: Virtual Memory cont., I/O and Access control

Written exercises and exam questions

Using a block diagram, outline three kinds of memory every process requires, and explain how dynamic memory is broken down into two different spaces itself. [3+1 marks]

Explain what a page fault is, why the happen and how they are handled. [5 marks]

When running a virtual machine (e.g. using VirtualBox) on a machine, performance degrades super-linearly after a certain point in the VM's RAM share. This can lead to counter-intuitive effects such as a VM with 1.5 GB RAM allocated running much slower than one with 512 MB. Why could this be the case? (Assume a system with a total of 2 GB of RAM) [5 marks]

Which page replacement scheme gives optimal performance? How can it be implemented in practice? Explain how the decision on whether to retain or swap out page is made under it. [6 marks]

Why do some people argue that segmentation is a better approach to virtual memory than paging? How is segmentation implemented? What is its downside and how can it be combined with paging to resolve this? [10 marks]

What are the differences between polled mode I/O and interrupt-driven I/O, and the difference between blocking, non-blocking and asynchronous I/O? What kind of I/O would you expect a harddrive to use? [6 marks]

[Optional bonus question: On a linux system (such as your VM), install the atsar utility (Debian package atsar), and run it using atsar -p 60 10 > foo.txt & command. The first number is the time between samples being taken, and the second number is the number of samples (i.e. in the example, 1 sample per minute will be taken, and a total of 10 samples). Have a look at foo.txt afterwards. How many pages are being brought in per second? (this is equivalent to the number of page faults!)] [5 marks]

Practical Exercises

This exercise assumes that you have a working version of VirtualBox and the virtual machine used for last supervision's practical exercises. In this exercise, you are going to build your own custom kernel module that you can load into your Linux kernel.

Log into your virtual machine using either the console in VirtualBox, or the ssh command in a terminal from a Linux or Mac machine, or use PuTTY for Windows. If you used the image above, your virtual machine is listening for SSH connections on port 22, on IP 10.0.2.15 (depending on your VirtualBox configuration). In either case, log in as root using the password sent to you in an email, or your own, if you've changed it.

Depending on whether you are running your own kernel that you built last time or whether you are running the Debian stock kernel, the next steps are slightly different:

For your own kernel, you first have to build the "kernel headers". These are a bunch of C header files that define the API of your kernel, so that modules can be compiled against it. To build them, go into your kernel source directory and run make-kpkg:

If you are using the stock Debian kernel, you can simply fetch the headers by running

# apt-get install linux-headers-[kernel-version]

The next step is to write the source code for our new kernel module. As you know, most of the kernel is written in the C programming language, so we will be using this. If you are not familiar with C, do not worry: You won't be expected to know the exact syntax. Just being able to transcribe the code should work just fine.

Make a directory for your new module somewhere (/root/ should be a good place) and then run the nano editor to write the code:

# mkdir testmodule
# nano helloworld.c

This is going to bring up a black-and-white editor. You do not need to to know much about nano in order to be able to use it; most key bindings are listed at the bottom of the screen. A prefix ^ means "hold Ctrl", so e.g. ^X Exit means that you should press Ctrl-X to exit.

Once you are in nano, type in the following source code (N.B.: Make sure that you transcribe it exactly as given, including all semi-colons, brackets etc. – if you don't, you will be getting weird compilation errors that you may not understand!):

What do you think the purpose of having a init and an exit function is? When do you think they might be called? [3 marks]

Exit nano by pressing Ctrl-X and then pressing Y to confirm that you would like to save the file. We are now almost ready to compile the kernel module.

However, as module compilation frequently takes a very large number of parameters that would be tedious to specifiy on the command line, kernel modules are compiled using so-called "makefiles". These are basically scripts that will automatically call the C compiler with the correct options. In our case, the makefile is quite simple. To create it (note that makefiles are almost always called Makefile, run

Note that the indentation of the make ... lines is using a tabulator rather than spaces. Makefiles are very peculiar about the correct indentation, so make sure that you get this right!Other things to note: the bit saying $(shell uname -r) will be expanded to the output of the uname -r command at compile time (try the command for yourself to find out what it does!), and $(PWD) is just a pre-defined shell variable that always points to the current directory. In other words, we are telling the compiler that the module directory is the current directory, and we also specify where it finds the kernel module headers.

Exit nano and save the makefile. Now it's time to build and test our module! To do so, simply run

# make

If no errors are reported, our module has built successfully and its binary code has been written to a file called helloworld.ko ("O" because it is an object file, and "K" because it is a special kernel object file). To have a look at the binary code, run

# hd helloworld.ko

which will produce a hex-dump of the module binary. What are the values of the first four bytes? [2 marks]

Now let's actually load the module into the running kernel. To do so, run

# insmod ./helloworld.ko

and ignore the message about the license tainting the kernel (we will fix that shortly). What is the output that the module generates? [3 marks]

We can list all currently loaded kernel modules by running the lsmod command. To check if ours is also amongst them, let's use the grep command to filter the output:

# lsmod | grep hello

What output does this produce? [2 marks]

To remove (unload) the module again, run

# rmmod ./helloworld.ko

The final step of this exercise is to get rid of the license warning when loading the module. Because Linux is an open-source operating system kernel, there are certain rules applying to the licenses kernel modules can be under. Because we have not specified a license, the kernel is suspicious of our module and complains. To fix this, open the module source file (helloworld.c) again and insert the following lines after the last #include ... line (this will put our module under the GNU General Public License):

These are a set of C preprocessor macros (don't worry about what that means, you'll learn in second year!) that the compiler will magically expand into bits of code that define the correct module parameters. Exit nano again, and compile your module by typing make. Once the module has compiled without errors, you can use the modinfo command to check that the new information is actually in there:

# modinfo helloworld.ko

What is the output from this command? [2 marks]

Try loading the module into the kernel again. This time, no warning message should come up :-)

[Bonus exercise for the keen: If you are interested in more advanced kernel module building, you could have a look at this page, which tells you how to build your own character device that you can read from at /dev/hello. However, be warned that kernel hacking very quickly gets very hairy and can suck up huge amounts of time!]

Supervision 3: UNIX and Windows NT case studies

Written Exercises and Exam Questions

What is, in the context of persistent data storage in an operating system, a directory service and why is it necessary? What is the abstraction that directory services use nowadays? [5 marks]

Describe the difference between access control lists and capabilities as protection primitives. Which of them is in use in most modern operating systems and why? [5 marks]

In one sentence each, describe the philosophies of the UNIX and Windows families of operating systems. [2 marks]

Practical Exercises

This exercise assumes that you have a working version of VirtualBox and the virtual machine used for last supervision's practical exercises, or another Linux installation that you can use. In this exercise, you are going to build your own shell for user interaction with the operating system.

Log into your virtual machine using either the console in VirtualBox, or the ssh command in a terminal from a Linux or Mac machine, or use PuTTY for Windows. If you used the image above, your virtual machine is listening for SSH connections on port 22, on IP 10.0.2.15 (depending on your VirtualBox configuration). In either case, log in as root using the password sent to you in an email, or your own, if you've changed it.

The first step is to write some very basic soruce code for our shell. We will write our shell in the C programming language. If you are not familiar with C, just follow the instructions and use Google to work things out if they are not explained.

Make a directory for your shell somewhere (/root/ should be a good place). Conventially, shells are named [author's last name initial]sh, so mine might e.g. be called ssh – except that is already the name of an existing Linux program, so I'll call it msh instead (actually, msh is also a linux program, and a shell no less, but its rarely used nowadays, as opposed to ssh). Now, after making a directory for the course code, let's run the nano editor to write the code:

# mkdir msh
# nano msh.c

Once you are in nano, type in the following source code (N.B.: Make sure that you transcribe it exactly as given apart from the blue bits, including all semi-colons, brackets etc. – if you don't, you will be getting weird compilation errors that you may not understand!):

Note that the indentation of the make ... lines is using a tabulator rather than spaces. Makefiles are very peculiar about the correct indentation, so make sure that you get this right!Other things to note: with this makefile, you can use make clean to delete any compiled outputs (e.g. to make sure they are regenerated). make all will compile object files and link them into an output binary, and just make is an alias for make all.

Now it's time to build and test our shell! To do so, simply run

# gcc -o mshmsh.c

if you did not use a makefile, and

# make

otherwise. If no errors are reported, the code has built successfully and its binary code has been written to a file called msh. Run this by typing ./msh What does it output? [2 marks]

Now that was a bit boring, so let's make this into a real shell with a prompt. To do so, open your source code file (e.g. msh.c), and change it to contain:

Close the editor, compile again and run the output binary again. What is its behaviour now? [Hint: Ctrl-D sends the EOF character.] [3 marks]

Finally, we need to put the two examples above together in order to create a shell that has both a prompt and can execute commands. To do this, we need to:

Capture the command the user is typing in.

Handle actual execution of the command.

Handle the case of an unknown command being entered.

To capture the command, we save it into a string as the user is typing it in. To do so, first add the following header inclusions after the last #include ... line:

#include <string.h>
#include <sys/wait.h>

Now, we need to set some memory aside to store the string that will hold the command while it is being entered. We do this using the malloc() standard library function. After the char c = '\0'; line, add:

char *tmp = (char *)malloc(sizeof(char) * 100);

We also need to replace the if-statement that checks if we have a newline character or something else by the following switch statement:

Now what's going on here? For each character entered, we check if it is a newline character (in which case the user has finished entering the command and wants to run it) or another character (in which case we need to store the character in the tmp string). If the command is finished, we call fork() to create a new process and use execvp to run the command in it. Note that the forked process will terminate once the command has finished running, but that the original process will continue running and display the prompt again once the command has finished. Why do you reckon we are calling the bzero() function after running the command, and what do you think it might do? What would happen if we did not call it? [3 marks]

Compile and run the finished shell. You should now be able to run commands like ls, nano and uname in it. You can even try running your shell from within your shell by typing ./msh inside it. What happens in that case? [1 mark]

Obviously, this is a very simple shell, and it is missing many features. What are the missing features that you can spot? [Hint: consider for example command history, command editing, passing arguments, handling and escaping quotes, output redirection, pipes etc.] [5 marks]

[Bonus exercise for the keen: Many of the missing features are not that hard to add. Investigate things like the readline library, the flex/bison parser generators (see example files here and here), ANSI coloured-prompts, your own shell scripting language, ...]