W4118 OPERATING SYSTEMS I

Spring 2009 -- Junfeng Yang

The written problems are to be done individually. The programming
problems are to be done in your assigned programming groups. All
homework submissions are to be made
via Courseworks. Refer
to the homework policy
section on the class web site for further details.

Written Assignment (30 pts)

Exercise numbers refer to the course textbook, Operating System
Concepts. Each problem is worth 5 points.

Write a code example in less than 20 lines of C code that
causes the lock order algorithm for deadlock detection (discussed in
class) to report a false positive.

Eraser uses per-thread locksets, i.e., the locks in a thread's
lockset are only those that the thread acquired. Of course, the
locks that are currently locked is a global property and do not
depend on which thread is running. What would happen if Eraser used
this global rather than per-thread lockset when checking for races?
Give a concrete code example in less than 20 lines of C.

Ch 5 Exercise 12

Ch 5 Exercise 15

Ch 5 Exercise 17

Describe how the time slice for a SCHED_NORMAL process can be
updated in Linux 2.6.11. Be sure to list the relevant procedures,
data structures, and variables; explain the meaning of
each.

Please complete and submit a private group
programming assignment evaluation. The evaluation should be in
a separate file called evaluation.txt that is submitted
with your individual written assignment submission.

Programming Assignment: Adding Fair Scheduling to Linux (70 pts)

Programming problems are to be done in your assigned groups using
copies of the virtual machine (VM) image already provided. For all programming problems
you will be required to submit source code, a README file documenting
your files and code, and a test run of your programs. In addition, you
should submit a cover sheet using either
homework_work.txt
or homework_nonwork.txt, depending on
whether or not the programming assignment is completely working or
not. For source code submissions, you only need to submit new source
code files that you created and kernel source code files that you
changed. You should clearly indicate your names, email addresses, and
assigned group number on your submission. Each group is required to
submit one writeup for the programming assignment.

You will build your own Linux kernel for this class. The image of
the kernel you build for this assignment should be
vmlinuz.hmwk4. Grading for this assignment will be done based on
vmlinuz.hmwk4. You will base your changes on the 2.6.11 kernel that you installed for Homework 2,
but without the modifications you made for the new systems calls
you wrote for Homework 2. In particular, submitted kernel patches should be
patched against your modified KDB-enabled kernel from Homework
2, i.e. do not include KDB code in your patch.

Description

The assignment is to add a fair share scheduler to the kernel.

The Linux scheduler repeatedly switches between all the running tasks on the system, attempting to
give a fair amount of CPU time to each task. Fair-share scheduling is a strategy that shares the CPU
among sets of processes according to the users that own those processes. For example, suppose that
Alice, Bob, and Carol belong to different users and are logged in to a machine that uses fair-share
scheduling. Suppose Alice has one runnable process, Bob has three, and Carol has six. Fair-share
scheduling views the ten runnable processes as belonging to the three users, and each user receives
one-third of the CPU cycles to allocate to the processes that it owns. So Alice's single process
would get about 33 percent of the available CPU cycles, each of Bob's three processes would get
roughly 11 percent of the available CPU cycles, and each of Carol's six processes would get about
5.5 percent of the CPU cycles.

Add a new scheduling policy to support fair-share scheduling. Call this policy UWRR, for user weighted
round-robin scheduling. UWRR should use fair-share scheduling based on each process's UNIX user
identification. At each invocation of the scheduler, we will use a hierachical scheme to choose which
task to run: first, a user is chosen, then, a task within that user's set of tasks is chosen. If we
allow each user's chosen task the same amount of time on the CPU, each user should be represented
equally.

Since we are making two choices in each scheduling decision, we need two algorithms: one to choose
the user, and one to choose one of that user's tasks. Let's start by assuming we will use a
Round-Robin scheme to decide which user to choose. We keep a queue of users, and whichever user
was chosen last time we scheduled, we choose the next user in the queue for this schedule. We then
choose a task within this user's set of tasks (more on this later), and let it run for a fixed
amount of time (the time quantum). Now every user gets an equal amount of CPU time.

Now let's say that some users are more equal than others. Imagine that we associate an integer,
which we'll call a weight, with each user, and when a user is chosen, we let the user's chosen tasks
run for W time quanta (instead of a single time quantum), where W is the user's weight. In this way
we can specify that some users get more CPU time than others, and also how much more. A user with a
weight of 3 will get 50% more CPU time than a user with weight 2. This is called proportional sharing.
More specifically, this implementation of proportional sharing is called Weighted Round-Robin.

We still haven't specified how a task is to be chosen once we choose a user. For simplicity, use
a simple round-robin (RR) scheduling algorithm to do that. The intra-user RR scheduler should use
the same default timeslice as the Linux scheduler uses for a task with the default nice value.
Otherwise, the RR scheduler should work the same as UWRR except that it schedules tasks, not users,
and there are no different task weights.

You are strongly advised to read Understanding Linux Kernel Chapter 7 Process Scheduling,
before starting the programming assignment. Most of the code of interest for this assignment is in
kernel/sched.c and include/linux/sched.h. These are probably not the
only files you will need to look at, but they're a good start. While there is a fair amount of code
in these files, a key goal of this assignment is for you to understand how to abstract the scheduler code
so that you learn in detail the parts of the scheduler that are crucial for this assignment and ignore
the parts that are not.

Implement user weights. Add two system calls, getuserweight()
and setuserweight(), with the following prototypes:

int getuserweight(int uid);

int setuserweight(int uid, int weight);

Give getuserweight() syscall number 318, and setuserweight()
number 319. getuserweight() should return the weight of
the specified user, or -1 on error. The default user weight should be
10. setuserweight() should give the specified user the
specified weight, and return 0, or -1 on error.

For either call, a uid value of -1 indicates the current
user (that is, the owner of the task that is making the system call).
Otherwise, the range of valid uids is 0-65535. A bad uid should result
in a return value of -1 and errno set to EINVAL. Only
root is allowed to call setuserweight(); a call by any
other user should return -1 and set errno to EACCES.
Also, a user's weight must be greater than zero. Any attempt to set a
bad weight is an error, and errno should be set to EINVAL.

Put your code for these two system calls at the bottom of kernel/sched.c.

Your scheduler should operate alongside the existing Linux
scheduler. Therefore, you should add a new scheduling policy, SCHED_UWRR.
The value of SCHED_UWRR
should be 4. Only tasks whose policy is set to SCHED_UWRR
(normally done via the system call sched_setscheduler())
should be considered for selection by your new scheduler.

Your scheduler and the existing scheduler should interoperate as follows:

If there are any running tasks whose policy is SCHED_RR
or SCHED_FIFO (the real-time tasks), one of them should
be chosen before any other, according to the existing Linux scheduler
rules.

If there are any running tasks whose policy is SCHED_UWRR
(your scheduler), one of them should be chosen before any SCHED_NORMAL
(the default scheduler) tasks, according to the rules outlined.

If all running tasks have a policy of SCHED_NORMAL
(or SCHED_BATCH), the default scheduler should be used to
choose a task, according to the default scheduler's rules.

In other words, there should be a strict priority
relationship where SCHED_RR and SCHED_FIFO
are first, SCHED_UWRR is second, and SCHED_NORMAL
and SCHED_BATCH are last.

Making your scheduler operate alongside the existing may
seem like an unnecessary difficulty, but it spares you the
(significant) difficulty of dealing with scheduling during
initialization, so you can get a system up and running before any of
your own code goes into action.

If a user who has no runnable SCHED_UWRR tasks
has a SCHED_UWRR task become runnable, that user should
go to the end of the round-robin queue (should be the last user to get
a turn).

The base time quantum should be 100 ms. This means that if a
user has a weight of 5, she gets 500 ms at a time with which to
run her SCHED_UWRR tasks. Note that this does not mean that every one of her SCHED_UWRR
tasks runs for 500 ms. How long a task is allowed to run depends on the
intra-user RR scheduler.

Note that you will need to change the timer interrupt
handler so that schedule gets called at the right time. See scheduler_tick()
in kernel/sched.c.

Linux uses an array data structure for keeping track of
tasks for scheduling. Tasks in this array are indexed by priority. You
may find this data structure useful for implementing UWRR. However,
your scheduler should be designed to work with any number of users and
should not assume only a fixed number of users.

Hint: when you see the need_resched
field of the currently running task get set to 1, this is a signal to
some assembly code on the return path of the interrupt handler. It
causes schedule() to get called just before returning
from the current interrupt handler, and is the correct way to make a
schedule happen from within an interrupt handler, where you can't call schedule()
directly.

Hint: the user_struct structure in include/linux/sched.h
is used for tracking users. This structure may be helpful in creating a
useful way to keep track of users.

(10 pts.) Create a simple test program to show that your scheduler works

Write a program that goes into an infinite while(1)
loop. This program would normally take up 100% of the cpu. Now
create three different users Alice, Bob, and Carol and have them each
run one or more copies of the loop program. Use the program top to monitor
the CPU usage, both before and after you make your scheduler
modifications. Show that your scheduler is both fair (i.e., if
the weights are set equally Alice, Bob, and Carol each get 1/3 of the
CPU) and can be weighted (i.e., set weights 3, 2, and 1 and show that
Alice gets three time the CPU of Carol and 50% more than Bob).
Users can be added to the system using the useradd program (e.g., useradd alice).

Create the patch file for submission

Your submission should be a kernel patch of the Linux kernel with KDB
that you created in the previous problem. A patch is used to
store changes you've made to the kernel and to submit your
changes for this problem. For example, if you've changed 50
lines of code, a patch will be about 50 lines long. This is much
easier to store, email, or move around than the full
283,000-line source tree. To create a patch, first back up your
.config file, then do a "make distclean" in your
changed source tree (you don't want object files, config files,
etc. in your patch). Then, from the directory above your source
tree (e.g. /usr/src), run

Notice that the original source tree is first, and the one
containing your modifications is second. There is a particular
"algebra" to patches. You may find it helpful to think of diff
as subtracting two source trees. The result of this subtraction
is a patch.

Additional Information

Remember, as a safety measure, you are strongly
encouraged to backup your VMware image from time to
time.

printk statements in system calls will print their
output to the console whenever they are invoked. To
request printk messages be sent to a log file, insert
the following line into the /etc/syslog.conf
file: kern.* /var/kern.log This will cause
printk messages to be written to /var/kern.log. You
can send a signal to request the syslog daemon re-read
the updated /etc/syslog.conf file, with the
command "kill -1 pid" where pid is the pid of
the syslogd process.

A lot of your problems will come from system
administration issues. If nobody in your group in
familiar with unix, you might want to pick up a book
on Unix/Linux system administration.

You can find some Linux programming tips at the
course resources
page.