AOS Design Guide

Introduction

The AOS project is a very large and complex system. Often the system is
built in an ad-hoc manner with little or no design beforehand, frequently
with the attitude that it will be cleaned up later.

This design methodology typically produces a system with only partial
functionality. This also involves a lot of writing and re-writing the same
code. It is usually also the case that "I'll do it later" never eventuates.

This page lists some of the design choices involved when building SOS to
allow you to better understand the difficulties before they occur. You are
free borrow or ignore as much or as little of this information as you like.
This is only a guide, and any SOS project you want to produce is fine.

Data Structures and Subsystems

While designing SOS you will need to consider a number of different
subsystems and data structures, and how they inter-relate.

Some systems you will need to consider include:

A task/thread startup protocol

I/O

File system access

Console access

A VFS-like layer

[Virtual] Memory

Page tables

Swapping

Frame management

Heap management

Thread and Process management

Namespace(s)

Drivers

Scheduling, run queues and blocking system calls

Some systems you probably don't need to implement, but still might like to
consider are:

Resource management

Access control / multi-user issues

Networking

A page-cache

Inter-process communication

In Milestone 0 you are required to pass data from an application to the OS
in the form of text data for printing. This is an example of passing
parameters and data between L4 address spaces.

There are many ways to pass data between applications, not limited to the
following:

UNIX-like copyin/copyout

Many short-IPC messages

Long IPC strings (NB: These may not be supported by your kernel)

fpage mapping

Memory objects

Which model you use is up to you, however some mechanisms are more
appropriate for some systems and not others (eg. copyin/copyout will not work
well with a multi-server OS). A poorly designed IPC primitive is a common
problem and leads to an inefficient system. You must ensure it is capable of
providing fast communication (eg. simple system calls) and also efficient
when dealing with large amount of data (eg. file access). You must also
ensure your IPC mechanism is free of DOS.

You are free to use existing systems for interface and stub
generation, eg. The Magpie IDL generator.

Debugging

Debugging your SOS project can often be a very difficult
task. For more information on debugging in L4 have a look at this page.

L4 Server Thread Management

Regardless of what kind of OS personality you choose to write, you will
need to deal with multiple asynchronous requests to your L4 server(s). There
are a number of ways this can be achieved. Once again, this list is neither
exhaustive nor truly distinct.

Single threaded servers

Have a single server thread and use continuations to avoid blocking
and DOS.

Single thread per 'application'

At the cost of extra resources you can remove some of the problems of
blocking threads.

Single thread per 'session'

Having more than one L4 server can create a headache for
multi-threaded solutions. Using sessions semantics can help to solve
this.

Worker threads per sub-system

Dedicating a thread to each subsystem can help place an upper-bound
on the number of threads that need creating (and deleting).

Stack-switching

A light-weight alternative to multi-threading is to have a separate
stack per client/application. This can be simpler than multiple threads
to manage and easier than explicit continuations. Debugging, however, can
be ... interesting.

Worker thread pool

Using a multi-threaded solution can leave to wasting of resources. By
creating a pool of worker threads that can grown and shrink resource
usage can be controlled.

All the above techniques are valid for use in SOS. Each has their own
advantages that make them easier/nicer/faster/better/whatever. They also have
their own problems with implementation and testing. Make sure your solution
is deadlock free, DOS free, thread safe, efficient and easy to debug.

Advanced Work

In AOS you are encouraged to do extra work that you are specifically
interested in. This can be in the form of additions to the SOS project, or
entirely different OS related projects. Some suggested SOS enhancements for
extra marks are listed below. Feel free to come up with your own ideas. If
you would like to do one of these (or your own suggestion), talk to us about
what is to be done and how marks will be awarded.

Disclaimer: None of these advanced
features is easy or trivial. You should avoid making your project
dependent on any of these as there is a good chance you will not complete
it. You should try to make sure you at least one to two milestones ahead
of the marking schedule before you attempt one of these. The motto is
"make it work, then make it fancy".

Device Driver

Write a more complex device driver than the OSTS timer. Examples
include a USB controllers, USB disk drives etc. Marks can
be awarded based on a partially functional driver.

Protocol Implementation

Implement a common network protocol within your SOS project. For
example, port (or implement your own) ssh daemon to SOS. The difficulty
of the protocol or port will dictate the bonus marks for it.

Virtualise SOS

OS virtualisation is a hot topic right now. Demonstrate two
copies of SOS running on L4, each running user-land
applications. Each copy of the OS must be segregated from
the other, and share device drivers securely.

Filesystems are passé. Implement orthogonal persistence in SOS so
applications restart back to the point when the system was shutdown.
Fault tolerance is not necessary.

Multi-user SOS

The standard SOS project has memory protection, but no access control.
Implement ACL or capability based access control and provide resource
accounting. Multiple concurrent users can be provided with a trivial
telnet server.

Alternate System Models

There are a number of different general system structures you can use when
building an OS personality on top of L4. A few major categories are listed
here. Be aware that this is in no way an exhaustive list, and you are
encouraged to come up with or design your own. These categories are also very
rough, and in no way well defined.

Monolithic System

Most operating systems, eg. Windows,
Linux and any BSD (Net, Free, Open, ...) use a monolithic system
model, with most, if not all, of the system implemented in a monolithic
kernel.

By placing all system components in the same address space communication
is done trivially with shared memory and function calls. Of course, this
means that all code in that address space is trusted and can bring down
the whole system.

This image shows an example of how a typical
monolithic (UNIX) system is laid out.

This model is the most commonly used for SOS as it is the easiest to
implement and debug. Within this model there is plenty of scope
for creativity and ample work for two people. Other models are presented
here so you can incorporate ideas and abstractions. Students with a keen
caffeine dependence are welcome to try them for a system model. There are
very few systems to source examples from and many have problems which are
as yet unsolved. If you choose a system model other
than monolithic, you have been warned.

Single Address Space (SAS)

Single-address-space systems are designed to make sharing between
applications easier. By placing all applications and the kernel in the
same address space (translation) pointers can be passed around while
maintaining their meaning. To preserve security the system needs to
implement protection as an orthogonal abstraction.

While a SAS is not necessarily an entirely different model (it could be
monolithic or multi-server), it does offer some interesting design
decisions. Instead of the typical SOS filesystem you could create a
persistent system. A SAS also helps in making a distributed shared memory
system.

A multi-server OS is the holy-grail of microkernel systems. By
decomposing the system into components (eg. VFS, file systems, memory
management, naming), each in their own address space, the system can be
constructed in a more flexible way. The layout of such a system is
depicted in this image. Multi-server OSes have a
strong tendency to move more work into application libraries rather than
the OS modules, but still preserving security.

L4 is a flexible and fast microkernel, however there are other
designs for microkernel (and micro-kernel-like) systems including L3, EROS, Mach,
Exokernel, Topsy and K42.

You may choose to pick an existing microkernel (or come up with your
own abstractions) and implement this on top of L4. You will also need to
implement elements of SOS on this system to demonstrate its usefulness.

OS161 on L4

Many students are probably familiar with OS161 used in
the introductory operating systems course. OS161
is a specific example of a (simple) monolithic kernel written for the
MIPS32 architecture.

One option for the project is to port OS161 to L4 by adding
a new 'L4' architecture. Because OS161 is already quite
complete, it will also be necessary to add extensions and
features to make this project in-line with writing SOS from
scratch.