Operating System
What is an operating system?
●
●
●
An operating system is a layer of software
which takes care of technical aspects of a
computer's operation.
It shields the user of the machine from the
low-level details of the machine's operation
and provides frequently needed facilities.
You can think of it as being the software
which is already installed on a machine,
before you add anything of your own.
What is an operating system?
●
●
Normally the operating system has a number
of key elements: (i) a technical layer of
software for driving the hardware of the
computer, like disk drives, the keyboard and
the screen; (ii) a filesystem which provides a
way of organizing files logically, and (iii) a
simple command language which enables
users to run their own programs and to
manipulate their files in a simple way.
Some operating systems also provide text
editors, compilers, debuggers and a variety
of other tools.
What is an operating system?
●
●
Since the operating system (OS) is in charge
of a computer, all requests to use its
resources and devices need to go through
the OS.
An OS therefore provides (iv) legal entry
points into its code for performing basic
operations like writing to devices.
What is an operating system?
●
●
Operating systems may be classified by both
how many tasks they can perform
`simultaneously' and by how many users can
be using the system `simultaneously'.
That is: single-user or multi-user and singletask or multi-tasking. A multi-user system
must clearly be multi-tasking.
What is an operating system?
●
●
●
The first of these (MS/PC DOS/Windows 3x)
are single user, single-task systems which
build on a ROM based library of basic
functions called the BIOS.
These are system calls which write to the
screen or to disk etc.
Although all the operating systems can
service interrupts, and therefore simulate the
appearance of multitasking in some
situations, the older PC environments cannot
be thought of as a multi-tasking systems in
any sense.
What is an operating system?
●
●
Only a single user application could be open
at any time.
Windows 95 replaced the old coroutine
approach of quasi-multitasking with a true
context switching approach, but only a single
user system, without proper memory
protection.
What is an operating system?
●
●
●
The Macintosh system 7 can be classified as
single-user quasi-multitasking1.1. That
means that it is possible to use several user
applications simultaneously.
A window manager can simulate the
appearance of several programs running
simultaneously, but this relies on each
program obeying specific rules in order to
achieve the illusion.
The MacIntosh not a true multitasking system
in the sense that, if one program crashes, the
whole system crashes.
What is an operating system?
●
●
Windows is purported to be preemptive
multitasking but most program crashes also
crash the entire system.
This might be due to the lack of proper
memory protection. The claim is somewhat
confusing.
What is an operating system?
●
●
●
●
●
AmigaDOS is an operating system for the
Commodore Amiga computer.
It is based on the UNIX model and is a fully multitasking, single-user system.
Several programs may be actively running at any
time.
The operating system includes a window
environment which means that each independent
program has a `screen' of its own and does not
therefore have to compete for the screen with other
programs.
This has been a major limitation on multi-tasking
operating systems in the past.
What is an operating system?
●
●
MTS (Michigan timesharing system) was the
first time-sharing multi-user system.
It supports only simple single-screen terminal
based input/output and has no hierarchical
file system.
What is an operating system?
●
●
●
Unix is arguably the most important operating
system today, and one which we shall
frequently refer to below.
It comes in many forms, developed by
different manufacturers.
Originally designed at AT&T, UNIX split into
two camps early on: BSD (Berkeley software
distribution) and system 5 (AT&T license).
What is an operating system?
●
●
The BSD version was developed as a
research project at the university of Berkeley,
California.
Many of the networking and user-friendly
features originate from these modifications.
What is an operating system?
●
●
●
●
Unix is generally regarded as the most portable and
powerful operating system available today by
impartial judges, but NT is improving quickly.
Unix runs on everything from laptop computers to
CRAY mainframes.
It is particularly good at managing large database
applications and can run on systems with hundreds
of processors.
Most Unix types support symmetric multithreaded
processing and all support simultaneous logins by
multiple users.
What is an operating system?
●
●
NT is a `new' operating system from
Microsoft based on the old VAX/VMS kernel
from the Digital Equipment Corporation
(VMS's inventor moved to Microsoft) and the
Windows32 API.
Initially it reinvented many existing systems,
but it is gradually being forced to adopt many
open standards from the Unix world.
What is an operating system?
●
●
It is fully multitasking, and can support
multiple users (but only one at a time-multiple logins by different users is not
possible).
It has virtual memory and multithreaded
support for several processors. NT has a
built in object model and security framework
which is amongst the most modern in us
Hierarchies and black boxes
●
●
●
A hierarchy is a way of organizing information
using levels of detail.
The phrase high-level implies few details,
whereas low-level implies a lot of detail,
down in the guts of things.
A hierarchy usually has the form of a tree,
which branches from the highest level to the
lowest, since each high-level object is
composed of several lower-level objects.
Hierarchies and black boxes
●
●
●
The key to making large computer programs
and to solving difficult problems is to create a
hierarchical structure, in which large highlevel problems are gradually broken up into
manageable low-level problems.
Each level works by using a series of `black
boxes' (e.g. subroutines) whose inner details
are not directly visible.
This allows us to hide details and remain
sane as the complexity builds up.
Resources and sharing
●
●
●
A computer is not just a box which adds
numbers together.
It has resources like the keyboard and the
screen, the disk drives and the memory.
In a multi-tasking system there may be
several programs which need to receive input
or write output simultaneously and thus the
operating system may have to share these
resources between several running
programs.
Resources and sharing
●
●
If the system has two keyboards (or
terminals) connected to it, then the OS can
allocate both to different programs.
If only a single keyboard is connected then
competing programs must wait for the
resources to become free.
Resources and sharing
●
●
●
●
Most multi-tasking systems have only a
single central processor unit and yet this is
the most precious resource a computer has.
An multi-tasking operating system must
therefore share cpu-time between programs.
That is, it must work for a time on one
program, then work a while on the next
program, and so on.
If the first program was left unfinished, it must
then return to work more on that, in a
systematic way. The way an OS decides to
share its time between different tasks is
called scheduling.
Communication, protocols, data
types
●
●
●
●
The exchange of information is an essential
part of computing.
Suppose computer A sends a message to
computer B reporting on the names of all the
users and how long they have been working.
To do this it sends a stream of bits across a
network.
When computer B receives a stream of bits,
it doesn't automatically know what they
mean.
Communication, protocols, data
types
●
●
It must decide if the bits represent numbers
or characters, integers or floating point
numbers, or a mixture of all of them.
These different types of data are all stored as
binary information - the only difference
between them is the way one chooses to
interpret them.
●
●
●
●
Communication, protocols, data
types
The resolution to this problem is to define a
protocol.
This is a convention or agreement between
the operating systems of two machines on
what messages may contain.
The agreement may say, for instance, that
the first thirty-two bits are four integers which
give the address of the machine which sent
the message.
The next thirty-two bits are a special number
telling the OS which protocol to use in order
to interpret the data.
Communication, protocols, data
types
●
●
●
The OS can then look up this protocol and
discover that the rest of the data are
arranged according to a pattern of
<name><time><name><time>...
where the name is a string of bytes,
terminated by a zero, and the time is a four
byte digit containing the time in hours.
Computer B now knows enough to be able to
extract the information from the stream of
bits.
Communication, protocols, data
types
●
●
It is important to understand that all
computers have to agree on the way in which
the data are sent in advance.
If the wrong protocol is diagnosed, then a
string of characters could easily be converted
into a floating point number - but the result
would have been nonsense.
Communication, protocols, data
types
●
●
●
Similarly, if computer A had sent the
information incorrectly, computer B might not
be able to read the data and a protocol error
would arise.
More generally, a protocol is an agreed
sequence of behavior which must be
followed.
For example, when passing parameters to
functions in a computer program, there are
rules about how the parameter should be
declared and in which order they are sent.
System overhead
●
●
●
●
An operating system is itself a computer
program which must be executed.
It therefore requires its own share of a
computer's resources.
This is especially true on multitasking
systems, such as UNIX, where the OS is
running all the time along side users'
programs.
Since user programs have to wait for the OS
to perform certain services, such as
allocating resources, they are slowed down
System overhead
●
●
The time spent by the OS servicing user
requests is called the system overhead.
On a multi-user system one would like this
overhead to be kept to a minimum, since
programs which make many requests of the
OS slow not only themselves down, but all
other programs which are queuing up for
resources.
Caching
●
●
●
●
●
Caching is a technique used to speed up
communication with slow devices.
Usually the CPU can read data much faster from
memory than it can from a disk or network
connection, so it would like to keep an up-to-date
copy of frequently used information in memory.
The memory area used to do this is called a cache.
You can think of the whole of the primary memory as
being a cache for the secondary memory (disk).
Sometimes caching is used more generally to mean
`keeping a local copy of data for convenience'.
Hardware
●
●
●
The CPU
Memory
Devices
Interrupts, traps, exceptions
●
●
●
●
Interrupts are hardware signals which are
sent to the CPU by the devices it is
connected to.
These signals literally interrupt the CPU from
what it is doing and demand that it spend a
few clock cycles servicing a request.
For example, interrupts may come from the
keyboard because a user pressed a key.
Then the CPU must stop what it is doing and
read the keyboard, place the key value into a
buffer for later reading, and return to what it
Interrupts, traps, exceptions
●
●
●
Other `events' generate interrupts: the
system clock sends interrupts at periodic
intervals, disk devices generate interrupts
when they have finished an I/O task and
interrupts can be used to allow computers to
monitor sensors and detectors.
User programs can also generate `software
interrupts' in order to handle special
situations like a `division by zero' error.
These are often called traps or exceptions on
some systems.
Interrupts, traps, exceptions
●
●
●
Interrupts are graded in levels.
Low level interrupts have a low priority,
whereas high level interrupts have a high
priority.
A high level interrupt can interrupt a low level
interrupt, so that the CPU must be able to
recover from several `layers' of interruption
and end up doing what it was originally
doing.
Resource management
●
●
In order to keep track of how the system
resources are being used, an OS must keep
tables or lists telling it what is free an what is
not.
For example, data cannot be stored neatly on
a disk. As files become deleted, holes appear
and the data become scattered randomly
over the disk surface.
Spooling
●
●
●
Spooling is a way of processing data serially.
Print jobs are spooled to the printer, because
they must be printed in the right order (it
would not help the user if the lines of his/her
file were liberally mixed together with parts of
someone elses file).
During a spooling operation, only one job is
performed at a time and other jobs wait in a
queue to be processed. Spooling is a form of
batch processing.
System calls
●
●
An important task of an operating system is
to provide black-box functions for the most
frequently needed operations, so that users
do not have to waste their time programming
very low level code which is irrelevant to their
purpose.
These ready-made functions comprise
frequently used code and are called system
calls.
System calls
●
●
●
For example, controlling devices requires
very careful and complex programming.
Users should not have to write code to
position the head of the disk drive at the right
place just to save a file to the disk.
This is a very basic operation which
everyone requires and thus it becomes the
responsibility of the OS. Another example is
mathematical functions or graphics
primitives.
Filesystem
●
●
●
Should the filesystem distinguish between
types of files e.g. executable files, text files,
scripts.
If so how? One way is to use file extensions,
or a naming convention to identify files, like
myprog.exe, SCRIPT.BAT, file.txt.
The problem with this is that the names can
be abused by users.
Filesystem
●
●
Protection. If several users will be storing
files together on the same disk, should each
user's files be exclusive to him or her?
Is a mechanism required for sharing files
between several users?
Filesystem
●
●
●
A hierarchical filesystem is a good starting
point for organizing files, but it can be too
restrictive. Sometimes it is useful to have a
file appear in several places at one time.
This can be accomplished with links. A link is
not a copy of a file, but a pointer to where a
file really is.
By making links to other places in a
hierarchical filesystem, its flexibility is
increased considerably.
Single-task OS
●
Memory map and registers
–
Roughly speaking, at the hardware level a
computer consists of a CPU, memory and a
number of peripheral devices.
–
The CPU contains registers or `internal
variables' which control its operation.
–
The CPU can store information only in the
memory it can address and in the registers of
other microprocessors it is connected to.
–
The CPU reads machine code instructions, one
at a time, from the memory and executes them
forever without stopping.
Single-task OS
●
Memory map and registers
–
The memory, as seen by the CPU, is a large
string of bytes starting with address and
increasing up to the maximum address.
–
Physically it is made up, like a jigsaw puzzle, of
many memory chips and control chips. mapped
into the diagram shown.
–
Normally, because of the hardware design of the
CPU, not all of the memory is available to the
user of the machine. Some of it is required for
the operation of the CPU.
Single-task OS
●
Memory map and registers -The roughly
distinguished areas are
–
Zero page: The first t `page' of the memory is
often reserved for a special purpose. It is often
faster to write to the zero page because you
don't have to code the leading zero for the
address - special instructions for the zero page
can leave the `zero' implicit.
–
Stack: Every CPU needs a stack for executing
subroutines. The stack is explained in more
detail below.
Single-task OS
●
Memory map and registers -The roughly
distinguished areas are
–
Screen memory: What you see on the screen of
a computer is the image of an area of memory,
converted into colours and positions by a
hardware video-controller. The screen memory is
the area of memory needed to define the colour
of every `point' or `unit' on the screen.
Depending on what kind of visual system a
computer uses, this might be one byte per
character and it might be four bytes per pixel!
Single-task OS
●
Memory map and registers -The roughly
distinguished areas are
–
Memory mapped I/O: Hardware devices like
disks and video controllers contain smaller
microprocessors of their own. The CPU gives
them instructions by placing numbers into their
registers. To make this process simpler, these
device registers (only a few bytes per device,
perhaps) are `wired' into the main memory map,
so that writing to the device is the same as
writing to the rest of the memory.
–
Operating system: The operating system itself is
a large program which often takes up a large
Single-task OS
●
Memory map and registers -The roughly
distinguished areas are
Single-task OS
●
Stack
–
A stack is a so-called last-in first-out (LIFO) data
structure.
–
That is to say - the last thing to be placed on top
of a stack, when making it, is the first item which
gets removed when un-making it.
–
Stacks are used by the CPU to store the current
position within a program before jumping to
subroutines, so that they remember where to
return to after the subroutine is finished.
Single-task OS
●
Stack
–
Because of the nature of the stack, the CPU can
simply deposit the address of the next instruction
to be executed (after the subroutine is finished)
on top of the stack.
–
When the subroutine is finished, the CPU pulls
the first address it finds off the top of the stack
and jumps to that location.
Single-task OS
●
Stack
–
Notice that the stack mechanism will continue to
work even if the subroutine itself calls another
subroutine, since the second subroutine causes
another stack frame to be saved on the top of
the stack.
–
When that is finished, it returns to the first
subroutine and then to the original program in
the correct order.
Single-task OS
●
Input/Output
–
Input arrives at the computer at unpredictable
intervals. The system must be able to detect its
arrival and respond to it.
Single-task OS
●
Interrupts
–
Interrupts are hardware triggered signals which
cause the CPU to stop what it is doing and jump
to a special subroutine.
–
Interrupts normally arrive from hardware devices,
such as when the user presses a key on the
keyboard, or the disk device has fetched some
data from the disk.
–
They can also be generated in software by errors
like division by zero or illegal memory address.
Single-task OS
●
Interrupts
–
When the CPU receives an interrupt, it saves the
contents of its registers on the hardware stack
and jumps to a special routine which will
determine the cause of the interrupt and respond
to it appropriately.
–
Interrupts occur at different levels. Low level
interrupts can be interrupted by high level
interrupts. Interrupt handling routines have to
work quickly, or the computer will be drowned in
the business of servicing interrupts.
Single-task OS
●
Interrupts
–
There is no logical difference between what
happens during the execution of an interrupt
routine and a subroutine.
–
The difference is that interrupt routines are
triggered by events, whereas software
subroutines follow a prearranged plan.
Single-task OS
●
Interrupts
–
An important area is the interrupt vector. This is
a region of memory reserved by the hardware for
servicing of interrupts.
–
Each interrupt has a number from zero to the
maximum number of interrupts supported on the
CPU; for each interrupt, the interrupt vector must
be programmed with the address of a routine
which is to be executed when the interrupt
occurs. i.e. when an interrupt occurs, the system
examines the address in the interrupt vector for
that interrupt and jumps to that location.
–
The routine exits when it meets an RTI (return
from interrupt) instruction.
Single-task OS
●
Buffers
–
The CPU and the devices attached to it do
not work at the same speed.
–
Buffers are therefore needed to store
incoming or outgoing information
temporarily, while it is waiting to be picked up
by the other party.
–
A buffer is simply an area of memory which
works as a waiting area. It is a first-in first-out
(FIFO) data structure or queue.
Single-task OS
●
Synchronous and asynchronous I/O
–
To start an I/O operation, the CPU writes
appropriate values into the registers of the
device controller.
–
The device controller acts on the values it
finds in its registers. For example, if the
operation is to read from a disk, the device
controller fetches data from the disk and
places it in its local buffer.
–
It then signals the CPU by generating an
interrupt.
Single-task OS
●
Synchronous and asynchronous I/O
–
While the CPU is waiting for the I/O to
complete it may do one of two things.
–
It can do nothing or idle until the device
returns with the data (synchronous I/O), or it
can continue doing something else until the
completion interrupt arrives (asynchronous
I/O).
–
The second of these possibilities is clearly
much more efficient.
Single-task OS
●
DMA - Direct Memory Access
–
Very high speed devices could place heavy
demands on the CPU for I/O servicing if they
relied on the CPU to copy data word by word.
–
The DMA controller is a device which copies
blocks of data at a time from one place to the
other, without the intervention of the CPU.
–
To use it, its registers must be loaded with
the information about what it should copy
and where it should copy to.
Single-task OS
●
DMA - Direct Memory Access
–
Once this is done, it generates an interrupt to
signal the completion of the task.
–
The advantage of the DMA is that it transfers
large amounts of data before generating an
interrupt.
–
Without it, the CPU would have to copy the
data one register-full at a time, using up
hundreds or even thousands of interrupts
and possibly bringing a halt to the machine!
Multi-tasking and multi-user OS
●
●
●
●
●
To make a multi-tasking OS we need loosely
to reproduce all of the features discussed in
the last chapter for each task or process
which runs.
It is not necessary for each task to have its
own set of devices.
The basic hardware resources of the system
are shared between the tasks.
The operating system must therefore have a
`manager' which shares resources at all
times.
This manager is called the `kernel' and it
constitutes the main difference between
single and multitasking operating systems.
Users – authentication
●
●
●
If a system supports several users, then each
user must have his or her own place on the
system disk, where files can be stored.
Since each user's files may be private, the
file system should record the owner of each
file.
For this to be possible, all users must have a
user identity or login name and must supply a
password which prevents others from
impersonating them.
Privileges and security
●
●
●
On a multi-user system it is important that
one user should not be able to interfere with
another user's activities, either purposefully
or accidentally.
Certain commands and system calls are
therefore not available to normal users
directly.
The super-user is a privileged user (normally
the system operator) who has permission to
do anything, but normal users have
restrictions placed on them in the interest of
system safety.
Privileges and security
●
●
●
On a multi-user system it is important that
one user should not be able to interfere with
another user's activities, either purposefully
or accidentally.
Certain commands and system calls are
therefore not available to normal users
directly.
The super-user is a privileged user (normally
the system operator) who has permission to
do anything, but normal users have
restrictions placed on them in the interest of
system safety.
Privileges and security
●
●
For example: normal users should never be
able to halt the system; nor should they be
able to control the devices connected to the
computer, or write directly into memory
without making a formal request of the OS.
One of the tasks of the OS is to prevent
collisions between users.
Tasks - two-mode operation
●
●
It is crucial for the security of the system that
different tasks, working side by side, should not
be allowed to interfere with one another
(although this occasionally happens in
microcomputer operating systems, like the
Macintosh, which allow several programs to be
resident in memory simultaneously).
Protection mechanisms are needed to deal with
this problem. The way this is normally done is to
make the operating system all-powerful and
allow no user to access the system resources
without going via the OS.
Tasks - two-mode operation
●
●
●
To prevent users from tricking the OS, multiuser
systems are based on hardware which supports twomode operation: privileged mode for executing OS
instructions and user mode for working on user
programs.
When running in user mode a task has no special
privileges and must ask the OS for resources through
system calls.
When I/O or resource management is performed, the
OS takes over and switches to privileged mode. The
OS switches between these modes personally, so
provided it starts off in control of the system, it will
alway remain in control.
Tasks - two-mode operation
●
●
●
●
At boot-time, the system starts in privileged
mode.
During user execution, it is switched to user
mode.
When interrupts occur, the OS takes over
and it is switched back to privileged mode.
Other names for privileged mode are monitor
mode or supervisor mode
I/O and Memory protection
●
●
●
●
To prevent users from gaining control of devices, by
tricking the OS, a mechanism is required to prevent
them from writing to an arbitrary address in the
memory.
For example, if the user could modify the OS
program, then it would clearly be possible to gain
control of the entire system in privileged mode.
All a user would have to do would be to change the
addresses in the interrupt vector to point to a routine
of their own making.
This routine would then be executed when an
interrupt was received in privileged mode.
I/O and Memory protection
●
●
●
●
The solution to this problem is to let the OS define a
segment of memory for each user process and to
check, when running in user mode, every address
that the user program refers to.
If the user attempts to read or write outside this
allowed segment, a segmentation fault is generated
and control returns to the OS.
This checking is normally hard-wired into the
hardware of the computer so that it cannot be
switched off.
No checking is required in privileged mode.
Time sharing
●
●
There is always the problem in a multitasking system that a user program will go
into an infinite loop, so that control never
returns to the OS and the whole system
stops.
We have to make sure that the OS always
remains in control by some method.
Time sharing
●
Here are two possibilities:
–
The operating system fetches each instruction from the
user program and executes it personally, never giving it
directly to the CPU.
–
The OS software switches between different processes by
fetching the instructions it decides to execute.
–
This is a kind of software emulation. This method works,
but it is extremely inefficient because the OS and the user
program are always running together.
–
The full speed of the CPU is not realized. This method is
often used to make simulators and debuggers.
Time sharing
●
Here are two possibilities:
–
A more common method is to switch off the OS while the
user program is executing and switch off the user process
while the OS is executing.
–
The switching is achieved by hardware rather than
software, as follows. When handing control to a user
program, the OS uses a hardware timer to ensure that
control will return after a certain time.
–
The OS loads a fixed time interval into the timer's control
registers and gives control to the user process. The timer
then counts down to zero and when it reaches zero it
generates a non-maskable interrupt, whereupon control
Memory map
●
●
●
●
We can represent a multi-tasking system
schematically.
Clearly the memory map of a computer does
not look like this figure.
It looks like the figures in the previous
chapter, so the OS has to simulate this
behaviour using software.
The point of this diagram is only that it shows
the elements required by each process
executing on the system.
Memory map
Memory map
●
●
●
Each program must have a memory area to
work in and a stack to keep track of
subroutine calls and local variables.
Each program must have its own input/output
sources.
These cannot be the actual resources of the
system: instead, each program has a virtual
I/O stream.
Memory map
●
●
●
●
The operating system arranges things so that the
virtual I/O looks, to the user program, as though it is
just normal I/O.
In reality, the OS controls all the I/O itself and
arranges the sharing of resources transparently. The
virtual output stream for a program might be a
window on the real screen, for instance.
The virtual printer is really a print-queue. The
keyboard is only `connected' to one task at a time,
but the OS can share this too.
For example, in a window environment, this happens
when a user clicks in a particular window.
Kernel and shells - layers of
software
●
●
So far we have talked about the OS almost
as though it were a living thing.
In a multitasking, multi-user OS like UNIX
this is not a bad approximation to the truth! In
what follows we make use of UNIX
terminology and all of the examples we shall
cover later will refer to versions of the UNIX
operating system.
Kernel and shells - layers of
software
●
●
The part of the OS which handles all of the
details of sharing and device handling is
called the kernel or core.
The kernel is not something which can be
used directly, although its services can be
accessed through system calls.
Kernel and shells - layers of
software
●
●
●
●
What is needed is a user interface or command line
interface (CLI) which allows users to log onto the
machine and manipulate files, compile programs and
execute them using simple commands.
Since this is a layer of software which wraps the
kernel in more acceptable clothes, it is called a shell
around the kernel.
It is only by making layers of software, in a hierarchy
that very complex programs can be written and
maintained.
The idea of layers and hierarchies returns again and
again.
Services: daemons
●
●
●
●
The UNIX kernel is a very large program, but it does
not perform all of the services required in an OS.
To keep the size of the kernel to a minimum, it only
deals with the sharing of resources.
Other jobs for operating system (which we can call
services) are implemented by writing program which
run along side user's programs.
Indeed, they are just `user programs' - the only
difference is that are owned by the system. These
programs are called daemons.
Services: daemons
●
Here are some example from UNIX.
–
mountd: Deals with requests for `mounting' this
machine's disks on other machines - i.e.
requests to access the disk on this machine from
another machine on the network.
–
rlogind: Handles requests to login from remote
terminals.
–
keyserv: A server which stores public and private
keys. Part of a network security system.
–
named: Converts machine names into their
network addresses and vice versa.
Multiprocessors – parallelism
●
●
●
The idea of constructing computers with
more than one CPU has become more
popular recently.
On a system with several CPUs it is not just
a virtual fact that several tasks can be
performed simultaneously - it is a reality.
This introduces a number of complications in
OS design.
Multiprocessors – parallelism
●
●
●
For example - how can we stop two
independent processors from altering some
memory location which they both share
simultaneously (so that neither of them can
detect the collision)?
This is a problem in process synchronization.
The solution to this problem is much simpler
in a single CPU system since no two things
ever happen truly simultaneously.
Processes and Thread
●
Multitasking and multi-user systems need to
distinguish between the different programs
being executed by the system. This is
accomplished with the concept of a process.
Naming conventions
●
●
Before talking about process management
we shall introduce some of the names which
are in common use.
Not all operating systems or books agree on
the definitions of these names.
Naming conventions
●
Process:
–
●
This is a general term for a program which is
being executed. All work done by the CPU
contributes to the execution of processes. Each
process has a descriptive information structure
associated with it (normally held by the kernel)
called a process control block which keeps track
of how far the execution has progressed and
what resources the process holds.
Task:
–
On some systems processes are called tasks.
Naming conventions
●
Job:
–
●
Some systems distinguish between batch
execution and interactive execution. Batch (or
queued) processes are often called jobs. They
are like production line processes which start, do
something and quit, without stopping to ask for
input from a user. They are non-interactive
processes.
CPU burst:
–
A period of uninterrupted CPU activity.
●
Thread:
–
●
Naming conventions
(sometimes called a lightweight process) is different from
process or task in that a thread is not enough to get a
whole program executed. A thread is a kind of stripped
down process - it is just one `active hand' in a program something which the CPU is doing on behalf of a program,
but not enough to be called a complete process. Threads
remember what they have done separately, but they share
the information about what resources a program is using,
and what state the program is in. A thread is only a CPU
assignment. Several threads can contribute to a single
task. When this happens, the information about one
process or task is used by many threads. Each task must
have at least one thread in order to do any work.
I/O burst
Scheduling
●
●
On most multitasking systems, only one
process can truly be active at a time - the
system must therefore share its time
between the execution of many processes.
This sharing is called scheduling.
(Scheduling time management.)
Scheduling
●
●
●
●
Different methods of scheduling are appropriate for
different kinds of execution.
A queue is one form of scheduling in which each
program waits its turn and is executed serially.
This is not very useful for handling multitasking, but it
is necessary for scheduling devices which cannot be
shared by nature.
An example of the latter is the printer. Each print job
has to be completed before the next one can begin,
otherwise all the print jobs would be mixed up and
interleaved resulting in nonsense.
Scheduling
●
We shall make a broad distinction between
two types of scheduling:
–
Queueing. This is appropriate for serial or batch
jobs like print spooling and requests from a
server. There are two main ways of giving priority
to the jobs in a queue. One is a first-come firstserved (FCFS) basis, also referred to as first-in
first-out (FIFO); the other is to process the
shortest job first (SJF).
Scheduling
●
We shall make a broad distinction between
two types of scheduling:
–
Round-robin. This is the time-sharing approach
in which several tasks can coexist. The
scheduler gives a short time-slice to each job,
before moving on to the next job, polling each
task round and round. This way, all the tasks
advance, little by little, on a controlled basis.
Scheduling
●
These two categories are also referred to as nonpreemptive and preemptive respectively, but there is
a grey area.
–
Strictly non-preemptive Each program continues
executing until it has finished, or until it must wait for an
event (e.g. I/O or another task). This is like Windows 95
and MacIntosh system 7.
–
Strictly preemptive The system decides how time is to be
shared between the tasks, and interrupts each process
after its time-slice whether it likes it or not. It then
executes another program for a fixed time and stops, then
Scheduling
●
These two categories are also referred to as
non-preemptive and preemptive respectively,
but there is a grey area.
–
Politely-preemptive?? The system decides how
time is to be shared, but it will not interrupt a
program if it is in a critical section. Certain
sections of a program may be so important that
they must be allowed to execute from start to
finish without being interrupted. This is like UNIX
and Windows NT.
Scheduling
●
To choose an algorithm for scheduling tasks
we have to understand what it is we are
trying to achieve. i.e. What are the criterea
for scheduling?
Scheduling
●
●
●
We want to maximize the efficiency of the machine.
i.e. we would like all the resources of the machine to
be doing useful work all of the time - i.e. not be idling
during one process, when another process could be
using them.
The key to organizing the resources is to get the CPU
time-sharing right, since this is the central `organ' in
any computer, through which almost everything must
happen.
But this cannot be achieved without also thinking
about how the I/O devices must be shared, since the
I/O devices communicate by interrupting the CPU
from what it is doing.
Scheduling
●
●
We would like as many jobs to get finished
as quickly as possible.
Interactive users get irritated if the
performance of the machine seems slow. We
would like the machine to appear fast for
interactive users - or have a fast response
time.
Scheduling
●
●
Some of these criteria cannot be met
simultaneously and we must make
compromises.
In particular, what is good for batch jobs is
often not good for interactive processes and
vice-versa, as we remark under Run levels priority below.
Scheduling hierarchy
●
●
Complex scheduling algorithms distinguish
between short-term and long-term
scheduling.
This helps to deal with tasks which fall into
two kinds: those which are active
continuously and must therefore be serviced
regularly, and those which sleep for long
periods.
Scheduling hierarchy
●
●
For example, in UNIX the long term
scheduler moves processes which have
been sleeping for more than a certain time
out of memory and onto disk, to make space
for those which are active.
Sleeping jobs are moved back into memory
only when they wake up (for whatever
reason). This is called swapping.
Scheduling hierarchy
●
●
The most complex systems have several
levels of scheduling and exercise different
scheduling polices for processes with
different priorities.
Jobs can even move from level to level if the
circumstances change.
Scheduling hierarchy
Runs levels – priority
●
●
Rather than giving all programs equal shares
of CPU time, most systems have priorities.
Processes with higher priorities are either
serviced more often than processes with
lower priorities, or they get longer time-slices
of the CPU.
Runs levels – priority
●
●
●
Priorities are not normally fixed but vary
according to the performance of the system
and the amount of CPU time a process has
already used up in the recent past.
For example, processes which have used a
lot of CPU time in the recent past often have
their priority reduced.
This tends to favour iterative processes
which wait often for I/O and makes the
response time of the system seem faster for
interactive users.
Runs levels – priority
●
●
●
●
In addition, processes may be reduced in priority if
their total accumulated CPU usage becomes very
large.
The wisdom of this approach is arguable, since
programs which take a long time to complete tend to
be penalized.
Indeed, they take must longer to complete because
their priority is reduced. If the priority continued to be
lowered, long jobs would never get finished.
This is called process starvation and must be
avoided.
Runs levels – priority
●
●
●
Scheduling algorithms have to work without
knowing how long processes will take.
Often the best judge of how demanding a
program will be is the user who started the
program. UNIX allows users to reduce the
priority of a program themselves using the
nice command. `Nice' users are supposed to
sacrifice their own self-interest for the good
of others.
Only the system manager can increase the
priority of a process.
Runs levels – priority
●
●
Another possibility which is often not
considered, is that of increasing the priority of
resource-gobbling programs in order to get
them out of the way as fast as possible.
This is very difficult for an algorithm to judge,
so it must be done manually by the system
administrator.
Context switching
●
●
●
Switching from one running process to
another running process incurs a cost to the
system.
The values of all the registers must be saved
in the present state, the status of all open
files must be recorded and the present
position in the program must be recorded.
Then the contents of the MMU must be
stored for the process.
Context switching
●
●
Then all those things must be read in for the
next process, so that the state of the system
is exactly as it was when the scheduler last
interrupted the process.
This is called a context switch. Context
switching is a system overhead. It costs real
time and CPU cycles, so we don't want to
context switch too often, or a lot of time will
be wasted.
Context switching
●
The state of each process is saved to a data
structure in the kernel called a process
control block (PCB).
Interprocess communication
●
●
One of the benefits of multitasking is that
several processes can be made to cooperate
in order to achieve their ends.
To do this, they must do one of the following.
Interprocess communication
●
Communicate.
–
Interprocess communication (IPC) involves
sending information from one process to another.
–
This can be achieved using a `mailbox' system,
a socket (Berkeley) which behaves like a virtual
communications network (loopback), or through
the use of `pipes'.
–
Pipes are a system construction which enables
one process to open another process as if it
were a file for writing or reading.
Interprocess communication
●
Share data
–
●
A segment of memory must be available to both
processes. (Most memory is locked to a single
process).
Waiting.
–
Some processes wait for other processes to give
a signal before continuing. This is an issue of
synchronization.
Interprocess communication
●
●
●
As soon as we open the door to co-operation
there is a problem of how to synchronize
cooperating processes. For example,
suppose two processes modify the same file.
If both processes tried to write
simultaneously the result would be a
nonsensical mixture.
We must have a way of synchronizing
processes, so that even concurrent
processes must stand in line to access
shared data serially.
Interprocess communication
●
Synchronization is a tricky problem in
multiprocessor systems, but it can be
achieved with the help of critical sections and
semaphores/ locks.
Creating processes
●
●
●
The creation of a process requires the following
steps.
Name. The name of the program which is to run as
the new process must be known.
Process ID and Process Control Block. The system
creates a new process control block, or locates an
unused block in an array. This block is used to follow
the execution of the program through its course,
keeping track of its resources and priority. Each
process control block is labelled by its PID or
process identifier.
Creating processes
●
●
●
●
Locate the program to be executed on disk and
allocate memory for the code segment in RAM.
Load the program into the code segment and
initialize the registers of the PCB with the start
address of the program and appropriate starting
values for resources.
Priority. A priority must be computed for the
process, using a default for the type of process
and any value which the user specified as a
`nice' value
Schedule the process for execution.
Process hierarchy: children and
parent processes
●
●
In a democratic system anyone can choose
to start a new process, but it is never users
which create processes but other processes!
That is because anyone using the system
must already be running a shell or command
interpreter in order to be able to talk to the
system, and the command interpreter is itself
a process.
Process hierarchy: children and
parent processes
●
When a user creates a process using the
command interpreter, the new process
become a child of the command interpreter.
Similarly the command interpreter process
becomes the parent for the child. Processes
therefore form a hierarchy.
Process hierarchy: children and
parent processes
Process hierarchy: children and
parent processes
●
●
The processes are linked by a tree structure.
If a parent is signalled or killed, usually all its
children receive the same signal or are
destroyed with the parent.
This doesn't have to be the case--it is
possible to detach children from their
parents--but in many cases it is useful for
processes to be linked in this way.
Process hierarchy: children and
parent processes
●
●
When a child is created it may do one of two
things.
–
Duplicate the parent process.
–
Load a completely new program.
Similarly the parent may do one of two
things.
–
Continue executing along side its children.
–
Wait for some or all of its children to finish before
proceeding.
Process states
●
●
In order to know when to execute a program
and when not to execute a program, it is
convenient for the scheduler to label
programs with a `state' variable.
This is just an integer value which saves the
scheduler time in deciding what to do with a
process.
Process states
●
Broadly speaking the state of a process may
be one of the following.
–
New.
–
Ready (in line to be executed).
–
Running (active).
–
Waiting (sleeping, suspended)
–
Terminated (defunct)
Process states
●
●
When time-sharing, the scheduler only needs
to consider the processes which are in the
`ready' state.
Changes of state are made by the system
and follow the pattern in the diagram below.
Process states
The transitions between different states normally
happen on interrupts.
From state
Event
To state
New
Accepted
Ready
Ready
Scheduled / Dispatch
Running
Running
Need I/O
Waiting
Running
Scheduler timeout
Ready
Running
Completion / Error / Killed Terminated
Waiting
I/O completed or wakeup event Ready
●
Queue scheduling
●
●
●
The basis of all scheduling is the queue
structure.
A round-robin scheduler uses a queue but
moves cyclically through the queue at its own
speed, instead of waiting for each task in the
queue to complete.
Queue scheduling is primarily used for serial
execution.
Queue scheduling
●
There are two main types of queue.
–
First-come first-server (FCFS), also called first-in
first-out (FIFO).
–
Sorted queue, in which the elements are
regularly ordered according to some rule. The
most prevalent example of this is the shortest job
first (SJF) rule.
Queue scheduling
●
●
●
The FCFS queue is the simplest and incurs
almost no system overhead.
The SJF scheme can cost quite a lot in
system overhead, since each task in the
queue must be evaluated to determine which
is shortest.
The SJF strategy is often used for print
schedulers since it is quite inexpensive to
determine the size of a file to be printed (the
file size is usually stored in the file itself).
Queue scheduling
●
The efficiency of the two schemes is
subjective: long jobs have to wait longer if
short jobs are moved in front of them, but if
the distribution of jobs is random then we can
show that the average waiting time of any
one job is shorter in the SJF scheme,
because the greatest number of jobs will
always be executed in the shortest possible
time.
Queue scheduling
●
●
●
Of course this argument is rather stupid,
since it is only the system which cares about
the average waiting time per job, for its own
prestige.
Users who print only long jobs do not share
the same clinical viewpoint.
Moreover, if only short jobs arrive after one
long job, it is possible that the long job will
never get printed. This is an example of
starvation. A fairer solution is required.
Queue scheduling
●
●
●
Queue scheduling can be used for CPU
scheduling, but it is quite inefficient.
To understand why simple queue scheduling
is not desirable we can begin by looking at a
diagram which shows how the CPU and the
devices are being used when a FCFS queue
is used. We label each process by , P1...P2
etc.
A blank space indicates that the CPU or I/O
devices are in an idle state (waiting for a
customer).
Queue scheduling
●
●
●
There are many blank spaces in the diagram,
where the devices and the CPU are idle. Why, for
example, couldn't the device be searching for the
I/O for P2 while the CPU was busy with P1 and
vice versa?
We can improve the picture by introducing a new
rule: every time one process needs to wait for a
device, it gets put to the back of the queue.
Now consider the following diagram, in which we
have three processes. They will always be
scheduled in order P1, P2, P3 until one or all of
them is finished.
Queue scheduling
CPU
P1(F)
devices
P3
●
●
●
P1
P3
P2
P3
P3
-
P1
P3 -
P1
P2
P2(F)
P3
P3
P1 -
P1 starts out as before with a CPU burst.
But now when it occupies the device, P2 takes over
the CPU.
Similarly when P2 has to wait for the device to
complete its I/O, P3 gets executed, and when P3
has to wait, takes over again.
Queue scheduling
●
●
Now suppose P2 finishes: P3 takes over, since it is
next in the queue, but now the device is idle, because
P2 did not need to use the device.
Also, P1 when finishes, only is left and the gaps of
idle time get bigger.
Queue scheduling
●
●
In the beginning, this second scheme looked pretty
good - both the CPU and the devices were busy most
of the time (few gaps in the diagram).
As processes finished, the efficiency got worse, but
on a real system, someone will always be starting
new processes so this might not be a problem.
Round-robin scheduling
●
●
The use of the I/O - CPU burst cycle to requeue jobs
improves the resource utilization considerably, but it
does not prevent certain jobs from hogging the CPU.
Indeed, if one process went into an infinite loop, the
whole system would stop dead. Also, it does not
provide any easy way of giving some processes
priority over others.
Round-robin scheduling
●
A better solution is to ration the CPU time, by
introducing time-slices. This means that
–
no process can hold onto the CPU forever,
–
processes which get requeued often (because they spend
a lot of time waiting for devices) come around faster, i.e.
we don't have to wait for CPU intensive processes, and
–
the length of the time-slices can be varied so as to give
priority to particular processes.
Round-robin scheduling
●
●
●
●
The time-sharing is implemented by a hardware
timer.
On each context switch, the system loads the timer
with the duration of its time-slice and hands control
over to the new process.
When the timer times-out, it interrupts the CPU which
then steps in and switches to the next process.
The basic queue is the FCFS/FIFO queue. New
processes are added to the end, as are processes
which are waiting.
Round-robin scheduling
●
●
●
●
The success or failure of round-robin (RR) scheduling
depends on the length of the time-slice or timequantum.
If the slices are too short, the cost of context
switching becomes high in comparision to the time
spent doing useful work.
If they become too long, processes which are waiting
spend too much time doing nothing - and in the worst
case, everything reverts back to FCFS.
A rule of thumb is to make the time-slices large
enough so that only, say, twenty percent of all context
switches are due to timeouts - the remainder occur
freely because of waiting for requested I/O.
CPU quotas and accounting
●
●
●
Many multiuser systems allow restrictions to
be placed on user activity.
For example, it is possible to limit the CPU
time used by any one job.
If a job exceeds the limit, it is terminated by
the kernel. In order to make such a decision,
the kernel has to keep detailed information
about the cumulative use of resources for
each process.
CPU quotas and accounting
●
●
This is called accounting and it can be a
considerable system overhead.
Most system administrators would prefer not
to use accounting - though unfortunately
many are driven to it by thoughtless or hostile
users.
Threads – Heavy and lightweight
processes
●
●
Threads, sometimes called lightweight
processes (LWPs) are indepedendently
scheduled parts of a single program.
We say that a task is multithreaded if it is
composed of several independent
subprocesses which do work on common
data, and if each of those pieces could (at
least in principle) run in parallel.
Threads – Heavy and lightweight
processes
●
●
●
If we write a program which uses threads - there
is only one program, one executable file, one
task in the normal sense.
Threads simply enable us to split up that program
into logically separate pieces, and have the
pieces run independently of one another, until
they need to communicate.
In a sense, threads are a further level of object
orientation for multitasking systems. They allow
certain functions to be executed in parallel with
others.
Threads – Heavy and lightweight
processes
●
●
On a truly parallel computer (several CPUs)
we might imagine parts of a program
(different subroutines) running on quite
different processors, until they need to
communicate.
When one part of the program needs to send
data to the other part, the two independent
pieces must be synchronized, or be made to
wait for one another.
Threads – Heavy and lightweight
processes
●
●
But what is the point of this? We can always
run independent procedures in a program as
separate programs, using the process
mechanisms we have already introduced.
They could communicate using normal
interprocesses communication. Why
introduce another new concept? Why do we
need threads?
Threads – Heavy and lightweight
processes
●
●
●
The point is that threads are cheaper than normal
processes, and that they can be scheduled for
execution in a user-dependent way, with less
overhead.
Threads are cheaper than a whole process
because they do not have a full set of resources
each.
Whereas the process control block for a
heavyweight process is large and costly to
context switch, the PCBs for threads are much
smaller, since each thread has only a stack and
some registers to manage.
Threads – Heavy and lightweight
processes
●
●
●
It has no open file lists or resource lists, no
accounting structures to update.
All of these resources are shared by all
threads within the process.
Threads can be assigned priorities - a higher
priority thread will get put to the front of the
queue.
Threads – Heavy and lightweight
processes
●
●
In other words, threads are processes within
processes!
Threads can only run inside a normal
process.
Threads – Heavy and lightweight
processes
Let's define heavy and lightweight processes with the
help of a table.
Object
Resources
Thread (LWP) Stack + set of CPU registers + CPU
time.
Task (HWP)
1 thread + process control block,
program
code, memory segment etc.
Multithreaded n-threads process control block,
task
program code, memory segment etc.
●
Why use threads?
●
●
●
From our discussion of scheduling, we can see that
the sharing of resources could have been made
more effective if the scheduler had known exactly
what each program was going to do in advance.
Of course, the scheduling algorithm can never know
this - but the programmer who wrote the program
does know.
Using threads it is possible to organize the
execution of a program in such a way that
something is always being done, when ever the
scheduler gives the heavyweight process CPU time.
Why use threads?
●
●
Threads allow a programmer to switch
between lightweight processes when it is
best for the program. (The programmer has
control.)
A process which uses threads does not get
more CPU time than an ordinary process but the CPU time it gets is used to do work
on the threads. It is possible to write a more
efficient program by making use of threads.
Why use threads?
●
●
Inside a heavyweight process, threads are
scheduled on a FCFS basis, unless the
program decides to force certain threads to
wait for other threads. If there is only one
CPU, then only one thread can be running at
a time.
Threads context switch without any need to
involve the kernel - the switching is
performed by a user level library, so time is
saved because the kernel doesn't need to
know about the threads.
Levels of threads
●
●
●
In modern operating systems, there are two
levels at which threads operate: system or
kernel threads and user level threads.
If the kernel itself is multithreaded, the
scheduler assigns CPU time on a thread
basis rather than on a process basis.
A kernel level thread behaves like a virtual
CPU, or a power-point to which userprocesses can connect in order to get
computing power.
Levels of threads
●
●
The kernel has as many system level threads
as it has CPUs and each of these must be
shared between all of the user-threads on the
system.
In other words, the maximum number of user
level threads which can be active at any one
time is equal to the number of system level
threads, which in turn is equal to the number
of CPUs on the system.
Levels of threads
●
●
●
Since threads work ``inside'' a single task,
the normal process scheduler cannot
normally tell which thread to run and which
not to run - that is up to the program.
When the kernel schedules a process for
execution, it must then find out from that
process which is the next thread it must
execute.
If the program is lucky enough to have more
than one processor available, then several
threads can be scheduled at the same time.
Levels of threads
●
Some important implementations of threads
are
–
The Mach System / OSF1 (user and system
level)
–
Solaris 1 (user level)
–
Solaris 2 (user and system level)
–
OS/2 (system level only)
–
NT threads (user and system level)
–
IRIX threads
–
POSIX standardized user threads interface
Symmetric and asymmetric
multiprocessing
●
●
Threads are of obvious importance in
connection with parallel processing.
There are two approaches to scheduling on a
multiprocessor machine:
–
Asymmetric: one CPU does the work of the
system, the other CPUs service user requests.
–
Symmetric: All processors can be used by the
system and users alike. No CPU is special.
Symmetric and asymmetric
multiprocessing
●
●
The asymmetric variant is potentially more
wasteful, since it is rare that the system
requires a whole CPU just to itself.
This approach is more common on very large
machines with many processors, where the
jobs the system has to do is quite difficult and
warrants a CPU to itself.
Problems with sharing for
processes
●
●
It is not only threads which need to be
synchronized. Suppose one user is running a
script program and editing the program
simultaneously.
The script is read in line by line. During the
execution of the script, the user adds four
lines to the beginning of the file and saves
the file.
Problems with sharing for
processes
●
●
Suddenly, when the next line of the executing
script gets read, the pointer to the next line
points to the wrong location and it reads in
the same line it already read in four lines
ago!
Everything in the program is suddenly shifted
by four lines, without the process executing
the script knowing about it.
Problems with sharing for
processes
●
We must consider programs which share
data.
–
When do we need to prevent programs from
accessing data simultaneously? If there are 100
processes which want to read from a file, this will
cause no problems because the data themselves
are not changed by a read operation. A problem
only arises if more than one of the parties wants
to modify the data.
Problems with sharing for
processes
●
We must consider programs which share
data.
–
Is it even sensible for two programs to want to
modify data simultaneously? Or is it simply a
stupid thing to do? We must be clear about
whether such collisions can be avoided, or
whether they are a necessary part of a program.
For instance, if two independent processes want
to add entries to a database, this is a reasonable
thing to do. If two unrelated processes want to
write a log of their activities to the same file, it is
probably not sensible: a better solution would be
Problems with sharing for
processes
●
We must consider programs which share
data.
–
How should we handle a collision between
processes? Should we signal an error, or try to
make the processes wait in turn? There is no
universal answer to this question - in some
cases it might be logically incorrect for two
processes to change data at the same time: if
two processes try to change one numerical value
then one of them has to win - which one? On the
other hand, if two processes try to add
something to a list, that makes sense, but we
Serialization
●
●
The key idea in process synchronization is
serialization. This means that we have to go
to some pains to undo the work we have put
into making an operating system perform
several tasks in parallel. As we mentioned, in
the case of print queues, parallelism is not
always appropriate.
Synchronization is a large and difficult topic,
so we shall only undertake to describe the
problem and some of the principles involved
here.
Serialization
●
There are essentially two strategies to serializing
processes in a multitasking environment.
–
The scheduler can be disabled for a short period of time,
to prevent control being given to another process during a
critical action like modifying shared data. This method is
very inefficient on multiprocessor machines, since all other
processors have to be halted every time one wishes to
execute a critical section.
–
A protocol can be introduced which all programs sharing
data must obey. The protocol ensures that processes
have to queue up to gain access to shared data.
Processes which ignore the protocol ignore it at their own
peril (and the peril of the remainder of the system!). This
method works on multiprocessor machines also, though it
is more difficult to visualize.
Serialization
●
The responsibility of serializing important
operations falls on programmers. The OS
cannot impose any restrictions on silly
behaviour - it can only provide tools and
mechanisms to assist the solution of the
problem.
Mutexes: mutual exclusion
●
●
●
●
Another way of talking about serialization is to use
the concept of mutual exclusion.
We are interested in allowing only one process or
thread access to shared data at any given time.
To serialize access to these shared data, we have to
exclude all processes except for one.
Suppose two processes A and B are trying to
access shared data, then: if A is modifying the data,
B must be excluded from doing so; if B is modifying
the data, A must be excluded from doing so. This is
called mutual exclusion.
Mutexes: mutual exclusion
●
●
Mutual exclusion can be achieved by a
system of locks. A mutual exclusion lock is
colloquially called a mutex.
The idea is for each thread or process to try
to obtain locked-access to shared data
Mutexes: mutual exclusion
●
●
●
Mutual exclusion can be achieved by a
system of locks. A mutual exclusion lock is
colloquially called a mutex.
The idea is for each thread or process to try
to obtain locked-access to shared data
The mutex variable is shared by all parties
(e.g. a global variable). This protocol is
meant to ensure that only one process at a
time can get past the function Get_Mutex.
Mutexes: mutual exclusion
●
All other processes or threads are made to
wait at the function Get_Mutex until that one
process calls Release_Mutex to release the
lock. A method for implementing this is
discussed below. Mutexes are a central part
of multithreaded programming.
User synchronization: file locks
●
●
●
A simple example of a protocol solution, to
the locking problem at the user level, is the
so-called file-lock in UNIX.
When write-access is required to a file, we try
to obtain a lock by creating a lock-file with a
special name.
If another user or process has already
obtained a lock, then the file is already in
use, and we are denied permission to edit
the file.
User synchronization: file locks
●
●
If the file is free, a `lock' is placed on the file
by creating the file lock.
This indicates that the file now belongs to the
new user. When the user has finished, the
file lock is deleted, allowing others to use the
file.
User synchronization: file locks
●
●
In most cases a lock is simply a text file. If we
wanted to edit a file blurb, the lock might be
called blurb.lock and contain the user
identifier of the user currently editing the file.
If other users then try to access the file, they
find that the lock file exists and are denied
access. When the user is finished with the
file, the lock is removed.
User synchronization: file locks
●
●
The same method of locks can also be used
to prevent two instances of a program from
starting up simultaneously.
This is often used in mail programs such as
the ELM mailer in UNIX, since it would be
unwise to try to read and delete incoming
mail with two instances of the mail program
at the same time.
User synchronization: file locks
●
●
We can implement a lock very easily. Here is
an example from UNIX in which the lock file
contains the process identifier.
This is useful because if something goes
wrong and the editor crashes, the lock will
not be removed. It is then possible to see
that the process the lock referred to no
longer exists and the lock can be safely
removed.
Exclusive and non-exclusive locks
●
●
●
To control both read and write access to files,
we can use a system of exclusive and nonexclusive locks.
If a user wishes to read a file, a nonexclusive lock is used. Other users can also
get non-exclusive locks to read the file
simultaneously, but when a non-exclusive
lock is placed on a file, no user may write to
it.
To write to a file, we must get an exclusive
lock. When an exclusive lock is obtained, no
other users can read or write to the file.
Flags and semaphores
●
●
Flags are similar in concept to locks. The
idea is that two cooperating processes can
synchronize their execution by sending very
simple messages to each other.
A typical behaviour is that one process
decides to stop and wait until another
process signals that it has arrived at a certain
place.
Monitors
●
●
Some languages (like Modula) have special
language class-environments for dealing with
mutual exclusion. Such an environment is
called a monitor.
A monitor is a language-device which
removes some of the pain from
synchronization. Only one process can be
`inside' a monitor at a time - users don't need
to code this themselves, they only have to
create a monitor.
Monitors
●
●
A procedure or function defined under the
umbrella of a monitor can only access those
shared memory locations declared within that
monitor and vice-versa.
Wait and signal operations can be defined
to wait for specific condition variables. A
process can thus wait until another process
sends a signal or semaphore which changes
the condition variable.
Deadlock
●
●
●
●
Waiting and synchronization is not all sweetness
and roses.
Consider the European road rule which says: on
minor roads one should always wait for traffic
coming from the right.
If four cars arrive simultaneously at a
crossroads (see figure) then, according to the
rule all of them must wait for each other and
none of them can ever move.
This situation is called deadlock. It is the stalemate of the operating system world.
Deadlock – Cause
●
Deadlock occurs when a number of
processes are waiting for an event which can
only be caused by another of the waiting
processes.
Deadlock – Cause
●
These are the essential requirements for a
deadlock:
–
Circular waiting. There must be a set of
processes P1..Pn where P1 is waiting for a
resource or signal from P2, P2 is waiting for P3
... and Pn is waiting for P1 .
–
Non-sharable resources. It is not possible to
share the resources or signals which are being
waited for. If the resource can be shared, there is
no reason to wait.
–
No preemption. The processes can not be forced
Deadlock – Cause
●
There are likewise three methods for handling
deadlock situations:
–
Prevention. We can try to design a protocol which ensures
that deadlock never occurs.
–
Recovery. We can allow the system to enter a deadlock
state and then recover.
–
Ostrich method. We can pretend that deadlocks will never
occur and live happily in our ignorance. This is the method
used by most operating systems. User programs are
expected to behave properly. The system does not
interfere. This is understandable: it is very hard to make
general rules for every situation which might arise.
Deadlock – Prevention
●
●
●
●
Deadlock prevention requires a system
overhead.
The simplest possibility for avoidance of
deadlock is to introduce an extra layer of
software for requesting resources in addition
to a certain amount of accounting.
Each time a new request is made, the
system analyses the allocation of resources
before granting or refusing the resource.
The same applies for wait conditions.
Deadlock – Prevention
●
●
The problem with this approach is that, if a
process is not permitted to wait for another
process - what should it do instead?
At best the system would have to reject or
terminate programs which could enter
deadlock, returning an error condition.
Deadlock – Prevention
●
●
Another method is the following. One might
demand that all programs declare what
resources they will need in advance.
Similarly all wait conditions should be
declared. The system could then analyse
(and re-analyse each time a new process
arrives) the resource allocation and pin-point
possible problems.
Deadlock – Detection
●
●
The detection of deadlock conditions is also
a system overhead.
At regular intervals the system is required to
examine the state of all processes and
determine the interrelations between them.
Since this is quite a performance burden, it is
not surprising that most systems ignore
deadlocks and expect users to write careful
programs.
Deadlock – Recovery
●
●
To recover from a deadlock, the system must either
terminate one of the participants, and go on
terminating them until the deadlock is cured, or
repossess the resources which are causing the
deadlock from some processes until the deadlock is
cured.
The latter method is somewhat dangerous since it
can lead to incorrect program execution. Processes
usually wait for a good reason, and any interruption
of that reasoning could lead to incorrect execution.
Termination is a safer alternative.
Memory and storage
●
●
Together with the CPU, the physical memory
(RAM) is the most important resource a
computer has.
The CPU chip has instructions to manipulate
data only directly in memory, so all arithemtic
and logic operations must take place in RAM.
Physical Address space
●
Every byte in the memory has an address
which ranges from zero up to a limit which is
determined by the hardware (see below).
Physical Address space
●
Although bytes are numbered from zero upward, not
every address is necessarily wired up to a memory
chip. Some addresses may be reserved for
–
Memory mapped I/O - individual registers belonging to
other chips and hardware devices.
–
The interrupt vector - the CPU itself requires some
workspace. Usually the interrupt vector and sometimes
the processor stack occupy fixed locations.
–
The operating system itself. This takes up a fair chunk of
memory. On most microcomputers this is located in
ROM. On multiuser systems upgrades are much more
frequent and it is always loaded from disk into RAM.
Paged RAM/ROM
●
●
●
The size of the physical address space is
limited by the size of the address registers in
the CPU.
On early machines this memory was soon
exceeded and it was necessary to resort to
tricks to add more memory.
Since it was not possible to address any
more than the limit, these machines
temporarily switched out one bank of
memory with another.
Paged RAM/ROM
●
●
The new memory bank used the same
addresses as the old, but only one could be
accessed at a time. This operation is called
paging.
A special hardware paging chip was used to
switch between banks, containing a register
which could choose between N banks of
memory.
Paged RAM/ROM
●
●
Paging has obvious disadvantages - not all
memory can be used at once and the method
is seldom used nowadays since modern
CPUs can address much larger memory
spaces.
As we shall see later, multi-user systems use
paging to disk. Instead of switching between
hardware banks of memory, they copy the
old contents to disk and reuse the memory
which is already there for something else.
Address binding - coexistence in
memory
●
●
●
●
When a high level language program is
compiled, it gets converted into machine code.
In machine code there are no procedure names,
or variable names.
All references to data or program code are
made by specifying the address at which they
are to be found.
This immediately begs the question: how do we
know what the addresses will be? How do we
know where the program is going to be located
in memory?
Address binding - coexistence in
memory
●
●
●
On microcomputers, this is very
straightforward. A program is compiled to run
starting from some fixed address.
The system defines a certain range of
addresses which can be used by user
programs.
Whenever the program is loaded from disk, it
is loaded into the memory at the same
address, so that all of the addresses referred
to in the program are correct every time.
Address binding - coexistence in
memory
●
●
A problem arises if the system supports
several programs resident in memory
simultaneously.
Then it is possible that the addresses coded
into one program will already be in use by
another.
Address binding - coexistence in
memory
●
In that case there are three possible options
–
Demand that programs which can coexist be compiled to
run at different addresses. (This means that every
program which is to be able to coexist must know about
every other!)
–
Relative addressing. Machine code uses addresses
relative to the start address at which the program was
loaded. The CPU must then add the start address to every
relative address to get the true address. This incurs a
performance penalty. Also, on some microprocessors (e.g.
intel 6502), the relative addressing instructions available
are limited to fairly small relative ranges, due to the size of
the CPU registers.
Address binding - coexistence in
memory
●
In that case there are three possible options
–
Use address binding. Here the idea is that ``dummy"
addresses are used when code is generated. When a
program is loaded in, the true addresses are computed
relative to the start of the program and replaced before
execution begins. This requires a special program called a
loader.
Address binding - coexistence in
memory
●
●
●
●
Needless to say, it is the last of these
methods which is used in modern systems.
It introduces an important distinction between
logical and physical addresses.
A user program writes only to logical
addresses, not knowing where the program
will end up in the physical address space.
The addresses are converted to physical
addresses automatically.
Address binding - coexistence in
memory
●
Again there is a choice. When should this
conversion take place?
–
When the program is loaded into memory, once
and for all?
–
While the program is being executed?
Address binding - coexistence in
memory
●
●
●
●
Initially it would seem that 1. is the better alternative,
since 2 incurs a runtime overhead.
In fact 2. is the more flexible option for reasons which
will become more apparent when we consider paging
to disk.
By performing the distinction at runtime, we have the
freedom to completely reorganize the use of physical
memory dynamically at any time.
This freedom is very important in a multitasking
operating system where memory has to be shared
continually.
Shared libraries
●
●
●
The concept of shared libraries lies
somewhere in the grey zone between
compiling and linking of programs and
memory binding.
The advantages of shared libraries should
be clearly apparent by the end of this section.
On windows systems, shared libraries are
called dynamically loaded libraries or dll's.
Shared libraries
●
●
●
●
●
On older systems, when you compile a program, the
linker attaches a copy of standard libraries to each
program.
Because of the nature of the linker, the whole library
has to be copied even though perhaps only one
function is required.
Thus a simple program to print ``hello'' could be
hundreds or thousands of kilobytes long!
This wastes considerable amount of disk space,
copying the same code for every program.
When the program is loaded into memory, the whole
library is loaded too, so it is also a waste of RAM.
Shared libraries
●
The solution is to use a run-time linker, which
only loads the shared library into RAM when
one of the functions the library is needed.
Shared libraries
●
The advantages and disadvantages of this scheme
are the following.
–
Considerable savings in disk space are made, because
the standard library code is never joined to the executable
file which is stored on disk, thus there is only one copy of
the shared library on the system.
–
A saving of RAM can also be made since the library, once
loaded into RAM can often be shared by several
programs.
–
A performance penalty is transferred from load-time to
run-time, the first time a function is accessed: the library
must be loaded from disk during the execution of the
program. In the long run, this might be outweighed by the
time it would otherwise have taken to load the library for n
programs, which now can share it. Also, the amount of
Shared libraries
Runtime binding
●
●
●
Keeping physical and logical addresses
completely separate introduces a new level
of abstraction to the memory concept.
User programs know only about logical
addresses.
Logical addresses are mapped into real
physical addresses, at some location which
is completely transparent to the user, by
means of a conversion table.
Runtime binding
●
●
●
The conversion can be assisted by hardware
processors which are specially designed to
deal with address mapping.
This is much faster than a purely software
solution (since the CPU itself must do the
conversion work).
The conversion is, at any rate, performed by
the system and the user need know nothing
about it.
Runtime binding
●
●
The part of the system which performs the
conversion (be it hardware or software) is
called the memory management unit (MMU).
The conversion table of addresses is kept for
each process in its process control block
(PCB) and mmust be downloaded into the
MMU during context switching (this is one
reason why context switching is expensive!).
Runtime binding
●
Each logical address sent to the MMU is
checked in the following way:
–
Does the logical address belong to the process?
If not, generate an ownership error (often called
a segmentation fault)
–
Translate the logical address into a physical
address.
Runtime binding
●
●
●
The ownership checking is performed at the
logical level rather than the physical level
because we want to be able to use the physical
memory in the most general possible way.
If we bind physical addresses to a special user it
means that we cannot later reorganize the
physical memory and part of the point of the
exercise is lost.
On the other hand, if users are only bound to
logical addresses, we can fiddle as much as we
like with the physical memory and the user will
never know.
Runtime binding
●
One more question must be added to the
above.
–
Are the data we want to access actually in the
physical memory? As we shall see later in this
chapter, many systems (the most immediate
example of which is UNIX) allow paging to disk.
Runtime binding
●
●
●
●
The conversion of logical addresses into physical
addresses is familiar in many programming
languages and is achieved by the use of pointers
Instead of referring to data directly, one uses a
pointer variable which holds the true address at
which the data are kept. In machine language, the
same scheme is called ``indirect addressing''.
The difference between logical addresses and
pointers is that all pointers are user objects, and thus
pointers only point from one place in logical memory
to another place in logical memory.
The mapping from logical to physical is only visible to
the designer of the system.
Runtime binding
●
●
●
How is the translation performed in practice?
To make the translation of logical to phyical
addresses practical, it is necessary to coarse
grain the memory.
If every single byte-address were
independently converted, then two 32 bit
addresses would be required for each byteaddress in the table and the storage space
for the conversion table would be seven
times bigger than the memory of the system!
Runtime binding
●
●
To get around this problem, we have to break
up the memory into chunks of a certain size.
Then we only need to map the start address
of each block, which is much cheaper if the
blocks are big enough.
Runtime binding
●
There are two schemes for coarse graining the
memory in this way:
–
Give each process/task a fixed amount of workspace (a
fixed size vector) which is estimated to be large enough to
meet its needs. Only the base address of the workspace
and the size need to be stored i.e. the whole vector in
logical memory is mapped into a corresponding vector in
physical memory. We don't know where it lies in the
physical memory, but the mapping is one-to-one. The
disadvantage with this scheme is that either too much or
too little memory might be allocated for the tasks.
Moreover - if only a small part of the program is actually
required in practice, then a large amount of memory is
Runtime binding
●
There are two schemes for coarse graining
the memory in this way:
–
Coarse grain or ``quantize'' the memory in
smallish pieces, called pages. Each page is
chosen to have the same fixed size (generally 24kB on modern systems), given by some power
of 2 bits (this varies from system to system).
The base address of each page is then stored in
the conversion table (the length is known, since
it is fixed). A unit of logical memory is called a
page, whereas a unit of physical memory is
called a frame. Apart from the difference in
Runtime binding
●
●
●
The second of these possibilities is an
attractive propostion for a number of
reasons.
By breaking up the memory into smaller
pieces, we have the possibility of
reorganizing (reusing) each piece separately.
Large programs need not be entirely in
memory if they are not needed. Also, if two
programs use the same code, they can share
pages, so two logical pages map into the
same physical frame.
Segmentation – sharing
●
●
●
From the point of view of the system:
sharing, process management and efficiency,
it is highly convenient to view the memory for
different processes as being segmented.
A segment is a convenient block of logical
memory which is assigned to a process when
it is executed.
The memory given to any process is divided
up into one or more segments which then
belong to that process.
Segmentation – sharing
●
●
●
The purpose of segments is to help the
system administrate the needs of all
processes according to a simple paradigm.
Each segment of memory is administrated
separately and all of the checks on valid
addressing are made on each segment.
It is therefore convenient to use separate
segments for logically separate parts of a
program/process.
Segmentation – sharing
●
●
●
Code segment - program code
Data segment - the program stack and
dynamically allocated data.
Arrays can conveniently be placed in a
segment of their own - that way, array boundchecking will be performed automatically by
the hardware of the system.
Segmentation – sharing
●
●
●
●
The segment idea can all be built on top of
the page/frame concept above by demanding
that segments be a whole number of pages.
That way, we retain the advantages of the
page system.
Segmentation is an additional overhead
which relates to the sharing of logical
memory between processes.
The page overhead relates to the mapping of
logical to physical addresses.
Page size, fragmentation and
alignment
●
●
●
The process of allocating memory is really
only half the story of memory management.
We must also be able to de-allocate or free
memory. When memory is freed from a
segment, it leaves a hole of a certain size,
which is added to the free-list.
Eventually, the number of these holes grows
quite large and the memory is said to
become fragmented.
Page size, fragmentation and
alignment
●
●
Fragmentation can lead to wasted resources.
We would clearly like to re-use freed memory
as far as possible, but if the holes are not big
enough to fit the data we need to allocate
then this is not possible.
Page size, fragmentation and
alignment
●
●
●
●
Another technical problem which leads to
fragmentation and wastage is alignment.
Alignment is a technical problem associated
with the word-size and design of the CPU.
Certain memory objects (variables) have to
be stored starting from a particular (usually
even) address.
This is because the multiple-byte registers of
the CPU need to align their ``footprints'' to
the addresses of the memory.
Page size, fragmentation and
alignment
●
●
Or, by virtue of the word-size of the system,
the CPU regards the addresses as being
effectively multiples of the word-size.
In order to meet this requirement, memory
sometimes has to be `padded' out with empty
bytes - which are therefore wasted.
Page size, fragmentation and
alignment
●
Fragmentation occurs at two levels:
–
Internal fragmentation. This is space wasted by
malloc in trying to fit data into a segment (logical
memory).
–
External fragmentation. This is space lying
between segments in the physical memory.
(There are never holes between segments in
logical memory since we can always just
renumber the logical addresses to remove them
- they are not real anyway.)
Page size, fragmentation and
alignment
●
●
●
Note that external fragmentation is formally
eliminated by the page concept.
With pages, every object in physical memory is
always the size of a page or frame, every hole
must also be the size of a page and thus one is
guaranteed to be able to fit a page block into a
page hole.
To some extent this is a cheat though, because
the problem is only transferred from external to
internal fragmentation - but such is the nature
of definitions.
Page size, fragmentation and
alignment
●
●
●
Internal fragmentation can be minimized by
choosing a smaller page size for the system.
That means that, on average, fewer bytes will
be wasted per page.
Of course, the system overhead grows larger
as the page size is reduced, so as usual the
size of pages is a tradeoff between two
competing requirements.
Page size, fragmentation and
alignment
●
●
At the user level, it is possible to avoid of the
fragmentation problem when writing
programs.
For example, if a program allocates and frees
memory objects of random sizes, it will be a
random issue whether or not the holes left
over can be used again.
Page size, fragmentation and
alignment
●
●
If, on the other hand, a program only
allocates memory in fixed size structures (like
C's struct and union variable types), then
every hole will be the same size as every
new object created and (as with pages) it will
always be possible to fit new data into old
holes.
This is a program design consideration.
Unions were designed for precisely this kind
of purpose.
Reclaiming fragmented memory
(Tetris!)
●
There are two strategies for reclaiming
fragmented memory.
–
Try to fit data into the holes that already exist.
–
Reorganize the data so that all the holes are
regrouped into one large hole.
Reclaiming fragmented memory
(Tetris!)
●
●
The second alternative clearly represents a
large system overhead and is seldom used.
The first method can be implemented in one
of three ways. Given a free-list of available
holes, one may choose a space on the basis
of
–
First fit. Choose the first hole which will do.
–
Best fit. Choose the smallest hole that will do.
–
Worst fit Choose the largest hole (which in some
screwy sense leaves the biggest remainder - for
what it's worth).
Reclaiming fragmented memory
(Tetris!)
●
●
The first two are preferable, but neither works
best in all cases.
The criteria are i) minimization of
fragmentation and ii) minimization of the
allocation overhead. The first is perhaps
preferable, since it is fastest.
Virtual Memory - Paging and
Swapping
●
●
Virtual memory is a way of making the
physical memory of a computer system
effectively larger than it really is.
Rather than using mirrors, the system does
this by determining which parts of its memory
are often sitting idle, and then makes a
command decision to empty their contents
onto a disk, thereby freeing up useful RAM.
Virtual Memory - Paging and
Swapping
●
●
As we noted earlier, it is quite seldom that
every byte of every program is in use all of
the time.
More often programs are large and contain
sections of code which are visited rarely if
ever at all by the majority of users - so if they
are not used, why keep them in RAM?
Virtual Memory - Paging and
Swapping
●
Virtual memory uses two methods to free up
RAM when needed.
–
Swapping. An entire process, including code
segment and data segments is expunged from
the system memory.
–
Paging. Only single pages are swapped out.
Virtual Memory - Paging and
Swapping
●
●
Of course, the simplest way to clear a space
in RAM is to terminate some processes, but
virtual memory is more subtle than that.
The idea is to free RAM only temporarily, with
the intention of copying the data back again
later. All of this should happen in such a way
that the user of the system do not realize that
it is happening.
Virtual Memory - Demand Paging Lazy evaluation
●
●
You might ask - if a program has a lot of
pages which do not get used, what is the
purpose of loading them in the first place and
then swapping them out?
One could simply make a rule that no page
should be brought into memory until it were
needed. Such a scheme is possibile, but few
systems allow a program to run if it cannot be
loaded fully into memory on start-up.
Virtual Memory - Demand Paging Lazy evaluation
●
●
One argument against this extreme form of
paging is that, it could be dangerous to start
a program which was unable to complete
because it was too large to run on the
system, under the conditions of the moment.
If it started to run and then crashed or exited,
it could compromise important data. (The
BSD UNIX system allocates sufficient space
in its swap area to swap or page out each
entire process as it begins. That way, none of
them will ever run out of swap during
execution.)
Virtual Memory - Demand Paging Lazy evaluation
●
●
●
On the other hand, if a program can be loaded in,
it is most likely safe - so if we then discover that
large parts of the program are never used, we
can page them out and never bother to page
them in again.
This is an example of what is called lazy
evaluation. A lazy pager never brings a page
back into memory until is has to i.e. until
someone wants to use it.
This can save a considerable amount of I/O time.
Another name for this is demand paging, since it
only occurs on demand from user processes.
Virtual Memory - Demand Paging Lazy evaluation
●
●
It is now easy to see how the paging concept
goes hand in hand with the logical memory
concept: each time the system pages out a
frame of physical memory, it sets a flag in the
page table next to the logical page that was
removed.
If a process attempts to read from that page
of logical memory the system first examines
the flag to see if the page is available and, if
it is not, a page fault occurs.
Virtual Memory - Demand Paging Lazy evaluation
●
●
●
A page fault is a hardware or software
interrupt (depending on implementation)
which passes control to the operating
system.
The OS proceeds to locate the missing page
in the swap area and move it back into a free
frame of physical memory.
It then binds the addresses by updating the
paging table and, when control returns to the
waiting process, the missing page is
automatically restored, as if it had never
been gone.
Virtual Memory - Demand Paging Lazy evaluation
●
●
Notice, that the location of the physical frame
is completely irrelevant to the user process.
A frame does not have to be moved back into
the same place that it was removed from,
because the runtime binding of addresses
takes care of its relocation.
Swapping and paging algorithms
●
●
●
How does the system decide what pages or
processes to swap out?
This is another problem in scheduling.
A multitude of schemes is available. Here we
shall only consider some examples.
Swapping and paging algorithms
●
Consider the UNIX system a moment. Before
paging was introduced, the only way that
memory segments could increase their size
was to
–
Try to look for free memory at the end of the
current segment and add it to the current
segment.
–
Try to allocate a new, larger segment, copy the
data to the new segment and deallocate the old
one.
–
Swap out the process, reallocate and swap in
Swapping and paging algorithms
●
●
In this use of swap space, it is clear that a
process is swapped out while it is waiting for
a suitable hole in to appear in the memory.
This might take a long time and it might be
immediate. Another case for swapping out a
job is if it has been idle (sleeping) for a long
time.
Swapping and paging algorithms
●
●
●
●
Let us now look more generally at how
paging decisions are made.
The most important aspect of paging is that
pages can still be accessed even though
they are physically in secondary storage (the
disk).
Suppose a page fault occurs and there are
no free frames into which the relevant data
can be loaded.
Then the OS must select a victim: it must
choose a frame and free it so that the new
Swapping and paging algorithms
●
●
●
This is called (obviously) page replacement.
The success or failure of virtual memory rest
on its ability to make page replacement
decisions.
Certain facts might influence these
algorithms. For instance, if a process is
receiving I/O from a device, it would be
foolish to page it out - so it would probably
I/O locked into RAM.
Here are some viable alternatives for page
replacement.
Swapping and paging algorithms
●
FIFO - first in first out
–
The simplest way of replacing frames is to keep
track of their age (by storing their age in the
frame table). This could either be the date, as
recorded by the system clock, or a sequential
counter.
–
When a new page fault occurs, we can load in
pages until the physical memory is full thereafter, we have to move out pages.
–
The page which has been in memory longest is
then selected as the first to go.
Swapping and paging algorithms
●
FIFO - first in first out
–
This algorithm has the advantage of being very
straightforward, but its performance can suffer if
a page is in heavy use for a long period of time.
–
Such a page would be selected even though it
was still in heavy use.
Swapping and paging algorithms
●
Second chance
–
A simple optimization we can add to the FIFO
algorithm is the following.
–
Suppose we keep a reference bit for each page
in the page table.
–
Every time the memory management unit
accesses a page it sets that bit to .
–
When a page fault occurs, the page replacement
algorithm looks at that bit and - if it is set to - sets
the bit to 0 but jumps over it and looks for
another page.
Swapping and paging algorithms
●
Second chance
–
The idea is that pages which are frequently use
will have their bits set often and will therefore not
get paged out.
–
Of course, this testing incurs an overhead. In the
extreme case that all pages are in heavy use the
page algorithm must cycle through all the pages
setting their bits to zero before finding the
original page again.
–
Even then, it might not find a page to replace, if
the bit was set again while it was looking through
the others. In such a case, the paging system
Swapping and paging algorithms
●
LRU - least recently used
–
The best possible solution to paging would be to
replace the page that will not be used for the
longest period of time - but unfortunately, the
system has no way of knowing what that is.
–
A kind of compromise solution is to replace the
page which has not been used for the longest
period (see the figure below).
–
This does not require a crystal ball, but it does
require some appropriate hardware support to
make it worthwhile. As with all good ideas, it
costs the system quite a lot to implement it.
Swapping and paging algorithms
●
LRU - least recently used
–
Two possibilities for such an implementation are
the following.
●
We record the time at which each page was last
referenced. Unlike the FIFO scheme above, this means
that we have to update the time-stamp every single time
memory is referenced, instead of only each time a page
is replaced. If the copying operation takes, say, five CPU
instructions (jump to update routine, locate page table
entry, load system clock time, store system clock time,
return), this means - roughly speaking - that the system
is slowed down by a factor of around five. This is an
unacceptable loss, so unless the memory management
unit can do something fancy in hardware, this scheme is
Swapping and paging algorithms
●
LRU - least recently used
–
Two possibilities for such an implementation are the
following.
●
–
We keep a stack of page addresses, so that the page number of
the most recently accessed page is always on the top of the
stack. Although this sounds cheaper in principle, since the page
replacement algorithm never has to search for a replacement - it
just looks on top of the stack - it still results in a large system
overhead to maintain the stack. We must update a data stucture
which requires process synchronization and therefore waiting.
Again, without special hardware, this is not economical.
In practice, many systems use something like the secondchance algorithm above. The UNIX pagedaemon uses
Thrashing
●
●
Swapping and paging can lead to quite a
large system overhead.
Compared to memory speeds, disk access is
quite slow - and, in spite of optimized disk
access for the swap area, these operations
delay the system markedly.
Thrashing
●
Consider the sequence of events which takes
place when a page fault occurs:
–
Interrupt / trap and pass control to the system
interrupt handler.
–
Save the process control block.
–
Determine cause of interrupt - a page fault.
–
Consult MMU - is the logical address given
inside the process' segment i.e. Legal?
–
Look for a free frame in the frame table. If none
is found, free one.
Thrashing
●
Consider the sequence of events which takes
place when a page fault occurs:
–
Schedule the disk operation to copy the required
page and put the process into the waiting state.
–
Interrupt from disk signals end of waiting.
–
Update the page table and schedule the process
for running.
–
(On scheduling) restore the process control
block and resume executing the instruction that
was interrupted.
Thrashing
●
●
●
Such a sequence of operations could take of
the order or milliseconds under favourable
conditions (although technology is rapidly
reducing the timescale for everything).
It is possible for the system to get into a state
where there are so many processes
competing for limited resources that it spends
more time servicing page faults and
swapping in and out processes than it does
executing the processes.
This sorry state is called thrashing.
Thrashing
●
●
●
●
Thrashing can occur when there are too many
active processes for the available memory.
It can be alleviated in certain cases by making the
system page at an earlier threshold of memory
usage than normal.
In most cases, the best way to recover from
thrashing is to suspend processes and forbid new
ones, to try to clear some of the others by allowing
them to execute.
The interplay between swapping and paging is
important here too, since swapping effectively
suspends jobs.