Programming Project 3: Paging

Assigned: April 7.
Due: April 28.

Revised: April 9

In this assignment, the STM has been modified to support paging. The
JAVA code attached here provides a simulated architecture, a simulated disk,
much of the "OS" code, and all of the data structures that you will need.
Your
task is to write the page fault handler and the disk driver for the simulated
disk.

The Simulated Hardware

The STML language and the format of an STML file are unchanged from
project 1.

The simulated hardware, which you may not change, is found in the file
STM.java. The methods included here are:

physAddress(reladd,write) -- handles address references.
"reladd" is the virtual address; "write" is true if the reference is
a write to the location and false if the reference is a read.
Specifically:

physAddress returns the physical address in RAM corresponding to the given
virtual address.

It sets the M bit, R bit, and time stamp for the page in the global
page table.

It sets the variable PageFault to be -1 if there is no page fault,
and to be the value of the page that has faulted if there is a page fault.

showInst(...) displays the content of an instruction

execInst(...) executes an instruction, unless doing so generates a
page fault. It also increments the PC.

There is an STM constructor which creates and initializes the STM

The major data structures are

mem[] --- RAM

regs[] --- registers

pageTable[] -- The global page table. The entries of the page table
are PTE's; their definition is found in PTE.java. Each PTE contains
a frame number, a disk number, an rbit, an mbit, and a timestamp.
(The rbit is set in this code but not actually used in this assignment.)
The frame number is -1 if the page is not in RAM. The disk number
is -1 if the page is not on disk.

The Simulated Disk

The simulated disk is found in Disk.java. You are not allowed to change
this code.

The disk communicates with
the OS using an object of type DiskTask, found in DiskTask.java.
A DiskTask consists of a disk address, a memory address, a size
(number of words to transfer) and a flag "write". The flag "write"
is true if the task is to write from memory to disk; it is false if
the task is to read from disk to memory.

The disk works as follows: The OS sends the disk a series of requests,
which it maintains in a set called "Requests". At each "machine cycle"
-- more precisely, each time the method "DiskAction" is called --
if the set of pending requests is non-empty, the disk flips a coin to
determine whether to take any action. With probability 3/4, it does nothing.
With probability 1/4, it picks a pending request at random and carries it
out. The method "DiskAction" returns the DiskTask that it has executed.

Note on random selection: My information is that the Java method
java.util.Random, starting with a specified seed, returns the same
sequence of random values in all valid implementations of Java.
You can make sure that your Java is giving the same sequence of
random numbers as my Java by compiling and running the code in
RandomTest.java and comparing it to
my output in randomnums.txt . If these
are not the same, let me know.

There are three public methods in Disk.java:

DiskAction(mem) is called every machine cycle. With probability 1/4
it picks a random pending task and executes it.

PostRequest() adds a new task to the set of pending tasks.

The constructor Disk() initializes the disk.

The command line

The command line accepts two additional flags.

"-f" indicates the number
of frames to be used. (The page size is equal to the size of RAM -- 65536 --
divided by the number of frames). The number of frames defaults to 8, with
a page size of 8192. Of course, in reality this would be absurdly small,
but it's good for testing, since page replacement starts to happen reasonably
soon.

"-a" indicates the page replacement algorithm to be used. "GL" is
the Global Least Recently Used algorithm; "LL" is the Local Least Recently Used
algorithm. The default is GL.

The Operating System

The "operating system" is found in file Project3.java. It is basically the same
as the operating system in the first project. (It does not include the
semaphores that you implemented for the second project.) You can make
any changes you want to this part of the code. However, it is not
_necessary_ to make any changes beyond writing the bodies of
"execPageFault" and "HandleDiskReport".

The one major change is that, since processes can block waiting for a disk
event, the CPU can become idle. The methods "contextSwitch" and
"run" have been rewritten to accommodate this possibility.
Additional minor changes include the following:

A process table entry now holds the page table for the process,
rather than the base and limit register.

execTrap has been modified so that, when process P terminates, its
pages are marked as free.

contextSwitch(block) now takes a boolean argument "block". "block"
is false if the current process does not have to block
(preemption or reading from input) and true if the current process does
have to block (page fault).

The file "Project3.java" also contains a number of data structures
and utility methods that you may find useful in doing the assignment.

InvTable is an inverted page table. For frame f, InvTable[f] is
an object of class InvTableEntry. It contains the following fields:

The page held in frame F. -1 if F is free.

The process held in frame F. -1 if F is free.

The field "holdingPage" and "holdingProc". These are normally
-1. If process PR does a reference to page PG, and a page fault occurs,
and frame F is chosen to be used, then InvTable[F].holdingPage is
set to PG and InvTable[F].holdingProc is set to PR. (This is
done in the part of the code that _you_ write.)

freeList is a boolean vector of free pages. freeList[F] is 0 if
F is free and 1 if F is occupied.

diskAddress(PR,PG) gives a simple method for computing the
disk address where page PG of process PR can be saved.

freeFrame() returns a free frame if there is one. If there isn't,
it returns -1.

What you have to do

What you have to do is to implement demand paging in this simulated system.
You are _allowed_ to make any changes you want to the code in Project3.java.
However, it is _possible_ to do the assignment without doing anything more
than writing the bodies of the methods "execPageFault" and "HandleDiskReport",
plus, if you want, writing subroutines specific to those methods.

This should require about one or two hundred lines of code. What is
critical in this assignment, and a little tricky,
is making sure that you've taken care of all the cases, and
that you've made all the updates to the data structures that you need
to in each case.

Page Faults

In deciding what to do in execPageFault, there are two main issues:

Issue 1: The page which is faulting may be (A) a new empty data page;
or (B) an existing page located on the simulated disk.

Issue 2: The frame that is chosen to be used may be (X) empty;
(Y) occupied by a clean page (M bit = 0) or (Z) occupied by a dirty page
(M bit = 1).

Any of the 6 combinations of A,B with X,Y,Z may occur.

Let PR be the current process; PG be the page that is faulting;
F be the frame to be used; OPG be the old page in F; and OPR be the old
process holding F. In all cases, the relevant page tables and the inverted page
table have to be updated.

Case A and X. F is assigned to PG; no context switch is needed

Case B and X. PG must be read in from disk; PR must block until
PG is read in from disk.

Case A and Y. F is assigned to PG; no context switch is needed.

Case B and Y. PG must be read in from disk; PR must block until PG
is read in from disk.

Case A and Z. F is assigned to PG. OPG must be written out to disk.
PR must block until OPG has been written out to disk.

Case B and Z. First, OPG must be written out to disk and then PG
must be read in from disk. However, since the disk simply chooses
its tasks at random, the task of reading in PG cannot be posted to the
disk until the task of writing OPG is complete. Therefore, the posting
of the task of reading in PG cannot be done by execPageFault. It
can be done by HandleDiskReport.

In all these cases, execPageFault deals with reading or writing from
disk by calling myDisk.PostRequest. execPageFault deals with blocking
process PR by calling "contextSwitch(true)".

In case Z, OPR does not have to block. However, OPG's page table
must record that OPG is no longer in memory, so that if OPR will
page fault if it tries to access OPG.

Page Replacement algorithm

You need to implement two variants on the "least recently used" page
replacement algorithm: global LRU and local LRU. In either case,
you go through the candidate frames and choose the one with the earliest
time stamp. It's unlikely that you will run into a tie, but not
impossible; if it happens, you can resolve it any way you want.

A frame is a candidate for replacement if

It is not already being held for some other page to be read into.
That is "InvTable[F].holdingPage" and ".holdingProc" are -1.

If the algorithm is local, then the frame must hold a page for
the current process.

Note that, in the global algorithm, the timestamp for pages for
the current process are found in the global "stm.pageTable", where
as pages for a non-running process PR are found in
"procTable[PR].pageTable".

HandleDiskReport

HandleDiskReport(Task) implements the OS's reaction to receiving
a report from the disk that some task has been completed. This
task report contains the disk address, the physical address, and
whether the operation was read or write, so it takes a little work
for the OS to determine what has happened. First, the frame involved
is calculated using the formula

int frame = Task.memAddress / stm.PAGESIZE;

Second the process PR and page PG waiting for this frame are retrieved
from InvTable[frame].holdingProc and InvTable[frame].holdingPage.

Now there are three cases:

The operation was reading in page PG of process PR. PR can now
be unblocked.
If the CPU is currently idle, PR can be activated;
otherwise, it is placed on the ready queue.

The operation was writing out a page, PG is a new page,
not on disk. PR can be unblocked.
If the CPU is currently idle, PR can be activated;
otherwise, it is placed on the ready queue.

The operation was writing out a page. PG is on disk and must
now be read in. A request to read in PG must be posted to the disk.
PR remains blocked.

Comments on Implementation

The assignment must be done in Java. I do not have time to translate this
code to C.

You need not reproduce the debugging output exactly, though of course this
will make it easier for you to check that you have done the assignment
correctly. You do have to generate enough debugging output with -d 2 to
establish that the code is actually using demand paging in a reasonable
way. Since the behavior of this kind of system can be quite sensitive to
small changes in sequence, it is possible that there could be correct code
that gives substantively different behavior in terms of what happens
to which pages when. If you think your code is behaving correctly but
it is doing something different from the sample outputs, consult with
me.

Error checking

You can assume that the STML code is correct,
that the command line is correctly formatted, and that the input format
is correct. That is, you don't have to do any error checking.

Partial credit and extra credit

I will give 85% credit if you get this to work for a single process
which uses more pages than the frames in RAM. The code is not that much
shorter, but I think that it is noticeably less confusing. Note that,
if there is only one process P, then at any given time either the CPU
is running P or P is blocked. Also, of course, there is no difference
between the local and the global replacement algorithms.

I will give 10% extra credit if you integrate the semaphores from project 2
with this. A test STML file for this may be found in
primessem.stm . This file modifies primesx.stm by protecting the reading
and writing by semaphores, so that, if several instances of primessem.stm
are run together, each instance reads its inputs consecutively and produces
its outputs consecutively.

Be sure to alert the grader if you are doing either the partial credit
or the extra credit version of the assignment.

Late Submissions

Because the due date is so close to the end of the semester, I have to
be stricter about it.

Any project submitted late -- i.e. after class time on Monday, April 28 --
will be penalized 10 points out of 100.

No projects will be accepted after class time, Monday, May 5

If you are submitting your project any time after
Wednesday, April 30, you should email it to me at davise@cs.nyu.edu, rather
than to the grader.

Test STML files

There are currently two test STML files. Both of these are variants of
the "primes.stm" file used in the first project, modified to create
frequent page faults. If I have time, I'll create
some more test files. It is unlikely that I will have time.

The file primesx.stm modifies the original code
in primes.stm by scattering the sieve of Eratosthenes around memory.
The program reads two values from input, N and D. It outputs the primes
up to N. D governs the scattering of the sieve in memory; specifically,
Specifically the Ith entry in the sieve is located at virtual address
100 + (D*I) mod 131072. D should be chosen to be an odd number.
The larger D is relative to the page size S, the more frequent the page faults.
If D is between S and 2S then each of the first (131072/S) entries in
the sieve is on a different page.

The file primesy.stm is the same as primesx.stm
except that the code is scattered over two pages (assuming a page size
of 8192) by placing 8192 0's in the input file. This generates an
example where one of the pages of code ends up being paged out.

The code needs more comments, in places. I'll try to get that done soon.

Submitting the assignment

When you submit the assignment, you should send to the grader _only_ the
source code files that you have modified or created. As always, be sure
to include your name as a comment at the start of the file.