Abstract.
We present an example of analysis of average seek time for a computer disk drive. The
example is unique in the following ways: the issue involved is of practical importance;
the final version of this analysis which is done without any simplifying (unrealistic)
assumptions is still accessible to undergraduate students; the analysis involves most
basic and crucial mathematical concepts: operations on polynomials, solving systems of
linear equations, mathematical induction, probability understood as frequency of events
and organization of a mathematical proof. The paper also shows how the problem and the
solution can evolve together.

Undergraduate students who have been solving textbook problems which depend on use of a
single mathematical method will see, perhaps for the first time, how several methods can
be combined together to solve a real world problem. It is generally understood that small
but realistic research problems which touch upon a variety of issues are among most
valuable components of college courses, but finding such problems which are also
accessible to undergraduate students is a challenge to educators.

Analysis of the average seek time similar to ours would be a routine for a computer
engineer but it has not been used in computer science education. A wider use of our
example could help in rising understanding and appreciation of math among computer science
and engineering majors. This paper is addressed to students and instructors in
undergraduate courses on computer architecture, discrete math and programming courses.
Ideas presented in this paper have been successfully used at SUNY/Plattsburgh.

COMPUTER DISK ACCESS TIME

A surface of a computer disk contains a number of concentric circles on which
the information is recorded. These circles are called tracks. Every tracks is
divided into the same number of segments; these segments are called blocks or sectors.
Angular measures of all blocks are the same Although blocks on tracks near the edge of the
disk are longer than those on tracks near the center of the disk, every block stores the
same amount of information. The disk drive is equipped with an arm containing a read-write
head which moves in a straight line between the edge of the disk and its center. When
the head reaches the desired track, it stops; the disk continuously rotates underneath and
the head can read or write a block of information on the track over which it is
positioned. In response to requests from the CPU, the I/O processor and disk
controller transfer the entire block to an I/O buffer in RAM. For more
information see chapter 3 of [Folk,Zoelick].

The time needed to fulfill a request for a block transfer is called access time.
Access time can be split into the following components.

Moving the read-write head from its current position to a position over the track where
the requested block is located. The time spent on this operation is called a seek time.
Of course, the seek time varies depending on the initial position of the read-write head.

Waiting until the beginning of the requested block comes under the head. The time spent
on this is called a latency or rotational delay. Of course, depending on the
situation, the waiting time can be as short as 0 or as long as needed for one (almost)
full revolution of the disk.

Scanning the desired block. The time spent on this operation is called a transfer
time. As all blocks have the same angular measure, block transfer time is constant --
it does not depend on the position of the block, track or other circumstances.

Disk manufacturers often specify the average-case or worst-case access time for their
models. We will consider the best-case as well. Let us think what is represented by these
terms.

Best-case

Worst-case

Average-case

Seek Time

0

Time needed to move form the first to the last
track

? (This will be investigated in the rest of this
paper)

Latency

0

Time needed for one full revolution of the disk

Time needed for half of the revolution of the disk

Transfer Time

Time needed for a block to pass under the head

Time needed for a block to pass under the head

Time needed for a block to pass under the head

Total Access Time

Sum of the above

Sum of the above

Sum of the above

WHAT DOES "AVERAGE" MEAN?

It is obvious that the average-case latency is the average of the best-case latency and
worst-case latency. But what does average seek time really mean? We need to develop
understanding of this issue and we will be proceeding in stages before we can formulate
and solve a real life version of this problem. Our initial formulations of the problem
will contain simplifying (unrealistic) assumptions, but these will be removed one by one
in consecutive versions.

To get to the essence of this issue, instead of considering the time we will consider
the distance the read-write head needs to move. The distance between the tracks will be
measured in multiples of the distance between two consecutive tracks; so, the distance
between tracks i and j is |i-j| units. Assume that there are n tracks on our disk.

The case of worst (largest) distance occurs when the head needs to move from track 1 to
track n, i.e. the distance of n-1 units. The best case is when no movement is necessary,
i.e. the distance of 0 units. What is the average case?

Problem 1. Is the average distance traveled by the head equal to the average of the
best- and worst-case distances?

To visualize the distances involved consider a (toy) disk with just five tracks. The
following chart shows the distances the read-write-head must travel to move form a
position above track i to a position above track j for a all pairs <i,j>. The
distance between two consecutive tracks is used as a unit.

Distances

To track:

1

2

3

4

5

From track:

1

0

1

2

3

4

2

1

0

1

2

3

3

2

1

0

1

2

4

3

2

1

0

1

5

4

3

2

1

0

The worst-case distance is 4; the best-case distance is 0; their average is 2. The total
distance in the 25 cases listed in the table above is 40. Assuming that each of the 25
cases is equally likely, the average distance is 40/25=1.6
. This shows that the answer to the question in Problem 1 is "no".

AVERAGE SEEK DISTANCE ASSUMING UNIFORM DISTRIBUTION OF REQUESTS

Now let us try to calculate the average distance traveled by the read-write head. Let
us also specify precisely the assumptions concerning likelihood of different requests for
transfer.

Problem 2. Consider a disk with n tracks per surface. Assume that the requests to
move the read-write head from track i to track j are equally likely for all pairs
<i,j> where i,j £ n. What is the average distance AD(n)
the read-write head travels in a single move? (Assume that tracks are equally spaced and
give answer as a multiple of the distance between two consecutive tracks.)

We already answered this question for n=5. To consider the question in full generality
we need to deal with the following table.

Distances

To track:

1

2

3

...

n-1

n

From track:

1

0

1

2

...

n-2

n-1

2

1

0

1

...

n-3

n-2

3

2

1

0

...

n-4

n-3

...

...

...

...

...

...

...

n-1

n-2

n-3

n-4

...

0

1

n

n-1

n-2

n-3

...

1

0

We denote the total distance traveled by the read-write head in n^2 cases listed in the
table above by TD(n) and the average distance by AD(n).

So, TD(n) is the sum of the n^2 numbers in the table above and AD(n) = TD(n) / n^2.

But how to represent TD(n) by a concise mathematical expression? Well, let's first see
how the function TD changes when we pass from n to n+1, for instance, from 5 to 6.

Distances

To track:

1

2

3

4

5

6

From track:

1

0

1

2

3

4

5

2

1

0

1

2

3

4

3

2

1

0

1

2

3

4

3

2

1

0

1

2

5

4

3

2

1

0

1

6

5

4

3

2

1

0

Notice that TD(6) = TD(5) + 2(1+2+3+4+5).

In general, TD(n+1) = TD(n) + 2(1+2+...+n) = TD(n) + n(n+1).

(We have used the fact that 1+2+..+n = n(n+1)/2.) So, now we have a formula which
allows to reduce the problem of calculating T(n+1) to the problem of calculating T(n).
Such a formula is called a recursive formula or a recurrence. However,
notice that this formula does not define function TD uniquely until we add a base case:
TD(1) = 0. Now, the complete definition of TD is

(1)

TD(1) = 0

(2)

TD(n+1) = TD(n) + n(n+1)

We would be interested in knowing a formula which defines TD explicitly in terms of n,
without referring to TD. Finding such formulas is the subject of the theory of recurrence
equations, but as we do not know this theory, we need to manage by our own means. The
nicest functions to work with are polynomials -- perhaps TD is a polynomial -- let's
explore. What could be the degree of such a polynomial? Well, equation (2) above involves
terms of degree two, so perhaps TD can be represented by a polynomial of a degree not
bigger than three. We are going to assumethat

(3)

TD(n) = an^3 + bn^2 + cn + d

and we will try to find values of coefficients a, b, c, and d. (If this fails, we will
be back where we were.) Using the recurrences (1) and (2) above we will set a system of
linear equations in which a,b,c,d will be treated as variables. Let us express the
left-hand side of (2) using (3):

TD(n+1) = a(n+1)^3 + b(n+1)^2 + c(n+1) + d =

a(n^3 + 3n^2 + 3^n + 1) + b(n^2+2n+1) + c(n+1) + d =

an^3 + 3an^2 + 3an + a + bn^2 + 2bn + b + cn + c + d =

an^3 + (3a+b)n^2 +(3a + 2b + c)n + (a + b + c + d)

Now let us express the right-hand side of (2) using (3):

TD(n) + n(n+1) = an^3 + bn^2 + cn + d + n(n+1) =

an^3 + (b+1)n^2 + (c+1)n + d

Do you remember from your calculus class a theorem which says that if for all values of
the argument two polynomials have the same values, then the polynomials must have the same
coefficients? (Would you be able to prove this theorem?) By this theorem we obtain the
following equations.

(4i)

3a + b = b + 1

(4ii)

3a + 2b + c = c + 1

(4iii)

a + b + c + d = d

So, this is a system of 3 equations with 4 variables -- we need one more equation. The
fourth equation will be obtained from (1) and (3):

If TD(n) is a polynomial of degree not bigger than three then this
polynomial must be (n^3 - n) ¤ 3.

So, (n^3 - n) ¤ 3 is a candidate for the expression which
could represent TD(n) but there is no certainty that this expression is correct. We leave
it to the Reader to verify that the function TD(n) defined as (n^3 - n) ¤ 3 satisfies equations (1) and (2). (The Reader could try to prove
a more general and more challenging result: if there exists a polynomial p(n) of degree k
such that for every n, f(n+1) = f(n) + p(n) then f(n), treated as a function over
integers, is a polynomial of degree k+1.) Once a proof has been completed we know that

(5)

TD(n) = (n^3 - n) ¤ 3
= (n-1)n(n+1) ¤ 3

So, the average distance traveled by the read-write head on a disk with n tracks under
assumptions of Problem 2 is

This suggests that it is 3, not 2, that is the magic number of computer disk average
seek time!

ALTERNATIVE CALCULATION

The Reader could approach Problem 2 in a different way. The lower triangle in the table
with distances traveled by the read-write head on a disk with n tracks contains n-1 ones,
n-2 twos, n-3 threes, etc. So

The formula for 1 + 2 + ... + k is well known. On the other hand, hardly anybody knows
or remembers the formula for 1*1 + 2*2 + ... + k*k. If we knew this formula we could
finish the calculation of TD(n) and then obtain AD(n). The Reader can find this formula
using steps analogous to those in the previous section: assume that 1*1 + 2*2 + ... + k*k
can be represented by a polynomial of degree not bigger than 3, set a system of linear
equations and solve them to find the coefficients of a candidate polynomial, then prove by
mathematical induction that the polynomial is correct. We leave this as an exercise.

The sum of numbers in the lower triangle can be also obtained by adding the numbers in
its every row and then adding obtained sums together. As the sum of numbers in a row of
the lower triangle can be represented as a binomial coefficient, obtaining the final
result can be an exercise in manipulating binomial coefficients.

AVERAGE SEEK DISTANCE ASSUMING NON-UNIFORM DISTRIBUTION OF REQUESTS

In some situations one cannot assume (as we did in the previous sections) that all the
moves of the read-write head from track i to track j are equally likely. If the computer
is used by a single user who has just one file open, the read-write head is more likely to
stay on the same track or to move to the next track than to move anywhere else. Let us
formulate a more general version of our problem which will cover also this case.

Problem 3. Consider a disk with n tracks per surface. Assume that the probability
that the next request for data transfer involves the same track is p0. Assume that the
probability that the next request requires the head to move from track k to track k+1
(where k<n) is p1. What is the average distance AD(n,p0,p1) the read-write head travels
in a single move? (Assume that tracks are equally spaced and give answer as a multiple of
the distance between two consecutive tracks.)

DISTANCES

To track:

1

2

3

...

n-1

n

From track:

1

0

1

2

...

n-2

n-1

2

1

0

1

...

n-3

n-2

3

2

1

0

...

n-4

n-3

...

...

...

...

...

...

...

n-1

n-2

n-3

n-4

...

0

1

n

n-1

n-2

n-3

...

1

0

Requests represented on the main diagonal (zeros) occur with probability p0; requests
represented just above the main diagonal (ones) occur with probability p1; the remaining
requests occur with probability 1-p0-p1. How to handle these assumptions of probability?
Let us think about frequency of events. Let R be a number of requests for data transfers;
think that R is big. According to the assumptions of the problem among our R requests
there will be roughly

R*p0 requests for data from the current track,
R*p1 requests for data from the next track,
R*(1-p0-p1) requests for data from a yet another track.

And the average length of a move can be obtained by dividing this expression by R:

p0* 0 +
p1* 1 +
(1-p0-p1) * ( TD(n) - (n-1) ) ¤ ( n^2
- n - (n-1) )

After substituting here the expression (5) for TD(n) and some simplifications, the
Reader should obtain:

(8)

AD(n,p0,p1) = p1 + (1-p0-p1)*
(n^2 + n -3) ¤ (3(n-1))

The Reader should not only carry out calculations leading to this formula, but also
check consistency of this formula with the formula (6) for AD(n) -- Problem 1 is a
particular case of the Problem 2 with p0 = n ¤ n^2 = 1 ¤ n and p1 = (n-1) ¤ n^2, so
substituting these values in formula (8) should produce a formula equivalent to (6).

Finally, let us also derive from (8) a simplified, approximate formula which can be
used for big values of n.

(9)

AD(n,p0,p1) » (1-p0-p1)
* n ¤ 3

AVERAGE SEEK TIME -- REAL WORLD VERSION

In the real world, customers of disk manufacturers do not ask about the average
distance traveled by the read-write head, they ask about the average seek time. Transition
from the calculation of distance to the calculation of time is not immediate -- the seek
time is not proportional to the distance traveled by the head. In every move, a certain
startup time needs to elapse before the mechanism can overcome inertia and only after that
the head moves at a constant speed. With the addition of this detail we obtain a real
world version of our problem.

Problem 4. Consider a disk with n tracks per surface. Assume that the probability
that the next request for data transfer involves the same track is p0. Assume that the
probability that the next request requires the head to move from track k to track k+1
(where k<n) is p1. Assume that the time needed to move the read-write head from track i
to j (where i¹j) is t0 + |i-j|*t1
where t0 and t1 are constants, (t0 is much bigger than t1). What is the average seek time
AT(n,p0,p1,t0,t1) ?

We will continue the reasoning from the previous version of the problem. Recall that R
stands for the number of requests for data transfers. The total distance the head moves in
these R requests is R*AD(n,p0,p1). In R*(1-p0)
of these requests the head does not stay on the current track and the startup time must be
accounted for. The total time spend on head movements in these R requests can be split
into R*(1-p0) startups, which take R*(1-p0)*t0 time, and the time spent on moving at the constant speed over the
total distance: R*AD(n,p0,p1)*t1.
So, the total time used for head movements in these R requests is

The reader should notice that the previous problem was a particular case of the current
one, with t0=0 and t1=1, and substituting these values in formula (10) should give a
formula equivalent to (8).

Finally, let us also derive from (10) a simplified, approximate formula which can be
used for big values of n.

(11)

AT(n,p0,p1,t0,t1) » (1-p0)*t0 + (1-p0-p1)*t1*n
¤ 3

In this approximate formula we left the constant term (1-p0)*t0
because t0 is much bigger than t1 causing that this term can substantially affect the
resulting value if n is big but not huge. (In real computer disks n is of the order of 100
or 1000 and t0 can be 50 times bigger than t1.)

USES IN A FIRST PROGRAMMING COURSE

The problems presented in this paper can be solved by means of computational
experiments instead of mathematical calculations and they are appropriate for first
programming course. They can be presented as a series of programming exercises
corresponding to problems 2, 3 and 4 above. We will refer to them as Exercises 2', 3' and
4'. The instructor is expected to make an introduction and explain the relation of these
programming exercises to the average seek time of a computer disk, including a discussion
of our Problem 1. Exercise 2' involves nested for-loops; Exercises 3' and 4' involve, in
addition, a random number generator. Exercises 2' and 3' are introductory and can be given
during a lab session while Exercise 4' is the ultimate goal, and if there was not enough
time in the lab it can be finished by students on their own.

USES IN HIGHER-LEVEL COURSES

Solutions to problems 2, 3 and 4 presented above assume the First-Come-First-Served
(FCFS) algorithm for scheduling disk arm movements: the requests for block transfers are
served in the order they were received. There is a more efficient disk arm schedulingalgorithm, known as LOOK (a variant of SCAN or ELEVATOR algorithm): the head keeps
moving in one direction until it visits all tracks for which there are requests in the
current pool, then it changes direction and continues in the same manner, and so on. For
more information on related algorithms see chapter 13 of [Silberschatz,Galvin].

We can assume that placements of requests form a Poisson process, i.e. that the time
between two requests has an exponential distribution. Theoretical analysis of the average
seek time under the LOOK algorithm belongs to queuing theory and could be included in a
graduate course on probability. A solution through a discrete simulation is accessible to
undergraduate students -- this is a excellent problem for an advanced undergraduate course
on object-oriented programming. Such a programming simulation should report not only
average seek time but also the average waiting time for a completion of the request. For
discussion of results of similar simulations see [Teorey,Pinkerton].

Let us also mention that section 5.4.9 of [Knuth-3] and section 3.2.1 of [Wiederhold]
contain other calculations related to the problems of this paper -- this material as well
as papers listed in [Silberschatz,Galvin] can be used as basis for individual research of
students.

CONCLUSION

We presented a topic which spans issues of hardware, mathematical methods and
programming. It can be developed in a series of increasingly difficult problems. The final
version is a real world problem, without simplifying (unrealistic) assumptions, but it is
still accessible to undergraduate students. Our experience at SUNY/Plattsburgh shows that
problems with such characteristics attract special interest of students and are especially
effective in increasing understanding and appreciation of the subject.

ACKNOWLEDGEMENTS

Programming problems were prepared by Prof. S. Denenberg and used in his course at
SUNY/Plattsburgh. The author is grateful to him and to Prof. L. Fairchild for their
comments.