Due: March 27, 2002

Objective

You and a partner will capture and analyze the packets exchanged
between two computers that are using TCP to transfer a file. Because
the computers have been specially configured to randomly discard some
fraction of the TCP segments they receive, you will be able in
particular to observe TCP's response to lost segments. You will
identify interesting events within the packet trace, estimate the loss
rate, and calculate the delivered data transfer rate. Hopefully you
will have enough time to do this for more than one loss rate.

Data Acquisition

You will use either of two cluster of three computers to gather your
data. Later, if you have time after analyzing your first packet
trace, you can use the other cluster to gather a second set of
packets. (The two clusters are configured with different loss rates.)

Both clusters of machines are located in the back right-hand corner of
OHS 329, as seen entering the room from the hallway. Each machine has
a piece of masking tape on it identifying the machine's IP address and
indicating that it is for MCS-394 only. The machines in one cluster
are 10.0.0.1, 10.0.0.2, and 10.0.0.3. The machines in the other
cluster are 10.0.0.4, 10.0.0.5, and 10.0.0.6. Each cluster also has
an ethernet hub, to which the three computers in that cluster are
connected.

Within each of the clusters, the three machines are configured
identically (aside from IP address). Therefore, you can use any of
the three to send the file, any of the remaining two to receive the
file, and the third computer to do the packet trace capture. When
sending the file, you will need to specify the IP address to send it
to, so be sure to note that address. You will need to log into each
machine as root (the "super user"), with the password I divulge in
class. (Actually, in the first lab period, we are likely to have a
whole queue of lab groups using the machines, so it will make sense to
leave them logged in.) For the packet capture program to work, you
need the super-user privileges on that machine. The TCP sending and
receiving could be done perfectly well as a normal user, except that I
haven't bothered to create any normal user accounts on these machines.

The order in which you give the commands is somewhat important; you
need to have the file-receiving command and the packet-capturing
command running before you do the file-sending command. (Otherwise,
you will get a "connection refused" error, if you aren't running the
receiving program, or won't capture all the packets, if you aren't
running the capturing program.)

On the machine where you want to receive the file, you will use the nc
program, also known as netcat. This is a very general program for
communicating using TCP (or UDP). You can look at the documentation for the full story, but a
suitable command line would be

nc -n -l -p 6789 >/dev/null

This will listen for a connection on port 6789 and put everything
received into the "file" /dev/null. (The special "file" /dev/null
isn't actually a file at all, but rather a bottomless pit into which
bytes can be put to discard them. If you really wanted a copy of the
sent file, you could redirect output wherever you wanted the copy.)

On the machine where you want to capture the packets, you will use the
tcpdump command. Again, this program has lots of options, which you
can read about in the documentation. All
you want to do now is capture the packets in a raw, binary, form; you
can do the human-readable analysis later on a different computer.
Therefore, a command like the following will suffice:

tcpdump -w /tmp/trace1

Note the 1 on the end of the filename; my suggestion is that each time
you capture a trace, you increment this number, while keeping notes
somewhere of what circumstances each trace was collected under.

On the machine where you are sending the file, you can use nc again,
but with different command line. You could use a command like

nc -n -w 3 10.0.0.x 6789 </usr/lib/libcrypt.a

but with the x replaced by the appropriate number (in the range 1-6)
to complete the IP address of the receiving machine. This will open a
TCP connection to port 6789 on that machine and transmit the contents
of /usr/lib/libcrypt.a. (I chose this file because it seems to be a
reasonable length. If you want more data to analyze, you could send a
longer file.) After the file is transmitted, both nc programs will
exit back to the respective shell prompts. That is your sign that it
is safe to end the tcpdump.

To stop capturing packets with tcpdump, you can type a control-C.
Then insert a DOS-formated floppy into the capturing machine's drive
and "mount" it using the command

mount /mnt/floppy

Now you can move your captured data onto the floppy and then unmount
the floppy, using the commands

mv /tmp/trace1 /mnt/floppy
umount /mnt/floppy

Be sure to wait until the floppy drive's light goes out before you
eject the disk.

Data Analysis

Be sure to leave a copy of your trace files on the floppy, as well as
copying them into your home directory for analysis. This is because
you will need to submit your floppy along with your lab report, so
that I can easily check your work. To copy the files into your home
directory, insert the floppy in one of the normal computers, and mount
it as indicated above. Copy the file over, for example by using a
command such as

cp /mnt/floppy/trace1 .

and then unmount the floppy, again using the command listed above.
(The example cp command ends with a space and period, to specify
copying into the current directory.)

You can now run tcpdump again, with different command-line options, in
order to get a human-readable version of the packet trace. On our
normal machines, we don't have tcpdump installed in the standard
search path, so you will have to specify the pathname of tcpdump in my
MCS-394 directory. A typical command would be

~max/MCS-394/tcpdump -r trace1 >trace1.out

After doing this, you can look at trace1.out, either on the screen or
by printing it out. In principle, you don't need any more tools than
your eyes and your brain. In practice, you may want to do what the
professionals do, and use the computer to help you locate
interesting patterns in the data.

There are a variety of general purpose tools that may be helpful in
the data analysis. Each of these programs has a man page describing
it, and there is also documentation in some of the books in the lab
monitor room, such as Linux: The Textbook. You can also
ask for help. The below are just some examples, not intended to imply
what you will actually want to do. Each of these programs reads from
standard input and writes to standard output. You can read from or
write to a file by using < or >, and can send the output from
one program directly into the input of another program using |.
The first example below selects out only lines containing one or more
digits, a colon, and then again one or more digits. The next example
replaces the string "foo bar" in each line (containing it) with
"baz". The third replaces everything from colon to end of line with
nothing (i.e., deletes it). The fourth sorts lines in numerical
order, assuming they start with numbers. The fifth eliminates all
lines that occur only once, without an immediately adjacent duplicate
line.

Whatever tools you choose to use to help you with the analysis,
there is one anomaly that you need to ignore. As an artifact of how
the nc program is working, there will be a long delay (several
seconds) between when the last data is acknowledged and when the FIN
packets are exchanged to shut down the connection. If you were to
include that delay, the throughput would seem much worse than it
really is, through no fault of TCP's. (The nc program simply delays
closing the connection.) Therefore, only measure the elapsed time to
the last ACK of data.
Here
are some key items you should be looking for:

How many bytes of data were sent, not including duplicates? How
long did it take to send this data? Hence, what was the delivered
throughput?

How many timeouts occurred? How long were the pauses these
typically introduced? Were there any extra-long pauses due to a
segment timing out a second time? What portion of the total elapsed
time was spent in these timeout pauses? How high would the throughput
have been without them?

What fraction of the original data segments were retransmitted? Which
specific ones? Were any repeatedly retransmitted?

For which of the retransmissions does the evidence suggest loss of
the original data segment (or of a previous retransmission of the data segment)
as the cause? For which ones does a lost
ACK seem to be the cause? For which ones are you unable to identify a
cause?

What fraction of the total data segments transmitted (both
original ones and retransmissions) are apparently being discarded?

For what fraction of the ACK segments do you have evidence of the
segment being discarded? This fraction will probably be smaller
than for data segments, not because ACK segments are less likely to be
discarded, but rather because sometimes they are discarded without any obvious
symptoms resulting. Why is this?

Assuming you have time to experiment with the other cluster of three
machines, you should be sure to compare them. How does the different
loss rate impact the delivered throughput?

Report

Be sure that your report does not assume the reader already knows what
you did. You may assume reasonable background knowledge of
networking, and should refer to external sources of information (such
as RFCs or program documentation) where appropriate. Remember to
also submit the floppy with your trace files on it, and be sure that
floppy is labeled with your names.

Possible Extensions

There are lots of additional investigations you could do. If you want
to do a run with the artificial segment loss turned off, to see how
much higher the throughput is, I can easily arrange that for you. I
can also turn on selective acknowledgement, if you want to see what
difference (if any) that makes. Of course, to get rigorous scientific
evidence, you probably should use more than a single relatively short
file transmission.