Series Introduction

Computing operates in an almost universally networked environment,
but the technical aspects of information protection have not kept up.
As a result, the success of information security programs has
increasingly become a function of our ability to make prudent management
decisions about organizational activities. Managing Network Security
takes a management view of protection and seeks to reconcile the need
for security with the limitations of technology.

Speed Kills

I do a fair amount of watching when I run a firewall
or similar network protection device, and one of the more frequent
questions I get is how I can deal with the rate of information. To give
you a sense of this, on the network in my house, we have a maximum of
9Mb/s inbound traffic and about 500Kb/s outbound traffic. At work, we
run at more like 48Mb/s, and on some special links, even faster. When I
tell people that we watch the traffic, they find it more or less beyond
belief. After all, 48Mb/s corresponds roughly to a 48 hefty books per
second, and nobody can read and understand that fast.

Of course I am known as an avid reader, and I do
read a lot of things and do so rather quickly, but still, they are
right. I cannot read that fast and likely never will be able to. So
the question is: How, how much, and to what extent, can I watch the
traffic?

A Question of Meaning

The first thing to understand about watching traffic
is what the traffic is made up of. For example most IP traffic is made
up of packet headers and not of packet content, and most of the packet
headers can be summarized pretty well by a very limited amount of
information. For example, here is the summary of a relatively innocuous
packet given by TCPdump:

This packet is simply an acknowledgment of a packet
that originally came from 1.2.3.5 on port 23 and went to 1.2.3.4 on port
13104. It was sent at 05:43:01.87 (5:43 and 1.87 seconds). It's not
very interesting, but it occurred, so we observe. If you watch all of
these packets, you will quickly come to the conclusion that there is
very little information content to be gleaned from these headers on an
individual basis. Rather, the headers provide information on the flow
of traffic. The information is mostly useful to see how much of what is
going where, and its absence often means more than its presence. While
I sometimes watch this in detail, I usually watch either the flow of it,
or select out a particular part of the flow for observation - for
example, I might look at a particular exchange between IP addresses - or
count sessions and the total amount of information communicated and
number of packets involved.

Another thing I sometimes do is watch the content of
the packets going by. For example, on my firewall machine, I always
track the content of packets and sometimes look at it to see if I see
anything strange. Of course, in order to know what's strange, you have
to know what's normal, and you learn that by watching the traffic and
digging into what every part of it is until you think you understand it.
Here's a higher level log of session content:

Now in this case, I have removed the extraneous
information (relative to the discussion) and we see a user logging into
a post office protocol (pop3) server to check their email. Notice the
user ID (USER fred) and password (PASS mypassword) are easily readable.
This is what most 'sniffers' do to get user IDs and passwords. You can
also find out a lot of user IDs and passwords in this way. As an
example, I seem to remember that in the early 1990s, the CMU CERT
reported one attack that grabbed about 100,000 user IDs and passwords.
Cases involving 10,000 or more stolen user IDs, passwords, or credit
card numbers are not that rare. At this level of detail, there is a lot
to see, and a lot of it is meaningful. For example, if you watch a
single IP address in this level of detail, you can tell what sites the
user(s) on that system visit on the Web, what information their Web
browser gives to the web server on the other side, what they are
searching for with a search engine, how well they do their searches, and
so forth. If there are many users on a system, you can often tell how
many users are using the Internet and characteristics of their use.

Here's an example of a part of a Web session I
recorded from a computer at a site. This tells me what that user was
telling remote sites when visiting them:

The web site now knows, if they care to, that I am
using Netscape version 3.04 in English running under X11 on a Linux
operating system version 2.0.13 on an Intel 486 computer. They could
also find out my email address (if I included it), other places I have
visited, what kinds of files my browser automatically processes, and so
forth. One of the most interesting lines is the list of what is
accepted by the browser. For attackers, this lists which attack scripts
might work, and when used in combination with the other information, can
form the precise profile information required to attack a site. It's
good to sniff this sort of traffic from your site to determine just what
your systems are telling potential attackers. This is also very handy
if you suspect someone is sending something they should not be sending,
or to detect the transmission of encrypted files.

This example is a bit more onerous. It indicates
that version 4.0 of Netscape is running on a Windows 95 box with
Microsoft Explorer present and that the system will automatically run
msword and ms-excel programs if I send files with the proper extension.
This could be exploited by providing a Trojan horse in a spreadsheet or
word document. It also tells me that the last site this browser went to
was all.net - always a good place to visit.

We can also watch the traffic of remote terminal
sessions. Here's a partial example (I have removed the traffic details
to leave only the text of the sessions this time). In this case, I can
tell the user is using a Unix system (it looks like Linux because of the
device names returned from the df command), how much space is available,
the name of the user. their directory, the name of the machine, and
that they have a very big tar file.

There are a couple of important things to note here.
First, and most obviously, the content of the session is not immediately
readable. This is important to observe because not all cryptography
works all of the time. In some cases, you may think information is
encrypted, but until you verify it by observation, you cannot be sure.
Another important thing to note is that the size of the input and output
from the two sessions are actually pretty similar. While the content
may be obscured by cryptography, in this case, the source, destination,
and size of exchanged information were not changed. Similarly, other
traffic characteristics remain largely the same. As a result, a large
file transfer can clearly be differentiated from telnet sessions and Web
sessions, even when encrypted.

Moving On Up

This detailed information is very helpful, but in
truth, it is only good for looking at details and there are too many
details to look at for this to be very useful for a high bandwidth
connection. As the speed goes up, the amount of detail we can watch
goes down. The issue then is to figure out what is most important to
watch at any given moment and to find ways to watch it.

The collection of exchanges associate with a telnet
session, or the visit to a Web site can often be rolled up as a single
entry of the form:

In this case, we see that on the specified date and
time (according to the computer we are looking at), a program called
Web.pl with process ID 32136 ran on behalf of IP address 194.151.95.22,
which appears to be a web proxy server in The Netherlands. A more
detailed log of the specific transaction is kept:

This has particular meaning on the particular system
and is not a sign of abuse in this case, however, some log files do
indicate attempted entries. For example, here is a similar level log
file entry produced by deception toolkit - an intrusion detection and
reporting system based on deceptions:

This log shows an attempt to telnet into a computer
and login as the user guest. It is clear that it was a real human being
doing this and not a computer because they made a typo on the password
the first time and did it again. They then tried a few commands and the
program eventually kicked them out.

Hopefully, intentional attacks happen far less often
than access to Web services. This means, among other things, that we
can watch every incident involving such an attack far more easily than
we can watch normal user traffic. It also means that, while examining
normal user traffic is infeasible, examining details of these attacks is
quite easy even for a large network. Similarly, retention of full
details of these sessions is an easy matter.

Returning briefly to the notion of time differences,
the notion of relative time is very important in trying to understand
sequences of events. It is very common for systems to have time
differences ranging from seconds to years, and when times are important
to establishing the sequence of events, tracking system time vs. actual
time is important. In addition, changing the system time to correct it
during an investigation is particularly problematic because it makes the
historical data no longer relatable to the current time frame. For
those of us who operate systems from all around the world, we always
have a few systems that are running in tomorrow or yesterday and at all
hours of the day. Relative time is the only way to track these systems,
and the global standard against which relativity is measured is Grenich
Mean Time (GMT).

Higher and Higher

At the next level of logging and analysis, we
typically see roll-ups of services over time. For example, a typical
corporate telephone bill or similar statement shows summaries of calls
by area code, time of day, and so forth. By putting this into a
database, a wide range of analyses can be done to detect patterns of
usage, find anomalous periods of operation, do trends analysis, and so
forth. These get rolled up into various forms depending on what is
desired by the information consumer. As we move farther from the
details, the ability to detect information related to specific events
that are security related tends to become harder and harder.

How High Do You Go?

Some might put forth the contention that events that
do not tend to have large scale effects also tend to be relatively
unimportant to the organization, and that therefore, examining higher
level information would be more relevant. My view is that we need to
look at issues from many perspectives if we are going to detect the
full range of security events that might occur.

In examining telephone billing records, we may not
be able to detect every unauthorized phone call by looking at weekly
billing summaries, but on a call by call basis, we probably can't tell
very much either. At the low level, you can't see the forest for the
trees, while at the top level, you can see that there is a forest, but
may not know what trees are in it. A single call to Indonesia, likely
means little to a major corporation, but a change in telephone bills of
$40,000 for one weekend compared to the previous weekend would seem to
be a clear indicator of a toll fraud, or at least a good basis for an
investigation. A few years ago, toll frauds had this characteristic
signature.

On the other hand, the value of information is often
highly disproportionate to volume. A very small amount of information
relative to the overall Internet traffic might contain a great deal of
value. For example, a recent Trojan Horse in a Word document sends your
private cryptographic keys to a site in Europe where they can be
exploited to forge digital signatures, read encrypted communications,
and perhaps gain unauthorized access to major financial systems. A few
hundred bytes can easily be worth millions of dollars in this context,
and the only way to detect this sort of activity is by looking at a very
detailed log of activities. It will not show up on the monthly summary
report, and by the time the financial impact appears, it may well be too
late to even determine how the information was released.

It Cuts Both Ways

While you may be more aware now than you were before
about the ways in which you can watch the world, you may rest assured
that the world has known how to watch you for a long time. If you don't
think they are watching, let me recall a few widely publicized recent
incidents.

A recent break-in to an ISP resulted in the
theft of tens of thousands of credit card numbers.

A recent word document included a Trojan
horse that sent out the PGP keys of anyone reading the document. This one
is still active today.

Sniffers have been detected in many of
the ISPs selling Internet services to people who use the same passwords
for many of their other accounts.

Attacks on pop3 (email) servers are now
widespread, and sniffing passwords to these servers provides a major
point of entry.

Audit trails of hostile sites are being
used today to attack systems that try to access their files. In many
cases, within a few seconds of entering such a site, scans of your IP
address ranges are started - typically looking for known Trojan Horse
entry points.

Conclusions

My view is that information at all levels is useful
to protection management as well as to technical protection, and logs
such as these must be understood and regularly viewed and reviewed in
order to get those jobs done.

And please remember - as you are watching the world
- the world is watching you.

About The Author:

Fred Cohen is a Principal Member of Technical Staff at Sandia National
Laboratories and a Managing Director of Fred Cohen and Associates in
Livermore California, an executive consulting and education group
specializing information protection. He can be reached by sending email to
fred at all.net or visiting /