The computer technologies that have incurred the most condemnation
recently -- Napster,
Gnutella, and Freenet -- are also the most
interesting from a technological standpoint. I'm not saying this to be
perverse. I have examined these systems' architecture and protocols,
and I find them to be fascinating. Freenet emerged from a bona fide,
academically solid research project, and all three sites are worth serious
attention from anyone interested in the future of the
Internet.

In writing this essay, I want to take the hype and hysteria out of
current reports about Gnutella and Freenet so the Internet community
can evaluate them on their merits. This is a largely technical
article; I address the policy debates directly in a companion article,
The Value of Gnutella and Freenet.
I will not cover Napster here because its operation has received more
press. It's covered in
"Napster: Popular Program Raises Devilish Issues" by Erik Nilsson,
and frankly, it is less interesting and far-reaching technically than
the other two systems.

In essence, Gnutella and Freenet represent a new step in distributed
information systems. Each is a system for searching for information;
each returns information without telling you where it came from. They
are innovative in the areas of distributed information storage,
information retrieval, and network architecture. But they differ
significantly in both goals and implementation, so I'll examine them
separately from this point on.

Gnutella basics

Each piece of Gnutella software is both a server and a client in one,
because it supports bidirectional information transfer. The Gnutella
developers call the software a "servent," but since that term looks
odd I'll stick to "client." You can be a fully functional Gnutella
site by installing any of several available clients; lots of different operating systems are supported. Next you have
to find a few sites that are willing to communicate with you: some may
be friends, while others may be advertised Gnutella sites. People with
large computers and high bandwidth will encourage many others to
connect to them.

Evil or Just Controversial?:

Open Source software such as Gnutella and Freeware are spreading as quickly as a virus. But are they really so unhealthy? Andy Oram points out the advantages--and disadvantages--of controversial technologies in this week's edition of Platform Independent on Web Review.

You will communicate directly only with the handful of sites you've
agreed to contact. Any material of interest to other sites will pass
along from one site to another in store-and-forward fashion. Does this
sound familiar, all you grizzled, old UUCP and Fidonet users out
there? The architecture is essentially the same as those unruly,
interconnected systems that succeeded in passing Net News and
e-mail around the world for decades before the Internet
became popular.

But there are some important differences. Because Gnutella runs over
the Internet, you can connect directly with someone who's
geographically far away just as easily as with your neighbor. This
introduces robustness and makes the system virtually failsafe, as
we'll see in a minute.

Second, the protocol for obtaining information over Gnutella is a kind
of call-and-response that's more complex than simply pushing news or
e-mail. Figure 1 shows the operation of the protocol. Suppose site A
asks site B for data matching "MP3." After passing back anything that
might be of interest, site B passes the request on to its colleague at
site C -- but unlike mail or news, site B keeps a record that site
A has made the request. If site C has something matching the request,
it gives the information to site B, which remembers that it is meant
for site A and passes it through to that site.

Figure 1. How Gnutella retrieves information

I am tempted to rush on and describe the great significance of this
simple system, but I'll pause to answer a few questions for those who are curious.

How are requests kept separate?

Each request has a unique number, generated from random numbers or
semi-randomly from something unique to the originating site like an
Ethernet MAC address. If a request goes through site C on to site D
and then to site B, site B can recognize from the identifier that it's
been seen already and quietly drop the repeat request. On the other
hand, different sites can request the same material and have their
requests satisfied because each has a unique identifier. Each site
lets requests time out, simply by placing them on a queue of a
predetermined size and letting old requests drop off the bottom as new
ones are added.

What form does the returned data take?

It could be an entire file of music or other requested material, but
Gnutella is not limited to shipping around files. The return could
just as well be a URL, or anything else that could be of value. Thus,
people are likely to use Gnutella for sophisticated searches, ending
up with a URL just as they would with a traditional search
engine. (More on this exciting possibility later.)

What protocol is used?

Gnutella runs over HTTP (a sign of Gnutella's simplicity). A major
advantage of using HTTP is that two sites can communicate even if one
is behind a typical organization's firewall, assuming that this
firewall allows traffic out to standard Web servers on port 80. There
is a slight difficulty if a client behind a firewall is asked to serve
up a file, but it can get by the firewall by issuing an output command
called GIV to port 80 on its correspondent. The only show-stopper
comes when a firewall screens out all Web traffic, or when both
correspondents are behind typical firewalls.

How does the system stop searching?

Like IP packets, each Gnutella request has a time-to-live, which is
normally decremented by each site until it reaches zero. A site can
also drastically reduce a time-to-live that it decides is ridiculously
high. As we will see in a moment, the time-to-live limits the reach of
each site, but that can be a benefit as well as a limitation.

How is a search string like "MP3" interpreted?

That is the $64,000 question, and leads us to Gnutella's greatest
contribution.

Richard Koman's WeblogSupreme Court Decides Unanimously Against Grokster
Updating as we go. Supremes have ruled 9-0 in favor of the studios in MGM v Grokster. But does the decision have wider import? Is it a death knell for tech? It's starting to look like the answer is no.
(Jun 27, 2005)