Hundreds of systems connected to Internet have file libraries, or
archives, accessible to the public. Much of this consists of free or
low-cost shareware programs for virtually every make of computer. If you
want a different communications program for your IBM, or feel like playing
a new game on your Amiga, you'll be able to get it from the Net.

But there are also libraries of documents as well. If you
want a copy of a recent U.S. Supreme Court decision, you can find it on
the Net. Copies of historical documents, from the Magna Carta to the
Declaration of Independence are also yours for the asking, along with a
translation of a telegram from Lenin ordering the execution of
rebellious peasants. You can also find song lyrics, poems, even
summaries of every ``Lost in Space'' episode ever made. You can also find
extensive files detailing everything you could ever possibly want to know
about the Net itself. First you'll see how to get these files; then
we'll show you where they're kept.

The commonest way to get these files is through the file transfer
protocol, or ftp. As with telnet, not all systems that connect to the
Net have access to ftp. However, if your system is one of these, you'll
be able to get many of these files through e-mail (see section Advanced E-mail).

Starting ftp is as easy as using telnet. At your host system's command
line, type

ftp site.name

@fnindex ftp site.name

and hit enter, where ``site.name'' is the address of the ftp site you want
to reach. One major difference between telnet and ftp is that it is
considered bad form to connect to most ftp sites during their business
hours (generally 6 a.m. to 6 p.m. local time). This is because
transferring files across the network takes up considerable computing
power, which during the day is likely to be needed for whatever the
computer's main function is. There are some ftp sites that are
accessible to the public 24 hours a day, though. You'll find these noted
in the list of ftp sites.

How do you find a file you want, though?

Until a few years ago, this could be quite the pain -- there was
no master directory to tell you where a given file might be stored on
the Net. Who'd want to slog through hundreds of file libraries looking
for something?

Alan Emtage, Bill Heelan and Peter Deutsch,
students at McGill University in Montreal, asked the same question.
Unlike the weather, though, they did something about it.

They created a database system, called archie, that would
periodically call up file libraries and basically find out what they had
available.

In turn, anybody could dial into archie, type in a file name, and
see where on the Net it was available. Archie currently catalogs close to
1,000 file libraries around the world.

Today, there are three ways to ask archie to find a file for you:
through telnet, ``client'' Archie program on your own host system or e-mail.
All three methods let you type in a full or partial file name and
will tell you where on the Net it's stored.
If you have access to telnet, you can telnet to one of the following
addresses: @host{archie.mcgill.ca}; @host{archie.sura.net};
@host{archie.unl.edu};
@host{archie.ans.net}; or @host{archie.rutgers.edu}.
If asked for a log-in name, type

archie

and hit enter.

When you connect, the key command is prog, which you use in this
form:

prog filename

followed by enter, where ``filename'' is the program or file you're
looking for. If you're unsure of a file's complete name, try typing in
part of the name. For example, PKZIP will work as well as
PKZIP201.EXE. The system does not support DOS or Unix wildcards.
If you ask archie to look for PKZIP*, it will tell you it couldn't
find anything by that name. One thing to keep in mind is that a file is
not necessarily the same as a program -- it could also be a document.
This means you can use archie to search for, say, everything online
related to the Beetles, as well as computer programs and graphics files.

A number of Net sites now have their own archie programs that
take your request for information and pass it onto the nearest archie
database -- ask your system administrator if s/he has it online. These
``client'' programs seem to provide information a lot more quickly than the
actual archie itself! If it is available, at your host system's command
line, type

archie -s filename

where filename is the program or document you're looking for, and hit
enter. The -s tells the program to ignore case in a file name and lets
you search for partial matches. You might actually want to type it this
way:

archie -s filename |more

which will stop the output every screen (handy if there are many sites
that carry the file you want). Or you could open a file on your computer
with your text-logging function.

The third way, for people without access to either of the above, is e-mail.

Send a message to @email{archie@quiche.cs.mcgill.ca}. You can leave the
subject line blank. Inside the message, type

prog filename

where filename is the file you're looking for. You can ask archie to
look up several programs by putting their names on the same ``prog'' line,
like this:

prog file1 file2 file3

Within a few hours, archie will write back with a list of the
appropriate sites.

In all three cases, if there is a system that has your file,
you'll get a response that looks something like this:

Chances are, you will get a number of similar looking responses
for each program. The ``host'' is the system that has the file. The
``Location'' tells you which directory to look in when you connect to
that system. Ignore the funny-looking collections of r's and hyphens
for now. After them, come the size of the file or directory listing
in bytes, the date it was uploaded, and the name of the file.

Now you want to get that file.

Assuming your host site does have ftp, you connect in a similar
fashion to telnet, by typing:

ftp sumex-aim.stanford.edu

(or the name of whichever site you want to reach). Hit enter. If the
connection works, you'll see this:

If nothing happens after a minute or so, hit control-C to return
to your host system's command line. But if it has worked, type

anonymous

and hit enter. You'll see a lot of references on the Net to
``anonymous ftp.'' This is how it gets its name -- you don't really have
to tell the library site what your name is. The reason is that these
sites are set up so that anybody can gain access to certain public
files, while letting people with accounts on the sites to log on and
access their own personal files. Next, you'll be asked for your
tpassword. As a password, use your e-mail address. This will then come
up:

First, ls is the ftp command for displaying a directory (you can
actually use dir as well, but if you're used to MS-DOS, this could lead
to confusion when you try to use dir on your host system, where it won't
work, so it's probably better to just remember to always use ls for a
directory while online).

The very first letter on each line tells you whether the listing is
for a directory or a file. If the first letter is a d, or an l,
it's a directory. Otherwise, it's a file.

The rest of that weird set of letters and dashes consist of ``flags''
that tell the ftp site who can look at, change or delete the file. You
can safely ignore it. You can also ignore the rest of the line until you
get to the second number, the one just before the date. This tells you
how large the file is, in bytes. If the line is for a directory, the
number gives you a rough indication of how many items are in that
directory -- a directory listing of 512 bytes is relatively small. Next
comes the date the file or directory was uploaded, followed (finally!) by
its name.

Notice the README.POSTING file up at the top of the directory. Most
archive sites have a ``read me'' document, which usually contains some
basic information about the site, its resources and how to use them.
Let's get this file, both for the information in it and to see how to
transfer files from there to here. At the ftp> prompt, type

get README

and hit enter. Note that ftp sites are no different from Unix sites in
general: they are case-sensitive. You'll see something like this:

And that's it! The file is now located in your home directory on your host
system, from which you can now download it to your own computer. The
simple get command is the key to transferring a file from an archive
site to your host system.

If the first letter on the line starts with a d, then that is a
directory you can enter to look for more files. If it starts with an
r, then it's a file you can get. The next item of interest is the
fifth column, which tells you how large the item is in bytes. That's
followed by the date and time it was loaded to the archive, followed,
finally, by its name. Many sites provide a README file that lists
simple instructions and available files. Some sites use files named
Index or INDEX or something similar.

If you want to download more than one file at a time (say a series
of documents, use mget instead of get; for example:

mget *.txt

This will transfer copies of every file ending with .txt in the given
directory. Before each file is copied, you'll be asked if you're sure
you want it. Despite this, mget could still save you considerable
time -- you won't have to type in every single file name.

There is one other command to keep in mind. If you want to get a
copy of a computer program, type

bin

and hit enter. This tells the ftp site and your host site that you are
sending a binary file, i.e., a program. Most ftp sites now use binary
format as a default, but it's a good idea to do this in case you've
connected to one of the few that doesn't.

To switch to a directory, type

cd directory-name

(substituting the name of the directory you want to access) and hit
enter. Type

ls

and hit enter to get the file listing for that particular directory.
To move back up the directory tree, type

cd ..

(note the space between the d and the first period) and hit enter. Or
you could type

cdup

and hit enter. Keep doing this until you get to the directory of
interest. Alternately, if you already know the directory path of the
file you want (from our friend archie), after you connect, you could
simply type

get directory/subdirectory/filename

On many sites, files meant for public consumption are in the pub
or public directory; sometimes you'll see an info directory.

Almost every site has a bin directory, which at first glance
sounds like a bin in which interesting stuff might be dumped. But it
actually stands for ``binary'' and is simply a place for the system
administrator to store the programs that run the ftp system. Lost+found
is another directory that looks interesting but actually never has
anything of public interest in them.

Before, you saw how to use archie. From our example, you can see
that some system administrators go a little berserk when naming files.
Fortunately, there's a way for you to rename the file as it's being
transferred. Using our archie example, you'd type

get zterm-sys7-color-icons.hqx zterm.hqx

and hit enter. Instead of having to deal constantly with a file called
zterm-sys7-color-icons.hqx, you'll now have one called, simply,
zterm.hqx.

Those last three letters bring up something else: Many program files
are compressed to save on space and transmission time. In order to
actually use them, you'll have to use an un-compress program on them first.

There are a wide variety of compression methods in use. You can tell
which method was used by the last one to three letters at the end of a
file. Here are some of the more common ones and what you'll need to
un-compress the files they create (and these decompression programs can all
be located through archie).

@ftable @code

.txt

.TXT
By itself, this means the file is a document, rather than a
program.

.doc

.DOC
Is another common suffix for documents. No
de-compression is needed, unless it is followed by

.ps

.PS
A PostScript document (in Adobe's page description language).
You can print this file on any PostScript capable printer, or use a
previewer, like the PD GhostScript.

.Z
This is a Unix compression method. To uncompress the file,
type

uncompress filename.Z

and hit enter at your host system's command prompt. If it's a
text file, you can read it online by typing

zcat file.txt.Z |more

at your host system's command line. There is a Macintosh
program called MacCompress that you can use on your machine
if you want to download the file (use archie to find where
you can get it!). There's an MS-DOS equivalent, often found
as u16.ZIP, which means it is itself compressed in the ZIP
format.

.zip

.ZIP
An MS-DOS format. Use the PKZIP package (usually found as
PKZ201.exe or something similar).

.gz
The GNU project's compression format. A variant of the PKZIP
format. Use gunzip filename.gz to uncompress.

.zoo

.ZOO
A Unix and MS-DOS format. Requires the use of a program
called zoo.

.Hqx
A Macintosh format that needs BinHex for de-compression.

.shar
A Unix format. Use unshar.

.tar
Another Unix format, often used to compress several related
files into one big file. Use tar. Often, a ``tarred'' file
will also be compressed with the .Z method, so you first have
to use uncompress and then tar.

.TAZ
Sometimes used for compressed archives .tar.Z, that are
stored on ``3 letter suffix only systems'' (aka MS-DOS).

.Sit
A Macintosh format, requires StuffIt.

.ARC
A DOS format that requires the use of ARC or ARCE.
@end ftable

A few last words of caution: Check the size of a file before you
get it. The Net moves data at phenomenal rates of speed. But that
500,000-byte file that gets transferred to your host system in a few
seconds could take more than an hour or two to download to your computer
if you're using a 2400-baud modem. Your host system may also have limits
on the amount of bytes you can store online at any one time. Also,
although it is really extremely unlikely you will ever get a file
infected with a virus, if you plan to do much downloading over the Net,
you'd be wise to invest in a good anti-viral program, just in case.

System administrators are like everybody else -- they try to make
things easier for themselves. And when you sit in front of a keyboard
all day, that can mean trying everything possible to reduce the number
of keys you actually have to hit each day.

Unfortunately, that can make it difficult for the rest of us.

Connect to many ftp sites, and one of the entries you'll often see
is a directory named bin.

You might think this is a bin where interesting things get thrown.
It's not. ``Bin'' is short for ``binary,'' i.e., the programs that make
the ftp site work, to which you won't have access anyway.

Etc is another seemingly interesting directory that turns out to be
another place to store files used by the ftp site itself. Lost+Found
directories are used by Unix systems for some routine housekeeping --
again, nothing of any real interest.

Then, once you get into the actual file libraries, you'll find that
in many cases, files will have such non-descriptive names as V1.1-AK.TXT.
The best known example is probably a set of several hundred
files known as RFCs, which provide the basic technical and
organizational information on which much of the Internet is built.
These files can be found on many ftp sites, but always in a form such as
RFC101.TXT, RFC102.TXT and so on, with no clue whatsoever
as to what information they contain.

Fortunately, almost all ftp sites have a ``Rosetta Stone'' to help
you decipher these names. Most will have a file named README (or some
variant) that gives basic information about the system. Then, most
directories will either have a similar README file or will have an index
that does give brief descriptions of each file. These are usually the
first file in a directory and often are in the form 00INDEX.TXT. Use
the ftp command to get this file. You can then scan it online or
download it to see which files you might be interested in.

Another file you will frequently see is called ls-lgR.Z. This contains
a listing of every file on the system, but without any descriptions (the
name comes from the Unix command ls -lgR, which gives you a listing of all
the files in all your directories). The .Z at the end means the file has
been compressed, which means you will have to use a Unix un-compress command
before you can read the file.

And finally, we have those system administrators who almost seem to
delight in making things difficult -- the ones who take full advantage of
Unix's ability to create absurdly long file names. On some FTP sites, you
will see file names as long as 80 characters or so, full of capital letters,
underscores and every other orthographic device that will make it almost
impossible for you to type the file name correctly when you try to get it.
Your secret weapon here is the mget command. Just type mget, a space, and
the first five or six letters of the file name, followed by an asterisk, for
example:

mget This_F*

The FTP site will ask you if you want to get the file that begins with that
name. If there are several files that start that way, you might have to
answer n a few times, but it's still easier than trying to recreate a
ludicrously long file name.

What follows is a list of some interesting ftp sites, arranged by
category. With hundreds of ftp sites now on the Net, however, this list
barely scratches the surface of what is available. Liberal use of archie
will help you find specific files.

The times listed for each site are in Eastern time and represent
the periods during which it is considered acceptable to connect.

@host{pit-manager.mit.edu}
(aka @host{rtfm.mit.edu})
The pub/usenet/rec.arts.books directory has
reading lists for various authors as well as lists of recommended
bookstores in different cities. Unfortunately, this site uses incredibly
long file names -- so long they may scroll off the end of your screen if
you are using an MS-DOS or certain other computers. Even if you want
just one of the files, it probably makes more sense to use mget than get.
This way, you will be asked on each file whether you want to get it;
otherwise you may wind up frustrated because the system will keep telling
you the file you want doesn't exist (since you may miss the end of its
name due to the scrolling problem).
6 p.m. - 6 a.m.

@host{ftp.eff.org}
The home of the Electronic Frontier Foundation. Use cd
to get to the pub directory and then look in the EFF, SJG and CPSR
directories for documents on the EFF itself and various issues related to
the Net, ethics and the law.
Available 24 hours.

@host{pit-manager.mit.edu}
The pub/usenet/misc.consumers directory has
documents related to credit. The pub/usenet/rec.travel.air directory
will tell you how to deal with airline reservation clerks, find the best
prices on seats, etc. See under Books for a caveat in using this ftp
site.
6 p.m. - 6 a.m.

@host{lumpi.informatik.uni-dortmund.de}
If you're interested in one possible future of computation,
and also are interested in global optimization problems, evolutionary
biology and genetics,
you might want to take a look at this server. For an overview on
the field, you should get the file pub/EA/docs/hhgtec.ps.Z,
aka @fyi{The Hitch-Hiker's Guide to Evolutionary Computation}.
Available 24 hours.

@host{ftp.germany.eu.net}
Run by Germany's EUNet group, i.e. it's
located at the University of Dortmund, Germany's backbone site of the
European part of the Internet, thus termed EUNet.
It's the European default server for MIT's X11 windowing system
releases, and also ``mirrors'' several important sites; e.g. in
pub/packages/gnu the GNU project's default server, etc.
Available 24 hours.

@host{iraun1.ira.uka.de}
Run by the computer-science department of the
University of Karlsruhe in Germany, this site offers lists of
anonymous-FTP sites both internationally
(in the anon.ftp.sites directory)
and in Germany (in anon.ftp.sites.de).
12 p.m. to 2 a.m.

@host{ftp.netcom.com}
The pub/profiles directory has lists of ftp sites.

@host{ncsuvm.cc.ncsu.edu}
The SENATE directory contains bibliographic
records of U.S. Senate hearings and documents for the past several
Congresses. Get the file README.DOS9111, which will explain the
cryptic file names.
6 p.m. - 6 a.m.

@host{nptn.org}
The General Accounting Office (GAO) is the investigative wing of
Congress. The pub/e.texts/gao.reports directory represents an experiment
by the agency to use ftp to distribute its reports.
Available 24 hours.

@host{nptn.org}
This site has a large, growing collecting of text files.
In the pub/e.texts/freedom. shrine directory, you'll find copies of
important historical documents, from the Magna Carta to the Declaration
of Independence and the Emancipation Proclamation.
Available 24 hours.

@host{seq1.loc.gov}
The Library of Congress has acquired numerous
documents from the former Soviet government and has translated many of
them into English. In the pub/soviet.archive/text. english directory,
you'll find everything from telegrams from Lenin ordering the death of
peasants to Khrushhchev's response to Kennedy during the Cuban missile crisis.
The README file in the pub/soviet.archive directory provides an
index to the documents.
6 p.m. - 6 a.m.

@host{info.umd.edu}
U.S. Supreme Court decisions from 1989 to the present
are stored in the info/Government/US/SupremeCt directory.
Each term has
a separate directory (for example, term1992). Get the README
and Index files to help decipher the case numbers.
6 p.m. - 6 a.m.

@host{ftp.uu.net}
Supreme Court decisions are in the court-opinions
directory. You'll want to get the index file, which tells you which file
numbers go with which file names. The decisions come in Word Perfect and
Atex format only.
Available 24 hours a day.

@host{nptn.org}
In the pub/e.texts/gutenberg/etext91 and etext92
directories, you can get copies of Aesop's Fables, works by Lewis Carroll
and other works of literature, as well as the Book of Mormon.
Available 24 hours.

@host{sumex-aim.stanford.edu}
This is the premier site for Macintosh
software. After you log in, switch to the info-mac directory, which will
bring up a long series of sub-directories of virtually every free and
shareware Mac program you could ever want.
9 p.m. - 9 a.m.

@host{ftp.uu.net}
Carries copies, or ``mirrors'' of Macintosh
programs from the Simtel20 collection in the
systems/mac/simtel20 directory.
Available 24 hours a day.

@host{wuarchive.wustl.edu}
This carries one of the world's largest
collections of MS-DOS software. The files are actually copied, or
``mirrored'' from a computer at the U.S. Army's White Sands Missile Range
(which uses ftp software that is totally incomprehensible). It also
carries large collections of Macintosh, Windows, Atari, Amiga, Unix, OS9,
CP/M and Apple II software. Look in the mirrors and systems directories.
The gif directory contains a large number of GIF graphics images.
Accessible 24 hours.

@host{ftp.uu.net}
Carries copies, or ``mirrors'' of MS-DOS programs from
the Simtel20 collection in the systems/msdos/simtel20 directory.
Available 24 hours a day.

@host{cs.uwp.edu}
The pub/music directory has everything from lyrics of
contemporary songs to recommended CDs of baroque music. It's a little
different - and easier to navigate - than other ftp sites. File and
directory names are on the left, while on the right, you'll find a brief
description of the file or directory, like this:

@host{pit-manager.mit.edu}
The pub/usenet/rec.pets.dogs and
pub/usenet.rec.pets.cats directories have documents on the respective
animals. See under Books for a caveat in using this ftp site.
6 p.m. - 6 a.m.

@host{wuarchiv.wustl.edu}
The graphics/gif directory contains hundreds of
GIF photographic and drawing images, from cartoons to cars, space images
to pop stars. These are arranged in a long series of subdirectories.

@host{pit-manager.mit.edu}
Look in the pub/usenet/alt.sex and
pub/usenet/alt.sex.wizards directories for documents related to all
facets of sex. See under Books for a caveat in using this ftp site.
6 p.m. - 6 a.m.

@host{elbereth.rutgers.edu}
In the pub/sfl directory, you'll find plot
summaries for various science-fiction TV shows, including Star Trek (not
only the original and Next Generation shows, but the cartoon version as
well), Lost in Space, Battlestar Galactica, the Twilight Zone, the
Prisoner and Doctor Who. There are also lists of various things related
to science fiction and an online science-fiction fanzine.
6 p.m. - 6 a.m.

@host{atari.archive.umich.edu} The shakespeare directory contains most of
the Bard's works. A number of other sites have his works as well, but
generally as one huge mega-file. This site breaks them down into various
categories (comedies, poetry, histories, etc.) so that you can download
individual plays or sonnets.

@host{ames.arc.nasa.gov}
Stores text files about space and the history of
the NASA space program in the pub/SPACE subdirectory.
In the pub/GIF
and pub/SPACE/GIF directories, you'll find astronomy- and NASA-related
GIF files, including pictures of planets, satellites and other celestial
objects.
9 p.m. - 9 a.m.

@host{goya.dit.upm.es}
This Spanish site carries an updated list of
bulletin-board systems in Spain, as well as information about European
computer networks, in the info/doc/net subdirectory, mostly in Spanish.
The BBS list is bbs.Z, which means you will have to uncompress it
to read it.
Available 24 hours.

@host{coe.montana.edu}
The pub/TV/Guides directory has histories and other
information about dozens of TV shows. Only two anonymous-ftp log-ins are
allowed at a time, so you might have to try more than once to get in.
8 p.m. - 8 a.m.

@host{ftp.cs.widener.edu}
The pub/simpsons directory has more files than
anybody could possibly need about Bart and family. The pub/strek
directory has files about the original and Next Generation shows as well
as the movies.
See also under Science Fiction.

@host{nic.stolaf.edu}
Before you take that next overseas trip, you might
want to see whether the State Department has issued any kind of advisory
for the countries on your itinerary. The advisories, which cover
everything from hurricane damage to civil war, are in the
pub/travel-advisories/ advisories directory, arranged by country.
7 p.m. - 7 a.m.

@host{pit-manager.mit.edu}
This site contains all available ``frequently
asked questions'' files for Usenet newsgroups in the pub/usenet
directory. For easy access, get the index file.
See under Books for a caveat in using this ftp site.
6 p.m. - 6 a.m.

@host{vmd.cso.uiuc.edu}
No password needed. The wx directory contains GIF
weather images of North America. Files are updated hourly and take this
general form: CV100222. The first two letters tell the type of file: CV
means it is a visible-light photo taken by a weather satellite. CI
images are similar, but use infrared light. Both these are in black and
white. Files that begin with SA are color radar maps of the U.S. that
show severe weather patterns but also fronts and temperatures in major
cities. The numbers indicate the date and time (in GMT - five hours
ahead of EST) of the image: the first two numbers represent the month,
the next two the date, the last two the hour. The file WXKEY.GIF
explains the various symbols in SA files.

Liberal use of archie will help you find specific files or
documents. For information on new or interesting ftp sites, try the
@news{comp.archives} newsgroup on Usenet. You can also look in the
@news{comp.misc},
@news{comp.sources.wanted} or
@news{news.answers} newsgroups on Usenet for lists of ftp
sites posted every month by Tom Czarnik and Jon Granrose.

The @news{comp.archives} newsgroup carries news of new ftp sites and
interesting new files on existing sites.

In the @news{comp.virus} newsgroup on Usenet, look for postings that list
ftp sites carrying anti-viral software for Amiga, MS-DOS, Macintosh,
Atari and other computers.

The @news{comp.sys.ibm.pc.digest} and @news{comp.sys.mac.digest} newsgroups
provide information about new MS-DOS and Macintosh programs as well as
answers to questions from users of those computers.