Conducting
Web research with online rationale -- that is, consistently entering
keywords, or so-called "natural language" queries, at search
database sites to locate information on the Web -- risks missing myriads
of potentially useful resources. The Web is not a database; nor is it "online."
It's a system more analogous to the size, content, and organization of
several massive physical libraries than to an online system comprising
countless databases.

Search database sites probe data collected by the site. Sites known
as mega search sites, or mega search engines, quickly explore data
gathered by several individual search database services. They do not
develop their own databases. Neither search database services, nor
mega search engines, store or probe all data available on the Web.

Further, many search database sites require the submission of a URL
in order to collect data from an unknown location. Some do not follow
all links within a site. Others, called spiders or robots, that in
theory regularly collect and update data from known sites, cannot
access newly discovered locations whose servers expressly prohibit it
with a robots.txt file.

What strategy should one employ to initiate research on the Web?
First, set aside online thinking and habits. When used with an
understanding of their individual features, strengths and limitations,
search database sites and mega engines often yield fruitful results.
It is important to note, however, that these resources do not
constitute the sole means for obtaining information on the Web.

Second, refresh traditional research skills. Imagine a large
unfamiliar physical library. What would expert researchers do to find
pertinent resources within the library? Most would consult a librarian
or the catalog. The catalog or the librarian would in turn refer them
to finding aids such as indexes, pathfinders or guides, and to
treatises, online databases, CD-ROM products, or other potentially
relevant resources within the library's collection.

By analogy, the Web is a vast library, but with notable differences
that effect research strategy. For example, no one manages the Web. No
one evaluates the quality of resources or assesses the needs of the
patrons. No one oversees weeding the collection, archiving
information, or making acquisitions. While classic foes like
censorship prove difficult in this environment, so does quality
control.

In addition, physical libraries experience lost or damaged
resources, but sources of information on the Web frequently alter
content or location without record of the change. Many times
information that previously existed disappears without a trace never
to return to either the same location or another one.

Recognizing the need for increased sensitivity to issues of quality
and authority, researchers should consider their degree of familiarity
with potential resources and with the subject. In other words, in the
physical world, expert researchers lacking familiarity with a
library's collection or with the subject initiate research by
consulting the catalog or another expert. Those knowing the collection
but uninstructed in specific resources relating to the topic seek
guides, indexes, pathfinders or other finding aids. Researchers
familiar with both the collection and the subject browse relevant
areas of the library in search of substantive resources.

On the Web, the first scenario -- unfamiliarity with both the
collection and the subject --presents the greater challenge because
finding a catalog may be an adventure itself. No signs nor maps point
to such tools as they do in a physical library. But means for finding
them exist. They comprise reading articles like this one, using known
catalogs to discover others, posting questions to appropriate
discussion groups, and using search services.

In closing, conducting comprehensive research on the Web requires
using a variety of tools some of which closely resemble publication
types researchers have consulted for years. When undertaking a new
research project on the Web, consider how expert researchers might
approach the same task in a physical library. Then attack it in a
similar fashion allowing for differences between the two environments
that effect strategy and for innovations in publication types.