At first glance, using a general search engine to locate information on the web seems easy. But getting a search engine to work with precision is another story. General search engines come packed with features that are often underutilized, but can be helpful in increasing search precision. The features differ from engine to engine, and skilled researchers will adjust their search strategy to take advantage of these differences depending on the type of results sought. This article will explain the differences in some of the available features, then examine a few major search engines in light of these features.

Searching Features

Alternative/Inclusive Default

When you type two words into a search engine box without any connectors, how does the engine put them together? Will it find only those pages where both words appear, or will it find pages where either word appears? Search engines with an inclusive default treat two separately typed words as if there were an AND between the words, while search engines with an alternative default treat the same two words as if there were an OR between the words. Thus, the results for the same search typed into two different search engines can be enormously different because one is inclusive, and the other alternative.

Inclusive Default Search Engines

Google

HotBot

Lycos

Alternative Default Search Engines

AltaVista

Excite

Many search engines allow a researcher to designate alternative or inclusive through the use of the connectors OR and AND. Inclusion can also be designated using a plus sign as a word modifier:

apple OR blueberry

apple AND blueberry

+apple +blueberry

Keyword/Concept Default

Some search engines use automatic concept searching as a default. Many advanced online researchers are accustomed to keyword searching, where the exact string of characters typed in is searched. Thus, an advanced researcher who unwittingly uses a search engine with a concept searching default can become frustrated. Concept searching occurs when the engine not only searches for the exact character string, but also for word forms, and even synonyms and other words that statistically appear with the typed word.

Keyword Search Default Search Engines

AltaVista

Google

HotBot

Lycos

Concept Search Default Search Engines

AltaVista (for some searches)

Excite

Exclusion

Most search engines allow exclusion of search results that contain certain terms. Many engines recognize this feature by placing a minus sign or the word NOT in front of the term to be excluded. This feature should be used sparingly to avoid eliminating relevant results that might have a casual mention of the excluded term. Note that a minus sign modifies a single word, while NOT is a connector between words:

pie –apple

pie NOT apple

Truncation

When using keyword, or exact match, searching, it can be helpful to command the search engine to locate pages where there are various forms of the word being sought. Typing the root of a word and adding a truncation symbol on the end can accomplish this. Most search engines recognize an asterisk as a truncation symbol. For example, if I wanted to find pages with various forms of the word independence, I would type independen* and the results would include pages that contain independence, independent, and independently.

Search Restrictors

Search restrictors in web search engines are similar to search fields in Westlaw. They allow a search for terms or values contained only in certain portions of a page, rather than anywhere in the entire page. A simple example is a search restricted to a type of domain, like .com or .edu. If a domain restriction is used, the search engine seeks results only where the url matches the designated domain type. Search restrictions are accomplished in different ways on different search engines, usually showing up in an engine’s advanced searching option. Serious researchers have long applauded HotBot’s search form, which makes restricted searching easy.

Title restrictions are often available. Use these with caution, perhaps as a first step to see what pops up. A title restriction reflects the title of the web page, designated by the web author. It may not necessarily correspond to the title of the document appearing on the page. For example, I might be looking for a copy of the Declaration of Independence. That document may appear on a web page entitled Historic Documents by the web author. If I restrict my search for “declaration of independence” to the title portion of pages, I will miss this page because it is actually called Historic Documents.

Date Searching

Searches can often be restricted by date. Additionally, dates often appear on the list of search results. However, like page titles, page dates can be somewhat misleading. The dates that are searched or reflected in results lists are the dates of the web page, and not necessarily the date of the document on the page. A search with a date restriction of July 4, 1776, will yield no results since no web pages were created or changed on that date. Thus, if I am searching for the Declaration of Independence, it won’t help me to try and place a date restriction in my search query. However, date restrictions can be useful to locate newly created or recently updated web pages, weeding out older results.

Phrase Searching

Most search engines recognize quotation marks around two or more terms as the designation of a phrase. Additionally, this can sometimes be accomplished by placing the Boolean connector ADJ between the terms. Thus, “apple pie” or apple ADJ pie will search for the phrase apple pie, and not search the two terms separately.

Nesting

Many search engines support the use of parentheses to nest various parts of a search query. For example, a search for apple or blueberry pie can be accomplished by nesting:

(apple or blueberry) ADJ pie

It can also be accomplished by searching two alternative phrases:

“apple pie” OR “blueberry pie”

Search Levels

It is often useful to perform a multi-level search, first casting a wide net, then narrowing by searching only within that set of results. This feature is offered by AltaVista, Google, HotBot and Lycos.

Results Features

When comparing search engines, search language is only half the story. Search results are also important. Search engines use various mathematical formulas to match terms from the search query to web pages containing those terms. These formulas take various factors into consideration to present lists of results often ranked by relevancy, at least, relevancy according to the formulas used. Some of the factors that go into the determination of relevancy are how closely together the terms appear, how many times they appear on the page, how close to the top of the page they are, and how unique they are.

Beyond pure relevancy rankings, however, many options are available to achieve a variety of results. Search engines present results quite differently, often without clearly explaining how the results are calculated or displayed. A serious researcher will seek to understand these differences and use them to her advantage.

Directory Results

Several years ago, before sophisticated portal sites were developed, there were two major ways to search for information on the web: directories and search engines. A directory is a collection of links to web sites which is classified into subject categories and subcategories.

As directories and search engines developed into overall portals, directories incorporated search engines and search engines incorporated directories. Portals have attempted to make these two entities appear seamless; however, they are two distinct finding tools. Understanding this concept allows the researcher to take more control over her searching.

Consider, for example, the classic directory, Yahoo! In a search for the Declaration of Independence, I can click through subject categories to locate it, or I can type “declaration of independence” in the search box. When searched, Yahoo! first searches its classified directory for subcategories entitled Declaration of Independence. If none are present, it then searches the directory for listed web sites entitled Declaration of Independence. If there are none, Yahoo! then uses search engine Google to search for web sites which contain the phrase Declaration of Independence. Yahoo! presents the first set of results it can, even if that happens to be the third step, web page results from Google. I do not have to prompt Yahoo! to move through to the next step if the first step found nothing; it happens automatically. This is why different searches on Yahoo! may produce results pages that look quite distinct.

Besides Yahoo!, there are two other major subject directories that have linked themselves with major search engines. The Open Directory Project provides directory results to Google, HotBot and Lycos, while LookSmart provides directory results to AltaVista and Excite.

Most Popular Results

As researchers began to realize that mathematical relevancy ranking didn’t always equal researchers’ intuitive relevancy ranking, tools were developed to put a more human factor back into relevancy determinations. Search engines can now measure what the most popular sites are, given certain search terms, and list the popular sites as results options. This is the driving force behind Direct Hit, which is used at HotBot and Lycos. Google and AltaVista include popularity as a factor in their formulas to determine relevancy rankings.

Customized Results

Most search engines allow the look of the results page to be changed, especially with regard to the number of hits per page. Additionally, they may offer the option of listing only titles or sorting by date or site, rather than relevancy.

Clustered/Compressed Results

Some searches produce many individual page hits from the same overall web site, making it seem like the results all come from the same place. When a search engine uses results compression, or clustering, it shows only one page per web site, while offering an option to view the other results from that site. This feature can be found at AltaVista, Excite, Google and HotBot.

Suggested Searches

Suggestions for further searching based on the initial search are provided by many search engines. These suggestions can be simple, such as synonyms or alternative search terms. They can be more sophisticated, such as suggestions for searching in different, specialized databases. Ask Jeeves is built entirely around suggested searches. If I type a question into Ask Jeeves’ search box, it returns a list of suggested specialized databases that might contain the answer to that question.

For example, I asked Jeeves “Where can I find the Declaration of Independence?” Jeeves returned several suggested sources for the text of the Declaration of Independence, as well as historical background on it.

Suggested searches can also be found at AltaVista, Excite, HotBot and Lycos.

Similar Searches

If I locate a web page that is highly relevant to my research issue, I might be interested in finding more pages that are very similar. Some search engines will perform a search for other similar pages at the click of a button. I simply choose a page from my results list and ask the engine to perform a second search to find similar pages. This feature can be found at Google (Similar Pages) and AltaVista (Related Pages).

Translated Results

A few years ago, AltaVista began offering a tool to translate a given results page from one language to another. The translations aren’t the greatest, but they’re better than nothing when confronted with results in an unfamiliar language. Google and Lycos also offer translation.

no (Help screens say yes, but I couldn’t find one instance where they appeared as search results); a separate directory can be browsed from the main page.

Popular:

not a separate list, but popularity is built into AltaVista’s relevancy formula

Clustered:

called site compression, is automatic in Basic Search and can be turned on in Advanced Search

Suggestions:

yes

Similar:

yes, called Related Pages

Translated:

yes

Other Features:

While Basic Search presents results ranked by relevancy, Advanced Search results will appear in random order unless the sort by box is used. Sort by allows users to place greater weight on certain terms.

Sabrina is also Researcher/Author of
beSpacific® - Accurate research surfacing documents and resources focused on law, technology, government reports, and knowledge discovery - with a global perspective. Updated daily since 2002 with a searchable database of 40,000 postings.