The TIPSTER Text Program was a Defense Advanced Research Projects Agency (DARPA
) led government effort to advance the state of the art in text processing technologies through
the cooperation of researchers and developers in Government, industry and academia. The
resulting capabilities were deployed within the intelligence community to
provide analysts with improved operational tools. Due to lack of funding, this program formally ended in the Fall of 1998.

DARPA, the Department of Defense (DoD) and the Central Intelligence Agency (CIA)
jointly funded and managed the program, in close collaboration with the National
Institute of Standards and Technology (NIST) and the Space and Naval Warfare Systems Center (SPAWAR, or SSC),
formerly NCCOSC/NRaD. A TIPSTER Advisory Board was formed in 1998 with
members representing users from other Government agencies interested in automated text
processing, such as the Department of Energy (DOE), Federal Bureau of Investigation (FBI),
Internal Revenue Service (IRS), National Science Foundation (NSF), Treasury Department and other Government agencies.

In its efforts to improve document processing efficiency and cost effectiveness TIPSTER
focused on three underlying technologies.

Document Detection: the capability to locate documents containing the type of
information the user wants from either a text stream or a store of documents.

Information Extraction: the capability to locate specified information within a text.

Summarization: the capability to condense the size of a document or collection
while retaining the key ideas in the material

These three capabilities formed the basis for nearly all other information handling tasks.

TIPSTER Phase I

During the first phase of TIPSTER research efforts, (1991-1994), the participants made
major advances in creating the algorithms for document detection and information
extraction and in improving the techniques for measuring those advances, through activities
such as the Message Understanding Conferences (MUC) and the Text Retrieval
Conferences (TREC). Document Detection technologies improved Recall from roughly
30% to as high as 75% and the improvement in the processing of natural language queries
was also significant. Improvements in Information Extraction produced increases in Recall
from roughly 49% to 65% and in Precision from 55% to 59%, and dramatic gains were
made in the ability to automatically identify a wide range of items such as names (both
personal and organizational), dates, locations, times, phone numbers, etc.

TIPSTER Phase II

The TIPSTER research and development community turned its attention to the creation of a
software architecture during the second phase, (April 1994-September 1996), in order to
standardize the technology components, enable "plug and play" capabilities among the
various tools being developed, and permit the sharing of software among the various
participants. Based on feedback from the researchers, developers, and users of the existing
prototype and implementation systems, the architecture, funding permitted, continued to evolve.

The Multilingual Entity Task (MET) developed Chinese and Japanese training collectons
with over 300 documents in each language. The task was initially confined to Named Entity
extraction and the development of a variety of tools such as word boundary finder,
part-of-speech tagged Chinese lexicons and dictionaries.

Various research projects and demonstration systems in support of Document Detection
and Information Extraction were also completed.

TIPSTER Phase III

Phase III started in October 1996 and continued to build on Phase I and II achievements
with new projects in supporting research, development and evaluation areas. Also,
summarization was added as a fundamental task area. See Phase III Overview