Fortunately, thanks to the Web, we can now work together on a global scale and share our expertise.
An international network can now be set up to collect and disseminate instantly the knowledge that society needs
to reduce human disease, increase agricultural production, control destructive invasive species,
protect endangered ones, and enjoy rather than struggle with nature.
This prospectus addresses how we could work together, leverage our existing resources, and build such a network.

What would it take to study and monitor a million species around the planet, in real-time, by 2012?
How should we work together to identify species, study their distribution and abundance,
share our findings, and, in so doing, help everyone better understand, manage, and enjoy biodiverity?

In 1992, Raven and Wilson recognized the importance of completing a global biodiversity survey
and proposed a 50-year plan (Science 258: 1099-1100).
Many individuals are now working to document, understand, and manage biodiversity.
Their efforts differ in geographic scale and taxonomic breadth.
At a global scale, Species 2000 has been cataloging all named species since 1999
(http://www.sp2000.org).
The Organization for Economic Cooperation and Development authorized the Global Biodiversity Information Facility (GBIF)
in 1999. GBIF helps coordinate and put databases on-line for various taxa
(http://www.gbif.org).
BioNET International, a network started in 1993 to build capacity in taxonomy,
now has 120 member countries
(http://www.bionet-intl.org).
National efforts include Costa Rica's INBio
(http://www.inbio.ac.cr/en/default2.html),
Mexico's Conabio
(http://www.conabio.gob.mx),
and the United States Geological Survey's National Biological Information Infrastructure
(USGS-NBII http://www.nbii.gov).
Projects differ in taxonomic breadth.
The goal of All Taxa Biodivesity Inventories (ATBI's),
such as Great Smoky Mountains National Park's,
is to document all species within a specified geographic boundary.
In contrast, the U.S. National Science Foundation's Planetary Biodiversity Inventory program
focuses on all the species within defined taxonomic groups.
Yet other efforts target single species, such as a disease causing agent or agricultural pest.

The Web is an efficient and inexpensive medium for rapidly sharing information.
There are now over 100,000 Websites that provide some information about biodiversity.
Large sites run by herbaria, museums, universities, and other entities
include Animal Diversity Web, ARKive, Ecoport, ETI, GenBank,
NatureServe, Tree of Life, and Wikipedia.
A multitude of smaller sites share information contributed by
amateur photographers, hobbyists, nature lovers, and other individuals.
Unfortunately, like the Web in general, this rapidly growing digital encyclopedia of life
is loose-knit, difficult to navigate, and lacks sufficient quality control.

Despite centuries of intense interest and much work, science knows relatively little about life on Earth.
Of the planet's estimated 5 -10 million or more species, only about 1.7
million species are scientifically described and named. The biology,
ecological interactions, and environmental requirements of most species,
even the named ones, are largely unknown.
The following challenges impede assembling and using biodiversity information:

What is it? -- Most people are unable to identify to species all but a small fraction of the organisms they find.
Consequently, until we make accurate identification tools readily available,
they can neither contribute nor retrieve species specific information.

Too many whats, wheres, and whens -- The magnitude of a global biodiversity survey is too great
for experts to do everything. Students and citizen scientists must learn about biodiversity and help
us study, monitor and map target species around the globe with up-to-date information.

Retrieving information -- Information about most species is currently inaccessible to most people.
No information exists for millions of species. Much of what does exists is not on the Web but
in copyrighted material that may or may not be available from local libraries and other sources.
Even if high-quality information is on the Web,
it may be hidden within the Web's chaff to all but a few cognoscenti
who know which databases to search and can spell scientific names correctly.
If we are to gain the help that we need for our quest from the general public,
we must return knowledge to them in a form that is free and easy to find.

Quality control. General search engines such Google have revolutionized our lives.
However, while they help us find information, they fail to rank its quality.
Multi-source biodiversity data need to be vetted and users allowed to view original records
before they use them. Is a point on a range map based on a museum specimen collected a century ago,
on a novice citizen scientist's contribution, or on a professional expert's recent report?

The network of centers we propose here will help overcome these impediments by
providing a framework of technology, training, and support so that individuals around the world can rapidly
assemble and share high-quality information about their biodiversity.

Why are the centers needed?

The proposed centers are intended to complement other on-going biodiversity projects
such as GBIF's and the Encyclopedia of Life initiative led by the Smithsonian Institution.
The centers will provide tools and training to assemble a vast about of biological
information from regional experts into a common framework by 2012.
They will provide scientists and other contributors with the one-on-one
technical support they need to share their expertise and data through the Web.

Based on Discover Life's experience of providing technical
support from the University of Georgia since 1998,
the proposed network of regional centers is an expedient means to transfer our
technology to potential contributing experts worldwide.
Because of the technical complexity of putting databases and identification guides on-line,
our most productive contributors to date are the scientists to whom we have provided long-term,
customized technical support, often by phone.
Our large training workshops were useful politically but they were not productive
in terms of getting guides built or information on the Web.

Discover Life relies on information provided by others.
It gives credit to the original source of each image, record, and page that
it serves from contributing databases. As such, our team of taxonomic experts are
search engine editors and ensure that the information served is accurate, up-to-date
and from reliable sources.

These are interactive Web forms that help users to identify a specimen
by checking and submitting its attributes
(see http://www.discoverlife.org/20/q).
Users check one or more states for each character that a guide presents,
skipping characters for which they lack information or are not sure.
While users exclude species that do not have their specimen's attributes,
at any time they can link to pages about the remaining possible taxa.
Eventually, if all goes well, they end up on the species page about the specimen
that they have identified. Finally, they may report to the system where they found the species.

Guide builders have a powerful set of on-line modules to speed guide buiding and creating species pages.
These modules are Web based and are fully integrated with features of the global mapper, image center,
record manager, index and search tools described below. The software enables multiple individuals from
across the Web to contribute simultaneously to a guide and its associated species pages.
The builders' menu includes features to import species lists and other data from speadsheets and databases;
add, rename, and drop taxa; specify rules to score attributes
for multiple taxa simultaneously; process images to illustrate states and species pages,
write text, and parse information from the Web into species pages.

The underlying logic behind IDnature guides is unlike other interactive identification software that requires
guide builders to complete an entire character-state matrix before a guide works.
Instead IDnature guide builders may leave many cells in their matrix blank and still produce a working guide.
A resolve feature helps guide builders determine when each guide has enough information
in its cells to identify all species from each other.
Consequently, guides with hundreds of species can be built very rapidly.
The IDnature guide to North American ants includes over 700 species and 5000 images.
It was assembled over the course of six months through the cooperation of individuals at over 10 institutions.
The guide to North American trees and shrubs resolves over 1,200 species.
Three University of Georgia undergraduates assembled and illustrated this guide over the course of six months.

We developed this tool in partnership with Topozone.com
(see http://www.discoverlife.org/20/m).
It allows users to overlay specimen records from partners' databases onto maps and aerial photographs served
by Topozone.com. Users can map multiple species simultaneously. Each point that is plotted links
back to its source and more information through the record manager described below.

From servers in Massachusetts, Topozone.com makes global maps available to the mapper at 1:1,000,000 scale,
topographical maps to 1:24,000 scale for the United States,
and aerial photographs at 1 pixel per square meter resolution or better for 89% of the United States,
in total over 25 TB of data.

The Global Mapper includes a gazetteer with over 7 million georeferenced places.
Depending on a map's resolution, it automatically converts data between latitude-longitude
(decimal degrees or degree-minute-seconds) and UTM coordinates.
It uses the WGS84/NAD83 datum and can convert data from NAD27.

Image links todemonstration of Global Mapper

Image Center

High-resolution images are key components of our IDnature guides
and species pages. Eventually we intend to document the life stages
and diagnostic characters of each species with numerous images. Image center software allows our partners
to contribute, manage, and process large numbers of images rapidly. Automated programs
enable each of our servers to process up to 4,000 original high-resloution images daily,
storing them at the five resolutions that we use in display and zooming.
Image titles, captions, and metadata are processed, stored, and retrieved by the record manager
described below.

Users can search for and see images using the IDnature guide software, by browsing,
and by specifying unique image identifiers.
They can display images either individually, in sets, or as part of the dynamic species pages.
The software's options allow end users and participating Websites to display images at various sizes.
They can display individual images, order them in a slide show, group them as thumbnails, or list
numerous images together with their associated text, copyright statements, and metadata in a single page.

This tool allows users to add and retrieve data records associated
with images, collection events, and specimen determinations
(see http://www.discoverlife.org/label).
Contributors can import data from spreadsheets and databases into the record manager.
The manager uses simple file formats that speed translating and importing both flat files and relational data.
Alternatively, contributors can enter data manually using Web forms.

The manager integrates data stored on partners' servers with data stored on Discover Life's.
It gives data providers maximum flexibility and does not force them to use a standard data schema.
Providers name and order the fields that they wish to share.
The manager indexes key fields, such as scientific name and geographic coordinates.
It passes geographic information to the Global mapper,
which in turn combines data from all contributors into maps.

Every night the manager automatically indexes millions of records from contributing databases,
large and small alike. It updates the databases that it mirrors. These include
Missiouri Botanical Garden's Tropicos database, which contains over 2.3 million specimen records.

The manager has an option to print labels with secure,
globally unique identifiers to track specimens with machine readable data matrix symbols.

Fast database indexes glue the above software tools together and enable them
to share data seemlessly. Contributors and system operators update these indexes as they add information.
Automated programs also update them each night, adding information from our partners' on-line databases.

The tools that search for and display species information, for example,
draw from various sources, including the indexes used by the
IDnature guides, Global mapper, Image center, and Record manager.
Consequently, individuals who build guides, provide images, and add specimen records
contribute to the search tool and species pages without explicitly doing so.

Discover Life provides Web services to share its tools and content with other Websites.
We discuss this technology below under Partners,
because it so integral to forging the partnerships
that we need to gather and share information on a million species.

We use non-Web based programs, some of which are automated, to
manage security, data integrity, cross-site back-up of files, and load management across servers.
We also have a growing set of programs that help us find and incorporate information from other Web sites
into our databases and the pages we serve.

Discover Life's main computers are housed at the University of Georgia and Missouri Botanical Garden.
In August, 2005, they served over 2.9 million pages and images.
The Website's long-term future does not depend on its current legal or physical homes.
The five-year cooperative agreement between NBII and Polistes states that
we will transfer Discover Life to a government agency or another non-profit should
Polistes be unable to maintain the Website.

We provide tools and content to other sites as Web services.
Any non-commercial Website may customize URLs and forms on their pages to use our services.
The information returned appears within their site's framework
and does not appear to come from our servers.
The navigation bars, for example, help keep users on participating sites
and do not transfer them to our servers.
While we take care to credit content providers,
our Web services can be set up to hide their role to end users.

In a Web emblazened with logos, banner ads, and unwanted commercial pop-ups,
providing behind-the-scene Web services may seem foolishly altruistic.
It is not. We do it to encourage other sites to share our content widely.
Hopefully we will partner together to share their content too.
Web managers and data providers will help us more if we give them content and credit, do not
syphon off their end users, and refrain from clutterng up their sites with extraneous logos
and sponsored links.

We encourage contributors to use our Web services as an efficient way to cross-link all our Websites.
Cross-links increase our combined visibility to end users.
Google and other search engines weigh the number of links to pages
in ranking what to list at the top of their results.
By cooperating and linking our sites together, we will all get more visitors than if we stand alone.

The images on the right of this page showcase our Web services.
They integrate, display, and credit information from numerous Websites and databases.
The red text above each image describes the resource.
The links below go to credit and species pages.
Our servers generate the returned pages dynamically,
querying multiple databases and Web pages for information on the fly.
If necessary, please use your browser's "Back" button to return to this page.