A collaboration between the Sanger Centre and the EBI adds critical annotation to sequence data from the Human Genome
Project

Today researchers at the Sanger Centre, a world-leading DNA sequencing centre, and the European Bioinformatics
Institute (EBI) announced that their joint bioinformatics project called Ensembl has now confirmed the location of the
sequence of more than 35,000 genes on the human genome and has identified a further 150,000 potential gene fragments.
Ensembl is an automatic tool that adds critical information to sequence information as it is submitted to genome
databases, enhancing the usefulness of this data to academia and industry. One effect will be to speed up the process
of identifying new targets for drug development. The new tool can be found at http://www.ensembl.org/.

The public Human Genome project has already released 3/4 of the human genome sequence. Ensembl aims to provide a
comprehensive analysis of this data. Ensembl data and program source code will be available for the free and
unrestricted use of biomedical researchers worldwide. Teams from Sanger and the EBI thereby hope to encourage worldwide
collaborations in "adding value" to genome databases.

Annotating this data is necessary to interpret DNA sequences and identify genes. In simple organisms such as bacteria,
most of the DNA consists of genes, but the roughly 100,000 human genes make up only about two per cent of the DNA
molecule. The function of the other 98 per cent is unknown; in some cases it appears to be "noise". Ensembl contains
automated routines which scan sequences for typical patterns found in genes and marks their positions in the molecule.
Since new sequences arrive in bits and pieces, another of Ensembl's jobs is to plot each sequence onto the "map" of
human chromosomes.

A number of new features are planned for the database in the near future, including integrating information about
variant forms of genes called SNPs - many of which have been linked to genetic diseases. The SNP Consortium Ltd, a
collaborative effort to create a freely available genome wide map of genetic markers, has recently announced the
release of a total of 100,000 SNPs, 45% of which have been contributed by the Sanger Centre.

Additional Material:

Notes to editors:

1. The Sanger Centre, which receives the majority of its funding from the Wellcome Trust, is one of the world's
leading genome sequencing centres. Both the Sanger Centre and the Wellcome Trust have been at the forefront of efforts
to keep sequence data in the public domain. The Sanger Centre employs about 500 people in the purpose-built campus at
Hinxton. The Centre is a leading partner in the Human Genome Project and also contributes to international projects to
sequence the genomes of disease-causing organisms.http://www.sanger.ac.uk/

2. The Wellcome Trust is the world,s largest medical research charity with an annual spend of some £600
million in the current financial year 1999/2000. The Wellcome Trust supports more than 3000 researchers at 300
locations in 30 different countries, laying the foundations for the healthcare advances of the 21st century and helping
to maintain the UK's reputation as one of the world,s leading scientific nations. As well as funding major initiatives
in the public understanding of science, the Wellcome Trust is the country,s leading supporter of research into the
history of medicine.http://www.wellcome.ac.uk/

3. The EBI is an Outstation of The European Molecular Biology Laboratory (EMBL);
it maintains some of the world's largest databases of DNA and protein sequence data, develops tools to help biologists
use it, and is the home of research groups who are looking for the biological significance of this data. The EBI is
also one of the world's most important centres for bioinformatics training. EMBL is a basic research institute funded
by 16 member states, including most of the EU, Switzerland and Israel. Research at EMBL is conducted by approximately
80 independent groups covering the spectrum of molecular biology. The Laboratory has five units: the main Laboratory in
Heidelberg, Outstations in Hinxton (the European Bioinformatics Institute), Grenoble (on the campus of the ILL and
ESRF), Hamburg (on the DESY site), and an external research programme in Monterotondo, Italy (sharing a campus with
EMMA and the CNRS). The Laboratory provides essential services to the European scientific community, welcomes a large
number of scientific visitors each year, and has an active international PhD programme.