Genome doesn't start with 'G'

Study of the largest and last chromosome of the human genome published

The Wellcome Trust Sanger Institute and colleagues in the UK and USA today publish the longest and final chapter in
what has been called The Book of Life - the text and study of our human genetic material. Published in Nature, the report of the sequence of human chromosome
1 is the final chromosome analysis from the Human Genome Project.

The sequence has been used to identify more than 1000 new genes and is expected to help researchers find novel
diagnostics and treatments for many diseases. In the past year alone, genes involved in a dozen diseases, including
cancer and neurological disease, have been identified using the freely available chromosome 1 sequence and DNA
resources.

If it were typed out, chromosome 1's huge repository of genetic information would cover 60,000 pages. It is home to
more than 3000 genes and more than 350 known diseases, including conditions as varied as cancer development,
Parkinson's and Alzheimer's disease, high cholesterol and porphyria (thought to affect King George III of England).

"The sequence we have generated, like that produced by our collaborators throughout the Human
Genome Project, has driven biomedical discovery," said Dr Simon Gregory, Assistant Professor from Duke
University, who led the project while at the Sanger Institute. "This moment, the publication of the
sequence from the last and largest human chromosome, completes the story of the HGP and marks the growing wave of
biological and medical research founded on the human genome sequence".

"Chromosome 1 contains fascinating stories of chromosome biology, of our evolution, and our health,
and it's inspiring to have played a part in a programme that will have so much power to understand the essence of human
biology."

Human chromosomes are numbered from the largest (chromosome 1) to the smallest (chromosomes 22 and 21). Each is
composed of many millions of genetic letters or bases, called A, C, T and G. The first genetic letter of chromosome 1
sequence, and hence the beginning of our genome, is "C".

The sequence of human chromosome 1 is 223,569,564 bases of genetic code - around 8% of our genome - and contains about
twice as many genes as the average chromosome. "The size of chromosome 1 means its landscape spans
extremes in gene content, with stretches of millions of bases of gene-rich oases and gene-poor deserts",
continued Dr Gregory, "as well as regions of the chromosome that are copied during early and late
phases of cell division.".

But the sequence must be mined to be of benefit: for example, differences in the sequence between individuals will help
develop an understanding of diseases associated with this chromosome. Almost 4500 single-letter changes in the genetic
code (called SNPs) were identified that could lead to changes in protein activity. In addition, 90 SNPs were found that
would result in a shortened - and possibly inactive - protein. Although some 15 SNPs are associated with already known
protection from malaria and predisposition to porphyria, the function of these newly located SNPs is yet to be
discovered.

"A catalyst for our gene discoveries", is how Dr Brian Schutte, Associate Professor of
Pediatrics at the University of Iowa, describes the sequence of chromosome 1. "Prior to the
sequencing efforts, we managed to localize the gene for a rare human orofacial clefting disease to a region on
chromosome 1. But, we had no clue which genes lay in the region".

"Our collaboration with the Sanger led to much more rapid discovery of the gene involved and now
we, and others, have found that normal genetic variation in the same gene contributes 12% risk for the common form of
cleft lip and palate. Our experience demonstrates two important issues. Firstly, gene discoveries in rare diseases can
contribute directly to the understanding of common diseases. Secondly, sequencing efforts accelerate gene discovery of
not only rare genetic disorders, but also common diseases that place the greatest burden on our healthcare
system."

The finished sequence of chromosome 1 enabled the team to bring together chromosome-wide information associated with
genetic variation from projects such as the HapMap - a leading
international study of human genetic variation. Our chromosome pairs 'recombine' with each other, so that regions
inherited from our two parents are shuffled when passed on to our children.

Shuffling the deck tends not to disrupt genes. Most of the recombination found on chromosome 1 occurs at a few hotspots
and more than 80% of hotspots are in only 15% of the sequence. Fine scale analyses have shown that recombination tends
to be near to genes but outside the actual gene structures themselves.

Dr David Bentley, Chief Scientist at Solexa and former Head of Human Genetics at
the Wellcome Trust Sanger Institute, said "The sequence of chromosome 1, published today, is part
of an exciting and near-complete reference volume of our genome. Freely available in the public domain, researchers all
over the world are already adding new information to it, enriching the picture of what it is to be human, for the
benefit of others in the future.".

Careful analysis also showed how our genome has undergone recent evolutionary selection. The team looked at
correlations between the HapMap data and the annotated chromosome 1 sequence to investigate the variation between three
human population groups with ancestry in Europe, Africa or Asia.

Genome sequence varies from person to person. New insights are being gained all the time. We now find that genetic
differences may be prevalent in one population but rare in, or absent from, another. Some of these like the gain or
loss of large regions, have been recognized only in the past few years as a result of the Human Genome Project.

For example, as well as the fine-grain variation represented by SNPs, the team localized genes to a number of larger
'chunks' of DNA that differed between individuals. These chunks are as large as 1 million bases. Some of the regions
have been previously implicated in how we vary in our interaction with the environment around us. For example,
variations in the region around the GSTM1 gene can alter our susceptibility to cancer-causing chemicals or toxins and
influence the toxicity or efficacy of certain drugs.

Chromosome 1 is particularly susceptible to rearrangement and it is thought that disruption to genes within these
rearrangements play a role in several cancers and in mental retardation. The high-quality sequence has already helped
researchers around the world to home in on genes that affect a range of cancers.

Rearrangements, deletions and duplications can tell us about our evolution and our diseases. More than 5% of the
chromosome is duplicated and can provide material for the evolution of new functions. In one example, the partial
duplication of a gene called NOTCH2 has resulted in a novel protein that is known to be functional in humans and has
been implicated in disease. Meanwhile, deletion of regions of chromosome 1p is found in 1/5000 to 1/10,000 live births
and may contribute to mental retardation syndromes.

"The Human Genome Project has provided us with a wealth of information about our genes and their
many variations," said Dr Mark Walport, Director of the Wellcome Trust. "It is a vital
resource for answering important questions about health and disease. We have been a committed partner in the project
since 1992 both in supporting the research and ensuring the results are freely accessible to all".

"The completion of the project, with the publication of the Chromosome 1 sequence, is a monumental
achievement that will benefit the research community for years to come and is a credit to all involved.".

The human genome is essential in understanding disease and the sequence of chromosome 1, together with the sequences
produced and analysed throughout the Human Genome Project, will continue to be a foundation to help improve human
health.

When seeking funding from the Wellcome Trust for their efforts to sequence the human genome in 1995, the Sanger
Institute management wrote: "Sequencing is not an end in itself: it is not the solution of the
genome, but merely the baseline information that allows the real aim - the biology - to proceed faster". The
chromosome 1 project stands as a reflection of that view. Genome sequence powers research to help us understand the
biology of our genome and the medical consequences of sequence variation.

Notes to Editors

Chromosome 1

The finished sequence comprises 223.6 million base-pairs (Mbp), determined to an accuracy of >99.99%. The sequence
of chromosome 1 published today includes 99.4% of the gene coding (euchromatin) regions of the chromosome amenable to
sequencing with current technologies. Gaps within the sequence (most are due to repetitive sequence) comprise about 1.3
Mbp. The total size of chromosome 1 is estimated to be 237.6 Mbp, which includes the centromere and a large non-coding
region (heterochromatin) in the centre of the chromosome.

Sequencing was carried out at the Wellcome Trust Sanger Institute and the University of Washington Genome Center
contributed 13% of the sequence finishing. Analysis of the chromosome content was carried out by Wellcome Trust Sanger
Institute.

The Human Genome Project

Throughout the Human Genome Project, sequence data have been released freely to speed biological and biomedical
research. For each of our 24 human chromosomes, a peer-reviewed report has been published: the publications describe
the attributes of the finished sequence and analysis of the gene content, variation in sequence and other features. The
sequence of chromosome 1 is the final report in this series.

The Wellcome Trust Sanger Institute

The Wellcome Trust Sanger Institute, which receives the majority of its funding from the Wellcome Trust, was founded in 1992. The Institute is responsible for the completion of the sequence of approximately one-third of the human genome as well as genomes of model organisms and more than 90 pathogen genomes. In October 2006, new funding was awarded by the Wellcome Trust to exploit the wealth of genome data now available to answer important questions about health and disease.

Websites

The Wellcome Trust and Its Founder

The Wellcome
Trust is the most diverse biomedical research charity in the
world, spending about £450 million every year both in the UK
and internationally to support and promote research that will
improve the health of humans and animals. The Trust was established
under the will of Sir Henry
Wellcome, and is funded from a private endowment, which is
managed with long-term stability and growth in mind.