Human Genome Project

Human Genome Project,

international scientific effort to map all of the genesgene,the structural unit of inheritance in living organisms. A gene is, in essence, a segment of DNA that has a particular purpose, i.e., that codes for (contains the chemical information necessary for the creation of) a specific enzyme or other protein......Click the link for more information. on the 23 pairs of human chromosomeschromosome, structural carrier of hereditary characteristics, found in the nucleus of every cell and so named for its readiness to absorb dyes. The term chromosome.....Click the link for more information. and, to sequence the 3.1 billion DNA base pairs that make up the chromosomes (see nucleic acidnucleic acid,any of a group of organic substances found in the chromosomes of living cells and viruses that play a central role in the storage and replication of hereditary information and in the expression of this information through protein synthesis......Click the link for more information.). Begun in 1990 with the goal of enabling scientists to understand the basis of genetic diseases and to gain insight into human evolution, the project was largely completed in 2000 when 85% of the human genome was decoded, and ended in 2003 with 99% decoded; detailed analyses of all the pairs were published by 2006. In the process, scientists identified genes for cystic fibrosis, neurofibromatosis, Huntington's disease, and an inherited form of breast cancer. In addition, the project decoded the genome of the bacterium E. coli, a fruit flyfruit fly,common name for any of the flies of the families Tephritidae and Drosophilidae. All fruit flies are very small insects that lay their eggs in various plant tissues......Click the link for more information., and a nematode worm (see phylum NematodaNematoda, phylum consisting of about 12,000 known species, and many more predicted species, of worms (commonly known as roundworms or threadworms). Nematodes live in the soil and other terrestrial habitats as well as in freshwater and marine environments; some live on the deep.....Click the link for more information.), in order to study genetic similarities among species, and a mouse genome was also decoded.

The Human Genome Project involved laboratories in the United States, France, Great Britain, Germany, and Japan. It was financed in the United States by the National Institutes of Health and by the Department of Energy and in Great Britain by the Wellcome Trust of London. A comparable project using new DNA-sequencing machines was begun as a private industry venture in the United States in 1998, with a stated goal of completing the mapping of the genome in three years.

Early in 2001 scientists from both teams jointly announced the "completion" of the mapping of the human genome, indicating that they had identified an estimated 30,000 genes (instead of the expected 100,000), constituting just 1% of the total human DNA. Subsequent comparison of the two teams' data has indicated that, because of differences in the genes identified by the teams, there may in fact be as many as 40,000 human genes. A subsequent, more refined estimate (2004) based on additional work on the genome was that there are between 20,000 and 25,000 genes. Work continues on further refining the sequencing of the genes on the chromosomes, eliminating the remaining gaps in the genome map, and identifying the extent of variation in the human genome. In 2007 the first sequences of human individuals (James D. WatsonWatson, James Dewey,1928–, American biologist and educator, b. Chicago, Ill., grad. Univ. of Chicago, 1947, Ph.D. Univ. of Indiana, 1950. With F. H. C. Crick he began (1951) research on the molecular structure of deoxyribonucleic acid (DNA) at the Cavendish Laboratory at.....Click the link for more information. and J. Craig VenterVenter, Craig(John Craig Venter), 1946–, American biotechnologist and pioneering genome mapper, b. Salt Lake City, grad. Univ. of California, San Diego (B.A. 1972, Ph.D. 1975)......Click the link for more information., who led the public and private human genome sequencing efforts, respectively) were released; Venter's genome was the first full (diploid) individual human genome. The NIH's National Center for Biotechnology Information maintains GenBank, a database of publicly available genetic sequences from the genomes of plants and animals, including some extinct species.

Bibliography

See studies by J. Sulston and G. Ferry (2003) and J. Shreeve (2004).

Human Genome Project

An organized international scientific endeavor to determine the complete structure of the human genetic material deoxyribonucleic acid (DNA) and understand its function. SeeHuman genetics

History

The idea for the Human Genome Project (HGP) first arose in the mid-1980s. Several scientific groups met to discuss the feasibility, and various reports were published. The most influential report was prepared by the National Research Council (NRC) of the U.S. National Academy of Sciences. It proposed a detailed scientific strategy that persuaded many scientists that the project was possible. October 1, 1990, was declared the official start time for the HGP in the United States; significant funding had become available and research groups were starting their work. Major contributions to the HGP have been made by the United Kingdom, France, Japan, and Germany, with smaller contributions from many other quarters. Coordination among the countries has been informal, relying largely on scientist-to-scientist collaborations, but has proved to be very effective.

Scientific strategy

First, markers are placed on the chromosomes by genetic mapping, that is, observing how the markers are inherited in families. Second, a physical map is created from overlapping cloned pieces of the DNA. Third, the sequence of each piece is determined, and the sequences are lined up by computer until a continuous sequence along the whole chromosome is obtained. The second and third steps can be reversed or done in parallel. As the pieces are sequenced, the sequences at the overlapping ends can be used to help order the pieces. If the sequencing is done before the pieces are mapped, the process is called whole-genome shotgun sequencing. SeeDeoxyribonucleic acid (DNA), Gene

Because the human genome is so big (human DNA consists of about 3 billion nucleotides connected end to end in a linear array), it was necessary to break the task down into manageable chunks (see illustration).

Steps in analyzing a genome

Model organisms

An important element of the overall strategy was to include the study of model organisms in the HGP. There were two reasons for this: (1) Simpler organisms provide good practice material. (2) Comparisons between model organisms and humans yield very valuable scientific information. The HGP initially adopted five model organisms to have their DNA sequenced: the bacterium Escherichia coli, the yeast Saccharomyces cerevisiae, the roundworm Caenorhabditis elegans, the fruitfly Drosophila melanogaster, and the laboratory mouse Mus musculus. The mouse genome is just as complex as the human genome, but the mouse offers the advantages that it can be bred and other experiments can be conducted that are not possible on humans.

Findings

How many genes are there is probably the most common question regarding the human genome. The first two human chromosomes to be sequenced, chromosomes 22 and 21, provided some interesting observations. Although the two chromosomes are approximately the same length, chromosome 22 has more than twice as many genes as chromosome 21. Extrapolation of the number of genes found on chromosomes 22 and 21 led to the estimate that the whole human genome contains about 36,000 genes. This is quite a surprise because previous estimates were 80,000 to 100,000 genes. Preliminary examination of the draft sequence of the entire human genome confirmed that the number of genes is much lower than previously thought. This does not necessarily mean that the human genome is less complex, because many genes can produce more than one protein by alternate splicing of their exons (protein-encoding regions of the gene) during translation into the constituents of proteins. SeeChromosome, Genetic code

Another fascinating feature of the human genome sequence is the large fraction that consists of repeated sequence elements; 40% of chromosome 21 and 42% of chromosome 22 are composed of repeats. The function of any of these repeats is not yet known, but elucidating their distribution in the genome may help to reveal it.

Another statistic that is of interest is the base composition, the percent of the DNA that is made of guanine-cytosine (GC) base pairs as opposed to adenine-thymine (AT) base pairs. Chromosome 22 has a 48% GC content, whereas chromosome 21 has 41% and the average over the genome is 42%. Again, the significance of this is not yet known, but higher GC content seems to correlate with higher gene density.

The type of analysis performed initially on chromosomes 21 and 22 has been extended to the entire human sequence. However, a full understanding will take decades to achieve.

Future research

With the complete sets of genes of organisms available, how genes are turned on and off and how genes interact with each other can be studied. What the different genes do and how they affect human health must also be learned. Consequently, much effort is now directed to studying the regulation of gene expression and annotating the sequence with useful biological information about function.

Another key challenge is to understand how DNA function varies with differences in the DNA sequence. Each human being has a unique DNA sequence which differs from that of any other human being by about 0.1%, regardless of ethnic origin. Yet this small difference affects characteristics such as how humans look and to what diseases they are susceptible. The differences also provide clues about the evolution of the human species and the historical migration patterns of people across the world. SeeMolecular biology, Nucleic acid

Human Genome Project

a multi-national project to map the human GENOME (i.e. of every gene on every human chromosome). Initiated in 1990, the project aims to complete its mapping early in the early decades of the present century. The traditional issues in nature and nurture – see NATURE-NURTURE DEBATE – arise in relation to the Project. There are those who expect that from the initial identification of genes links between specific genes and bodily functions, disease, etc will also be widely established and provide greatly enhanced understanding and a capacity to intervene. Others point to the limitations and dangers likely to be associated with such a reductionistic account. The likelihood is that while some matching of genes with functions will be tight (e.g. as already clear for some genetically inherited disease), many other areas will continue to require explanation beyond such reductionistic accounts.

Human Genome Project

A bioinformatics project that has identified the 30,000 genes in human DNA. Coordinated by the U.S. Department of Energy and the National Institutes of Health, the U.S. Human Genome Project started in 1990 and released its findings in February 2001 along with findings from a separate project by Celera Genomics Group. There are similar projects in other countries as well. The purpose is to store the three billion chemical base pairs (the DNA sequence) derived from these analyses in databases for use in biomedical research. See micro array.

The End Goal The goal of the Human Genome Project is to determine the relationships between DNA makeup and human traits and predispositions. Although sequencing costs have been extremely expensive, they are approaching the USD $1,000 level per human genome, enabling "personalized genomics" to dramatically alter the course of medicine. See Personal Genome Project.

A Human Component Dictionary This information is not a blueprint of the human being, rather it is a dictionary of components. Once believed that each gene made only one protein, it is now believed that each gene creates numerous proteins, although this information is expected to take years to determine. Part of the U.S. government project is to study the ethical and legal impact that this information will have on society. An abundance of information can be found at www.genome.gov.

NHGRI led the NIH contribution to the International Human Genome Project -- an international research effort to determine the location of all human genes and to read the entire set of genetic instructions encoded in human DNA -- which has as its primary goal the sequencing of the human genome.

Combined with the $15 million award we received in July to participate in the international Human Genome Project, we are delighted to allocate some of our industrial scale sequencing capacity to these two extremely important scientific efforts.

All content on this website, including dictionary, thesaurus, literature, geography, and other reference data is for informational purposes only. This information should not be considered complete, up to date, and is not intended to be used in place of a visit, consultation, or advice of a legal, medical, or any other professional.