Caltech and the Human Genome Project

PASADENA- Two of the key inventions that made possible the monumental task of sequencing the human genome came from the California Institute of Technology. These were especially important in the sequencing of the 3 billion DNA base pairs composing the human genome because the inventions speeded up progress on the task.

The first landmark invention was a method for the automated sequencing of DNA by Leroy Hood, then a professor of biology at Caltech, and his colleagues, Mike Hunkapiller, Tim Hunkapiller, Charles Connell, and Lloyd Smith. Before their discovery, figuring out the sequence of a segment of DNA had been exceedingly difficult and laborious. Because the process was so slow and required the work of highly skilled technicians, it was clear to most scientists in the mid '80s that it would not be possible to sequence entire genomes by manual methods.

The method devised by Hood and his colleagues changed that. They developed a novel chemistry that permitted a machine to detect DNA molecules, using fluorescent light. This method revolutionized DNA sequencing, ultimately making it possible to launch the Human Genome Project. Coupled with some recent advances, the method remained the core for the just-completed phase of sequencing the human genome.

A second key invention for the genome project was developed at Caltech by Professor Melvin Simon, chair of Caltech's biology division, and his coworker Hiroaki Shizuya. They recognized that a critical part of sequencing would be preparing large DNA segments for the process. To accomplish this, they invented "bacterial artificial chromosomes" (BACs), which permit scientists to use bacteria as micromachines to accurately replicate pieces of human DNA that are over 100,000 base pairs in length. These BACs provided the major input DNA for both the public genome project and Celera.

The Simon research group was also a major contributor to the mapping and sequencing of chromosome 22-a substantial segment of the human genome, which was completed in 1999. These researchers are presently using genomic information to create an "onco-chip," which will give researchers convenient experimental access to a miniature array containing hundreds of BACs, each carrying a gene whose mutation can cause human cancer.

Caltech researchers, both current and past, have also been important in promoting the Human Genome Project itself-a project that originally met with scientific skepticism when it was born 12 years ago, particularly when the goal of a fully sequenced human genome by the year 2003 was announced.

That skepticism has long since been replaced by wholesale enthusiasm from the scientific community. David Baltimore, president of Caltech and a Nobel laureate for his work on the genes of viruses, was a highly influential supporter of the Human Genome Project at its inception. Baltimore, then a professor of biology at MIT, was one of an international cadre of farsighted biologists that also included Hood and Simon. They shared a vision of the future in which knowledge of every gene that composes the human genome would be available to any scientist in the world at the click of a computer key.

To shape this unprecedented and complex project, Caltech professors Norman Davidson, Barbara Wold, and Steve Koonin have served in national scientific advisory roles to the genome project in the intervening years. Also, Baltimore chaired the National Institutes of Health (NIH) meeting where the human genome project was launched.

Koonin, who is Caltech's provost, was chair of the JASON study of 1997, which noted to the scientific community that quality standards could be relaxed so that a "rough draft" of the human genome could be made years earlier and still be of great utility. This, in fact, was the approach that prevailed.

The Human Genome Project is unique among scientific projects for having set aside, from the beginning, research support for studies of the ethical, legal, and social implications of the new knowledge of human genes that would result. In Caltech's Division of the Humanities and Social Sciences, Professor Daniel Kevles has examined these ethical issues in his book The Code of Codes: Scientific and Social Issues in the Human Genome Project, which he coedited in 1992 with Leroy Hood.

Caltech scientists are also actively engaged in the future of genomics, which is the use of the newly obtained DNA sequences to discover and understand the function of genes in normal biology and in disease and disease susceptibility. This includes devising new ways to extract and manipulate information from the human genome sequence and from recently completed genome sequences of important experimental organisms used by scientists in the laboratory, such as the fruit fly, mustard weed, and yeast.

In one new project, Caltech recently became the home site for the international genome database for a key experimental organism called C. elegans, under the direction of Caltech Professor Paul Sternberg. This tiny worm has about 19,000 different genes, many of which correspond to related genes in humans. The shared origin and functional relationships between the genes of worm and man (and fruit fly and all other animals) let scientists learn much about how human genes work, by studying these small creatures in the laboratory.

The Worm Genome Database, called Wormbase, is undertaking the major task of collecting and making computer-accessible key information about every worm gene, its DNA sequence, and what its function is in the animal. This will require that new methods in automated data-mining and computing be brought together and fused with expert knowledge in biology, and then made accessible by computer to anyone interested.

Because of the relatedness of many genes and their functions among all animals, this information about the worm and its genome will be important for understanding human genes, and vice versa.

Another major genomics effort at Caltech is aimed at understanding how groups of genes work to direct development from a fertilized egg to an adult organism, and how these groups of genes change their action or fail in aging, cancer, or degenerative disease. The genomics approach to these problems involves the application of new computational methods and automated experimental technologies.

To do this, Barbara Wold, together with Mel Simon, Professor Stephen Quake from Caltech's Division of Engineering and Applied Science, and Dr. Eric Mjolsness of the NASA's Jet Propulsion Laboratory, have established the L. K. Whittier/Caltech Gene Expression Center, funded by the Whittier Foundation. The new work in genomics is also fueling new interdisciplinary programs at Caltech in the computational modeling of cells and organisms.