Company Update: Blue Gene Tackles Disease and Migration

Carol Potera

IBM (www.ibm.com) has provided innovative solutions for businesses for more than 80 years. Now the company’s expertise, which helped banks and other businesses manage large volumes of information, is being applied to biological data. IBM researchers started analyzing raw DNA sequences about 14 years ago. “That was before the Human Genome Project when the quantity of data was orders of magnitude less than it is now,” says structural biologist Ajay Royyuru, Ph.D., head of the computational biology center at IBM’s Thomas J. Watson Research Center in Yorktown Heights, NY.

Back then, computer scientists and biologists fantasized about how increasing computer power could dramatically expand the spectrum of biological problems that could be studied. “There were plenty of questions in biology and few answers,” Dr. Royyuru notes.

He and his coworkers convinced company leaders that the new field of computational biology could positively impact the company’s business. In 1992, Andrea Califano and Isidore Rigoutsos, Ph.D., started the Computational Biology Center. Among the center’s various projects, computer scientists and biologists worked together to design Blue Gene, one of the world’s fastest supercomputers that makes calculations at petaFLOP speeds.

On July 1, 2005, the Brain Mind Institute at the Ecole Polytechnique Federale de Lausanne in Switzerland and IBM launched the Blue Brain Project. The goal is to build a detailed, biologically accurate model of the brain using the Blue Gene supercomputer. The first phase consists of building a software replica of a column of the neocortex, the largest and most complex part of the human brain that controls language, memory, learning, and complex thought. “The model is based on experimental data about the organization and architecture of the brain,” says Dr. Royyuru. The computer will generate a 3-D structure and recreate the high-speed electrochemical interactions of the brain.

Relatively little is known about how the brain works. In the brain, neurons self organize into columns in the cortex that specialize in higher order functions. By building cortical columns, the researchers hope to learn how neurons connect and fire to carry out learning and specialized functions. The brain simulations could answer questions about thought, perception, and memory processes. Scientists also hope to learn about malfunctioning microcircuits associated with neurological and psychiatric disorders such as autism, schizophrenia, and depression.

“Blue Brain is a good example of how computational biology can dramatically advance our understanding of physiology and disease,” Dr. Royyuru says.

The National Geographic Society teamed up with IBM to launch a groundbreaking research project that will trace the migratory history of humans. DNA samples collected from hundreds of thousands of people worldwide, including indigenous groups, are being analyzed on IBM computers to map how the earth was populated. When the project started in April 2005, the public was invited to participate and contribute to the database by buying a $99.95 kit for DNA cheek swabbing. “As a skeptical scientist, I wondered who would pay that much,” says Dr. Royyuru. Surprisingly, 170,000 kits have been purchased worldwide.

Tracking Ancestors

The five-year project will use genetics to fill in gaps in the story of human history. Scientists at 10 centers around the world are collecting DNA samples from indigenous populations. Scientists from IBM’s Computational Biology Center will search for new patterns and connections within the genetic data. The genetic information will reveal missing details about global human migration and provide new insights into the interconnections of the human species. The resulting public database will house one of the largest collections of human population genetic information ever assembled as a resource for geneticists, historians, and anthropologists. The genetic data, however, are not disease-oriented or clinically relevant. “The genetic markers are confined to the Y chromosome and mitochondrial DNA and give information about ancestry,” says Dr. Royyuru.

In May 2006, IBM joined with WHO, CDC, and other public health institutions worldwide to control the spread of infectious diseases through the Global Pandemic Initiative. IBM will contribute open-source software programs to communities to share information about disease outbreaks and predict how diseases will spread. An epidemiological modeling framework, called Spatio-Temporal Epidemiological Modeller (STEM), will help public health experts and government planners simulate the spread of a disease based on spatial and time factors, such as weather conditions, bird migration routes, and travel patterns.

For instance, if the H5N1 strain of bird flu contaminates commercial poultry flocks as recently occurred in England, STEM could simulate what-if scenarios, such as the impact of placing a community under quarantine or stopping all air flights. “We hope that STEM will become a baseline tool,” says Dr. Royyuru.

Project Checkmate

A related program, called Project Checkmate, teams IBM researchers with those at The Scripps Research Institute. The Blue Gene supercomputer will model and predict how the influenza virus, including the infamous 1918 strain, mutates over time. “If you know the variations before the virus mutates, you can prepare with vaccines and therapeutics ahead of time,” Dr. Royyuru says. Likely viral variants can be analyzed to determine their potency, whether they escape immune surveillance, and what drugs or vaccines counteract them. “We can do computational modeling of variants on a much larger scale than you can do in a laboratory,” Dr. Royyuru says.