September 5, 2012

University of Washington genome scientist Dr. John Stamatoyannopoulos studies the control circuitry of the human genome. Credit: Clare McLean

Researchers at the University of Washington have determined that the majority of genetic changes associated with more than 400 common diseases and clinical traits affect the genome's regulatory circuitry. These are the regions of DNA that contain instructions dictating when and where genes are switched on or off. Most of these changes affect circuits that are active during early human development, when body tissues are most vulnerable.

By creating extensive blueprints of the control circuitry, the research also exposed previously hidden connections between different diseases. These connections may explain common clinical features, as well as offer a new approach for pinpointing the specific types of cells and tissues that either cause or are most affected by a particular disease. The findings provide a major paradigm shift for understanding the genetic causes of disease, and open new avenues for development of diagnostics and treatments. The findings appear in the Sept. 5 online issue of Science.

"Genes occupy only a tiny fraction of the genome, and most efforts to map the genetic causes of disease were frustrated by signals that pointed away from genes. Now we know that these efforts were not in vain, and that the signals were in fact pointing to the genome's 'operating system'—the instructions for which are hidden in millions of locations around the genome," said Dr. John A. Stamatoyannopoulos, associate professor of genome sciences and medicine at the UW. "The findings provide a new lens through which to view the role of genetics and genome function in disease."

The human genome's control circuitry is encoded in millions of regulatory regions—short DNA sequences that are scattered throughout the 98 percent of the genome that does not specify the protein product of a gene. Specialized proteins, called regulatory factors, recognize specific DNA sequences in these regulatory regions, thereby creating switches that turn genes on and off. In many cases, these switches are located far away from the genes that they control. These distances have made it difficult to determine the relationship between specific switches and genes.

The researchers used a special molecular probe called a nuclease to detect all of the regulatory regions active in each cell type they studied. The specific nuclease they used—called DNase I—snips the genome where regulatory factors are bound to DNA. By treating cells with DNase I and analyzing the pattern of snipped DNA sequences using massively parallel sequencing technology and high-performance computers, the researchers were able to create comprehensive maps of all the regulatory DNA in many different types of cells. These maps were then analyzed with advanced software algorithms to sort through the data and expose previously hidden connections between disease-associated genetic variation and specific regulatory regions.

The regulatory mapping and analysis was conducted on 349 cell and tissue samples. These included samples from all major organs as well as 233 tissue samples from different stages of early human development. In total, nearly 4 million distinct regulatory regions were discovered, though only about 200,000 of these were 'on' in any particular cell type.

To make a connection with common diseases and clinical traits, the researchers analyzed genetic variants that had been strongly associated with diseases and traits through so-called genome-wide association studies, which compare genetic information between groups of people with or without a particular disease or trait. During the past decade, hundreds of genome-wide association studies involving hundreds of thousands of patients worldwide have been performed for over 400 diseases and traits. Nearly 95 percent of the time, these studies flagged genetic variants that were located outside of gene protein-coding regions. Comparison of these data with the regulatory DNA blueprints yielded several key findings:

76 percent of disease-associated variants in non-gene regions are actually located within or are tightly linked to regulatory DNA. This suggests that many diseases result from changes in when, where, and how genes are turned on rather than changes to the gene itself.

88 percent of the regulatory regions that contained disease-associated DNA variants were active in early human development fetal development. Because many of these variants are associated with common diseases that occur in adults, the finding indicates that factors influencing the genome's regulatory circuitry early in development may impact the risk of developing particular diseases later in life.

DNA changes associated with specific diseases tend to occur in the specific short DNA codes recognized by regulatory proteins involved in physiological processes related to the disease or the organs or cells affected by the disease. For example, DNA variants associated with diabetes tend to occur in the codes recognized by regulatory proteins that control various aspects of sugar metabolism and insulin secretion. Similarly, variants associated with immune system disorders, such as multiple sclerosis, asthma, or lupus, are found in specific recognition codes for proteins that regulate immune system function.

Many seemingly unrelated diseases share common regulatory circuitry, including diseases that affect the immune system, different types of cancers, and a range of neuropsychiatric disorders.

The study also revealed a wealth of additional connections between genetic variants and disease that had been lurking within existing genome-wide association studies data. Viewing these data through the lens of regulatory DNA exposed thousands of variants that were highly selectively localized within regulatory DNA of disease-specific cell types. These variants had previously been ignored because the stringent selection criteria used in earlier studies did not take regulatory regions into account.

Another surprising finding was that the regulatory circuitry blueprints could be used to pinpoint cell types that play a role in specific diseases—without requiring any prior knowledge about how the disease worked. For example, genetic variants associated with Crohn's disease (a common type of inflammatory bowel disease) were found to be concentrated in the regulatory regions mapped in two specific subsets of immune cells—the same cell types that took decades of prior research to be linked with development of tCrohn's disease. Applying this approach systematically will enable researchers to identify cell types not previously known to play a role in a particular disease, expanding our understanding of the disease process and potentially leading to new therapies.

Recommended for you

A meta-analysis of genome-wide association studies (GWAS) has identified six loci or regions of the human genome that are significantly linked to personality traits, report researchers at University of California San Diego ...

CRISPR/Cas9 genome editing is quickly revolutionizing biomedical research, but the new technology is not yet exact. The technique can inadvertently make excessive or unwanted changes in the genome and create off-target mutations, ...

Scientists at Baylor College of Medicine, Baylor Genetics, the University of Texas Health Science Center at Houston and Texas Children's Hospital are combining descriptions of patients' clinical features with their complex ...

A UC San Francisco-led research team has identified the rare genetic mutation responsible for a unique case of severe combined immunodeficiency (SCID), a deadly immune system disorder also known as "boy in the bubble" disease. ...

A new discovery may unlock the answer to a vexing scientific question: How to conduct mitochondrial replacement therapy, a new gene-therapy technique, in such a way that safely prevents the transmission of harmful mitochondrial ...

1 comment

In the next few years, we will be able to buy a map of our own whole genome, but this, I guess, opens up the market for the epigenome map. By now, I see nobody selling the DNA-protein epigenomic information directly to a consumer, similar to what 23andMe does with DNA information.

Please sign in to add a comment.
Registration is free, and takes less than a minute.
Read more

Click here to reset your password.
Sign in to get notified via email when new comments are made.