'Junk DNA' Debunked

Researchers describe DNA studies Wednesday that could point the way to new methods to detect and treat disease. From left, Tim Hubbard of Wellcome Trust Sanger Institute, Roderic Guigo of the Centre for Genomic Regulation and Ewan Birney of the European Bioinformatics Institute.
Press Association/Associated Press

By

Gautam Naik and

Robert Lee Hotz

Updated Sept. 5, 2012 2:01 p.m. ET

The deepest look into the human genome so far shows it to be a richer, messier and more intriguing place than was believed just a decade ago, scientists said Wednesday.

While the findings underscore the challenges of tackling complex diseases, they also offer scientists new terrain to unearth better treatments.

The new insight is the product of Encode, or Encyclopedia of DNA Elements, a vast, multiyear project that aims to pin down the workings of the human genome in unprecedented detail.

Encode succeeded the Human Genome Project, which identified the 20,000 genes that underpin the blueprint of human biology. But scientists discovered that those 20,000 genes constituted less than 2% of the human genome. The task of Encode was to explore the remaining 98%—the so-called junk DNA—that lies between those genes and was thought to be a biological desert.

That desert, it turns out, is teeming with action. Almost 80% of the genome is biochemically active, a finding that surprised scientists.

In addition, large stretches of DNA that appeared to serve no functional purpose in fact contain about 400,000 regulators, known as enhancers, that help activate or silence genes, even though they sit far from the genes themselves.

The discovery "is like a huge set of floodlights being switched on" to illuminate the darkest reaches of the genetic code, said Ewan Birney of the European Bioinformatics Institute in the U.K., lead analysis coordinator for the Encode results.

For example, the new research helped scientists to discover that a particular type of regulatory switch, known as the GATA family of transcription factors, was associated with the risk of Crohn's disease, an inflammatory bowel condition. The data helped narrow down this link from a possible 2,000 options, said Dr. Birney.

"That's a new association, and we're saying we have about 400 of those" showing other such biological links, he added.

ENLARGE

Scientist John Stamatoyannopoulos led a team of researchers that discovered that deciphered the intricate regulatory code that controls the human genome.
Clare McLean/UW Medicine

Kick-started in 2003 and funded largely by the U.S. National Institutes of Health, Encode has so far cost $185 million and involved more than 440 scientists. According to one scientist's estimate, if all the genomic data found in the effort were printed on a wall that is about 52½ feet high, the structure would be some 18 miles long.

The flood of scientific data is likely to keep researchers busy for a long time. More than 30 papers based on Encode were published Wednesday. Six of those, including an overview paper, appeared in the journal Nature, along with a total of 24 related papers in Genome Research and Genome Biology. Others journals, including Science, also published papers.

Researchers led by genome scientist John Stamatoyannopoulos at the University of Washington deciphered the intricate regulatory code that controls the human genome. They discovered that genetic changes linked to more than 400 common diseases all affect the genome's ability to control when, where and how genes behave—not the genes themselves.

Their analysis, reported in Science, suggests that many diseases share a common set of regulatory controls. Among these ailments are immune-system disorders, cancer and some neuropsychiatric diseases.

"There is a level of shared genetic liability," said Dr. Stamatoyannopoulos. "There are variants that may not only increase your susceptibility to one disease but to other diseases as well."

Eventually, the findings may lead to new ways of screening patients for disease, but they also offer a basic insight into the way cells process the information that makes life possible.

Using a special enzyme called DNase1 as a molecular probe, Dr. Stamatoyannopoulos and his colleagues discovered nearly four million different places along the human genome where proteins called transcription factors are locked onto tiny stretches of DNA—which essentially are words written with the chemical characters of the human genetic alphabet. These transcription factors serve as switches that actively control genes or other regulatory proteins. On average, each transcription factor can affect up to 3,000 genes.

"We created a dictionary of the genome's programming language," said Dr. Stamatoyannopoulos. "We could map millions of locations where we could catch regulatory proteins in the act of reading the information in the genome."

The unexpected level of activity seen in the genomic hinterlands may also help explain what makes us human. Compared with other species, the human genome has about 30 times as much "junk DNA."

When the human genome was first sequenced, scientists were surprised that its structure—based on fewer-than-expected genes—seemed uncomplicated, said Chris Ponting, a professor of genomics at the University of Oxford who wasn't involved in the latest research.

This copy is for your personal, non-commercial use only. Distribution and use of this material are governed by our Subscriber Agreement and by copyright law. For non-personal use or to order multiple copies, please contact Dow Jones Reprints at 1-800-843-0008 or visit www.djreprints.com.