Lock and Key: How computer science reveals the “dark matter” of the human genome

By Rosie Haney

Feb. 11, 2013

You could say that Josh Welch's interest in DNA is in his genes.

When the Ohio University student was a teen, his father, Lonnie Welch, a professor of electrical engineering and computer science, introduced him to the relatively new field of bioinformatics. The discipline grew from the Human Genome Project, which sought to identify the sequence of human DNA.

Each human DNA sequence is "named" with a series of letters—3 billion in all. Crunching all that data created demand for specialized computer programs, which became the focus of the bioinformatics field.

While still in high school, Welch joined his dad's team that was building a program called Wordseeker, which looks for patterns in the letter sequences of DNA.

Josh Welch

"I really enjoyed working with my dad. One of the biggest benefits is I got started with research much earlier than if I had worked with another professor," says Welch, who double majored in computer science and piano.

The team used the program to explore topics such as the regulation of DNA repair genes, which help protect cells from diseases such as cancer. In addition, Welch used the software to study the genes of the mustard weed to aid efforts to develop cold- and disease-resistant crops.

During his undergraduate experience, Welch also interned with the Ohio University biotechnology spin-off company Diagnostic Hybrids, where he developed software tools to streamline the collection, analysis, and storage of data produced by the company's assays for medical testing.

Welch plans to look at a specific part of the genome called Long Intergenetic Non-Coding RNA (lincRNA). The basic sequence of our genome starts with DNA, which transcribes information to RNA, which is then translated into proteins. This coding DNA, however, accounts for only about 2 percent of our overall genome. Welch calls what remains the "dark matter" of the human genome.

"A lot of people thought this was junk left over from evolution, but over the past 10 years, a lot of research has shown that it does do something," Welch says."We're still trying to figure out what that does."

His goal is to write a computer program that can predict the complicated 3D shapes of lincRNAs that allow them to regulate the process of converting DNA into RNA. It's crucial step to understanding what lincRNAs do, he says, because their shapes dictate how they function.

Understanding lincRNA behaviors could allow scientists to target the expression of certain genes, which could be useful for agriculture, biofuels, and medicine, Welch notes. The lincRNA create structures like keys to unlock gene expression.

"If we can figure out the shape of the key," he says,"then we can design our own keys to fit the lock."

This story appears in the Autumn/Winter 2012 issue of Ohio University's Perspectives magazine.