Study finds why some human genes are more popular with biomedical researchers

Historical bias leads biomedical researchers to study certain genes over and over again

Historical bias is a key reason why biomedical researchers continue to study the same 10 percent of all human genes while ignoring many genes known to play roles in disease, according to a study publishing September 18 in the open access journal PLOS Biology, led by Thomas Stoeger and Luís Amaral of Northwestern University, and colleagues. This bias is bolstered by research funding mechanisms and social forces.

Hot and cold regions of biology. Genes (dots) are mapped according to generic chemical and biological characteristics. Blue indicates cold regions, where genes are studied less frequently than anticipated under the assumption that every gene would be studied to the same extent. Credit: Thomas Stoeger

Recent studies from other labs have reported that researchers actively study only about 2,000 of the nearly 20,000 human protein-coding genes, so the researchers set out to find why. The researchers compiled 36 distinct resources describing various aspects of biomedical research and analyzed the large database for answers.

The team found that well-meaning policy interventions to promote exploratory or innovative research actually result primarily in additional work on the most established research topics — those genes first characterized in the 1980s and 1990s, before completion of the Human Genome Project. The researchers also discovered that postdoctoral fellows and Ph.D. students who focus on poorly characterized genes have a 50 percent lower chance of becoming an independent researcher.

“We discovered that current research on human genes does not reflect the medical importance of the genes,” Stoeger said. “Many genes with a very strong relevance to human disease are still not studied. Instead, social forces and funding mechanisms reinforce a focus of present-day science on past research topics.”

The researchers applied a systems approach to the data — which included chemical, physical, biological, historical and experimental data — to uncover underlying patterns. In addition to explaining why some genes are not studied, they also explain the extent to which an individual gene is studied. And they can do that for approximately 15,000 genes.

The Human Genome Project — the identification and mapping of all human genes, completed in 2003 — promised to expand the scope of scientific study beyond the small group of genes scientists had studied since the 1980s. But the Northwestern researchers found that 30 percent of all genes have never been the focus of a scientific study and less than 10 percent of genes are the subject of more than 90 percent of published papers. And this despite the increasing availability of new techniques to study and characterize genes.

“Everything was supposed to change with the Human Genome Project, but everything stayed the same,” said Amaral, the Erastus Otis Haven Professor of Chemical and Biological Engineering and a co-author of the study. “Scientists keep going to the same place, studying the exact same genes. Should we be focusing all of our attention on this small group of genes?”

With researchers focused on just 2,000 human genes, the biology encoded by the remaining 18,000 genes remains largely uncharacterized. Some of these genes, the researchers note, include an understudied breast cancer gene cluster and genes connected to lung cancer that could be at least as important as the well-studied genes.

“The bias to study the exact same human genes is very high,” Amaral said. “The entire system is fighting the very purpose of the agencies and scientific knowledge which is to broaden the set of things we study and understand. We need to make a concerted effort to incentivize the study of other genes important to human health.”

Looking forward, the Northwestern team is developing a public resource that could help identify understudied genes that have the potential to be of critical importance to specific diseases. The resource includes information on any extraordinary chemical property, whether a gene is highly active in a specific tissue and whether there is a strong link to a disease.