Graur et al. to ENCODE: Zing!

by T. Ryan Gregory, on February 22nd, 2013

I expect that we will be seeing several harsh critiques of ENCODE’s extraordinary claims about function in the human genome and the equally incredible mega-hype associated with the project. I know of at least one more that is forthcoming from a heavy-hitter in the field, but as a snarky smackdown, it will be very tough to beat the recent paper by Dan Graur and colleagues published in Genome Biology and Evolution. The paper is open-access, so go ahead and read it yourself here. Meantime, enjoy the following zingers:

ENCODE adopted a strong version of the causal role definition of function, according to which a functional element is a discrete genome segment that produces a protein or an RNA or displays a reproducible biochemical signature (for example, protein binding).
Oddly, ENCODE not only uses the wrong concept of functionality, it uses it wrongly and inconsistently.

…the ENCODE authors singled out transcription as a function, as if the passage of RNA polymerase through a DNA sequence is in some way more meaningful than other functions. But, what about DNA polymerase and DNA replication? Why make a big fuss about 74.7% of the genome that is transcribed, and yet ignore the fact that 100% of the genome takes part in a strikingly “reproducible biochemical signature”—it replicates!

Ward and Kellis (2012) confirmed that ~5% of the genome is interspecifically conserved, and by using intraspecific variation, found evidence of lineage-specific constraint suggesting that an additional 4% of the human genome is under selection (i.e., functional), bringing the total fraction of the genome that is certain to be functional to approximately 9%. The journal Science used this value to proclaim “No More Junk DNA” Hurtley 2012), thus, in effect rounding up 9% to 100%.

ENCODE chose to bias its results by excessively favoring sensitivity over specificity. In fact, they could have saved millions of dollars and many thousands of research hours by ignoring selectivity altogether, and proclaiming a priori that 100% of the genome is functional. Not one functional element would have been missed by using this procedure.

Interestingly, ENCODE, which is otherwise quite miserly in spelling out the exact function of its “functional” elements, provides putative functions for each of its 12 histone modifications. For example, according to ENCODE, the putative function of the H4K20me1 modification is “preference for 5’ end of genes.” This is akin to asserting that the function of the White House is to occupy the lot of land at the 1600 block of Pennsylvania Avenue in Washington, D.C.

In a miraculous feat of “next generation” science, the ENCODE authors were able to determine the frequencies of nonexistent derived alleles.

… a surprisingly large number of scientists have had their knickers in a twist over “junk DNA” ever since the term was coined by Susumu Ohno (1972).

In dissecting common objections to “junk DNA,” we identified several misconceptions, chief among them (1) a lack of knowledge of the original and correct sense of the term, (2) the belief that evolution can always get rid of nonfunctional DNA, and (3) the belief that “future potential” constitutes “a function.”

We urge biologists not be afraid of junk DNA. The only people that should be afraid are those claiming that natural processes are insufficient to explain life and that evolutionary theory should be supplemented or supplanted by an intelligent designer (e.g., Dembski
1998; Wells 2004). ENCODE’s take-home message that everything has a function implies purpose, and purpose is the only thing that evolution cannot provide. Needless to say, in light of our investigation of the ENCODE publication, it is safe to state that the news concerning the death of “junk DNA” have been greatly exaggerated.

ENCODE’s biggest scientific sin was not being satisfied with its role as data provider; it assumed the small-science role of interpreter of the data, thereby performing a kind of textual hermeneutics on a 3.5-billion-long DNA text. Unfortunately, ENCODE disregarded the rules of scientific interpretation and adopted a position common to many types of theological hermeneutics, whereby every letter in a text is assumed a priori to have a meaning.

So, what have we learned from the efforts of 442 researchers consuming 288 million dollars? According to Eric Lander, a Human Genome Project luminary, ENCODE is the “Google Maps of the human genome” (Durbin et al. 2010). We beg to differ, ENCODE is considerably worse than even Apple Maps.

We conclude that the ENCODE Consortium has, so far, failed to provide a compelling reason to abandon the prevailing understanding among evolutionary biologists according to which most of the human genome is devoid of function. The ENCODE results were
predicted by one of its lead authors to necessitate the rewriting of textbooks (Pennisi 2012). We agree, many textbooks dealing with marketing, mass-media hype, and public relations may well have to be rewritten.

12 comments to Graur et al. to ENCODE: Zing!

The recent post <a href=”http://www.create.ab.ca/encode-project-discarding-junk-dna-for-good/”>ENCODE Project – Discarding ´Junk DNA´ for Good” by the Creation Science Association of Alberta clearly demonstrates the damage ENCODE has caused.
What we currently don’t know is what it did to biology freshmen.

Aside from my comment on contactability and the Stephen J Gould video which is not particularly relevant to this site, I would like to say I do not come from the genetics field but did a bit of it years ago in undergrad molecular biology, however I am more generally a student of ecology. My concern is that this argument is not serviing the progress of your field well. Although it is important that people such as yourselves express concern when apparently half-scientific truths are promoted in the mainstream, to which your blog above clarifies things to people such as myself who have only a grasp, the field seems to be locked into a battle. To this end is the discipline losing focus on things that I would even more like clarification on, such as the role of regulatory genes etc in animal adaptation?
Or is junk going to be the dominant theme of molecular genetics in every regard?? I think if the perpetrators of this junk about junk are just left to quietly disappear, but more especially the kings of junk the creationists are left in their dark hole, they will be left behind. Thus the rest of us should keep on with the pursuit of truth about how evolution by natural selection really operates at the macro and the molecular level, and what is found may in the end clarify much of the debate one day.
Otherwise one spends an enormous amount of time and energy dealing with junk and the merchants of it.

It is sad that the stars of the ENCODE project show such little understanding of biology in general.
And it raises profound questions about whether or not all the “big science” is really helping at all. Does the level of focus need to make progress into the depths of the genome preclude developing a basic competence in modern evolutionary theory? It shouldn’t, evidence from the ENCODE project aside.

I guess a lot of people who read that critical paper will wonder about its tone.
But the irreverence it shows towards these first ENCODE papers, and associated news, seems warranted. Every high school student understands experimental protocol: you write-up the methods of analysis before doing the experiment, and justify those methods as useful by appeals to prior work on the subject, showing how your experiment will allow you to understand or further develop a theory.
Imagine, back before doing the ENCODE experiment, that its scientists wrote an experimental proposal that said they would evaluate the data as they ended up doing: if I see binding, I call that functional, and if I see transcription, I say this is functional, and I am doing my analysis this way because that will maximise the value of this experiment to our current theories of junk DNA and genetics.
So I have question: are the evaluations done in the ENCODE papers consistent with the experiments’ initial proposals?

As for textbooks, well my textbook – “Evolutionary Bioinformatics” – which is now in its second edition, finds the balance of evidence not to support the idea of junk DNA. I suspect that a careful perusal of textbooks would provide many similar examples.

Since my above comment on the textbook treatment of “junk DNA,” I have read the paper in Genome Biology and Evolution by Graur et al (2013). There has also appeared a paper by W. F. Doolittle in the Proceedings of the National Academy of Sciences (2013). Both papers are highly critical of the ENCODE Consortium publications, but neither addresses the issues I raised in the 1990s, which were gathered together in my textbook – ‘Evolutionary Bioinformatics’ (2006, 2011). Since both papers refer to Professor Gregory, and Graur’s citations include this Gregory blog, I hope I may be permitted to briefly expand here on my initial comment.
The ENCODE results, taken with earlier studies showing genome-wide transcription (Ota et al. 2004 Nature Genetics 36, 40-45), are consistent with the proposals:
1. Extragenic DNA encodes “RNA antibodies,” which constitute an intracellular immune system analogous to the CRISP system in bacteria. Cardinal features of the corresponding genomic segments are (a) that they must be variable to avoid anticipation by intracellular pathogens, and (b) that they are preferentially transcribed in response to “stress” (e.g. pathogen entry; Forsdyke, Madill & Smith 2002 Trends in Immunology 23, 575-579).
2. The primary function of introns is to accommodate stem-loop potential, thus serving an error-correcting role. In this respect, third codon positions can be seen as ‘mini-introns’ (Forsdyke 2013; Biological Theory, doi 10.1007/s13752-013-0090-6).

“Consistent with” and “evidence for” are very different things. Extensive transcription is also “consistent with” it being mere noise, or being involved in gene regulation, or any number of hypotheses. Widespread transcription has been known since the 1970s and was discussed in the very first detailed overview of “junk DNA” by Comings (1972).http://www.genomicron.evolverzone.com/2012/09/encode-2012-vs-comings-1972/