Friday, November 20, 2015

The truth about ENCODE

A few months ago I highlighted a paper by Casane et al. (2015) where they said ...

In September 2012, a batch of more than 30 articles presenting the results of the ENCODE (Encyclopaedia of DNA Elements) project was released. Many of these articles appeared in Nature and Science, the two most prestigious interdisciplinary scientific journals. Since that time, hundreds of other articles dedicated to the further analyses of the Encode data have been published. The time of hundreds of scientists and hundreds of millions of dollars were not invested in vain since this project had led to an apparent paradigm shift: contrary to the classical view, 80% of the human genome is not junk DNA, but is functional. This hypothesis has been criticized by evolutionary biologists, sometimes eagerly, and detailed refutations have been published in specialized journals with impact factors far below those that published the main contribution of the Encode project to our understanding of genome architecture. In 2014, the Encode consortium released a new batch of articles that neither suggested that 80% of the genome is functional nor commented on the disappearance of their 2012 scientific breakthrough. Unfortunately, by that time many biologists had accepted the idea that 80% of the genome is functional, or at least, that this idea is a valid alternative to the long held evolutionary genetic view that it is not. In order to understand the dynamics of the genome, it is necessary to re-examine the basics of evolutionary genetics because, not only are they well established, they also will allow us to avoid the pitfall of a panglossian interpretation of Encode. Actually, the architecture of the genome and its dynamics are the product of trade-offs between various evolutionary forces, and many structural features are not related to functional properties. In other words, evolution does not produce the best of all worlds, not even the best of all possible worlds, but only one possible world.

How did we get to this stage where the most publicized result of papers published by leading scientists in the best journals turns out to be wrong, but hardly anyone knows it?

Back in September 2012, the ENCODE Consortium was preparing to publish dozens of papers on their analysis of the human genome. Most of the results were quite boring but that doesn't mean they were useless. The leaders of the Consortium must have been worried that science journalists would not give them the publicity they craved so they came up with a strategy and a publicity campaign to promote their work.

Their leader was Ewan Birney, a scientist with valuable skills as a herder of cats but little experience in evolutionary biology and the history of the junk DNA debate.

The ENCODE Consortium decided to add up all the transcription factor binding sites—spurious or not—and all the chromatin makers—whether or not they meant anything—and all the transcripts—even if they were junk. With a little judicious juggling of numbers they came up with the following summary of their results (Birney et al., 2012) ..

The human genome encodes the blueprint of life, but the function of the vast majority of its nearly three billion bases is unknown. The Encyclopedia of DNA Elements (ENCODE) project has systematically mapped regions of transcription, transcription factor association, chromatin structure and histone modification. These data enabled us to assign biochemical functions for 80% of the genome, in particular outside of the well-studied protein-coding regions. Many discovered candidate regulatory elements are physically associated with one another and with expressed genes, providing new insights into the mechanisms of gene regulation. The newly identified elements also show a statistical correspondence to sequence variants linked to human disease, and can thereby guide interpretation of this variation. Overall, the project provides new insights into the organization and regulation of our genes and genome, and is an expansive resource of functional annotations for biomedical research.

The bottom line is that these leaders knew exactly what they were doing and why. By saying they have assigned biochemical functions for 80% of the genome they knew that this would be the headline. They knew that journalists and publicists would interpret this to mean the end of junk DNA. Most of ENCODE leaders actually believed it.

That's exactly what happened ... aided and abetted by the ENCODE Consortium, the journals Nature and Science, and gullible science journalists all over the world. (Ryan Gregory has published a list of articles that appeared in the popular press: The ENCODE media hype machine..)

Almost immediately the knowledgeable scientists and science writers tried to expose this publicity campaign hype. The first criticisms appeared on various science blogs and this was followed by a series of papers in the published scientific literature. Ed Yong, an experienced science journalist, interviewed Ewan Birney and blogged about ENCODE on the first day. Yong reported the standard publicity hype that most of our genome is functional and this interpretation is confirmed by Ewan Birney and other senior scientists. Two days later, Ed Yong started adding updates to his blog posting after reading the blogs of many scientists including some who were well-recognized experts on genomes and evolution [ENCODE: the rough guide to the human genome].

Within a few days of publishing their results the ENCODE Consortium was coming under intense criticism from all sides. A few journalists, like John Timmer, recongized right away what the problem was ...

Yet the third sentence of the lead ENCODE paper contains an eye-catching figure that ended up being reported widely: "These data enabled us to assign biochemical functions for 80 percent of the genome." Unfortunately, the significance of that statement hinged on a much less widely reported item: the definition of "biochemical function" used by the authors.

This was more than a matter of semantics. Many press reports that resulted painted an entirely fictitious history of biology's past, along with a misleading picture of its present. As a result, the public that relied on those press reports now has a completely mistaken view of our current state of knowledge (this happens to be the exact opposite of what journalism is intended to accomplish). But you can't entirely blame the press in this case. They were egged on by the journals and university press offices that promoted the work—and, in some cases, the scientists themselves.

Nature may have begun to realize that it made a mistake in promoting the idea that most of our genome was functional. Two days after the papers appeared, Brendan Maher, a Feature Editor for Nature, tried to get the journal off the hook but only succeeded in making matters worse [see Brendan Maher Writes About the ENCODE/Junk DNA Publicity Fiasco].

Meanwhile, two private for-profit companies, illumina and Nature, team up to promote the ENCODE results. They even hire Tim Minchin to narrate it. This is what hype looks like ...

Soon articles began to appear in the scientific literature challenging the ENCODE Consortium's interpretation of function and explaining the difference between an effect—such as the binding of a transcription factor to a random piece of DNA—and a true biological function.

By March 2013—six months after publication of the ENCODE papers—some editors at Nature decided that they had better say something else [see Anonymous Nature Editors Respond to ENCODE Criticism]. Here's the closest thing to an apology that they have ever written ....

The debate over ENCODE’s definition of function retreads some old battles, dating back perhaps to geneticist Susumu Ohno’s coinage of the term junk DNA in the 1970s. The phrase has had a polarizing effect on the life-sciences community ever since, despite several revisions of its meaning. Indeed, many news reports and press releases describing ENCODE’s work claimed that by showing that most of the genome was ‘functional’, the project had killed the concept of junk DNA. This claim annoyed both those who thought it a premature obituary and those who considered it old news.

There is a valuable and genuine debate here. To define what, if anything, the billions of non-protein-coding base pairs in the human genome do, and how they affect cellular and system-level processes, remains an important, open and debatable question. Ironically, it is a question that the language of the current debate may detract from. As Ewan Birney, co-director of the ENCODE project, noted on his blog: “Hindsight is a cruel and wonderful thing, and probably we could have achieved the same thing without generating this unneeded, confusing discussion on what we meant and how we said it”.

Oops! The importance of junk DNA is still an "important, open and debatable question" in spite of what the video sponsored by Nature might imply.

(To this day, neither Nature nor Science have actually apologized for misleading the public about the ENCODE results. [see Science still doesn't get it ])

The ENCODE Consortium leaders responded in April 2014—eighteen months after their original papers were published.

In that paper they acknowledge that there are multiple meanings of the word function and their choice of "biochemical" function may not have been the best choice ....

However, biochemical signatures are often a consequence of function, rather than causal. They are also not always deterministic evidence of function, but can occur stochastically.

This is exactly what many scientists have been telling them. Apparently they did not know this in September 2012.

They also include in their paper a section on "Case for Abundant Junk DNA." It summarizes the evidence for junk DNA, evidence that the ENCODE Consortium did not acknowledge in 2012 and certainly didn't refute.

In answer to the question, "What Fraction of the Human Genome Is Functional?" they now conclude that ENCODE hasn't answered that question and more work is needed. They now claim that the real value of ENCODE is to provide "high-resolution, highly-reproducible maps of DNA segments with biochemical signatures associate with diverse molecular functions."

We believe that this public resource is far more important than any interim estimate of the fraction of the human genome that is functional.

There you have it, straight from the horse's mouth. The ENCODE Consortium now believes that you should NOT interpret their results to mean that 80% of the genome is functional and therefore not junk DNA. There is good evidence for abundant junk DNA and the issue is still debatable.

I hope everyone pays attention and stops referring to the promotional hype saying that ENCODE has refuted junk DNA. That's not what the ENCODE Consortium leaders now say about their results.

41 comments
:

Thank you for this, Larry. I was at a talk last weekend by Fazale Rana, who was of course hyping ENCODE. Summary: junk DNA was the "best argument" for evolution and now it's been decisively refuted and all the opposition to it is because evolutionists don't want to admit they're wrong. He even included a quote from PZ to back that up. I'm going to be blogging my experience of last weekend, and also reporting on it to my local CFI group, so this is a neat and timely summary.

It will be interesting, but perhaps distressing, to see what textbook authors write about junk DNA in the coming decade. After all, they will be the sources that many students will pay attention to. Will other biochemistry textbook authors be aware of the quiet retraction by the ENCODE leadership? Will the authors of elementary biology textbooks be aware of it? Will the authors of evolution textbooks be aware of it?

Come to think of it, I know several authors of major textbooks of biology or of evolution. I'd better ask them. I hope to find that they are good on the issue. If not, I'll point them here.

As a general rule, biochemistry textbook authors know very little about molecular evolution and surprisingly little about biochemistry! I know most of them so this is not just speculation.

They are very prone to stasis—writing about the same ideas and concepts that they were taught as undergraduates. Ironically, they are also very likely to fall for the hype of new discoveries as long as the paradigm being shifted isn't something that upsets their own views.

The biochemistry textbook authors are very likely to promote the ENCODE hype because it fits with their outmoded, adaptationist, view of evolution.

On the other hand, you would be hard pressed to find any textbook (other than my own!) that explains why the amino acid sequences of globin or cytochromes is so different in different species.

Remember the example of a real "paradigm" shift called chemiosmotic theory? It took decades for this to enter the biochemistry textbooks in a meaningful way and even now there are textbooks that don't fully accept it. That's because it conflicted with the way the authors were taught as undergraduates.

If I were to put more emphasis on junk DNA in my textbook it would meet with considerable resistance from biochemistry instructors in many universities. I can guarantee you that I would receive angry letters from some of them pointing out my ignorance of the latest results and quoting ENCODE.

I know this because it has happened in the past whenever I tried to introduce a new way of looking at something.

This is a conundrum. If I get too far ahead of my audience (biochemistry instructors) they won't recommend the book to their students and I won't be able to change the views of those students.

I took an introductory course to genetics over the summer using a recently published book that had a section talking about the ENCODE results. The book was Genetics: A Conceptual Approach by Benjamin Pierce. On page 600 it reads:

In a series of papers published in 2012, ENCODE concluded that at least 80% of the human genome is involved in some type of function. Many of the functional sequences consisted of sites where proteins bind and influence the expression of genes. Prior to this study, much of the genome was considered "junk DNA" with no function, but the ENCODE study has greatly altered this view and suggests that there is little nonfunctional DNA in the human genome.

The back cover reads:

Only about 1.5% of the human genome directly codes for proteins; previously, much of that remaining genome was thought to consist of nonfunctional or “junk” DNA. However, new studies reveal that three-quarters of the human genome is transcribed, and research has identified a number of noncoding RNAs that have important cellular functions.

Ironically, they are also very likely to fall for the hype of new discoveries as long as the paradigm being shifted isn't something that upsets their own views.

I think you really hit it on the head several years ago with your "deflated ego" description.

I am just thinking about so-called "paradigm changes" and about Peter Mitchell and the chemiosmotic theory, Carl Woese and three domain concept (which I understand you dispute), and now the ENCODE junk DNA assertion. What made the first two concepts slow to be accepted and the last one immediately accepted by many?

Well, its the almost religious conviction that if something exists it must be functional and that this is exactly what the doctor ordered to explain the inordinate complexity of humans (in a way the modest number of proteins coding genes does not readily do).

The PMF and ATP synthesis, and all this talk about Archaea etc... who cares? But everyone it seems (creationist and scientist alike) has a horse in the race when it comes to wanting to explain how we are so special.

Interesting... the Graur book "sample chapter" has a section on prokaryote-eukaryote origins, and the "Universal tree of life". Now I just need to get these grants and paper reviews and revisions off my desk to read something more interesting ;-)

Larry - you might be interested to know that together with Richard Dawkins I have just finished updating our book “The Ancestor’s Tale” and I have specifically added a section that deals with ‘junk’, ENCODE, and what functionality means. Happy to send it to you, if you are interested.

Larry, I've been hoping you would respond to the Facebook rants of "Evolution 2.0", apparently one Perry Marshall, who has been attacking you a lot lately. It's the usual James Shapiro-type cellular teleology-- evolution is driven by "cell-mediated changes", meaning the cell PLANS its evolution-- not creationist, just Shapiro style stupidity. He writes things like "Larry Moran is an old school Darwinist dummy, Darwinists predicted junk DNA, ENCODE killed junk DNA dead three years ago", and so on. He wrote a post like this 3 or 4 days back, so you might have to search for it. I can't put in a direct link.

In his most recent post, he goes on about "Evolution Fraud" and accuses you of saying evolution is random because you don't want to do science but just want to drink martinis when you should be working.

PREDICTION: Ten years from now, when we understand twice as much about the genome as we do now, most scientists will be reluctant to admit they ever entertained such anti-scientific superstitions. [He means junk DNA.]

Recently I had been reading Plant Intelligence by Buhlmer where I was first introduced to Barbara McClintock who instantly became one of my heros, she apparently stated each of her corn plants had a distinct personality however to realize this you must be in a child like state. Make a heart connection! Nobel prize winning child like heart focused brain…

I co sponsored Richard Gerber author Vibrational Medicine in early 90’s to speak in Toronto. He said the entire Universe in made of one element, LIGHT…and then with a twinkle in his eye said ‘just as the mystics have been saying, we are beings of light’. He then said we live in a holographic Universe or as mystics have said for eons an illusion.Beings of light playing a DNA self replicating existence…Anyone who watches TV or has a basic interest in Ancient Aliens has heard how some super duper ancient ET civilization came here to seed the Planet and apparently manipulated DNA to create the current human race. Apparently our ancestors were pretty hot and the ancients could not resist. ANCIENT ALIENS 101…kinda puts a hole in religion other than as a way to build community spirit, fear and accumulate wealth.One of my defining moments was years ago reading how Crick and a colleague postulated Earth is far too young to have developed a technology as sophisticated as DNA.Technology = DNA?? now there is a game changer…I caught myself a terrible code…In Dolores Cannon Convoluted Universe series she tells of a patient under hypnosis who said he saw this being millions of years advanced of humans travelling through the Universe, a perfect blend of technology and consciousness…if reincarnation exists, which no scientist in their right mind would accept any more than doctors in the mid 1800’s would wash their hands, perhaps in a few generations we will understand and accept the basic code of human life and be able to travel throughout dimensions in an instant, have all knowledge, realize we are eternal beings in a material world…know it …not some airy fairy superstitious new age fantasy…we actually know we operate DNA self replicating bodies w conscious awareness…giddy yup..Apparently without heart centered awareness it will never happen no matter how deeply we investigate the holographic picture…I have read where original DNA had 12 strands and was reduced to 2.The ancient Eastern religions spoke of 12 dimensions a human can experience while in a physical body. Physical, Astral, Causal, Mental, are lower limited realities…is DNA the vehicle SOUL/Consciousness/awareness/, whatever makes us unique, to experience Earth to become self realized.Currently humanity is pretty well stuck in physical and emotional [astral] reality, lower dimension of hate, greed, anger…DNA w 2 strands of information perhaps limits our perspective…If the DNA still retained the 12 strands would the coded information make it easier to access these other states of consciousness?Is this where those who created us operate?Could GOD just be other more advanced conscious beings as some suggest?

Well I just had to google 12 stranded DNA and came up with this site. Apparently we have been slowly converting to 12 stranded DNA over the last couple decades or so. I would like to think this is a mock site but since it involves a Doctor of Naturopathy, I suppose it isn't.

http://www.bibliotecapleyades.net/ciencia/ciencia_cambio03.htm#Evidence of changes in our DNA

As it turns out, Dr. Fox was arrested on fraud charges shortly after she gave that interview and was sent to jail for falsifying results on tests she never conducted, as well as injecting patients with steroids without their knowledge. These details conjure the image of a creepy mad scientist doing work on hopelessly delusional people wishing to see themselves and their progeny in a space of notable honor and privilege, when in fact all they were doing was getting scammed.

This Marshall guy seems to be yet another engineer whose motivated reasoning led him to abuse the poor analogy of DNA-is-a-Code, therefore it needed a Coder, therefore it proves Creation. Then he hides his creator in the complexity of the cell, ala Stephen Meyer. And evolution is the result of the cell modifying it's own code. This looks like another example of a smart person mistakenly thinking his training in one area qualifies him to be an expert in another.

There is a recent and interesting paper in Nature that describes the whole genome assembly of Oropetium thomaeum, aka the resurrection plant, able to suffer prolonged periods of drought and then pop-back to life upon re-watering. The novelty of the paper lay in the use solely of a Small Molecule Real Time (SMRT) sequencing platform-- specifically, that on offer from Pac-Bio (http://www.nature.com/nature/journal/vaop/ncurrent/full/nature15714.html).

The work is a tour de force that demonstrates just how powerful are the forces of advances in sequence acquisition in modern biological research. Nonetheless, two aspects of the paper bug me. First, it often reads like a Pac-Bio press release. That may be warranted by the outstanding results they describe, but it made me cringe, nonetheless. The second gripe I have is more germane to the present discussion: When touting the capacity of SMRT platforms to decode large and complex genomes, the authors make repeated reference to the direct applicability of such data to enable ENCODE-like projects in species other than humans.

It's a disturbing trend in scientific research and policies of funding agencies: "Sure, we spent $100's of millions on the ENCODE project, maybe the results are right or wrong hardly makes a difference. But if you can tie your ongoing research program to that PR-driven juggernaut, you'll probably be gainfully employed for a long, long time".

Laurence A. Moran

Larry Moran is a Professor in the Department of Biochemistry at the University of Toronto. You can contact him by looking up his email address on the University of Toronto website.

Sandwalk

The Sandwalk is the path behind the home of Charles Darwin where he used to walk every day, thinking about science. You can see the path in the woods in the upper left-hand corner of this image.

Disclaimer

Some readers of this blog may be under the impression that my personal opinions represent the official position of Canada, the Province of Ontario, the City of Toronto, the University of Toronto, the Faculty of Medicine, or the Department of Biochemistry. All of these institutions, plus every single one of my colleagues, students, friends, and relatives, want you to know that I do not speak for them. You should also know that they don't speak for me.

Subscribe to Sandwalk

Quotations

The old argument of design in nature, as given by Paley, which formerly seemed to me to be so conclusive, fails, now that the law of natural selection has been discovered. We can no longer argue that, for instance, the beautiful hinge of a bivalve shell must have been made by an intelligent being, like the hinge of a door by man. There seems to be no more design in the variability of organic beings and in the action of natural selection, than in the course which the wind blows.Charles Darwin (c1880)Although I am fully convinced of the truth of the views given in this volume, I by no means expect to convince experienced naturalists whose minds are stocked with a multitude of facts all viewed, during a long course of years, from a point of view directly opposite to mine. It is so easy to hide our ignorance under such expressions as "plan of creation," "unity of design," etc., and to think that we give an explanation when we only restate a fact. Any one whose disposition leads him to attach more weight to unexplained difficulties than to the explanation of a certain number of facts will certainly reject the theory.

Charles Darwin (1859)Science reveals where religion conceals. Where religion purports to explain, it actually resorts to tautology. To assert that "God did it" is no more than an admission of ignorance dressed deceitfully as an explanation...

Quotations

The world is not inhabited exclusively by fools, and when a subject arouses intense interest, as this one has, something other than semantics is usually at stake.
Stephen Jay Gould (1982)
I have championed contingency, and will continue to do so, because its large realm and legitimate claims have been so poorly attended by evolutionary scientists who cannot discern the beat of this different drummer while their brains and ears remain tuned to only the sounds of general theory.
Stephen Jay Gould (2002) p.1339
The essence of Darwinism lies in its claim that natural selection creates the fit. Variation is ubiquitous and random in direction. It supplies raw material only. Natural selection directs the course of evolutionary change.
Stephen Jay Gould (1977)
Rudyard Kipling asked how the leopard got its spots, the rhino its wrinkled skin. He called his answers "just-so stories." When evolutionists try to explain form and behavior, they also tell just-so stories—and the agent is natural selection. Virtuosity in invention replaces testability as the criterion for acceptance.
Stephen Jay Gould (1980)
Since 'change of gene frequencies in populations' is the 'official' definition of evolution, randomness has transgressed Darwin's border and asserted itself as an agent of evolutionary change.
Stephen Jay Gould (1983) p.335
The first commandment for all versions of NOMA might be summarized by stating: "Thou shalt not mix the magisteria by claiming that God directly ordains important events in the history of nature by special interference knowable only through revelation and not accessible to science." In common parlance, we refer to such special interference as "miracle"—operationally defined as a unique and temporary suspension of natural law to reorder the facts of nature by divine fiat.
Stephen Jay Gould (1999) p.84

Quotations

My own view is that conclusions about the evolution of human behavior should be based on research at least as rigorous as that used in studying nonhuman animals. And if you read the animal behavior journals, you'll see that this requirement sets the bar pretty high, so that many assertions about evolutionary psychology sink without a trace.

Jerry Coyne
Why Evolution Is TrueI once made the remark that two things disappeared in 1990: one was communism, the other was biochemistry and that only one of them should be allowed to come back.

Sydney Brenner
TIBS Dec. 2000
It is naïve to think that if a species' environment changes the species must adapt or else become extinct.... Just as a changed environment need not set in motion selection for new adaptations, new adaptations may evolve in an unchanging environment if new mutations arise that are superior to any pre-existing variations

Douglas Futuyma
One of the most frightening things in the Western world, and in this country in particular, is the number of people who believe in things that are scientifically false. If someone tells me that the earth is less than 10,000 years old, in my opinion he should see a psychiatrist.

Francis Crick
There will be no difficulty in computers being adapted to biology. There will be luddites. But they will be buried.

Sydney Brenner
An atheist before Darwin could have said, following Hume: 'I have no explanation for complex biological design. All I know is that God isn't a good explanation, so we must wait and hope that somebody comes up with a better one.' I can't help feeling that such a position, though logically sound, would have left one feeling pretty unsatisfied, and that although atheism might have been logically tenable before Darwin, Darwin made it possible to be an intellectually fulfilled atheist

Richard Dawkins
Another curious aspect of the theory of evolution is that everybody thinks he understand it. I mean philosophers, social scientists, and so on. While in fact very few people understand it, actually as it stands, even as it stood when Darwin expressed it, and even less as we now may be able to understand it in biology.

Jacques Monod
The false view of evolution as a process of global optimizing has been applied literally by engineers who, taken in by a mistaken metaphor, have attempted to find globally optimal solutions to design problems by writing programs that model evolution by natural selection.