More Recent Comments

Thursday, August 07, 2014

The Function Wars: Part IV

The world is not inhabited exclusively by fools and when a subject arouses intense interest and debate, as this one has, something other than semantics is usually at stake.
Stephan Jay Gould (1982)This is my fourth post on the function wars.

The first post in this series covered the various definitions of "function" [Quibbling about the meaning of the word "function"]. In the second post I tried to create a working definition of "function" and I discussed whether active transposons count as functional regions of the genome or junk [The Function Wars: Part II]. I claim that junk DNA is DNA that is nonfunctional and it can be deleted from the genome of an organism without affecting its survival, or the survival of its descendants.

In the third post I discussed a paper by Rands et al. (2014) presenting evidence that about 8% of the human genome is conserved [The Function Wars: Part III]. This is important since many workers equate sequence conservation with function. It suggests that only 8% of our genome is functional and the rest is junk. The paper is confusing and I'm still not sure what they did in spite of the fact that the lead author (Chris Rands) helped us out in the comments. I don't know what level of sequence similarity they counted as "constrained." (Was it something like 35% identity over 100 bp?)

My position if is that there's no simple definition of function but sequence conservation is a good proxy. It's theoretically possible to have selection for functional bulk DNA that doesn't depend on sequence but, so far, there are no believable hypothesis that make the case. It is wrong to arbitrarily DEFINE function in terms of selection (for sequence) because that rules out all bulk DNA hypotheses by fiat and that's not a good way to do science.

So, if the Rands et al. results hold up, it looks like more that 90% of our genome is junk.

Let's see how a typical science writer deals with these issues. The article I'm selecting is from Nature. It was published online yesterday (Aug. 6, 2014) (Woolston, 2014). The author is Chris Woolston, a freelance writer with a biology background. Keep in mind that it was Nature that started the modern functions wars by falling hook-line-and-sinker for the ENCODE publicity hype. As far as I know, the senior editors have not admitted that they, and their reviewers, were duped.

It must be very difficult to cover this story, although that hasn't prevented some science writers (e.g. Elizabeth Pennisi) from trying. (She has been spectacularly unsuccessful.)

Here's what Chris Woolston writes.

Just how much of our genome serves a purpose anyway? A recent study reignited the debate on this, particularly on social media. ...

After comparing the genomes of 12 different mammals (including humans, mice and pandas), researchers at the University of Oxford, UK, concluded that only about 8.2% of the human genome is shaped by natural selection. The rest, they argue, is non-functional. Observers noted the large difference between this estimate and a previous claim by the ENCODE (Encyclopedia of DNA Elements) Project that 80% of the genome is biochemically active. Patrik D'haeseleer, a computational biologist at Lawrence Livermore National Laboratory, California, tweeted “only between 8% and 80% of human #genome is functional. Glad we've got that sorted out.” At the heart of the issue are differing definitions of 'function'. Erick Loomis, an epigeneticist at Imperial College London, tweeted: “Maybe we should stop using 'functional' if we can't find a common definition.”

I don't know why it was necessary to quote Patrik D'haesseleer—he clearly doesn't understand the problem. Otherwise, this is a pretty good description of the issue.

I understand the frustration of people like Eric Loomis. (What the heck is an "epigeneticist"?). I imagine that most scientists are pretty tired with reading about the function wars. But avoiding the word "function" isn't going to make the problem go away. Something other than semantics is at stake. For example, we would still have to deal with the question of junk even if we studiously avoided the word "function."

Besides, as any biologist (including epigeneticists?) should know, we can't agree on the definitions of all kinds of things like "gene," "species," "evolution," "epigenetics," and "The Central Dogma of Molecular Biology," and that doesn't prevent us from talking about them.

Chris Woolston continues ...

The attempts to define genome function have been mired in controversy since ENCODE published its '80%' finding in 2012 (Nature 489, 57–74; 2012). A subsequent paper from the same consortium a few months ago also met with derision, partly because it didn't even speculate on the fraction of the genome that might have a purpose (M. Kellis et al. Proc. Natl Acad. Sci. USA 111, 6131–6138; 2014). That paper did, however, argue that evolutionary, genetic and biochemical data need to be taken into account to work out the answer.

In the latest report, the Oxford researchers responded to that call by focusing on evolutionary data. They looked for parts of the genome that showed low rates of mutation, a sign that those regions were conserved through natural selection. They classified the sequences — and only those sequences — as functional, a definition that is at odds with that used by ENCODE, which equated biochemical activity with functionality.

I think this is a pretty accurate summary of the problem.

The shifting definitions confused some readers. "I don't get this paper," tweeted John Greally, an epigeneticist at the Albert Einstein College of Medicine of Yeshiva University in New York City. "Functional=conserved, but discussion acknowledges that function can be in non-conserved sequences?" When reached for further comment, Greally says that he "gets" the paper now, but that he is "still frustrated by the way this debate is causing so much unproductive friction".

I'm with Greally except that I still don't "get" the Rands et al. paper. I don't understand what they did and how they distinguish between "constrained" and "conserved" sequences. Nevertheless, there are many papers that agree with the general conclusion. About 5-10% of the human genome is conserved. (Another "epigeneticist"?)

When Greally says he is frustrated, he is not alone. I too, regret that there have been so many papers discussing "function." The semantic debate is distracting us from the real issue. As soon as ENCODE opponents started debating the meaning of the word "function" they conceded that there IS a debate and ENCODE may be right after all.

The paper, Greally says, missed an opportunity to explore why certain sequences — especially those known as transcription factor binding sites — are under such low evolutionary pressure, even though they presumably have important biological roles. Instead, he adds, the authors emphasized the supposed discrepancy with ENCODE. "The paper appears to be in use as a bludgeon with which to hammer the ENCODE project, not necessarily by the authors, but by others," he suggests.

I wasn't aware of the fact that transcription factor binding sites are "under low evolutionary pressure." Is this true? Is the consensus binding site for human transcription factors different than the binding site for the orthologous mouse transcription factor? I didn't think there was a difference for most transcription factors.

In any case, I don't think the Rands et al. study is capable of recognizing conserved transcription factor binding sites unless they are embedded in a fairly large stretch of additional conserved sequence.

And, yes, the paper was intended as a criticism of the ENCODE publicity hype and their ridiculous claim that 80% of our genome is functional. We need more bludgeons because there are still some biologists who don't get it.

One outspoken critic of ENCODE is Dan Graur, who studies molecular evolutionary bioinformatics at the University of Houston, Texas. He publicly celebrated the new paper by tweeting: "What an amazing birthday present." In a follow-up interview, he said that the paper refutes ENCODE's claims, and added that it is "idiotic" to suggest that a part of the genome could be functional if it didn't respond to pressure from natural selection.

Hmmm ... I have suggested that parts of the genome are functional even though their sequences are not conserved by pressure from natural selection. These are spacer sequences such as those required to separate some transcription factor binding sites and intron cleavage recognition sites. I don't think this is necessarily "idiotic" but I'll have to ask Dan what he thinks of my idea. I also don't think that the various bulk DNA hypotheses are necessarily idiotic. True, there are some idiots who advocate a role for bulk DNA, but there are also some very smart people who have contributed to the debate.

ENCODE member Ross Hardison, a molecular biologist at Pennsylvania State University, called the latest paper "elegant” even though it took a different view of functionality. The Oxford group's findings don't contradict those of ENCODE, he says, because the project never estimated the proportion of the genome that would be conserved through natural selection. He added that it will probably take a combination of approaches to determine which parts of the genome we can't live without. "I expect that with more experiments and analyses, estimates of the proportion of the human genome that is functional will approach some convergence, even though they are pretty far apart now."

The Rands et al. paper directly contradicts the claim by the ENCODE consortium that 80% of our genome is functional.

The ENCODE Consortium did look at conservation and the results were reported in the summary paper (ENCODE, 2012). Their Figure 1 looks at the conservation of the elements they mapped. They conclude that a significant percentage of these elements show evidence of selection, especially in primate lineages. They speculate that these "functional" elements arose relatively recently in the human lineage. They conclude (page 71) ...

Importantly, for the first time we have sufficient statistical power to assess the impact of negative selection on primate-specific elements, and all ENCODE classes display evidence of negative selection in these unique-to-primate elements. Furthermore, even with our most conservative estimate of functional elements (8.5% of putative DNA/protein binding regions) and assuming that we have already sampled half of the elements from our transcription factor and cell-type diversity, one would estimate that at a minimum 20% (17% from protein binding and 2.9% protein coding gene exons) of the genome participates in these specific functions, with the likely figure significantly higher.

Their estimate of 20% constrained sequence is contradicted by the Rand et al. paper. Ross Hardison is going to be disappointed. There isn't going to be any "convergence" or middle ground in this debate.

For those of you who are truly interested, Ross Hardison has a podcast where he defends ENCODE against Dan Graur [Debating ENCODE Part II: Ross Hardison, Penn St.]. The interviewer never asks the big question, "How much of our genome is junk?", but Hardison tells us that he has never been comfortable with the idea of junk DNA.

I think that Chris Woolston did a pretty good job of explaining this controversy to the general audience of Nature readers. My only complaint is that he was a little too even-handed. The ENCODE Consortium is clearly on the losing side of the debate about junk DNA and that is the real story. There is a massive amount of evidence supporting the idea that most of our genome is junk.

7 comments
:

" I too, regret that there have been so many papers discussing "function." The semantic debate is distracting us from the real issue."Hmmm. I think you're loving it. And I think you enjoy holding people's feet to the fire when they say stupid things, perhaps because they have been swept along with some trend or hype, or perhaps because they never really learned to be a critical scientist.Or if you don't actually enjoy it, maybe you do it from some sense of responsibility as a teacher. Either way, good on ya.

I wasn't aware of the fact that transcription factor binding sites are "under low evolutionary pressure." Is this true? Is the consensus binding site for human transcription factors different than the binding site for the orthologous mouse transcription factor? I didn't think there was a difference for most transcription factors.

I'm surprised this wasn't known to you Larry, it should be pretty easy to elucidate from the very same principles of transcription you have written about on this blog often. Isn't the main constraint operating on a transcription factor binding site, that it is sufficiently large that it is unlikely to look like another piece of the genome, to prevent transcriptional interference?

It would seem to me that such a binding site would be much* more tolerant of mutation, as long as it doesn't start looking too much like other binding sites, or substantially erodes the binding affinity of the transcription factor.

I have even used the fact that some of these binding sites seem to drift weakly over time, to show that they are at some level incompatible with the ID argument of "common design", in arguments with creationists. The argument goes that the designer would not need to slowly change these binding sites in different species, he could have simply designed a large set of binding sites and used the same set in all his different creations. The fact that they aren't the same, but actually slowly evolve, is evidence against the idea that the designer is re-using his "designs" in different organisms. It's a falsification of the "common design" retort.

I don't have a problem with mutations that affect the binding site or its position. What I was questioning is the idea that the protein evolves rapidly so that it recognizes different consensus sequences in different species. I have difficult believing that this happens frequently in less than one hundred million years. Does anyone have an example of homologous transcription factors from different mammals that bind to different sequences? How many examples are there?

Why should sequence conservation, or transcription, be necessary for a segment of DNA to have a function? Theoretically at least, some segments of DNA could be spacers, part of the scaffolding needed to physically construct a particular protein. They would not be transcribed and their sequences would not be conserved (only the length would matter) but they would still have a function.

Laurence A. Moran

Larry Moran is a Professor Emeritus in the Department of Biochemistry at the University of Toronto. You can contact him by looking up his email address on the University of Toronto website.

Sandwalk

The Sandwalk is the path behind the home of Charles Darwin where he used to walk every day, thinking about science. You can see the path in the woods in the upper left-hand corner of this image.

Disclaimer

Some readers of this blog may be under the impression that my personal opinions represent the official position of Canada, the Province of Ontario, the City of Toronto, the University of Toronto, the Faculty of Medicine, or the Department of Biochemistry. All of these institutions, plus every single one of my colleagues, students, friends, and relatives, want you to know that I do not speak for them. You should also know that they don't speak for me.

Subscribe to Sandwalk

Quotations

The old argument of design in nature, as given by Paley, which formerly seemed to me to be so conclusive, fails, now that the law of natural selection has been discovered. We can no longer argue that, for instance, the beautiful hinge of a bivalve shell must have been made by an intelligent being, like the hinge of a door by man. There seems to be no more design in the variability of organic beings and in the action of natural selection, than in the course which the wind blows.Charles Darwin (c1880)Although I am fully convinced of the truth of the views given in this volume, I by no means expect to convince experienced naturalists whose minds are stocked with a multitude of facts all viewed, during a long course of years, from a point of view directly opposite to mine. It is so easy to hide our ignorance under such expressions as "plan of creation," "unity of design," etc., and to think that we give an explanation when we only restate a fact. Any one whose disposition leads him to attach more weight to unexplained difficulties than to the explanation of a certain number of facts will certainly reject the theory.

Charles Darwin (1859)Science reveals where religion conceals. Where religion purports to explain, it actually resorts to tautology. To assert that "God did it" is no more than an admission of ignorance dressed deceitfully as an explanation...

Quotations

The world is not inhabited exclusively by fools, and when a subject arouses intense interest, as this one has, something other than semantics is usually at stake.
Stephen Jay Gould (1982)
I have championed contingency, and will continue to do so, because its large realm and legitimate claims have been so poorly attended by evolutionary scientists who cannot discern the beat of this different drummer while their brains and ears remain tuned to only the sounds of general theory.
Stephen Jay Gould (2002) p.1339
The essence of Darwinism lies in its claim that natural selection creates the fit. Variation is ubiquitous and random in direction. It supplies raw material only. Natural selection directs the course of evolutionary change.
Stephen Jay Gould (1977)
Rudyard Kipling asked how the leopard got its spots, the rhino its wrinkled skin. He called his answers "just-so stories." When evolutionists try to explain form and behavior, they also tell just-so stories—and the agent is natural selection. Virtuosity in invention replaces testability as the criterion for acceptance.
Stephen Jay Gould (1980)
Since 'change of gene frequencies in populations' is the 'official' definition of evolution, randomness has transgressed Darwin's border and asserted itself as an agent of evolutionary change.
Stephen Jay Gould (1983) p.335
The first commandment for all versions of NOMA might be summarized by stating: "Thou shalt not mix the magisteria by claiming that God directly ordains important events in the history of nature by special interference knowable only through revelation and not accessible to science." In common parlance, we refer to such special interference as "miracle"—operationally defined as a unique and temporary suspension of natural law to reorder the facts of nature by divine fiat.
Stephen Jay Gould (1999) p.84

Quotations

My own view is that conclusions about the evolution of human behavior should be based on research at least as rigorous as that used in studying nonhuman animals. And if you read the animal behavior journals, you'll see that this requirement sets the bar pretty high, so that many assertions about evolutionary psychology sink without a trace.

Jerry Coyne
Why Evolution Is TrueI once made the remark that two things disappeared in 1990: one was communism, the other was biochemistry and that only one of them should be allowed to come back.

Sydney Brenner
TIBS Dec. 2000
It is naïve to think that if a species' environment changes the species must adapt or else become extinct.... Just as a changed environment need not set in motion selection for new adaptations, new adaptations may evolve in an unchanging environment if new mutations arise that are superior to any pre-existing variations

Douglas Futuyma
One of the most frightening things in the Western world, and in this country in particular, is the number of people who believe in things that are scientifically false. If someone tells me that the earth is less than 10,000 years old, in my opinion he should see a psychiatrist.

Francis Crick
There will be no difficulty in computers being adapted to biology. There will be luddites. But they will be buried.

Sydney Brenner
An atheist before Darwin could have said, following Hume: 'I have no explanation for complex biological design. All I know is that God isn't a good explanation, so we must wait and hope that somebody comes up with a better one.' I can't help feeling that such a position, though logically sound, would have left one feeling pretty unsatisfied, and that although atheism might have been logically tenable before Darwin, Darwin made it possible to be an intellectually fulfilled atheist

Richard Dawkins
Another curious aspect of the theory of evolution is that everybody thinks he understand it. I mean philosophers, social scientists, and so on. While in fact very few people understand it, actually as it stands, even as it stood when Darwin expressed it, and even less as we now may be able to understand it in biology.

Jacques Monod
The false view of evolution as a process of global optimizing has been applied literally by engineers who, taken in by a mistaken metaphor, have attempted to find globally optimal solutions to design problems by writing programs that model evolution by natural selection.