Monday, September 10, 2012

Science Writes Eulogy for Junk DNA

Elizabeth Pennisi is a science writer for Science, the premiere American science journal. She's been writing about "dark matter" for years focusing on how little we know about most of the human genome and ignoring all of the data that says it's mostly junk [see SCIENCE Questions: Why Do Humans Have So Few Genes? ].

It doesn't take much imagination to guess what Elizabeth Pennisi is going to write when she heard about the new ENCODE Data. Yep, you guessed it. She says that the ENCODE Project Writes Eulogy for Junk DNA.

When researchers first sequenced the human genome, they were astonished by how few traditional genes encoding proteins were scattered along those 3 billion DNA bases. Instead of the expected 100,000 or more genes, the initial analyses found about 35,000 and that number has since been whittled down to about 21,000. In between were megabases of “junk,” or so it seemed.

This is just a repetition of the Myth Concerning the Historical Estimates of the Number of Genes in the Human Genome. The truth is that knowledgeable scientists knew decades ago that the human genome could only have about 30,000 genes or else the mutation load would be too high. Data collected in the 70s and 80s confirmed that we could only have about 30,000 functional genes because genes were bloated by introns and our genome was full of transposons.

The idea that much of the genome was junk was also a conclusion based on genetic load arguments and on the discovery that most of our genome was littered with defective transposons (pseudogenes). The concept of junk DNA also explained the C-Value "Paradox." None of that evidence has disappeared with the publication of the ENCODE results. The "dark matter" really is junk.

Pennisi continues ...

This week, 30 research papers, including six in Nature and additional papers published by Science, sound the death knell for the idea that our DNA is mostly littered with useless bases. A decadelong project, the Encyclopedia of DNA Elements (ENCODE), has found that 80% of the human genome serves some purpose, biochemically speaking. “I don't think anyone would have anticipated even close to the amount of sequence that ENCODE has uncovered that looks like it has functional importance,” says John A. Stamatoyannopoulos, an ENCODE researcher at the University of Washington, Seattle.

Beyond defining proteins, the DNA bases highlighted by ENCODE specify landing spots for proteins that influence gene activity, strands of RNA with myriad roles, or simply places where chemical modifications serve to silence stretches of our chromosomes. These results are going “to change the way a lot of [genomics] concepts are written about and presented in textbooks,” Stamatoyannopoulos predicts.

There's nothing new in the ENCODE results. Pennisi should remember the controversy when the pilot project results were published in 2007. Many scientists pointed out, correctly, that a transcribed region is not necessarily indicative of a biological function. They also pointed out that DNA binding proteins are EXPECTED to bind to many non-functional loci, especially in a genome full of junk DNA. A binding site does not equate to biological function.

The death of junk DNA has been greatly exaggerated but it fits in nicely with a preconceived notion of mysterious dark matter and blinders that prevent you from seeing any evidence supporting junk DNA.

There's something seriously wrong when the two leading science journals openly support a radical change in our understanding of genomes based entirely on an incorrect interpretation of the data. Even when that misinterpretation is promoted by the authors of the papers that's no reason for prominent science journalists to stop being skeptical.

This is the time for Nature and Science to ask themselves how they could have been taken in by such hype and how they are going to prevent it in the future. They should also ask themselves whether they should retract the articles they wrote on the death of junk DNA. They would do no less for the science papers they publish if they found that the results were misleading.

NickM: I wonder if the journals will at least suck it up enough to publish critical letters

Well, at least some of them did! I just posted the following comment, on Elizabeth Pennisi’s perspective, in Science (if nothing else, the comment might have the merit of abbreviating ‘junk DNA’ and ‘functional DNA’ to jDNA and fDNA, which should be useful, as I have a hunch that a lot of ink and bits will be flowing about these concepts in the coming weeks):

Multiple eulogies for junk DNA?

Before writing a eulogy for junk DNA (jDNA), we need to know more about it. So what is jDNA?

All genomic sequences that code for proteins and functional RNA, or are involved in regulating gene expression (e.g. promoter elements) are functional DNA (fDNA). However, there are many other sequences that are functional, such as those participating in DNA replication, chromosome organization, etc.

By definition, jDNA is non-functional. However, by its bare presence in the genome, jDNA gets replicated and can undergo recombination, transcription and transposition, and it can be targeted by diverse DNA binding proteins.

ENCODE has been a logical follow up of the Human Genome project, which found that less than 2% of our genome codes for proteins and functional RNAs. Even by including generous estimates of regulatory sequences, the fDNA has been considered just a fraction of the genome; the rest, 90% or more, remained jDNA.

ENCODE has challenged all that, by suggesting that 80% or more of the human genome is fDNA. Accordingly, most of jDNA has evaporated. Whether this interpretation of the data, which involved a change in the definition of fDNA, was a hasty decision that reflects poorly on an otherwise remarkable project remains to be seen.

Here, I want to point out that a previous eulogy for jDNA was penned more than two decades ago (1), when it was proposed that jDNA functions as a sink for the integration of proviruses, transposons and other inserting elements, thereby protecting fDNA from inactivation or alteration of its expression.

Considering that at least 50% of the human genome is composed of transposable elements, and that the rate of their transposition is very high, this protective mechanism makes evolutionary sense. The evolution of alternative protective mechanisms against insertion mutagenesis such as specific integration sites in species that have little jDNA, (e.g. Bacteria) is strong evidence for this selective pressure. However, this pressure enters a new dimension in humans and other multicellular species, in which the number of integration events in somatic cells (including those by retroviruses) that would lead to cancer would be enormous without a protective mechanism. This model is fully consistent with the current data, makes evolutionary sense, and, statistically, is a fact.

No surprisingly, I beg to differ! As a matter of fact, I think it makes so much sense that (similar to other common sense issues that are highly inconvenient, such as the Onion Test) the only way to deal with it is to pretend that it doesn’t exist, or to say: “That doesn't make any sense,” or that “it is silly” (see Birney thinks the Onion Test is silly)

Is someone keeping a file of URLs of science media reports that the notion of junk DNA is dead? That would be useful in the future when they start claiming that they never said any such thing, or when the ENCODE people start saying that they are innocent, that they did not set off this media frenzy.

I realize that Larry has been recording major reports one by one in posts here (and all praise to him for that). But some repository would be helpful when we need to make up a slide or two with collages of science media reports, when we say to an audience "Now, some of you may have heard that there isn't any 'junk DNA' ..."

Claudiu -- I have been going tediously on the record on this one in this very blog for a while, for example: here and some earlier comments on Larry's recent posts too. (That one is also about whether it is sensible to conclude from the presence of lots of junk DNA that morphological traits are also not subject to natural selection).

Ryan -- Thanks, I saw some of your fine posts but missed that. It is a great service to us all. A list of all the science writers who are going to "have egg on their face", in effect.

I note the on Carl Zimmer's "Loom" blog at Discover Magazine's site he says he "would be all over" this story if he weren't overloaded by another one. But he does give links to both sides of the argument. Which means he is one possible writer who could comment on the Emperor's New Jeans (pun intended).

Claudiu -- you will also find me shooting off my mouth on September 5 at Panda's Thumb, in a thread started by Nick Matzke's comment. Nick, Larry, Sean Eddy, and Ryan Gregory deserve most of the honors here but I am happy to chime in.

Thanks for your response, Joe. Although I have a slightly different position on junk DNA (see my comments on Larry’s post: A Tribute to Stephen Jay Gould), which is by no means homologous to ENCODE’s extravagant and unfounded position, I appreciate your evolutionary perspective, Darwinian style, on defining junk DNA as the “DNA whose variation is not constrained by natural selection.”

When ENCODE looks at the next batch of transciption factors I would suggest that they throw in yeast GAL4 and maybe a few others. If GAL4 lights up all over the genome that would suggest they're not really looking at fucntionality. Another idea: if there are TF that we know have very few targets..one target would be ideal....can we see binding sites scattered evenly over every chromosome?RW

"The death of junk DNA has been greatly exaggerated but it fits in nicely with a preconceived notion of mysterious dark matter"

It also nicely fits in with a heart-warming view of the progression of science in which scientists slowly but steadily discover signal in what was thought to be noise. The reality of science is that sometimes noise is noise.

It seems most (not all) science journalists view their job as simply interpreting new results for the average joe. They figure any skepticism was already done by the peer reviewers. Their job is to make the stuff intelligible, and on deadline. One can understand this, since (unlike, say, politics) the subject matter can be pretty arcane. In cases like this, I blame the scientists and their institutional PR machines.

Most science journalists do, indeed, claim that their job is to INTERPRET science for the general public. Like all reporters, they claim that they can see through lies and distortions and report the news correctly. No science journalist will admit that all they do is paraphrase the conclusions of the paper and the thoughts of the authors.

While I put a great deal of blame on the authors, science journalists can't claim that they are doing their job correctly when they have so many recent failures to their credit.

%80 of our genome may have some sort of biochemical functions although I am highly spectical about this claim. But does this mean that %80 of mutations are NOT neutral? Creationists loved this ENCODE project result. Because their famous claim is this: ''most mutations are harmful''. If %80 of base pairs are really functionally important to us, this means that most mutations may be deleterious rather than being neutral. What I want to learn is this: How much percentage of base pairs are under selection? Do ENCODE results show that most mutations are harmful? I still feel that most mutations are neutral because every human gets 50-100 mutations from parents. If most mutations were really deleterious, all of us would be genetically ill. What is your opinion? Are most mutations are deleterious or effectively neutral. Somebody please answer.

All of us are genetically ill. The human genome contains so many defunct genes, no wonder we grow old and die. Try to grow an organism from a haploid genome: failure guaranteed. It is because we have genetic backups (found in diploid genomes and genetic redundancy), that we are still around.

If its poor understanding by science reporters then why should the public and creationists EVER have confidence in science writers or researchers?If creationists opposed these conclusions we would be charged with denying science and dangerous to science research!It means there must be a greater liberality about confidence in conclusions from "science" researchers.This is why evolution and company are not settled facts just because its written that way.These are slippery subjects relative to hard data , origin issues, and the historic demand that scientists can't be wrong is exploding before our eyes here for somebody.

There should be a methodology about conclusions in the natural sciences that would make things like this present contention less likely.We could call it the scientific methodology.This creationist insists evolutionary biology has never been put under the scrutiny of actual standards of the scientific method.this because past and gone events and processes can't be studied in the present.Its all lines of reasoning and fossils of data points.

If its poor understanding by science reporters then why should the public and creationists EVER have confidence in science writers or researchers?

Exactly. You should always be very skeptical about statements made by scientists and science journalists.

The problem with creationists is that they are skeptical to the point of ridiculous about science that supports evolution but credulous to the point of IDiocy about claims that challenge evolution and support their worldview.

It's called having your cake and eating it too.

If creationists opposed these conclusions we would be charged with denying science and dangerous to science research!

Isn't it strange that creationists oppose 99% of the research in biology but they fall all over themselves glorifying the ENCODE claims?

Well its not biology but evolutionary biology.Your right about about the acceptance of researchers ideas when it accords with ones own.they are making a mistake here.If the researchers had said otherwise they wouldn't agree with them.

Id'ers would probably say a case like this is about hard data. A discovery that can be repeated in any investigation.Its not an interpretation of data but discovered data.I think they see it like this.

Yet the equation of consent to their conclusions when it suits you is something to be intellectually aware of and wary of.

Larry--I see a lot of 20/20 hindsight (or I knew things were this way) in your post. The gene number canard, for example--you write, "The truth is that knowledgeable scientists knew decades ago that the human genome could only have about 30,000 genes or else the mutation load would be too high". Yet your own blog post cited said that the majority of the bets in the 90s were in the 40.000 to 50,000 range (were those all scientists without "knowledge"?), and the current number ins under 25,000--half that. Yes, some/a few people got it right but no one at the time had "proof" of what the number was--they had arguments based on minimal amount of data, but that's far from saying there was a consensus that there were 20-30,000 genes. You offer a revisionist history that you accuse ENCODE and the media of doing, to my eye. For the record, what is your position on whether ENCODE should have been done--much of the backlash seems to stem from those who opposed the project and would rather have seen the money go to PI grants--much like many opposed the human genome being sequenced for similar reasons. The anti-ENCODE faction lost the debate when NIH went ahead, but now is obviously a chance to replay it.

Yes. Most of them were graduate students and postdocs whose main focus was on the technology and not on trying to understand genomes. They were heavily influenced by a back-of-the-envelope calculation done by Wally Gilbert in the 1980s. However, the point is that even among this group there was not an obvious bias toward huge numbers of genes as some people would have you believe.

Look at that chart. How many of those people do you think were really "surprised" at the low number of genes initially reported? Not many, I bet.

Were you one of the people who was surprised?

... that's far from saying there was a consensus that there were 20-30,000 genes

I did not say that there was such a consensus among molecular biologists. My point is that it's wrong to imply that "everyone" thought there were at least 100,000 genes and they were "surprised" by the publication of the human genome sequence. That's revisionist history.

Laurence A. Moran

Larry Moran is a Professor Emeritus in the Department of Biochemistry at the University of Toronto. You can contact him by looking up his email address on the University of Toronto website.

Sandwalk

The Sandwalk is the path behind the home of Charles Darwin where he used to walk every day, thinking about science. You can see the path in the woods in the upper left-hand corner of this image.

Disclaimer

Some readers of this blog may be under the impression that my personal opinions represent the official position of Canada, the Province of Ontario, the City of Toronto, the University of Toronto, the Faculty of Medicine, or the Department of Biochemistry. All of these institutions, plus every single one of my colleagues, students, friends, and relatives, want you to know that I do not speak for them. You should also know that they don't speak for me.

Subscribe to Sandwalk

Quotations

The old argument of design in nature, as given by Paley, which formerly seemed to me to be so conclusive, fails, now that the law of natural selection has been discovered. We can no longer argue that, for instance, the beautiful hinge of a bivalve shell must have been made by an intelligent being, like the hinge of a door by man. There seems to be no more design in the variability of organic beings and in the action of natural selection, than in the course which the wind blows.Charles Darwin (c1880)Although I am fully convinced of the truth of the views given in this volume, I by no means expect to convince experienced naturalists whose minds are stocked with a multitude of facts all viewed, during a long course of years, from a point of view directly opposite to mine. It is so easy to hide our ignorance under such expressions as "plan of creation," "unity of design," etc., and to think that we give an explanation when we only restate a fact. Any one whose disposition leads him to attach more weight to unexplained difficulties than to the explanation of a certain number of facts will certainly reject the theory.

Charles Darwin (1859)Science reveals where religion conceals. Where religion purports to explain, it actually resorts to tautology. To assert that "God did it" is no more than an admission of ignorance dressed deceitfully as an explanation...

Quotations

The world is not inhabited exclusively by fools, and when a subject arouses intense interest, as this one has, something other than semantics is usually at stake.
Stephen Jay Gould (1982)
I have championed contingency, and will continue to do so, because its large realm and legitimate claims have been so poorly attended by evolutionary scientists who cannot discern the beat of this different drummer while their brains and ears remain tuned to only the sounds of general theory.
Stephen Jay Gould (2002) p.1339
The essence of Darwinism lies in its claim that natural selection creates the fit. Variation is ubiquitous and random in direction. It supplies raw material only. Natural selection directs the course of evolutionary change.
Stephen Jay Gould (1977)
Rudyard Kipling asked how the leopard got its spots, the rhino its wrinkled skin. He called his answers "just-so stories." When evolutionists try to explain form and behavior, they also tell just-so stories—and the agent is natural selection. Virtuosity in invention replaces testability as the criterion for acceptance.
Stephen Jay Gould (1980)
Since 'change of gene frequencies in populations' is the 'official' definition of evolution, randomness has transgressed Darwin's border and asserted itself as an agent of evolutionary change.
Stephen Jay Gould (1983) p.335
The first commandment for all versions of NOMA might be summarized by stating: "Thou shalt not mix the magisteria by claiming that God directly ordains important events in the history of nature by special interference knowable only through revelation and not accessible to science." In common parlance, we refer to such special interference as "miracle"—operationally defined as a unique and temporary suspension of natural law to reorder the facts of nature by divine fiat.
Stephen Jay Gould (1999) p.84

Quotations

My own view is that conclusions about the evolution of human behavior should be based on research at least as rigorous as that used in studying nonhuman animals. And if you read the animal behavior journals, you'll see that this requirement sets the bar pretty high, so that many assertions about evolutionary psychology sink without a trace.

Jerry Coyne
Why Evolution Is TrueI once made the remark that two things disappeared in 1990: one was communism, the other was biochemistry and that only one of them should be allowed to come back.

Sydney Brenner
TIBS Dec. 2000
It is naïve to think that if a species' environment changes the species must adapt or else become extinct.... Just as a changed environment need not set in motion selection for new adaptations, new adaptations may evolve in an unchanging environment if new mutations arise that are superior to any pre-existing variations

Douglas Futuyma
One of the most frightening things in the Western world, and in this country in particular, is the number of people who believe in things that are scientifically false. If someone tells me that the earth is less than 10,000 years old, in my opinion he should see a psychiatrist.

Francis Crick
There will be no difficulty in computers being adapted to biology. There will be luddites. But they will be buried.

Sydney Brenner
An atheist before Darwin could have said, following Hume: 'I have no explanation for complex biological design. All I know is that God isn't a good explanation, so we must wait and hope that somebody comes up with a better one.' I can't help feeling that such a position, though logically sound, would have left one feeling pretty unsatisfied, and that although atheism might have been logically tenable before Darwin, Darwin made it possible to be an intellectually fulfilled atheist

Richard Dawkins
Another curious aspect of the theory of evolution is that everybody thinks he understand it. I mean philosophers, social scientists, and so on. While in fact very few people understand it, actually as it stands, even as it stood when Darwin expressed it, and even less as we now may be able to understand it in biology.

Jacques Monod
The false view of evolution as a process of global optimizing has been applied literally by engineers who, taken in by a mistaken metaphor, have attempted to find globally optimal solutions to design problems by writing programs that model evolution by natural selection.