Thursday, August 01, 2013

The Junk DNA Controversy: John Mattick Defends Design

The failure to recognize the implications of the non-coding DNA will go down I think as the biggest mistakes in the history of molecular biology.

John Mattickabc AustraliaJohn Mattick has just published a paper dealing with the controversy over the ENCODE results and junk DNA. As you might imagine, Mattick defends the idea that most of our genome is functional. He attempts to explain why most of the critics are wrong.

The title of the paper is "The extent of functionality in the human genome" (Mattick and Dinger, 2013). It's published in the HUGO Journal. Recall that HUGO (Human Genome Organization) gave Mattick a prestigious award for his contributions to genome research. (See The Dark Matter Rises for a discussion of these contributions.)

Mattick's paper begins by mentioning three of the papers that were critical of ENCODE results: Dan Graur's paper (Graur et al. 2013), Ford Doolittle's paper (Doolittle, 2013), and the paper by Niu and Jiang (2013).

He begins by addressing one of Dan Graur's points about conservation.

Sequence Conservation

Let's cover a bit of background before dealing with Mattick.

Scientists have fifty years of experience looking at sequences. We have repeatedly observed that some sequences are conserved while other are not. Conserved sequences are remarkably correlated with functional regions of proteins, RNAs, and the genome. By contrast, non-conserved sequences almost always correlate with nucleic acid and amino acid sequences that are not essential for function. In the case of genomic sequences (DNA) these non-conserved sequences have often been tested and, with only a few exceptions, no evidence of function has been discovered. (Promoter bashing experiments are a good example.)

In a few cases, large regions of the genome have been deleted with no apparent effect on the organism. In other cases, considerable variation within a population is observed (e.g. humans) and the absence of some stretches of DNA in some individuals does not seem to affect these individuals. Thus, deleting non-conserved DNA doesn't appear to affect fitness suggesting strongly that it is nonfunctional.

ThemeGenomes
& Junk DNASome parts of the human genome resemble functional genes and transposons but all available evidence indicates that these regions no longer function like the genes and transposons they resemble. They appear to be pseudogenes and defective transposons (or fragments of transposons). By looking at well-identified orthologs in different species we can see that these pseudogenes have gained fixed mutations at the rate perfectly consistent with the rate of fixation of neutral alleles by random genetic drift.

Whole genome comparisons of mammalian genes also demonstrate that 90% of their genomes are not conserved and are evolving as though the nucleotide sequence was irrelevant. (Rate of fixation equals the mutation rate.) This observation is also consistent with genetic load data showing that about 90% of our genome can't be constrained by negative selection.

It is reasonable to conclude that most of the typical mammalian genome is not functional. The only other possibility is that a large percentage of these genomes is functional but the function has nothing to do with the actual sequence of DNA.

John Mattick does not like this line of reasoning. He says ...

the substantive scientific argument of Graur et al. is based primarily on the apparent lack of sequence conservation of the vast majority (~90%) of the human genome, suggesting that this indicates lack of selective constraint (and therefore function). The fundamental flaw, however, in this argument is that conservation is relative, and its estimation in the human genome is largely based on the questionable proposition that transposable elements, which provide the major source of evolutionary plasticity and novelty (Brosius 1999), are largely non-functional. This argument also overlooks a number of other assumptions and considerations that are tacitly embedded in conservation comparisons and their interpretation (Pheasant and Mattick 2007) ...

Mattick raises five objections that, in his opinion, make the Graur et al. argument (and the data) invalid.

This is nonsense. Lack of sequence conservation implies lack of function based on decades of work. It's true that there are some parts of the genome whose function doesn't depend on sequence but the burden of proof is on those who claim that most non-conserved sequences are functional.

As a general rule, regulatory sequences consist of short conserved sequences that bind proteins. They are easy to identify and relatively easy to recognize. They are well conserved between related species. There may be a few exceptions but the general rule applies. Regulatory sequences are just as well conserved as typical amino acid codons in proteins. Mattick is wrong.

3. regulatory sequences are the main genetic substrates for the exploration of phenotypic diversity in animals.

It's true that phenotypic differences between species can often be explained by differences in otherwise conserved regulatory sequences.

4. the conclusion of lack of conservation of most of the human genome is largely based on a circular comparison with the rate of evolution of pan-mammalian ancient ‘repeats’

Mattick complains that the lack of conservation of genomic sequences is largely based on a circular argument. This is hard to understand. He says ...

... one assumes that a subset of the genome is evolving neutrally and is therefore indicative of the rate of unconstrained divergence, then finds that most of the rest of the genome is behaving similarly, which is therefore concluded to also be non-functional. If the first assumption is incorrect ... the derived conclusion of non-functionality of the rest of the genome is also incorrect.

The logic seems relatively uncontroversial. Mattick is correct. If one assumes that part of the genome is evolving neutrally then the conclusion will be invalid if the assumption is incorrect.

The problem is that we have plenty of evidence that most of the genome is evolving neutrally so it's not an assumption. It's a fact. Maybe I don't understand this argument?

5. even if ancient repeats are neutrally evolving (which we think unlikely), the extant comparison set is restricted to those whose orthology is recognizable ...

This is true. We can only determine that pseudogenes and defective transposons are evolving neutrally if we know that the DNA regions in different species are orthologous. Fortunately, we have plenty of excellent examples. These allow us to deduce the common ancestor and determine the rate of fixation of allele in each lineage. They serve as good examples of fixation of neutral alleles by random genetic drift.

I'm not sure I understand why this is so important to Mattick.

The C-Value Paradox

Mattick correctly identifies the main argument for junk DNA based on genome size comparisons.

... the so-called ‘C-value enigma’ , which refers to the fact that some organisms (like some amoebae, onions, some arthropods, and amphibians) have much more DNA per cell than humans, but cannot possibly be more developmentally or cognitively complex, implying that eukaryotic genomes can and do carry varying amounts of unnecessary baggage.

He argues that, while this may be true, the differences are often due to polyploidy or increases in the amount of defective transposon sequences. It's not clear to me why this invalidates the conclusion that some eukaryotes can carry a lot of junk in their genomes.

He then goes on to say ...

... there is a broadly consistent rise in the amount of non-protein-coding intergenic and intronic DNA with developmental complexity, a relationship that proves nothing but which suggests an association that can only be falsified by downward exceptions, of which there are none known.

That "correlation" only exists in the mind of John Mattick. He mentions "downward exceptions" and says that there are none known. I don't know what he means by this. Does he mean that the minimum size of a vertebrate genome is defined by the pufferfish, with about 27,000 genes and a total genome size of 0.33 ×109 or about 1/10 the size of the human genome.

Or does he mean the minimum size of the mammalian genome defined by the Bent-winged bat at approximately half the size of the human genome? Either way, the human genome must contain a lot of junk that isn't required to specify a complex vertebrate.

This argument doesn't make any sense.

Pervasive Transcription

We now come to the most important part of Mattick's defense of ENCODE. The question is whether pervasive transcription is a reflection of noise or whether the majority of the RNAs produced have a function. Keep in mind that most of these RNAs are complementary to defective transposon sequences and their sequence is not conserved. Also keep in mind that only a small percentage reach a concentration of at least one molecule per cell. (Mattick does NOT mention concentration.)

Mattick's main argument for function is ..

... the vast majority of the mammalian genome is differentially transcribed in precise cell-specific patterns (Mercer et al. 2008) to produce large numbers of intergenic, interlacing, antisense and intronic non-protein-coding RNAs, which show dynamic regulation in embryonal development ...

Let's think about that for a minute.

Let's assume that the human genome is littered by chance with short sequences that resemble transcription factor binding sites. This has to be true unless there is strong negative selection against anything that resembles the binding sites of any transcription factor. There's no conceivable way that this could happen so it follows logically that there will be spurious binding sites.

Transcription factors will blind to these spurious nonfunctional sites as long as the DNA is available for binding. In some cases the accidental binding of a transcription factor would lead to spurious, accidental, transcription.

In almost all cases, these spurious transcripts will be extremely rare—their concentration will be less than one transcript per cell. This is important since you can't have a serious discussion of this issue without considering concentration.

If our assumption is correct, there's one other feature of the spurious transcription that must be observed: the transcription will be cell specific or developmentally regulated. This is because different transcription factors are present in different cell types and at different stages of development. It's also because the accessibility of different parts of the genome vary from cell type to cell type and at different kinds of development. This is the transition from "open" chromatin to a "closed" version resembling heterochromatin.

We're left with the conclusion that spurious, accidental, transcripts must be differentially expressed as a function of cell type and development. That's exactly what we observe. But Mattick uses this necessary feature as an argument for function. That makes no sense.

He claims that ...

... differential expression (including extensive alternative splicing) of RNAs is a far more accurate guide to the functional content of the human genome than logically circular assessments of sequence conservation, or lack thereof. Assertions that the observed transcription represents random noise (tacitly or explicitly justified by reference to stochastic (‘noisy’) firing of known, legitimate promoters in bacteria and yeast), is more opinion than fact and difficult to reconcile with the exquisite precision of differential cell- and tissue-specific transcription in human cells.

I don't think it's fair to say that spurious transcription is "more opinion than fact." It's a biochemical necessity as long as you understand the properties of DNA binding proteins.

Mattick has one more argument up his sleeve and it's the same argument made by Intelligent Design Creationists.

Moreover, where tested, these noncoding RNAs usually show evidence of biological function in different developmental and disease contexts, with, by our estimate, hundreds of validated cases already published and many more en route, which is a big enough subset to draw broader conclusions about the likely functionality of the rest.

There are over one million different transcripts that have been detected in human cells. In some cases there have been obvious clues that these transcripts have a function. Many of these best candidates have been investigated and it turns out that quite a few have a function.

That's not a surprise. But just because there are functional RNAs does not mean that all RNAs are functional. It does not even mean that a substantial percentage are functional. (Remember that 300 functional RNAs out of one million is 0.03%.)

[Mattick has been] a true visionary in his field; he has demonstrated an extraordinary degree of perseverance and ingenuity in gradually proving his hypothesis over the course of 18 years.
Hugo Award Committee A Question of Motives

Now we get to the end of the paper and the most astonishing claim. I had to read this several times before I sure I was interpreting it correctly.

There may also be another factor motivating the Graur et al. and related articles (van Bakel et al. 2010; Scanlan 2012), which is suggested by the sources and selection of quotations used at the beginning of the article, as well as in the use of the phrase “evolution-free gospel” in its title (Graur et al. 2013): the argument of a largely non-functional genome is invoked by some evolutionary theorists in the debate against the proposition of intelligent design of life on earth, particularly with respect to the origin of humanity. In essence, the argument posits that the presence of non-protein-coding or so-called ‘junk DNA’ that comprises >90% of the human genome is evidence for the accumulation of evolutionary debris by blind Darwinian evolution, and argues against intelligent design, as an intelligent designer would presumably not fill the human genetic instruction set with meaningless information (Dawkins 1986; Collins 2006). This argument is threatened in the face of growing functional indices of noncoding regions of the genome, with the latter reciprocally used in support of the notion of intelligent design and to challenge the conception that natural selection accounts for the existence of complex organisms (Behe 2003; Wells 2011).

The last two references are to Michael Behe's paper about functional pseudogenes and to Jonathan Wells' book The Myth of Junk DNA. I don't think I've ever see a legitimate scientific paper that references that book by Jonathan Wells.

Mattick also uses IDiot terminology when he says that, "...the argument posits that the presence of non-protein-coding or so-called ‘junk DNA’ that comprises >90% of the human genome is evidence for the accumulation of evolutionary debris by blind Darwinian evolution." As you should all know by now, the accumulation of junk DNA is the antithesis of "Darwinian evolution." You should also note that it's mostly IDiots who get confused about the difference between junk DNA and "non-protein-coding" DNA.

57 comments
:

I've posted the following statement on the Hugo comment page accompanying Mattick's article:

Larry Moran's critique

Readers may be interested in Larry Moran's critique at his Sandwalk blog (http://sandwalk.blogspot.de/2013/08/the-junk-dna-controversy-john-mattick.html).

IMO T. Ryan Gregory's onon test cited by Mattick is not so much about the fact that onion genomes are bigger than the human genome but rather about the fact that the sizes of the smallest and the biggest onion genomes differ by a factor of 5. One would have to claim different complexities for these onion species if one beleaves that most sequences are functional.

It may take some time until it shows up because the following message poped up after I submitted my comment:

Your comment will be checked by a moderator, this should happen within 2 working days. You will receive an email when the comment appears on the site or if it is rejected by the moderator.

You will have to register if you want to comment over there.If you have access to one of the following sites username and password should be the same: BioMed CentralSpringerOpen Chemistry CentralCurrent Controlled TrialsCases Database

Anyone who cites Jonathan Wells as an authority is a charlatan, pure and simple. We can't say for sure Mattick is a religiously-motivated IDiot, and his motivations don't really matter, but his citing of Jonathan Wells AS AN AUTHORITY proves Mattick is a charlatan.

It's all ID catchphrases.

so-called ‘junk DNA’ that comprises > 90% of the human genome is evidence for the accumulation of evolutionary debris by blind Darwinian evolution

Who the hell ever uses the word "Darwinian" in a scientific paper to mean neutral goddamn evolution? An IDiot, that's who. John Timmer quantified just how much SCIENTISTS JUST DON'T SAY "DARWINISM" in a review of IDiot Stephen Meyer's Explore Evolution.

Timmer writes: [In the scientific literature] "Searching for "neo-darwinism" netted 30 references; "neodarwinism" another five. Trying "neodarwinian" and "neo-darwinian" pulled out a whopping 96 references. The term appears to have no significant presence in scientific communications. In contrast, searching for "evolution" pulled out 226,476 papers, while the more specific "selective pressure" 21,553. If this book is all about science, why not use the terminology actual scientists do? Presumably, because the institute producing the book promotes the idea that evolution isn't science, but an ideology, one that their fellows have pulled a Godwin on and attempted to tie to Nazism." [Ars Technica vs. Stephen Meyer's Explore Evolution]

Mattick: "the argument posits that the presence of non-protein-coding or so-called ‘junk DNA’"

Again, only IDiots and muggle journalists say scientists believed ncDNA = junk DNA. As I have said over and over and over and over, no geneticist nor molecular biologist ever said that he himself believed that ncDNA was equal to junk DNA, or a subset thereof. This is a lie promoted by ID creationists and reporters in the muggle press who wanted to sell magazines with a fake Kuhnian "paradigm shaft" based on lying about what scientists believed 10-20 years ago, replacing their real hypothesis with something dumber and easier to disprove.

However, Mattick's profound, ocean-deep ignorance of the history of science is not the same as ignorance of science. One can be, like Mattick very, very, very, very ignorant of the recent history of science (and I mean 1980'-1990's, not 1880') while not necessarily being ignorant of science itself. But there are other indicators of this.

Mattick: "There may also be another factor motivating the Graur et al. and related articles (van Bakel et al. 2010; Scanlan 2012), which is suggested by the sources and selection of quotations used at the beginning of the article, as well as in the use of the phrase “evolution-free gospel” in its title (Graur et al. 2013): the argument of a largely non-functional genome is invoked by some evolutionary theorists in the debate against the proposition of intelligent design"

Fraudian psychoanalysis, which does not belong in a supposed journal of genomics! If Mattick wants to write papers on psychoanalysis, let him send them to Psychology Today! Fraudian psychoanalysis and proving your point by accusing people vastly smarter than you of having bad motives is a technique universally employed by creationists.

I have never, ever, ever, ever seen such psychoanalysis or armchair psychology employed in any peer-reviewed paper on genetics, molecular biology, physics, or anything scientific. It's like when Nazis like Johannes Stark blathered about "Jewish Physics". This is shameful, and HUGO journal should be ashamed. They should retract Mattick's paper, or they should rename themselves "Modern Amateur Psychoanalysis Journal."

I also find it strange that he cited Michael Behe in this particular context. I have never seen Behe explicitly speaking about junk DNA before. In fact, his more recent writings seem to suggest that he accepts the fact that the genome seems to be littered with excess DNA.

This is what he had to say:

"If DNA were exactly like a blueprint, with no wasted space, and every line and curve representing a point of building, then this mutation rate would be fatal. After all, one critical mistake is all it takes to kill (or cause the building to collapse). But in fact, DNA isn't exactly like a blueprint. Only a fraction of its sections are directly involved in creating proteins and building life. Most of it seems to be excess DNA, where mutations can occur harmlessly." (Emphasis added)

Mattick: he substantive scientific argument of Graur et al. is based primarily on the apparent lack of sequence conservation of the vast majority (~90%) of the human genome, suggesting that this indicates lack of selective constraint (and therefore function). The fundamental flaw, however, in this argument is that conservation is relative, and its estimation in the human genome is largely based on the questionable proposition that transposable elements, which provide the major source of evolutionary plasticity and novelty (Brosius 1999), are largely non-functional.

This is patently false and Mattick shows his profound ignorance of sequence comparison methods!

The literature on sequence conservation rates goes back decades before modern genomics, before whole genomes could be sequenced. In the 1970's and early 1980's all molecular biologists had to work with were amino acid sequences, and they had to deduce neutral substitution rates by comparisions of CODING DNA, CODING DNA, NOT NON-CODING DNA and NOT TRANSPOSONS! The Dayhoff matrix was not constructed from damn transposons! What a blunder!

Molecular biologists would laugh out loud at Mattick's fantasies-- mol. biologists have decades of experience at actually MUTATING proteins and seeing what happens, and comparing that to sequence comparisons across many species. They know from experience, NOT "circular logic" as Mattick ignorantly claims, that sequence conservation is the most reliable, simple metric (if you don't have a 3-D protein structure) indicating functional or structural constraints.

Mattick here is guilty of "pot-kettle-black", because he himself is employing circular logic in asserting that cell-type-specific or developmentally specific expression of RNA's at incredibly low levels of concentration is the best evidence of the function. What evidence does he have to back that up? Statistics? Experimentation? Observation? No, he assumes it-- it's his hypothesis, and he presents his hypothesis as if it were data proving his hypothesis. Circular logic.

Hey, if the experimental methods don't support your hypothesis, says Mattick, the experimental methods must be wrong. Pick your methods so as to guarantee you get the result you need, says Mattick. He's Mr. Circular Logic.

"... there is a broadly consistent rise in the amount of non-protein-coding intergenic and intronic DNA with developmental complexity, a relationship that proves nothing but which suggests an association that can only be falsified by downward exceptions, of which there are none known."

Here Mattick is referring to his infamous “Dog’s Ass Plot” that was so effectively skewered by T. Ryan Gregory. That plot of Mattick’s was based on fake data– his diagonal slopes on the bars on the graph? made up numbers– and bordered on scientific fraud. But, having made a fake plot based on fake numbers, Mattick now treats his fake data as real.

And as for his claim of "downward exceptions", "there are none known", not only is this false, the use of the word "exception" must be challenged. The word "exception" implies that there's a rule-- but where's the rule? Mattick's Dog's Ass Plot?

Now we come to the circular logic. Who the hell decided that the Dog's Ass Plot could not be falsified by UPWARD exceptions!? Mattick decided that-- Mattick announced from atop lofty Olympus that no UPWARD exceptions can falsify his Dog's Ass Plot BECAUSE HE KNOWS THERE ARE DOZENS AND DOZENS, LIKELY MANY MANY MORE, UPWARD EXCEPTIONS. Again, it's circular logic because when the experimental evidence don't support Mattick's hypothesis, he announces 'hey, THOSE methods don't count'!!

This is like arguing with some Apple Computer fan circa 1995: "Well yeah, Apple computers suck at gaming, but computers aren't MEANT to be used for gaming!"

And as for his claim that downward exceptions don't EXIST, here's a bunch for ya, John!!

The Drosophila genome is very reduced in comparison to other flies. Its genome can be directly compared to other flies with, presumably, more junk.

In bladderworts [Genlisea-Utricularia] there is wide variation. Utricularia gibba is now famous for its fully sequenced, tiny genome, but the beauty part is, there are other bladderworts that are closely related but with far larger or even smaller genomes, that just beg to be sequenced. Utricularia prehensilis has 4.56 times as much DNA as U. gibba, while Genlisea hispidula with has 18.4 times as much as U. gibba.

Meanwhile, G. margaretae and G. aurea are even smaller than U. gibba!

I have been going about saying someone should write an NIH grant to sequence Utricularia prehensilis or Genlisea hispidula.

The sea urchin has 814 Mbp = 1/4 x Human.

The turkey has 1.1 Gbp, about 1/3 x human.

The frog Hyla nana has 1.89 pg C = 55% of human.

Let’s not forget the 100-fold variation within amphibians, from genomes much smaller (less than 1/3) than human, to Necturus Lewisi with 34 times bigger than human.

“An extraordinary range of C values is found in amphibians where the smallest genomes are just below 10^9 bp while the largest are almost 10^11 [100 billion basepairs, compared to 3.2 billion in humans]. It is hard to believe that this could reflect a 100-fold variation in the number of genes needed to specify different amphibians.” [Lewin, Genes II]

I'm going to mention some more "Downward exceptions" which according to Mattick don't exist!!

BIRDS

The black-chinned hummingbird, Archilochus alexandri, can fly and hover and has asymmetric flight feathers and an awesome sense of balance, and has 1/4 [26%] as much DNA as a human. This is about the same size as the genome of the sea urchin.

Turkey and chicken have about 1/3 as much DNA as a human.

PLACENTAL MAMMALS

The bent-winged bat, Miniopterus schreibersi, can fly and has echolocation, but has less than 1/2 [49.4%] as much DNA as a human.

The barking deer Muntiacus muntjak has half as much DNA as a human.

AMPHIBIANS

All salamanders, newts, axolotl, caecilians and waterdogs have much, much more DNA than humans, but frogs and toads vary widely.

The ornate burrowing frog, Limnodynastes ornatus, has nearly 1/4 [27%] as much DNA as a human. This is about the same size as the genome of the sea urchin.

Couch's spadefoot toad, Scaphiopus couchii, has less than 1/3 [29%] as much DNA as a human. This is about the same size as chicken or turkey.

The Jamaican laughing frog, Osteopilus brunneus, has about 1/2 [52%] as much DNA as a human. This is more than the bent-winged bat, and about as much as the barking deer.

Within the tree frog genus Hyla there is a four-fold variation, from Hyla nana which has about half [55%] as much DNA as a human, to Hyla cf. lanciformis sp.2 which has 2.12 times as much. (Hyla versicolor has even more but it is apparently tetraploid.)

It's also interesting to note that back in 2007, Mattick believed that at least 20% of the genome is functional, which is, as Ryan Gregory showed, consistent with the view of many geneticists and molecular biologists (including Comings himself). Even Dan Graur in his lecture at the SMBE a few weeks ago guesstimated that junk DNA comprises at least 65% of the human genome (nowhere near the upper limit that Mattick set for himself). So, what's wrong now? If he truly believed back in 2007 (3 years after he published his now famous article in Scientific American) that it is reasonable to say that 20% of the genome is functional, why is he acting now as though he's revolting against an orthodoxy? And why is he defending the mistakes and the media hype of the ENCODE project?

Let's dial it back a teeny bit. Mattick isn't citing Behe and Wells for any purpose other than to show that IDiots are doing what he says they're doing: using the supposed functionality of all DNA as an argument for ID, and against the argument that junk DNA shows ID wrong. He's not claiming the IDiots are correct in their arguments for ID. He hasn't offered any support for ID. It's all just part of his attempt to psychoanalyze his opponents.

As I said in a previous thread, he is not directly supporting ID, that is true. And all he is technically doing is accusing people who defend junk DNA of doing so out of (anti)religious motivations. However, this is the first article ever by a reputable scientist in a reputable journal that talks about ID so prominently and does not say anything negative about it, which on its own is problematic, and more importantly, if you are willing to make the accusation that the concept of junk DNA is maintained in the scientific community by atheistic bias, you should be prepared for the counterargument, which is that you are supporting the indefensible scientifically notion that the whole genome is functional because you yourself have some nefarious religious agenda, even if you have not yet come out of the closet with it.

Mattick and Dinger accuse some scientists of arguing against junk DNA because they are motivated by their desire to refute intelligent design. According to Mattick, their argument is that an intelligent designer would never put junk in our genome.

Mattick and Dinger then say ...

This argument is threatened in the face of growing functional indices of noncoding regions of the genome, with the latter reciprocally used in support of the notion of intelligent design and to challenge the conception that natural selection accounts for the existence of complex organisms.

Why would they say that? Why not just say Graur and others are wrong because the scientific evidence supports function and not junk? Why did they choose to ally themselves with intelligent design?

And why did they conflate noncoding DNA and junk DNA and use the term "blind Darwinian evolution." Those are things that the IDiots do routinely. It's safe to assume that Mattick and Dinger are familiar with the debate in the blogosphere since they reference Jack Scanlan's blog post. They must know what they are doing.

I think it's fair to psychoanalyze Mattick and Dinger since they raised the issue.

Oh, and don't forget that they said this in their abstract ...

Finally, we suggest that resistance to these findings is further motivated in some quarters by the use of the dubious concept of junk DNA as evidence against intelligent design.

"Almost" in the sense of "not"? You're reading way too much into it, but I'd be willing to change my opinion if the actual paper offered something more. As it stands, this is merely an attack on the motives of his opponents, nothing more.

If you read the paper, you cannot impute nefarious motivation to him if you are to stick to intellectual honesty. But we all know that the set of people who also read the whole paper if a subset (often not very large) of those who read the abstract, so this will have a negative effect just based on this.

Larry says: "I think it's fair to psychoanalyze Mattick and Dinger since they raised the issue."

They certainly did raise the issue, and that's to their everlasting shame-- it drops them out below the bottom of the worst of the scientific literature-- but it doesn't mean we should go there.

Nor do we need to! The logic here, the factual errors, are so bad that we don't need to question their motivations! If we get to questioning their motivations, we'll be pushed off what should be the topic:

1. their use of circular logic (while accusing others of the same),

2. their pathetic armchair psychoanalysis in place of evidence,

3. their factual errors,

4. their fallacy of affirming the consequent in spite of Dan Graur's warning not to,

5. their claim that UPWARD exceptions can't disprove the Dog's Ass Plot, and their claim that DOWNWARD exceptions don't exist!!

This is so terrible, we should not be pushed off topic into speculating about their religious beliefs. Sometimes people are pig-ignorant of science for secular reasons also.

I think that most scientists are rarely ignorant, but they associate and promote certain ideas or concepts (even when they know that they are misleading), because of potential rewards, such as advancing their careers. The case of ENCODE project is a good example.

And while we're at it, I wish journal reviewers demanded from their contributors the same transparency of sentence construction that Moran gives us. Then it might be a lot clearer exactly what Mattock's reasoning consists of.

The main self-contradictory part of Mattick's argument is where he says that repetitive DNA sequences [e.g. transposons] DON'T add to the complexity of the genome, in order to wave away the C value paradox. OTOH, he also says they're functional, all DNA is functional (!) and all of it adds to "developmental complexity", but all DNA does that without adding to "genetic complexity." Go figure!

In this way he explains the supposed superior complexity of the human as compared with, say, the legless salamander Necturus lewisi which has 34.5 times as much DNA as a human (I know, I know, it's ridiculous but that's his logic.)

I can't believe he would write a paragraph like this:

Mattick: "That may be so, but the extent of such baggage in humans is unknown. However, where data is available, these upward exceptions appear to be due to polyploidy and/or varying transposon loads (of uncertain biological relevance), rather than an absolute increase in genetic complexity"

So he just admitted that HALF THE HUMAN GENOME, HALF OF IT!! does not increase genetic complexity! Yes, you can say that HALF of the human genome does not add to genetic complexity, but you are not allowed to say it does not add to function!

He does not describe his metric for genetic complexity. If he means Kolmogorov complexity, he's right-- repetitive sequences add very little to Kolmogorov or algorithmic complexity-- but while that's true, it appears to contradict his assertion that the whole genome is functional, and all of it adds to "developmental complexity."

But I want to discuss that, besides being self-contradictory, it's gobbledygook in terms of Mattick's own allegedly relevant measure of complexity, so-called "developmental complexity" which he does not define a metric for! But supposedly we humans are superior to all other organisms by a metric Mattick won't define! Like here:

Mattick: "Moreover, there is a broadly consistent rise in the amount of non-protein-coding intergenic and intronic DNA with developmental complexity"

Mattick doesn't define "developmental complexity" and he CANNOT without looking like a fool, and sucking himself into a whirlpool of self-contradiction! Consider the following contradictory facts:

All salamanders, newts, axolotls, caecilians and waterdogs have far larger genomes than human beings, SFAIK. That's the rule, not the exception. Are they more developmentally complex than humans?

Suppose Mattick were to argue that indeed, all salamanders are indeed more developmentally complex than humans. He'd then have several huge problems.

1. Axolotols don't fully develop into the adult form of salamanders-- they keep their gills-- they are certainly less "developmentally complex" than humans and other salamanders, but the axolotl Ambystoma mexicanum has 13.7 times as much DNA as a human! SFAIK, this is true of all axolotls studied!

2. Caecilians are legless like snakes, but are amphibians. They never grow legs, and are certainly less "developmentally complex" than humans, but the legless caecilian Siphonops annulatus has 4 times as much DNA as a human.

3. The two-toed amphiuma, Amphiuma means, aka "Conger eel", like an axolotl, never undergoes full development, has small vestigial legs, no eyelids, and no tongue. But it has 27.4 times as much DNA as a human. According to Mattick's logic, it is more "developmentally complex" than a human.

Mattick cannot dismiss these facts as "flukes" or exceptions to his imaginary "rule" that he faked in his Dog's Ass Plot. Again I repeat: All salamanders, newts, axolotls, caecilians and waterdogs have far larger genomes than human beings, SFAIK.

4. Another rule: All lungfishes have more DNA than humans. The African marbled lungfish Protopterus aethiopicus has 38 times more DNA than a human. Are all lungfish more "developmentally complex" than humans?

5. Another rule: Marsupials on AVERAGE have 22% more DNA than the average placental mammal. The AVERAGE for marsupials is 16% higher than human genome size. Bennett's wallaby, for one, has 60% more DNA than a human. That's not a fluke, that's the rule. Are marsupials more "developmentally complex" than humans?

Those are the rules. Those are not the exceptions.

There are also many frogs, many sharks, many crustaceans, some insects, some annelid worms, many flatworms and many plants with more DNA than humans.

IMO Mattick's use of the term "polyploidy" is wrong. I would rather use "genome duplication" instead. While polyploidy is the starting point for genome duplication it implies that chromosomes numbers remain constant after duplication events. Genome duplication, IMO, leaves room for later rearrangements (translocations, inversions, deletions, additional gene duplications) that have happened to form the chromosomes of living species many of which are diploid despite some polyploid state of a common ancestor.

Let's do some simple envelope calculations of percent non-coding DNA, and see if Mattick is right about "no downward exceptions."

It's easy to look up genome sizes, but I can't find many counts of coding base pairs for all species.

Here are three where I at least know the gene counts.

The sea urchin has 814 Mbp and 23.3K genes.

The pufferfish has 330 Mbp and 27K genes.

The human has 3200 Mbp and 23K genes.

So we guess that each gene has, say, 1500 bp's. In the ballpark.

Then we compute the percent of non-coding DNA:

((genome size in bp)-(# of genes)*1500))/(genome size in Bp)

Sea Urchin: 95.7%

Puffer fish: 87.7%

Human: 98.9%

Is a sea urchin more developmentally complex than a pufferfish?

Mattick says there are no "downward exceptions" to his rules. Is that true?

Within many genera there is a vast variation in genome size, but no real variation in developmental complexity.

Within onions there is a 9.8-fold variation. The difference between the largest onion genome, minus the smallest, is 19.1 times larger than the whole human genome.

Within Necturus, the "waterdogs" [amphibians], there is a 5-fold variation over the genus. The difference between the largest and the smallest is 27.5 times larger than the whole human genome.

Within Genlisea-Utricularia [bladderworts, a flowering plant] there is a 24-fold variation in genome size.

Within Hyla, a genus of tree frogs, there is a 4-fold variation.

Within Xenopus, another frog genus, there is a 2.7-fold variation.

Within the genus Ctenomys, the tuco-tuco [a rodent] there is a 1.75-fold variation.

Among amphibians, there is at least a 100-fold variation. Among angiosperms, there is a 2000-fold variation.

Certainly everything in a genus should have the same number of genes; thus the percentage of non-coding DNA must vary enormously over these genera.

Certainly everything in a genus should have the same developmental complexity. How can Mattick say that developmental complexity scales with percent of non-coding DNA, and that there are no "downward exceptions"?

Whether there are downward exceptions depends entirely on what you use for your standard. Has Mattick established a standard? If you pick as your standard the species with the smallest genome in each group, then there will be no downward exceptions. Of course that will increase upward exceptions. I'm wondering what the standard vertebrate might be. If it's fugu, perhaps there are no downward exceptions. But it appears to be, judging by the figure, a doggie. So that's a problem for him. I forget why humans aren't vertebrates and why vertebrates aren't chordates. But never mind.

Harshman asks: "Whether there are downward exceptions depends entirely on what you use for your standard. Has Mattick established a standard?"

In the Dog's Ass Plot, the y axis is percent of non-coding DNA. But within single genera, there are huge variations in genome sizes, so considering that different species within the same genus must have comparable numbers of genes, and identical "complexity", there must be huge variations in percentages of non-coding DNA, which do not track with "developmental complexity."

I don't have access to the full Wong et al. article in the moment but I my impression is that it rather rather states that mRNAs containing retained introns are directed towards NMD. Thus, it is likely that at least many of such RNAs are non-fuctional. The hype about the impact of splice variants and the size of databases of alternatvely spliced transcript was dubious before though.

One may wonder if Mattick and Dinger are aware of how genome size is evaluted. How can they talk about polyploidy when c-values refer to the haploid genome? i.e., it corrects for ploidy. From there abstract:

We also show that polyploidy accounts for the higher than expected genome sizes in some eukaryotes, compounded by variable levels of repetitive sequences of unknown significance.

Otherwise one would have to conclude that the knowingly misrepresent the C-value paradox.

Y'all's fits validate what the Peer reviewed work by these researchers assertions are, in that you really could care less about scientific research and more about protecting a preferred ideology. Do some current peer review work to back your assertions, and maybe the rest of us will take you seriously...

The Mattick article contains no original research. That which is asserted without doing original research can be refuted without doing original research.

Mattick's article is the equivalent of a letter to the editor-- it's his opinion, no new research there-- but we know his logic is self-contradictory, when it isn't circular, and we know he's factually wrong.

When he says there are "no downward exceptions", that's factually wrong. It's not backed up his orginal research, and we don't need to do original research to refute it. Just read or download T. Ryan Gregory's database of animal genome sizes.

The following quote is taken from the abstract of the paper that you cited:

"We have previously argued that the proportion of an animal genome that is non-protein-coding DNA (ncDNA) correlates well with its apparent biological complexity."

This claim has been shown to be factually incorrect, in this thread and in many others. And the graph that you're trying to distance Mattick from is actually an accurate reflection of the meaning of the statement quoted above.

In any case, you haven't addressed any of the arguments that Larry and others have raised against Mattick's central claims. In fact, if Mattick is an honest scientist, and I believe he is, he should admit that the many "downward exceptions" that have been provided in this thread do actually falsify his claims.

There are always exceptions in biology; so its the way of nature. In a much more refined and detailed approach than the approximations of this thread, the authors also reports: we extended our prior work to the 1,627 prokaryotic and 153 eukaryotic genomes described above and found a clear correlation between the nc/tg ratio and increasing complex taxonomic groups (p < 2.2e-1.6, Kruskal-Wallis test, Fig. 2A). The range of nc/tg values is considerable, with the averages for archaea and bacteria being nearly identical (two-tailed p = 0.359, Mann-Whitney U test) at 0.130 and 0.136, respectively, and extending to ~0.98 in the Metazoa. The average value for each taxa is minimally influenced by data points outside the first or third quartiles. For example, [...] And further states: To further refine the association of nc/tg ratio values and organismal complexity, we investigated the 73 species with a previously defined number of cell types.35 Examining these species revealed a positive correlation between the nc/tg ratio and organismal complexity (Fig. 2B, Spearman correlation coefficient r = 0.952, p value < 0.0001). We found that the distribution of values was well described by a modified Hill’s equation56 (which is itself a modified logistic function, see “Discussion”), in the form y = Kxn/(1+Kxn) where K = 0.15219 ± 0.02272 with a p value < 0.0001 and n = 0.99888 ± 0.06943 with a p value < 0.0001 (Fig. 2B). This distribution is consistent with patterns observed in complex information systems theory, in which the amount of encoded information approaches an asymptote defined by the maximum allowable entropy (see “Discussion”).

I suspect that by "downward exceptions", what is implied is a downward trend in the nc/tg ratios during evolution.

If that's what Mattick meant by "downward exceptions" then his claims have been falsified.

This article is behind a pay wall. Noncodarnia, please copy and paste the values of nc and tg and celltype counts for the 73 species named. It might be in the Supplemental materials. Let's make our own plot and see if it matches the dog's ass plot.

We should also note that correlation does not prove causation. More complex organisms have smaller population sizes, hence more slightly deleterious mutations, hence more non-coding DNA. So this correlation does not get close to proving all ncDNA adds to biological complexity.

But we know the rules: all salamanders, newts, caecilians, axolotls etc. have much more ncDNA than humans. All lungfish have more ncDNA than humans. Marsupials on average have more DNA than placentals incl. humans. Sharks have more ncDNA than bony fish. So let's make our own plot and see if we can discern the real rules.

Noncodarnia says there are always exceptions in biology. Mattick just said there are none. Noncodarnia should therefore regard Mattick as incompetent. I doubt such blatant contradictions will trouble him.

And another question: why in this paper did Mattick only analyze ANIMALS? No plants, no fungi allowed. They've been EXPELLED.

Inconvenient facts: angiosperm genomes vary by 2000-fold.

I have two questions for you:

Is Paris japonica a downward exception?

Is Utricularia gibba?

I think you'll change the subject rather than answer. Mattick is cherry picking his data, therefore he did not pick cherries; they would be a most unsuitable data point.

Adding to what Diogenes has just said, the following is quoted from Graur's critique of the ENCODE media hype:

"Actually, evolution can only produce a genome devoid of “junk” if and only if the effective population size is huge and the deleterious effects of increasing genome size are considerable (Lynch 2007). In the vast majority of known bacterial species, these two conditions are met; selection against excess genome is extremely efficient due to enormous effective population sizes, and the fact that replication time and, hence, generation time are correlated with genome size. In humans, there seems to be no selection against excess genomic baggage. Our effective population size is pitiful and DNA replication does not correlate with genome size."

I'm afraid that the inclusion of prokaryotic genomes in the first analysis that you quoted might have skewed the results.

Shadi, you are right about his large number of prokaryotes, but I would like to know the 73 species used by Mattick to compute the correlation with cell type counts-- perhaps the subset of 73 is less dominated by prokaryotes.

I asked Nocodarnia to paste Mattick's data and I'm still waiting.

But if there's a bias toward prokaryotes, it's part of a larger problem of bias: Mattick limits himself to sequenced genomes, and so that database is highly biased against large genomes, because they're expensive to sequence.

So with this standard, he's generally examining the near smallest genomes in each group, humans excepted. This bias might be why humans stand out-- an artifact of our human interest in sequencing our own bloated genome, but not the Neuse River waterdog's.

You guys and gals may find it interesting that the Monarch Butterfly (Danaus plexippus) has a genome size of 0.29pg (picograms) while the Least-Marked Euchlaena Moth (Euchlaena irraria) has a genome size of 1.94pg.

From this page:

http://www.genomesize.com/statistics.php?stats=insects

Of course 'complexity' can be argued endlessly and a Monarch butterfly isn't exactly the same critter as a Euchlaena Moth. They each have their own attributes. The Monarch stands out with its migration and wintering behavior, while the moth may be able to hear the sonar signals of bats and avoid being eaten. I say "may" because I'm not familiar with that particular moth. Still, I'm a bit surprised that the Monarch genome is smaller than that of a Euchlaena Moth and all the other assayed leps (59 total), according to that page.

Interesting. Above I asked whether Mattick had cherry picked his data by not picking cherries. For the record, the common cherry Prunus avium has 1/10th the DNA of a human, but within the genus Prunus there is a 13-fold variation.

The largest species is bigger than the human genome, and the difference between largest and smallest is 3.37 pg, almost as big as the whole human genome.

And on the topic of crickets (since crickets are all we hear when we ask the creationists or Mattick and coworkers to explain the C value paradox), the families Gryllacrididae and Gryllidae have genome sizes varying by 6-fold among them. The camel cricket has 2.7 times more DNA than human. The difference between the largest and the smallest is more than twice the size of the whole human genome.

SPARC, you're the hero of the day. Thank you very much for stepping up to the plate when Noncodarnia (who I think might be a student of Mattick's) ran off.

The cell type counts are great, but might I trouble you to copy the coding nucleotide counts and genome sizes for the 73 species that Mattick analyzes? Not all of them, just the 73 for which he computed the correlation.

Laurence A. Moran

Larry Moran is a Professor Emeritus in the Department of Biochemistry at the University of Toronto. You can contact him by looking up his email address on the University of Toronto website.

Sandwalk

The Sandwalk is the path behind the home of Charles Darwin where he used to walk every day, thinking about science. You can see the path in the woods in the upper left-hand corner of this image.

Disclaimer

Some readers of this blog may be under the impression that my personal opinions represent the official position of Canada, the Province of Ontario, the City of Toronto, the University of Toronto, the Faculty of Medicine, or the Department of Biochemistry. All of these institutions, plus every single one of my colleagues, students, friends, and relatives, want you to know that I do not speak for them. You should also know that they don't speak for me.

Subscribe to Sandwalk

Quotations

The old argument of design in nature, as given by Paley, which formerly seemed to me to be so conclusive, fails, now that the law of natural selection has been discovered. We can no longer argue that, for instance, the beautiful hinge of a bivalve shell must have been made by an intelligent being, like the hinge of a door by man. There seems to be no more design in the variability of organic beings and in the action of natural selection, than in the course which the wind blows.Charles Darwin (c1880)Although I am fully convinced of the truth of the views given in this volume, I by no means expect to convince experienced naturalists whose minds are stocked with a multitude of facts all viewed, during a long course of years, from a point of view directly opposite to mine. It is so easy to hide our ignorance under such expressions as "plan of creation," "unity of design," etc., and to think that we give an explanation when we only restate a fact. Any one whose disposition leads him to attach more weight to unexplained difficulties than to the explanation of a certain number of facts will certainly reject the theory.

Charles Darwin (1859)Science reveals where religion conceals. Where religion purports to explain, it actually resorts to tautology. To assert that "God did it" is no more than an admission of ignorance dressed deceitfully as an explanation...

Quotations

The world is not inhabited exclusively by fools, and when a subject arouses intense interest, as this one has, something other than semantics is usually at stake.
Stephen Jay Gould (1982)
I have championed contingency, and will continue to do so, because its large realm and legitimate claims have been so poorly attended by evolutionary scientists who cannot discern the beat of this different drummer while their brains and ears remain tuned to only the sounds of general theory.
Stephen Jay Gould (2002) p.1339
The essence of Darwinism lies in its claim that natural selection creates the fit. Variation is ubiquitous and random in direction. It supplies raw material only. Natural selection directs the course of evolutionary change.
Stephen Jay Gould (1977)
Rudyard Kipling asked how the leopard got its spots, the rhino its wrinkled skin. He called his answers "just-so stories." When evolutionists try to explain form and behavior, they also tell just-so stories—and the agent is natural selection. Virtuosity in invention replaces testability as the criterion for acceptance.
Stephen Jay Gould (1980)
Since 'change of gene frequencies in populations' is the 'official' definition of evolution, randomness has transgressed Darwin's border and asserted itself as an agent of evolutionary change.
Stephen Jay Gould (1983) p.335
The first commandment for all versions of NOMA might be summarized by stating: "Thou shalt not mix the magisteria by claiming that God directly ordains important events in the history of nature by special interference knowable only through revelation and not accessible to science." In common parlance, we refer to such special interference as "miracle"—operationally defined as a unique and temporary suspension of natural law to reorder the facts of nature by divine fiat.
Stephen Jay Gould (1999) p.84

Quotations

My own view is that conclusions about the evolution of human behavior should be based on research at least as rigorous as that used in studying nonhuman animals. And if you read the animal behavior journals, you'll see that this requirement sets the bar pretty high, so that many assertions about evolutionary psychology sink without a trace.

Jerry Coyne
Why Evolution Is TrueI once made the remark that two things disappeared in 1990: one was communism, the other was biochemistry and that only one of them should be allowed to come back.

Sydney Brenner
TIBS Dec. 2000
It is naïve to think that if a species' environment changes the species must adapt or else become extinct.... Just as a changed environment need not set in motion selection for new adaptations, new adaptations may evolve in an unchanging environment if new mutations arise that are superior to any pre-existing variations

Douglas Futuyma
One of the most frightening things in the Western world, and in this country in particular, is the number of people who believe in things that are scientifically false. If someone tells me that the earth is less than 10,000 years old, in my opinion he should see a psychiatrist.

Francis Crick
There will be no difficulty in computers being adapted to biology. There will be luddites. But they will be buried.

Sydney Brenner
An atheist before Darwin could have said, following Hume: 'I have no explanation for complex biological design. All I know is that God isn't a good explanation, so we must wait and hope that somebody comes up with a better one.' I can't help feeling that such a position, though logically sound, would have left one feeling pretty unsatisfied, and that although atheism might have been logically tenable before Darwin, Darwin made it possible to be an intellectually fulfilled atheist

Richard Dawkins
Another curious aspect of the theory of evolution is that everybody thinks he understand it. I mean philosophers, social scientists, and so on. While in fact very few people understand it, actually as it stands, even as it stood when Darwin expressed it, and even less as we now may be able to understand it in biology.

Jacques Monod
The false view of evolution as a process of global optimizing has been applied literally by engineers who, taken in by a mistaken metaphor, have attempted to find globally optimal solutions to design problems by writing programs that model evolution by natural selection.