A Response to Paul McBride on Junk DNA

We have recently been discussing a series of critical reviews of Science and Human Origins published by PhD student Paul McBride on his blog Still Monkeys. I am going to respond now to McBride's review of chapter 4, in which he tackles Casey Luskin's handling of the subject of "junk DNA." McBride's rebuttal to chapter 4 is divided into two sections -- part 1 and part 2. First, McBride (before having read the chapter) makes ten predictions about what he will find. He then offers a response to the chapter itself.

In his review, McBride notes that he had predicted that Luskin would "conflate non-coding DNA and junk DNA, and that Luskin would exploit this erroneous conflation by pointing to known functions of non-coding DNA as evidence against junk DNA."

Of course, no one today (including the likes of Larry Moran, PZ Myers and T. Ryan Gregory) denies that at least some non-protein-coding DNA serves important functions. The term "junk DNA" was first coined in 1972 in a paper by Susumu Ohno. Although Ohno believed that the vast majority of the DNA that didn't code for proteins was "the remains of nature's experiments which failed," Ohno suggested that "these silent DNA base sequences may now be serving the useful but negative function of spacing [genes]."

McBride writes,

I'd like just once to see all these references where all these researchers are saying that if DNA does not code for a protein then it is junk.

As stated above, no credible scientist claims that all non-coding DNA is "junk." But there are plenty of scientists who claim that the majority of it is "junk." In 1980, Francis Crick and Leslie Orgel published an article in Nature titled "Selfish DNA: The Ultimate Parasite." They concluded that "there is a large amount of evidence that suggests, but does not prove, that much DNA in higher organisms is little better than junk." Although they claimed that it is not "very plausible that all this extra DNA is needed for gene control," they conceded that "some portion of it certainly must be."

As recently as 2009, in The Greatest Show on Earth, Richard Dawkins has claimed that "the greater part (95 per cent in the case of humans) of the genome might as well not be there, for all the difference it makes."

In 2010, John Avise published a book titled Inside the Human Genome: A Case for Non-Intelligent Design, in which he claimed that "noncoding repetitive sequences -- 'junk DNA' -- comprise the vast bulk (at least 50%, and probably much more) of the human genome."

Notwithstanding what Paul McBride says, ID proponents are well aware of this literature and do not, as he claims, conflate "junk DNA" and "non-coding-DNA." Indeed, if he reads Chapter 2 of Jonathan Wells's The Myth of Junk DNA, he will find that Wells acknowledges that even the early pioneers of the "junk DNA" paradigm did not necessarily think that the entire non-coding genome was "junk."

In his list of predictions of the contents of Luskin's chapter on junk DNA, McBride erects a second strawman. He predicts that Luskin will "identify functions of non-coding DNA...as evidence that junk DNA doesn't exist." It seems to be very probable that at least some portion of our DNA is indeed genuinely without function. But the frequent Darwinist claim that the majority of it is without function, and that this serves as evidence against design, is highly suspect. As Jonathan Wells has written,

I never claim that functions have been found for most non-protein-coding DNA, though as I stated above the list grows longer every week. It is the trend, more than the current total, that should worry any defender of junk DNA.

McBride points out that a significant portion of the mammalian genome "actually could not be evolutionarily conserved by natural selection," and he claims that this provides evidence that it is non-functional. While conservation usually implies functional constraint, however, the converse does not necessarily follow. For example, a paper was published in 2007 in Nature by the ENCODE Project Consortium detailing the results from the ENCODE project. This research reported,

At the outset of the ENCODE Project, many believed that the broad collection of experimental data would nicely dovetail with the detailed evolutionary information derived from comparing multiple mammalian sequences to provide a neat "dictionary" of conserved genomic elements, each with a growing annotation about their biochemical function(s). In one sense, this was achieved; the majority of constrained bases in the ENCODE regions are now associated with at least some experimentally derived information about function. However, we have also encountered a remarkable excess of experimentally identified functional elements lacking evolutionary constraint, and these cannot be dismissed for technical reasons. This is perhaps the biggest surprise of the pilot phase of the ENCODE Project, and suggests that we take a more "neutral" view of many of the functions conferred by the genome. [emphasis added]

Thus, it seems, while evolutionary constraint may be used to infer function, there is no reason to think that non-constraint implies non-function.

McBride then discusses Luskin's treatment of transposable elements. As he notes, transposable elements are sequences of repetitive DNA, accounting for nearly half our total genome, that have the ability to self-transpose from one position to another within the genome. In fact, around 45% of our genome is made up of transposable elements and around 55% of non-repetitive elements. McBride also correctly observes that most transposons in the human genome are inactive, meaning that they have mutated so that they are no longer able to undergo transposition. Although McBride concedes that some of these transposons do serve genomic functions, he nonetheless claims that this is a very small fraction of these DNA elements, and that most are indeed "junk."

A relevant review paper was published last year, however, in the journal Briefings in Functional Genomics (Pandey and Mukerji, 2011). The paper was entitled "From 'JUNK' to Just Unexplored Noncoding Knowledge: the case of transcribed Alus." This review paper documents a plethora of functions for Alu elements (the most abundant type of TE in humans). These functions include "mediat[ing] non-homologous recombination leading to genome shuffling, affect[ing] nucleosome positioning/exclusion and chromatin remodeling, and alter[ing] methylation status/imprinting."

In addition, "Alu nucleosomes are proposed to serve as 'anchors' in organizing the chromatin in human cells." They may also "serve as TFBS or enhancers/repressors and thus regulating gene expression," and operate as repressors or enhancers depending on their position within the promoter region. They can also "affect transcriptional activity, produce alternative splice variants through its exonization and A-I edited transcripts." In addition, the untranslated regions (UTRs) "can affect alternative transcript isoforms in a tissue-specific manner and provide binding site for miRNAs." The more we learn about transposable elements, the more we discover that they are not junk at all.

Next, McBride touches on the apparent pervasive transcription of the genome. Surprisingly, McBride cites the work of Bakel et al., noting that "while certain techniques have detected high levels of transcription in humans, other techniques that are less error-prone have failed to do so." The problem is that the Bakel paper is based on a fatal methodological flaw.

Bakel et al. use a program called "RepeatMasker," which screens out all the repetitive DNA. But given that about 50% of our genome is comprised of repetitive DNA, the conclusions drawn by the authors seem to be less than convincing. In fact, the official description of RepeatMasker itself states that "On average, almost 50% of a human genomic DNA sequence currently will be masked by the program."

To make matters worse, the researchers proceed to base their results "primarily on analysis of PolyA+ enriched RNA." But we've known since 2005 that, in humans, PolyA- sequences are twice as abundant as PolyA+ transcripts. So the authors not only exclude half the genome from their research, but also completely ignore two thirds of the RNA in what remains.

Kapranov et al. (2010) warn that "efforts to elucidate how non-coding RNAs (ncRNAs) regulate genome function will be compromised if that class of RNAs is dismissed as simply 'transcriptional noise.'" Furthermore, they note that some studies "focus only on polyA-selected RNA, a method that enriches for protein coding RNAs and at the same time discards the vast majority of RNA prior to analysis." In any case, it has been shown that even DNA that is not transcribed can be functional.

McBride also offers a rebuttal to Luskin's review of pseudogenes. He raises the now infamous human GULO (L-gulonolactone oxidase) pseudogene which, in most mammals, codes for one of the enzymes involved in the biosynthesis of vitamin C. As I have discussed previously, however, it is possible that this so-called pseudogene is involved in prenatal synthesis of vitamin C in humans.

McBride claims that,

Some pseudogenes get transcribed as RNA and sometimes act as regulators for genes. Such pseudogenes are not junk, and there are a couple known examples. We have about 20,000 pseudogenes, so again this is a numbers game. The majority are non-functional, and contribute to our total junk (although they total only about 1% of our genome).

Only a couple of known examples? Actually, the number is far larger than that. For a recent review, see this 2011 paper by Pink et al. ("Pseudogenes: Pseudo-functional or key regulators in health and disease?"). The abstract states,

Pseudogenes have long been labeled as "junk" DNA, failed copies of genes that arise during the evolution of genomes. However, recent results are challenging this moniker; indeed, some pseudogenes appear to harbor the potential to regulate their protein-coding cousins. Far from being silent relics, many pseudogenes are transcribed into RNA, some exhibiting a tissue-specific pattern of activation. Pseudogene transcripts can be processed into short interfering RNAs that regulate coding genes through the RNAi pathway. In another remarkable discovery, it has been shown that pseudogenes are capable of regulating tumor suppressors and oncogenes by acting as microRNA decoys. The finding that pseudogenes are often deregulated during cancer progression warrants further investigation into the true extent of pseudogene function. In this review, we describe the ways in which pseudogenes exert their effect on coding genes and explore the role of pseudogenes in the increasingly complex web of noncoding RNA that contributes to normal cellular regulation.

Finally, McBride touches on the argument for common descent from the evidence of the fusion origin of chromosome 2. He writes,

Luskin argues that this is not proof of common descent, and there may not have even been a fusion event. He argues that the telomeric DNA in human chromosome 2 is shorter than the telomeres found at the end of typical chromosomes. Again, this is designed to cast a small shadow of doubt on our common descent with other primates. No positive argument is offered for an alternative model.

In Science and Human Origins, Casey Luskin briefly mentions in passing that the evidence for a fusion origin of chromosome 2, while suggestive, is not conclusive. For example, both interstitial telomeric sequences (Farre et al., 2009) and secondary alpha satellite sequences (Baldini et al., 1993) have been identified in other instances where no fusion has occurred.

As Luskin has explained previously on ENV, however, even in the event that chromosome 2 did indeed originate via a fusion event, common descent between humans and chimpanzees is not the only hypothesis that can account for this. On an evolutionary model, since no other primates share this fusion, the fusion event must have taken place some time after the supposed divergence of the human and chimpanzee lineage.

But suppose that our genus Homo, being separately designed with 48 chromosomes, underwent a fusion event within its own lineage, quite independently of other primates. This would result in exactly the same observations with respect to chromosome 2 and the current number of chromosomes (46). The evolutionist might protest that the banding patterns of human chromosome 2 match two of the chimpanzee autosomes -- but that only brings us back to the argument from similarity. And we already know that humans and chimpanzees have striking genetic similarity.

To conclude, while Paul McBride is to be commended for having actually read the book before criticizing its arguments (unlike certain other people we could mention), his critiques (rebuffed here and in previous posts) should hardly be a cause for concern to the informed reader.