Every newborn human baby has about 100 mutations not found in either parent. If most of our genome contained functional sequence information, then this would be an intolerable genetic load. Only a small percentage of our genome can contain important sequence information suggesting strongly that most of our genome is junk.

C-Value Paradox

A comparison of genomes from closely related species shows that genome size can vary by a factor of ten or more. The only reasonable explanation is that most of the DNA in the larger genomes is junk.

Modern Evolutionary Theory

Nothing in biology makes sense except in the light of population genetics. The modern understanding of evolution is perfectly consistent with the presence of large amounts of junk DNA in a genome.

Pseudogenes and broken genes are junk

More than half of our genomes consists of pseudogenes, including broken transposons and bits and pieces of transposons. A few may have secondarily acquired a function but, to a first approximation, broken genes are junk.

Most of the genome is not conserved

Most of the DNA sequences in large genomes is not conserved. These sequences diverge at a rate consistent with fixation of neutral alleles by random genetic drift. This strongly suggests that it does not have a function although one can’t rule out some unknown function that doesn’t depend on sequence.

I’m also relieved to see that the anti-junk fanatics tend to gravitate towards Larry’s blog and leave the rest of us mostly alone.

Comments

I’d quibble that the only explanation for #2 is junk and more junk. In fact, I thought some plants have huge genomes because of massive replication — in effect, each chromosome got duplicated multiple times over history. If this is so, then the ratio of junk to functional shouldn’t change much over the unduplicated version of its genome.

Don’t forget that vertebrates have had at least two whole genome duplications in their early history. The vast majority of the duplicated genes decayed into pseudogenes (and many must have been deleted entirely).

(This is me trying to understand — I’m not not disagreeing with you experts)

OK, that makes sense that selection doesn’t care much about redundant copies of genes and so the extra copies will degrade over time (although presumably such decay is spread out over the chromosome copies).

But how could such a thing become fixed in a species? There is an energy cost to building and maintaining those extra copies in every cell of the plant. Why aren’t the strains which developed 10x copies of its genome out-competed by its cousins which didn’t duplicate its genome? [ I understand that in small populations random stuff, even bad stuff, can just happen to become fixed, but I’m talking about the general case here ].

Duplication occasionally happens in animals, too. Xenopus laevis is considered allotetraploid – it had a duplication event some time after diverging from Xenopus tropicalis. The result is that when developmental biologists working in X. laevis try to do molecular biology in their system, they inevitably deal with pseudogenes. It made cloning X. laevis versions of developmentally significant genes a pain … and drives Xenopus grad students into mouse and zebra fish labs.

*cough, furtive looks*

Although, a few guys at Berkeley (Lyle Zimmerman, Richard Harland, I think Tyrone Hays supplied them with some tropicalis, too), tried to start using tropical is so they could use a “real” diploid organism. Don’t know how that ended up.

But how could such a thing become fixed in a species? There is an energy cost to building and maintaining those extra copies in every cell of the plant. Why aren’t the strains which developed 10x copies of its genome out-competed by its cousins which didn’t duplicate its genome?

Because for most organisms the added cost of replicating a bigger genome is small enough that selection doesn’t notice it. The more often replication occurs the more this cost shows up, and the bigger the population, the smaller the cost that selection can notice. Most bacteria have negligible amounts of junk DNA for just those reasons. But most eukaryotes have much less frequent replication and much smaller populations than bacteria, and so the cost isn’t big enough to be selected against. Several orders of magnitude variation in genome size seems to be nearly neutral in evolutonary terms.

@Amphiox
Somebody already mentioned the two rounds of tetraploidy in vertebrate history. Someone should also mention the third round in teleosts. The vast majority of duplicate genes from all three events have disappeared, and we recognize the events from the few remnants that proved useful, such as Hox clusters.

The reason for this is that the energy expenditure required to replicate and maintain DNA is trivial compared to the total energy budget of a typical eukaryotic cell, thanks to the possession of mitochondria. Whatever minimal cost there is to replicating the extra DNA is either too small to be visible to selection, or easily compensated by some minor benefit accrued from one of the duplicated genes.

In prokaryotes, however, we do in fact see that extra DNA does have a significant energetic cost, and is actively selected against. Prokaryotes lose unnecessary genes quickly (ie deletion mutations that remove unnecessary DNA are often strongly beneficial). This has been hypothesized as one reason why prokaryotes never evolved the kind of complexity that eukaryotes evolved. Prokaryotes also have very little junk DNA (and often none at all).

Bingo! That was my thought too, and I’ve asked PZ and others about it. I’m not an expert (at all!), and I haven’t yet found anyone who has actually studied the energy costs in a precise way, but I think the cost of duplicating junk DNA is not that great. Also, cells have mechanisms trying to repair damaged DNA, so that would be working at cross purposes to any simplifying process.

So all in all, it seems that just like in my house, junk tends to accumulate much faster than it gets cleared away.

Right, no function in regards to DNA sequence at first approximation (though many scientists make careers on digging deeper than first approximation). Creationist arguments aside, I think the role of junk DNA in genome evolution is an interesting controversy.

Creationist arguments aside, I think the role of junk DNA in genome evolution is an interesting controversy.

Yes, in the same way that evolution vs. intelligent design is an interesting controversy. Did you in fact read Larry’s list? Most of your genome is junk. It has no role in evolution, if by that vague phrase you mean it is somehow adaptive. If on the other hand you mean that it has something to do with evolution, that’s certainly true; genome size evolves.

I think the role of junk DNA in genome evolution is an interesting controversy.

In the sense that junk DNA contains a record of “fossil” genes, whose sequences, having no functional significance, change solely by mutation and drift at quantifiable rates, without any influence from selection complicating the picture, it can provide much insight into the course of evolution in different lineages.

Another way to think about it is that junk DNA is a consequence of genome evolution, rather than it playing any role in it. As genomes expand and gain complexity, the accumulation of junk DNA is the stochastic byproduct of undirected evolutionary processes. For every duplicated gene that mutates in a manner that accrues novel function, many more duplicate only to degenerate into pseudogenes, or be deleted altogether. Thus does junk accumulate, at a rate faster than new functions.

In an intelligently designed system of course, this should not happen.

If you want me to be more specific, just ask. I’m interested in chromosome evolution as well as the evolution of new genes, particularly in this case as a result of genomic duplication and adaptive mutation…as well as the possibility of non-sequence level adaptation such as GC and isochore conservation across taxa. Go google segmental duplications or something.

I’m a programmer, so every time I read about junk DNA, my first thought is, “aaah, we have to remove the nonfunctional code!”

Question is: what effect would that have? Would it be safe to just remove junk DNA, or could there be side effects, like, some processes suddenly running too fast, causing different behaviour?
Also, would it have beneficial effects except the aforementioned saving of a little amount of energy?

No! No! all wrong! ‘Junk’ DNA is the entire coded Encyclopaedia Galactica inserted in every one of our cells by helpful aliens!
And all we have to do is to decode it.
Indeed we might just find the Answer, you know the answer to ‘Life, the Universe and Everything’.
Although it wouldn’t surprise me if by the time we do get around to reading it, those buggers in the GM industry will have mucked around with DNA so much that all we’ll be able to read will be:

As a ‘programmer’ it should be obvious that “junk dna” is just biology’s way of doing ECC. That is, if you were going to transmit your ‘code’ over a noisy signal, through in a lot of random ‘junk’ so the noise is less likely to hit the ‘valid’ parts of your code. The statistics are a little out-of-the-ordinary in that the error-rate is not on a bit-by-bit scale but on the whole sequence itself. So if there is going to be one error in the sequence and the sequence has 100 bits of signal and 1000 bits of “junk”, the error will more likely end up in the “junk”. but if your sequence is 100 bits of signal and 0 bits of “junk” then the error will definitely be in the signal.

vertebrates have had at least two whole genome duplications in their early history

Yep, which is why we have four sets of Hox genes (none of them complete!) instead of one.

And the teleosts, as mentioned in comment 12, had another round, giving them about seven sets of Hox genes.

Xenopus tropicalis

More commonly Silurana tropicalis these days. Xenopus and Silurana do appear to be sister-groups, though.

No! No! all wrong! ‘Junk’ DNA is the entire coded Encyclopaedia Galactica inserted in every one of our cells by helpful aliens!

Sadly, this, too, fails the onion test.

So if there is going to be one error in the sequence

See comment 27: mutations don’t happen at a rate per sequence, they happen at a rate per number of nucleotides. This is a basic empirical fact that has been known for decades.

Also, onion test.

“…and the universe was created by Aaaarghhhh…”

This, on the other hand, is full of win!

Aren’t there relationships between genome size and nucleus size, and beween nucleus size and cell size?

Yes. The epidermal hairs of Arabidopsis consist of a single 32-ploid cell each, for example.

Because smaller cells have more surface per volume, they allow faster metabolism. There is some degree of inverse correlation between genome/cell size and metabolic rates in sarcopterygians at least: the smallest genomes are found in saurischian dinosaurs (birds included) and bats, the most gihugrongous ones in salamanders and lungfish.

@Stevem #26. But the error rate (by which I suppose you mean mutation rate, to drop the ugly computer metaphor) isn’t per whole sequence; it’s on that bit-by-bit scale you reject.

Ah, my EE background showing, not a biologist. Thought the gene-by gene mutation rate was a possibility, so rejected it first to make the rest of the “metaphor” work. Hoist by my own petard, my error wasn’t a single part of the analogy but the whole thing itself. Let this (#26) stand as an example of the fallacy of treating DNA as simply a computer’s sourcecode.

It isn’t clear to me what any of those has to do with junk DNA specifically rather than with whole-genome effects.

As comparative genomics is becoming more feasible through NGS, don’t think function will suddenly spread across the genome but that our definition of function or junk will be somewhat revised as karyotype evolution is better understood. For example, repeat rich sites as targets for segmental duplication and therefore large-scale DNA variation and evolution.

Doolittle’s ENCODE opinion piece in PNAS

“the publicity surrounding ENCODE reveals the extent to which these understandings [of biological function] have been eroded. However, theoretical expansion in other directions, reconceptualizing junk, might be advisable.”

The folks that go hog-wild into creationism are not the kind of folks that trust facts, understand statistics and probability, follow logical trains of thought and analysis and comparison, or that even will sit down and read a bunch of logical bullet points.

Lists of facts are mostly only helpful when preaching to the choir, the other folks, it just sails over their heads.