dIntrons sequences account for about 30% of the genome. Most of these sequences qualify as junk but they are littered with defective transposable elements that are already included in the calculation of junk DNA.

This conforms to my general impression but I would like to see reference(s) for the specific numbers. E.g., functional vs defective RNA viruses.

Good question! I haven't yet posted a specific description of the Human Endogenous Retroviruses (HERVs). Check out Kurth and Bannert (2010) [Int J Cancer 126:306-314] for a review of the subject.

The only functional retroviruses in the human genome are the ones belonging to the HERV-K class. All the others have multiple mutations that make then defective. There appear to be about 100 recent sites of HERV-K insertion but most of the retroviruses at these sites have mutations. Site HERV-K113 is almost certainly functional but it seems like there may only be a few more in the human genome.

That's the basis of my estimate that <1% of the endogenous retrovisuses are active.

Not, not really. I asked for my own education only. I know very few hard facts about viruses in genomes.

With the multitude of human viruses, I would have thought that over time most of them managed integration into a germline. At least weaker versions of them that don't mess up cells too much should then just stay there. So I expected more than just retroviruses and more than ~10% of active viruses.

I have one question - how do you get to 8.5% being essential - is this different from being functional?

To replicate this 8.5% figure I've had to exclude all amounts which are quantified as being < 1%.

If I assume that it is possible to say <1% means at most 1% (or whatever) for these figures (seems reasonable!) - then it would seem to put functional/essential elements discovered so far to be no more than about 16% - ie nearly double the figure you've given.

With your total adding up to 98% I assume you are hedging your bets, which is totally understandable given the state of the science, but the lacunas this creates are a bit of a confusion.

I'm going to quibble with your assessment of the rRNA genes and the false dichotomy between 'essential' and 'junk' in this specific case. While only 1/3 may be actively used to generate rRNA in a mitotic cell, all of the rRNA genes are 'functional' in the sense that in the germline having abundant numbers of nearly sequence-identical genes in high local concentration keeps the non-allelic meiotic recombination rate in the rDNA high, thereby preserving the homogeneity of the rDNA sequences. Trimming off the 'extra' copies of rDNA genes will reduce the rate of gene conversion in the rDNA compromising the evolutionary capacity of the rDNA to maintain functional primary sequence. The relative importance of this phenomenon for a functional human genome is unclear, but since S. cerevisiae uses a similar rDNA genomic architecture strategy, it is reasonable to hypothesize that the 'extra' copies are in fact relevant to the success of the species population, keeping in mind that evolution is a population-based phenomenon rather than a single organism phenomenon.

If I assume that it is possible to say <1% means at most 1% (or whatever) for these figures (seems reasonable!) - then it would seem to put functional/essential elements discovered so far to be no more than about 16% - ie nearly double the figure you've given.

Thanks for pointing this out. When I said <1% I meant WAY less than 1% but I can see how that might have been confusing. I've fixed all those numbers to read "<0.1%"—hope this helps.

I've also adjusted the "unknown" value to 26.5% so everything adds up to 100%.

The 5S RNA genes are arranged as a single tandem cluster on chromosome 1 (1q42). The repeats are 2.2 kb but the gene itself is only 200 bp. The number of repeats varies from individual in the range of 35-175 copies. I estimate that the average size of the cluster is 220 kb and more than half of the repeat could be deleted without any effect. (The length of the repeat varies from species to species.

The 18S/5.8S/28S/ ribosomal RNA genes are found in five different clusters as 43 kb repeats. Most of this repeat is non-transcribed spacer and I estimate that a lot of it is non-essential DNA that could easily be deleted without any effect on the individual or the species. (The length of non-transcribed spacer varies from species to species.)

I agree with you that even the ribosomal psedugoenes may be required in the long run. That's why I ended the posting with ...

The minimum number of 45S genes in mammals is not known for certain in humans but in chickens the loss of anything more than half the average number is lethal. It seems reasonable that of the 300 or so human 45S genes only about 150 are absolutely required and the remainder are dispensable. However, concerted evolution of these genes is essential in the long run and the mechanism of concerted evolution and gene conversion requires lots of copies. Thus, all copies are necessary for the species even though only half may be required for any one individual.

Thanks for alerting me to a problem with the calculations. I've fixed the numbers.

The minimum number of 45S genes in mammals is not known for certain in humans but in chickens the loss of anything more than half the average number is lethal.

The chicken data on rDNA copy number notwithstanding, the effect of rDNA insufficiency in humans is unknown. Clearly a severe insufficiency will be lethal, but there is unlikely to be a sharp demarcation between 'enough = totally fine' and 'not enough = dead'. For this reason, the division of rDNA gene content into 'essential' and 'junk' is an oversimplification. I hypothesized that non-lethal human rDNA insufficiency would manifest in humans as Diamond-Blackfan anemia of variable clinical severity determined by the level of rDNA insufficiency, drawing parallels to the way in which DBA is known to be caused by ribosomal protein insufficiency, but unfortunately, the NIH declined to fund these studies.

In terms of the effects of the length of the non-transcribed spacer on the efficiency of rDNA gene conversion and architectural rearrangement, much will depend upon the efficiency of meiotic recombination with respect to the interplay of multifactorial considerations including unit repeat length, repeat copy number, total number of gene clusters, and sub-nuclear multi-cluster organization (such as nucleolar co-localization). This area of meiotic biology is very poorly understood.

I like your revised estimate that over half of the rDNA is essential, but the rest is going to be progressively less essential. The point at which 'less essential' slides over into 'junk' is a judgement call.

The chicken data on rDNA copy number notwithstanding, the effect of rDNA insufficiency in humans is unknown.

Right. That's why I said that every single gene is essential if we assume an average of 300 45S genes per genome. There's no junk in those genes even though some of them are pseudogenes.

I like your revised estimate that over half of the rDNA is essential, but the rest is going to be progressively less essential. The point at which 'less essential' slides over into 'junk' is a judgement call.

Of course it's a judgment call. I'm assuming that a large part of the spacer DNA is junk because there are plenty of species that have much shorter spacers.

If you have a better estimate that you prefer then please let me know. Remember we're quibbling over something like 0.1% of the genome. Do you think that's an important point? Why?

Remember we're quibbling over something like 0.1% of the genome. Do you think that's an important point? Why?

No, the important point is that for some aspects of the genome, like the rDNA, the strict binary classification of "essential" vs "junk" is inaccurate because it excludes the potential for a lot of middle ground between "can't live without it" and "of no use whatsoever". There needs to be a third category along of the lines of "confers a fitness advantage under some circumstances", or whatever you'd prefer to call it.

The relevance of the lengths of the spacers in other species is hard to assess due to species-specific differences in recombination efficiency determinants, many of which are understood poorly or not at all, particularly in the meiotic context.

No, the important point is that for some aspects of the genome, like the rDNA, the strict binary classification of "essential" vs "junk" is inaccurate because it excludes the potential for a lot of middle ground between "can't live without it" and "of no use whatsoever". There needs to be a third category along of the lines of "confers a fitness advantage under some circumstances", or whatever you'd prefer to call it.

That category is already covered under "essential" and "functional." I would never classify such a sequence as "junk."

There is a wealth of evidence that repetitive DNA (including retrotransposons) do play an important part in stabilizing the DNA molecule in eukaryotes (see research on the GC content) .Also, it need not have an "active" function. I suspect introns, for example, mostly serve as spacer sequences to facilitate alternative splicing etc.

All the same, LTR elements do serve to modulate gene expression, even at a distance. The mechanism is still poorly understood. They are hardly defective. Why has natural selection conserved all this "junk DNA"? Why are 70-90% of plant genomes made of the stuff? To accumulate all this "garbage" would impose to high of a metabolic cost on the organism.

I think you are arguing that the core information content of the genome is about 9% - agreed. But that doesn't mean the rest is useless.

I strongly believe that intergenic ncDNA helps maintain structural stability and also shields coding DNA from harmful mutations and viruses.

Belief (per se) has no currency in science. Do you have any substantial evidence for this claim?

If you're suggesting that the junk DNA hypothesis itself is an argument from ignorance, then the ignorance is yours. There's lots of supporting data; just read Dr. Moran's posts, fercrissake! If you want to dispute that evidence or offer contrary evidence, great! Please do. Otherwise, there's no reason to take you seriously.

I'm not saying that all intron sequences are junk. I'm saying that based on the variation we seen within a population and between closely related species, the majority of intron sequences are dispensable junk.

I doubt that large intron sequences are needed to facilitate alternative splicing since only a small percentage of human genes exhibit biologically functional alternative splicing.

All the same, LTR elements do serve to modulate gene expression, even at a distance.

Please supply references to the scientific literature showing that a significant percentage of LTR's play a biologically relevant role in modulating gene expression. I'm not denying that there are half a dozen examples but that amount is insignificant in the grand scheme of things.

Why has natural selection conserved all this "junk DNA"?

It hasn't. Get your facts straight.

To accumulate all this "garbage" would impose to high of a metabolic cost on the organism.

Have you ever heard of the C-Value Paradox or The Onion Test? The scientific evidence shows pretty conclusively that the presumed cost is insignificant for most species.

I strongly believe that intergenic ncDNA helps maintain structural stability and also shields coding DNA from harmful mutations and viruses.

Perhaps you like to explain how junk DNA could shield coding DNA from mutations and viruses? Here's something you might like to read before answering: Does Excess Genomic DNA Protect Against Mutation?. I suspect you haven't though about this very deeply.

Perhpas you'd like to explain how junk DNA could help maintain structural stability? Do species with much less junk DNA have unstable genomes? Some species of frog have 100X more DNA than closely related species. Is that because their genome is extremely stable?

By analyzing the genome-wide data of mRNA stability published by someone previously, we found that human intron-containing genes have more stable mRNAs than intronless genes, and the Arabidopsis thaliana genes with the most unstable mRNAs have fewer introns than other genes in the genome.

They also affect gene expression in subtle ways. You have blue eyes in part because of a mutation in an intron causing a reduction in melanin concentration in your iris:

The point about LTRs is that they can modulate gene expression - even you accept this. if many are inactive/defective right now,they could be later on - and selection will favor those individuals who have the most preserved sequences.

The problem with you is that you are a deep thinker who is not thinking across deep time. You are just looking at the genome and seeing no current potentiality in vast swathes of it when there exists the possibility of future uses and effects.

There is no doubt that ncDNA can shield exonic regions from any harmful retroviral insertions or from illegitimate recombination. I was not referring to point mutations.

As for frogs, I am not sure - 100x sounds excessive. Salamanders also have enormous genomes. If I were a researcher I would investigate possible reasons for a relationship rather than just dissing it.

The problem with you is that you are a deep thinker who is not thinking across deep time. You are just looking at the genome and seeing no current potentiality in vast swathes of it when there exists the possibility of future uses and effects.

As I deep thinker I know a thing or two about evolution. The fact that sometime in the next million years there might be a small piece of junk DNA that evolves a new function is no reason to dismiss the evidence that most of our genome is junk. And in case you think otherwise, let me assure you that there cannot be SELECTION for potential future uses. Evolution doesn't work that way.

There is no doubt that ncDNA can shield exonic regions from any harmful retroviral insertions or from illegitimate recombination. I was not referring to point mutations.

How, exactly, does that work? The more junk DNA you have the bigger the target for retrovirus insertions that will cause no harm. Hence, those species with large genomes (i.e. us) carry a lot of retroviruses.

In species with small genomes the average insertion will be lethal and quickly weeded out of the population.

Do you think there's an adaptive reason to provide a bigger target for retroviral insertions?

As for frogs, I am not sure - 100x sounds excessive. Salamanders also have enormous genomes. If I were a researcher I would investigate possible reasons for a relationship rather than just dissing it.

Investigations have been underway for at least forty years. Don't you think it's time we started to entertain the idea that there may not be a reason why closely related species have vastly different amounts of DNA?

"And in case you think otherwise, let me assure you that there cannot be SELECTION for potential future uses."

Just to play Devil's Advocate, you can completely inactivate telomerase in mice with no appreciable phenotype until the mice have been bred for 6 or 7 generations. Why doesn't this count as selection for potential future use?

Both introns and UTRs play an important part in molecular stability. That is increasingly obvious. Their utility is subtle.

Let me assure you that there cannot be SELECTION for potential future uses. Evolution doesn't work that way.

No, you don't seem to understand how selection can have a long reach into the future - I am very disappointed. Here is an example:

If element A is useless now but turns out to be useful later on, then those individuals with the most preserved (functioning) element As in their genomes will feel its beneficial effects compared to those with defective and degenerate ones. As such, they will become more prevalent because of differential reproduction. OK?

In species with small genomes the average insertion will be lethal and quickly weeded out of the population.

That is precisely the point. Both ncDNA, and also duplicate protein-coding genes, serve as buffers against harmful effects in the *individual* organism. In fact, I can see how both the accumulation of duplicate genes (80% of eukaryotic genes are paralogs of others) and ncDNA might set off evolutionary arms races. It is a case of must having something only because someone else does - the actual net benefit needs not exist.

Now, the reason why I mentioned "illegitimate recombination" is because uni-chromosomal prokaryotes don't recombine their DNA as eukaryotes do. Since illegal recombinatory events can cause frameshifts in coding DNA, ncDNA (especially intergenic sequences) serves as a protective shield against them.

Don't you think it's time we started to entertain the idea that there may not be a reason why closely related species have vastly different amounts of DNA?

I would be more interested in why amphibians have so much ncDNA compared to others. I know you think of this as coincidence and accidence but I think that is a lazy answer. Also, 40 years ago we didn't have the genomic analysis tools we have now. It is a brave new world you want to destroy, Larry.

In fact, I can see how both the accumulation of duplicate genes (80% of eukaryotic genes are paralogs of others) and ncDNA might set off evolutionary arms races. It is a case of must having something only because someone else does - the actual net benefit needs not exist.

What unmitigated junk, pun fully intended. I must have something with no benefit (or deleterious) because someone else does, and that leads to an "arms race"? So if my neighbor is in hock up to his eyeballs, I better accumulate massive debt as well or be left behind?

@Atheistoclast If element A is useless now but turns out to be useful later on, then those individuals with the most preserved (functioning) element As in their genomes will feel its beneficial effects compared to those with defective and degenerate ones. As such, they will become more prevalent because of differential reproduction. OK?

Not OK.

If element A is useless now how can there be any selection for it, other than a negative selection (probably very small if any) making carriers less prevalent due to the cost of carrying a useless function.

I'm not even arguing from a genetic basis (good thing as I have no background) but from a logical basis and your statement seems to be logically incoherent.

If element A is "useless" (your word) then by definition it can not have any beneficial effects.

If element A is useless now how can there be any selection for it, other than a negative selection (probably very small if any) making carriers less prevalent due to the cost of carrying a useless function.

You don't understand. Element A may have no use at present but it could do in the future - that is the nature of variation. Therefore, its preservation is beneficial in the evolutionary long-run.

If element A is "useless" (your word) then by definition it can not have any beneficial effects.

At the moment. But in the future it might. Therefore, my argument is that in the future those individuals with in-tact element As will be reproductively better off compared to those with degenerate elementAs.

For example, if some LTR retotransposon is doing nothing at the moment in some obscure part of the genome but relocates to a become a new cis-regulatory element in some gene 100 years from now, then selection will favor those individuals who possess the functional element compared to those who have lost it or have let it become defective.

Larry, unfortunately, can't even begin to understand this. The reason is that he is not a population geneticist - he is a biochemist. That's all he is, that's all he'll ever be.

What unmitigated junk, pun fully intended. I must have something with no benefit (or deleterious) because someone else does, and that leads to an "arms race"? So if my neighbor is in hock up to his eyeballs, I better accumulate massive debt as well or be left behind?

OK...perhaps an analogy would be useful. Say I invest in a backup to my hard drive and you do not. We are both given some deadlined assignment to do on our computers that we both save every night to our respective HD.

Now, one night 2 weeks down the line we both experience some physical failure and we lose the data on our hard drives. However, my backup has been storing all of my data all along.

As a result, I am at an advantage over you since you have to start all over again whereas I don't.

This is one reason why duplicate genes become accumulated in the genome. They may not offer anything beneficial per se other than it always is good to have a backup - those who don't are at a disadvantage compared to those who do.

Proud to read that centromeric DNA is considered functional, even though it is not.

Various articles have shown that the kinetochore can form on unique (non-repetitive) DNA. One chromosome in orangutan lacks alpha-satellite DNA, as does one chromosome in horses, whereas most chromosomes in chicken lack repetitive DNA at the centromeric locus all together. In addition, neocentromeres (centromeric relocalizaton on a chromosome) formation in humans happens relatively frequent (in term of evolutionary time), although it is commonly associated with clinical manifestations.

At the same time, tandem repeat arrays have been observed on holocentric chromosomes (lacking a primary constriction or regional/localized centromere), which (currently) have not been shown to associate with kinetochore/centromere function.

For these reasons, I would argue that your estimations of functional DNA in our genome is overestimated by 2%.

Dr. Moran, I was looking at your list in your post. Where do chaperone proteins fit into the picture? Are they produced by coding DNA or non-coding DNA. Which of your categories does the DNA which produces chaperone proteins fit?

dIntrons sequences account for about 30% of the genome. Most of these sequences qualify as junk but they are littered with defective transposable elements that are already included in the calculation of junk DNA."

CORRECTION:Well it looks like Moran is not interested in discussing this further. But it is worth looking at. Moran covers the topic of introns in this thread:http://sandwalk.blogspot.com/2011/05/junk-jonathan-part-7chapter-4.htmlFrom there we get the distinct impression that Moran considers introns in protein-encoding genes to be "junk". And he appears to think that this "junk" constitutes 9.6% of the genome.

Why consider introns to be junk? Do they perform no useful function?It seems that they perform the function in the gene of separating the exons. That seems useful. The exons can be distinguished from each other by these separators.It seems Moran considers them junk because they are spliced out in the process of transcription to RNA.I wonder if that is why Moran considers them "junk". It would be great if he could explain his thinking about that.

I am talking about the introns in protein-coding genes. It appears that Moran considers those particular introns (in protein-coding genes) to be "junk". It appears that he considers those particular introns (in protein-coding genes) to be 9.6% of the entire genome.

I am hoping he can explain why he thinks that they are "junk". None of his other posts address this question.

Evolution theory is left with the odd idea that the exons in the gene are perfectly organized to provide the RNA and then the proteins for the functioning of the creature. But right beside all these exons are introns that are just "junk".

Please explain using evolution principles such as drift, mutation, selection etc how in the world that could happen.

Anonymous, I appreciate that you have taken the time to list some references. Please copy and paste from them the sentences that you think are relevant to the discussion. I used to read references people gave, only to find that my time had been wasted. I do not do that any more. You could start by just taking one of them.

Laurence A. Moran

Larry Moran is a Professor in the Department of Biochemistry at the University of Toronto. You can contact him by looking up his email address on the University of Toronto website.

Sandwalk

The Sandwalk is the path behind the home of Charles Darwin where he used to walk every day, thinking about science. You can see the path in the woods in the upper left-hand corner of this image.

Disclaimer

Some readers of this blog may be under the impression that my personal opinions represent the official position of Canada, the Province of Ontario, the City of Toronto, the University of Toronto, the Faculty of Medicine, or the Department of Biochemistry. All of these institutions, plus every single one of my colleagues, students, friends, and relatives, want you to know that I do not speak for them. You should also know that they don't speak for me.

Subscribe to Sandwalk

Quotations

The old argument of design in nature, as given by Paley, which formerly seemed to me to be so conclusive, fails, now that the law of natural selection has been discovered. We can no longer argue that, for instance, the beautiful hinge of a bivalve shell must have been made by an intelligent being, like the hinge of a door by man. There seems to be no more design in the variability of organic beings and in the action of natural selection, than in the course which the wind blows.Charles Darwin (c1880)Although I am fully convinced of the truth of the views given in this volume, I by no means expect to convince experienced naturalists whose minds are stocked with a multitude of facts all viewed, during a long course of years, from a point of view directly opposite to mine. It is so easy to hide our ignorance under such expressions as "plan of creation," "unity of design," etc., and to think that we give an explanation when we only restate a fact. Any one whose disposition leads him to attach more weight to unexplained difficulties than to the explanation of a certain number of facts will certainly reject the theory.

Charles Darwin (1859)Science reveals where religion conceals. Where religion purports to explain, it actually resorts to tautology. To assert that "God did it" is no more than an admission of ignorance dressed deceitfully as an explanation...

Quotations

The world is not inhabited exclusively by fools, and when a subject arouses intense interest, as this one has, something other than semantics is usually at stake.
Stephen Jay Gould (1982)
I have championed contingency, and will continue to do so, because its large realm and legitimate claims have been so poorly attended by evolutionary scientists who cannot discern the beat of this different drummer while their brains and ears remain tuned to only the sounds of general theory.
Stephen Jay Gould (2002) p.1339
The essence of Darwinism lies in its claim that natural selection creates the fit. Variation is ubiquitous and random in direction. It supplies raw material only. Natural selection directs the course of evolutionary change.
Stephen Jay Gould (1977)
Rudyard Kipling asked how the leopard got its spots, the rhino its wrinkled skin. He called his answers "just-so stories." When evolutionists try to explain form and behavior, they also tell just-so stories—and the agent is natural selection. Virtuosity in invention replaces testability as the criterion for acceptance.
Stephen Jay Gould (1980)
Since 'change of gene frequencies in populations' is the 'official' definition of evolution, randomness has transgressed Darwin's border and asserted itself as an agent of evolutionary change.
Stephen Jay Gould (1983) p.335
The first commandment for all versions of NOMA might be summarized by stating: "Thou shalt not mix the magisteria by claiming that God directly ordains important events in the history of nature by special interference knowable only through revelation and not accessible to science." In common parlance, we refer to such special interference as "miracle"—operationally defined as a unique and temporary suspension of natural law to reorder the facts of nature by divine fiat.
Stephen Jay Gould (1999) p.84

Quotations

My own view is that conclusions about the evolution of human behavior should be based on research at least as rigorous as that used in studying nonhuman animals. And if you read the animal behavior journals, you'll see that this requirement sets the bar pretty high, so that many assertions about evolutionary psychology sink without a trace.

Jerry Coyne
Why Evolution Is TrueI once made the remark that two things disappeared in 1990: one was communism, the other was biochemistry and that only one of them should be allowed to come back.

Sydney Brenner
TIBS Dec. 2000
It is naïve to think that if a species' environment changes the species must adapt or else become extinct.... Just as a changed environment need not set in motion selection for new adaptations, new adaptations may evolve in an unchanging environment if new mutations arise that are superior to any pre-existing variations

Douglas Futuyma
One of the most frightening things in the Western world, and in this country in particular, is the number of people who believe in things that are scientifically false. If someone tells me that the earth is less than 10,000 years old, in my opinion he should see a psychiatrist.

Francis Crick
There will be no difficulty in computers being adapted to biology. There will be luddites. But they will be buried.

Sydney Brenner
An atheist before Darwin could have said, following Hume: 'I have no explanation for complex biological design. All I know is that God isn't a good explanation, so we must wait and hope that somebody comes up with a better one.' I can't help feeling that such a position, though logically sound, would have left one feeling pretty unsatisfied, and that although atheism might have been logically tenable before Darwin, Darwin made it possible to be an intellectually fulfilled atheist

Richard Dawkins
Another curious aspect of the theory of evolution is that everybody thinks he understand it. I mean philosophers, social scientists, and so on. While in fact very few people understand it, actually as it stands, even as it stood when Darwin expressed it, and even less as we now may be able to understand it in biology.

Jacques Monod
The false view of evolution as a process of global optimizing has been applied literally by engineers who, taken in by a mistaken metaphor, have attempted to find globally optimal solutions to design problems by writing programs that model evolution by natural selection.