11 July 2010

So why is it that I and many other biologists hypothesize that introns are mostly non-functional?

(I'll assume that you've read the previous posts, and that you understand what it is that I mean when I challenge claims that introns are functional elements in an information-rich genome. And to avoid confusion, I'll speak only for myself, although I surmise that a tiny minority of biologists would agree with creationist characterizations of the human genome.)

Here are the basic data that lead me to conclude that intron sequences are mostly dispensable for biological function. I've provided links to key references, and we can go into more detail in further posts or in the comments.

1. When we compare genomes from similar types of organisms, we see that coding sequences are highly conserved, meaning that these sequences are very similar in the different organisms. But the intron sequences tend to vary considerably, both in length and in sequence. These variations do not generally correlate with organismal complexity or functional innovation. I surmise that introns can be changed significantly without affecting development or function, and I conclude that this is because most intronic sequence is of no functional value.

2. When genomes from very similar types of mammals are compared, introns appear to be lost frequently but almost never gained. Loss of introns occurs in a pattern that suggests a specific mechanism for the deletion, and the losses do not result in known functional deficits. I surmise that the mere existence of many introns, never mind the sequences they harbor, is dispensable for development or function in mammals. (Intron loss is common in other animals as well.)

3. When genes within the same organism are compared, those genes that are most highly expressed (the ones that are used the most) have the shortest introns. (See analyses in chicken and nematode and human.) I surmise that introns tend to be a burden on genetic machinery, such that natural selection acts to favor compactness in oft-used genes, by favoring shorter introns. And I conclude that intron sequences are thus of little consequence, noting that the most heavily used genes are the genes with the least intron sequence.

4. When we examine animal genomes that are notable for their small size, and compare them to animal genomes of average or large size, we see that introns are dramatically reduced in size while coding sequences remain roughly constant. The variations do not correlate with organismal complexity or functional innovation. I surmise that because downsizing of genomes leads invariably to downsizing of intron space, much of the content of intron space is dispensable for development or function in animals.

5. When we examine the genetic causes for various diseases or traits, we find mutations in coding sequences, or mutations that alter splicing, or mutations that affect genetic control regions, far more often than we find mutations in introns. (Take a stroll through the OMIM database; pick your favorite human disease and try to find intron mutations.) This despite the fact that intron space is at least twenty times the size of coding space and vastly bigger than the genomic space allotted to splice codes and control sequences. I surmise that most of the changes that really matter are changes to genes and their expression; mutations in introns are almost always inconsequential.

6. Genetic experiments over the last three decades have shown that, with rare exceptions, the genetic deletion of a gene (by disruption of one or more of its coding regions) can be corrected by re-introducing the coding region alone into the organism. (Here's a recent example from a mouse experiment, chosen from scores or hundreds in the literature.) In other words, a mutant organism in which gene X has been inactivated can be "rescued" (made largely normal again) by the insertion of the coding sequence of gene X without introns. I conclude that intron sequences rarely harbor significant functional information, which is far more likely to be found in coding sequences and control regions.

7. Transposable elements – those weird genomic players that can jump around in genomes – are far more common in introns than they are in coding regions. The implication is that introns are much more tolerant of such messing around than are the coding sequences. I surmise that intron sequences are rarely relevant to biological function.

Conclusion: intron space looks like a junkyard to me. At least in medium-sized genomes like ours, I see little evidence that introns are streamlined, efficient, functionally-critical information repositories. I see them as messy collections of evolutionary debris, harboring lots of interesting functional bits but largely consisting of flotsam that is as likely to cause dysfunction as it is to lead to evolutionary opportunity.

Next and final post in the series: some ideas for experimental tests of the assertions of design theorists who posit that introns are characteristics of genomes that are "dominated by sequences rich in functional information."

Please review my Rules and policies before posting a comment. Note that comments are closed after a month. If you would like to get in touch with me, visit the About page for contact details, including an anonymous comment form that works all the time.

Introns. Let's think about this, people. Part IV.

So why is it that I and many other biologists hypothesize that introns are mostly non-functional?

(I'll assume that you've read the previous posts, and that you understand what it is that I mean when I challenge claims that introns are functional elements in an information-rich genome. And to avoid confusion, I'll speak only for myself, although I surmise that a tiny minority of biologists would agree with creationist characterizations of the human genome.)

Here are the basic data that lead me to conclude that intron sequences are mostly dispensable for biological function. I've provided links to key references, and we can go into more detail in further posts or in the comments.

1. When we compare genomes from similar types of organisms, we see that coding sequences are highly conserved, meaning that these sequences are very similar in the different organisms. But the intron sequences tend to vary considerably, both in length and in sequence. These variations do not generally correlate with organismal complexity or functional innovation. I surmise that introns can be changed significantly without affecting development or function, and I conclude that this is because most intronic sequence is of no functional value.

2. When genomes from very similar types of mammals are compared, introns appear to be lost frequently but almost never gained. Loss of introns occurs in a pattern that suggests a specific mechanism for the deletion, and the losses do not result in known functional deficits. I surmise that the mere existence of many introns, never mind the sequences they harbor, is dispensable for development or function in mammals. (Intron loss is common in other animals as well.)

3. When genes within the same organism are compared, those genes that are most highly expressed (the ones that are used the most) have the shortest introns. (See analyses in chicken and nematode and human.) I surmise that introns tend to be a burden on genetic machinery, such that natural selection acts to favor compactness in oft-used genes, by favoring shorter introns. And I conclude that intron sequences are thus of little consequence, noting that the most heavily used genes are the genes with the least intron sequence.

4. When we examine animal genomes that are notable for their small size, and compare them to animal genomes of average or large size, we see that introns are dramatically reduced in size while coding sequences remain roughly constant. The variations do not correlate with organismal complexity or functional innovation. I surmise that because downsizing of genomes leads invariably to downsizing of intron space, much of the content of intron space is dispensable for development or function in animals.

5. When we examine the genetic causes for various diseases or traits, we find mutations in coding sequences, or mutations that alter splicing, or mutations that affect genetic control regions, far more often than we find mutations in introns. (Take a stroll through the OMIM database; pick your favorite human disease and try to find intron mutations.) This despite the fact that intron space is at least twenty times the size of coding space and vastly bigger than the genomic space allotted to splice codes and control sequences. I surmise that most of the changes that really matter are changes to genes and their expression; mutations in introns are almost always inconsequential.

6. Genetic experiments over the last three decades have shown that, with rare exceptions, the genetic deletion of a gene (by disruption of one or more of its coding regions) can be corrected by re-introducing the coding region alone into the organism. (Here's a recent example from a mouse experiment, chosen from scores or hundreds in the literature.) In other words, a mutant organism in which gene X has been inactivated can be "rescued" (made largely normal again) by the insertion of the coding sequence of gene X without introns. I conclude that intron sequences rarely harbor significant functional information, which is far more likely to be found in coding sequences and control regions.

7. Transposable elements – those weird genomic players that can jump around in genomes – are far more common in introns than they are in coding regions. The implication is that introns are much more tolerant of such messing around than are the coding sequences. I surmise that intron sequences are rarely relevant to biological function.

Conclusion: intron space looks like a junkyard to me. At least in medium-sized genomes like ours, I see little evidence that introns are streamlined, efficient, functionally-critical information repositories. I see them as messy collections of evolutionary debris, harboring lots of interesting functional bits but largely consisting of flotsam that is as likely to cause dysfunction as it is to lead to evolutionary opportunity.

Next and final post in the series: some ideas for experimental tests of the assertions of design theorists who posit that introns are characteristics of genomes that are "dominated by sequences rich in functional information."