Friday, April 19, 2013

Coelacanths Evolve More Slowly?

I don't have a lot of time today (I leave for Boston tomorrow) but I can't let this pass.

The complete draft genome of the African coelacanth, Latimeria chalumnae has just been published in Nature (Amemiya et al. 2013). Ceolacanths have long been regarded as "living fossils," a term that persists even though the data have been disputed ever since the first fish were identified 75 years ago. I couldn't believe what I was reading when I saw the press release from the Broad Institute in Boston [Coelacanth genome surfaces]. The author, Haley Bridger of Broad Communications, says ...

An international team of researchers has decoded the genome of a creature whose evolutionary history is both enigmatic and illuminating: the African coelacanth. A sea-cave dwelling, five-foot long fish with limb-like fins, the coelacanth was once thought to be extinct. A living coelacanth was discovered off the African coast in 1938, and since then, questions about these ancient-looking fish – popularly known as “living fossils” – have loomed large. Coelacanths today closely resemble the fossilized skeletons of their more than 300-million-year-old ancestors. Its genome confirms what many researchers had long suspected: genes in coelacanths are evolving more slowly than in other organisms.

“We found that the genes overall are evolving significantly slower than in every other fish and land vertebrate that we looked at,” said Jessica Alföldi, a research scientist at the Broad Institute and co-first author of a paper on the coelacanth genome, which appears in Nature this week. “This is the first time that we’ve had a big enough gene set to really see that.”

Researchers hypothesize that this slow rate of change may be because coelacanths simply have not needed to change: they live primarily off of the Eastern African coast (a second coelacanth species lives off the coast of Indonesia), at ocean depths where relatively little has changed over the millennia.

This can't be right, I said to myself. Let's check out the actual paper.

Unfortunately, it was right. Here's the figure and here's what the authors say in the results section of the paper.

The morphological resemblance of the modern coelacanth to its fossil ancestors has resulted in it being nicknamed ‘the living fossil.’ This invites the question of whether the genome of the coelacanth is as slowly evolving as its outward appearance suggests. Earlier work showed that a few gene families, such as Hox and protocadherins, have comparatively slower protein-coding evolution in coelacanth than in other vertebrate lineages. To address the question, we compared several features of the coelacanth genome to those of other vertebrate genomes.

Protein-coding gene evolution was examined using the phylogenomics data set described above (251 concatenated proteins) (Fig. 1). Pair-wise distances between taxa were calculated from the branch lengths of the tree using the two-cluster test proposed previously to test for equality of average substitution rates. Then, for each of the following species and species clusters (coelacanth, lungfish, chicken and mammals), we ascertained their respective mean distance to an outgroup consisting of three cartilaginous fishes (elephant shark, little skate and spotted catshark). Finally, we tested whether there was any significant difference in the distance to the outgroup of cartilaginous fish for every pair of species and species clusters, using a Z statistic. When these distances to the outgroup of cartilaginous fish were compared, we found that the coelacanth proteins that were tested were significantly more slowly evolving (0.890 substitutions per site) than the lungfish (1.05 substitutions per site), chicken (1.09 substitutions per site) and mammalian (1.21 substitutions per site) orthologues (P < 10−6 in all cases) (Supplementary Data 5). In addition, as can be seen in Fig. 1, the substitution rate in coelacanth is approximately half that in tetrapods since the two lineages diverged. A Tajima’s relative rate test confirmed the coelacanth’s significantly slower rate of protein evolution (P < 10−20)

The authors make it clear in the discussion that they think of molecular evolution of amino acid sequences only in terms of adaptation.

Since its discovery, the coelacanth has been referred to as a ‘living fossil’, owing to its morphological similarities to its fossil ancestors. However, questions have remained as to whether it is indeed evolving slowly, as morphological stasis does not necessarily imply genomic stasis. In this study, we have confirmed that the protein-coding genes of L. chalumnae show a decreased substitution rate compared to those of other sequenced vertebrates, even though its genome as a whole does not show evidence of low genome plasticity. The reason for this lower substitution rate is still unknown, although a static habitat and a lack of predation over evolutionary timescales could be contributing factors to a lower need for adaptation. A closer examination of gene families that show either unusually high or low levels of directional selection indicative of adaptation in the coelacanth may provide information on which selective pressures acted, and which pressures did not act, to shape this evolutionary relict.

This extraordinary claim flies in the face of everything we know about molecular evolution. Preliminary data from some of these same authors was criticized by Casane and Laurenti1 (2013) earlier this year. I'll quote what they said and leave it up to Sandwalk readers to draw their own conclusions.

Transposing the concept of ‘living fossil’ to the genomic level has led to the hypothesis of genetic stasis (or at least to the idea of a reduced molecular evolutionary rate) that is in sharp contrast with the principles of evolutionary genetics. Genomes change continuously under the combined effects of various mutational processes, that produce new variants, and genetic drift and selection, that eliminates or fixes them in populations. In other terms, the only possibility for genomes to replicate without change implies at least one of the two following conditions: (i) new variants do not appear (i.e. no mutations), and (ii) new variants are systematically eliminated by selection (i.e. no genetic drift and very powerful selection against new variants). Of course we can consider a less extreme case, i.e. a reduced evolutionary rate of the genome, but this still implies a lower mutation rate and/or stronger selection against new variants than observed in other species.

The coelacanth data make no sense. You should be very skeptical.

You should also wonder about the kind of people that Nature asks to review their papers. Reviewers may not be inclined to challenge the data but they should challenge the conclusions and they should ask the authors to address the fact that their interpretation is inconsistent with the modern evolutionary theory.

One other thing, if you look through the names of the authors, you will see several people who should know better than to attach their name to a paper like this. What's going on?

[Photo Credit: This is a photo of a model of a related species Latimeria chalumnae from the Oxford University Museum. (Wikipedia)]

39 comments
:

What I notice about the tree is that the length of a lineage seems closely related to the number of species sampled in that lineage, which leads me to suspect that this is all about artifacts of analysis. The way we detect evolutionary change is by comparison with other species, and the fewer species, the less our chances of detecting a particular change. This is clearly true in parsimony analyses, and I expect it also to be true (though to a somewhat lesser extent) in likelihood analyses. Nothing to see here, folks. Move along.

(Of course it's also possible that evolutionary change is correlated with speciation, as fans of punctuated equilibria might say. But that needs to be tested by a method that isn't biased toward such a conclusion.)

So are you saying the data is possibly wrong, or merely the interpretation? Because it seems to me that, as scientists, we have to always be willing to revise our theories in light of new data. And, as you point out, the authors aren't noobs in the areas of evolution OR genomics.

Just based on Larry's summary above, it seems they are basing their conclusion only on data from "protein-coding genes", and extrapolating this to the genome as a whole, an extrapolation that may not be justified. I hope I'm not misreading that.

"In this study, we have confirmed that the protein-coding genes of L. chalumnae show a decreased substitution rate compared to those of other sequenced vertebrates, even though its genome as a whole does not show evidence of low genome plasticity." They seem to imply that the substitution rate of the rest of the genome is similar to other species.

To clarify, evolution by genetic drift does occur, but it's directionless, which is what happens during stasis in punctuated equilibria. The phenotype shifts but randomly and not in any perceivable direction. It's more like a wobble. And neutral molecular evolution of course does not affect phenotypes so it should be expected in coelacanths.

I doubt if any sequences evolving neutrally can even be aligned, or recognized as homologous, between coelacanths and anything else. So all they have to compare would be sequences under fairly strong purifying selection, like most protein-coding exons and some structural RNAs.

I should mention that there are other reasons than PE for supposing a correlation between evolutionary rate and speciation rate, if this is at all what we're seeing here rather than an artifact of sampling and analysis. For example, speciosity, size, generation time, and mutation rate are all expected to be somewhat correlated.

Just want to point out that fig 3 in the paper actually quantifies (a point estimate of) the proportion of sites evolving under purifying selection (blue), neutrally (yellow) and positive selection (red) on each branch of the tree, and the strength of that selection, for the gene depicted. The analysis is based on comparing synonymous and nonsynonymous rates, so can only be done for protein coding regions. Throughout the tree, purifying selection dominates, with quite a few neutrally evolving sites and an essentially negligible nr of positively selected sites. When the authors speculate about adaptation playing a role, they are ignoring this evidence.

(Full disclosure: I'm one of the people responsible for the analysis methodology that produced this figure.)

What John Harshman says is wise. This --> "What I notice about the tree is that the length of a lineage seems closely related to the number of species sampled in that lineage, which leads me to suspect that this is all about artifacts of analysis." ...is a known phenomenon, there are articles on it, and a test for it. The name/author escapes me at the moment, maybe Pagel. It's expected to be a problem one sequences diverge enough such that you could have multiple substitutions sometimes, but would miss them if you didn't have dense enough taxon sampling.

Anyhow, that said, it's not like a strict molecular clock often applies, and a cold environment, long generation time, low speciation rate, etc., might have some roll in producing slower substitution. They report something like a 40% difference in substitution rate. Is that possible? Sure. Does it match the naive/silly expectation of "no evolution in a living fossil"? No, it falsifies it...

Ah, well at least they say they checked for this in the Supp. Mat., although nothing about the details is reported, despite 135 pages of Supp. Mat....

Relative rate of gene evolution

To test the rate of evolution of coelacanth relative to other species we performed two types of analyses, Tajima relative rate test105 and Two-Cluster test 106, on the carefully curated dataset used for the phylogenomic analysis (see section "Determining the closest living fish relative of the tetrapod ancestor").

Tajima Relative Rate Test

First, we applied Tajima relative rate test (RRT) on the sequence alignments of a dataset consisting of approximately 250 genes. Each gene-set was separately aligned and sites with gaps or unknown amino acids were excluded. Each comparison included two ingroups and one outgroup. For each such triplet, we concatenated all the aligned gene-sets that included all three species and performed the Tajima RRT using in-house perl scripts. The relative rates of evolution between coelacanth and other species (lungfish, human, mouse, chicken and dog) were evaluated using each of the three chondrichthyan species as outgroup (Leucoraja erinacea, Callorhinchus milii, Scyliorhinus canicula). Tajima RRT analysis shows that coelacanth is not only evolving significantly slower than any of the tetrapod species used but also more slowly than lungfish (p < 0.05; Supplementary Dataset 6). An only slightly different picture is revealed on the respective analysis between lungfish and tetrapods. Lungfish is evolving significantly slower than human, mouse and dog, but seems to evolve as fast as the chicken. As can be seen in Figure 1, the substitution rate observed on the coelacanth lineage is approximately half that of tetrapods. Because branch lengths may be underestimated in regions of a tree that have few species, here potentially confounding the analysis of the coelacanth branch, we examined the node-density effect107-108 in each tree of the Bayesian posterior distribution but found no evidence for this artifact.

Still suspicious. According to the abstract, their relative rate tests were done with distances abstracted from the tree. So if there's an artifact, their tests would not detect it. Can you really test for a node-density effect using the posterior tree distribution? I don't actually see how.

The data is the sequences, not the analysis itself, which may or not be adequately done. If the analysis is not properly done, then conclusions are compromised. And the only way to evaluate the analysis is the same in all science, by subjecting it to a discussion between the authors and the rest of the community.

Actually, neither the data nor the analyses look suspicious. The interpretation, on the other hand, is dubious:

Coelacanths are known (someone please correct me if this has been called into question) to have exceptionally long generation times. This means that we should expect a short branch length (if nr of substitutions per generation per site is similar, total nr of substitutions per site should be shorter), so that finding is not at all remarkable or in need of explanation.

Exceptionally long generation times could explain the data provided that coelacanth eggs are formed after 30-odd generations and then remain dormant for one hundred years. The fish would also have to produce limited amounts of sperm during that time.

It is well established that mutation rate correlates strongly with generation time in vertebrates, and possibly also in other systems (e.g. invertebrates and plants). The reasons for the correlation are still under debate. See Thomas et al, MBE 2010, which starts with a review of some of the literature on this.

A series of recent studies on extant coelacanths has emphasised the slow rate of molecular and morphological evolution in these species. These studies were based on the assumption that a coelacanth is a ‘living fossil’ that has shown little morphological change since the Devonian, and they proposed a causal link between low molecular evolutionary rate and morphological stasis. Here, we have examined the available molecular and morphological data and show that: (i) low intra-specific molecular diversity does not imply low mutation rate, (ii) studies not showing low substitution rates in coelacanth are often neglected, (iii) the morphological stability of coelacanths is not supported by paleontological evidence. We recall that intra-species levels of molecular diversity, inter-species genome divergence rates and morphological divergence rates are under different constraints and they are not necessarily correlated. Finally, we emphasise that concepts such as ‘living fossil’, ‘basal lineage’, or ‘primitive extant species’ do not make sense from a tree-thinking perspective.

Editor's suggested further reading in BioEssays Tree thinking for all biology: the problem with reading phylogenies as ladders of progress Abstract==============================================

It is clear that ‘living fossil’, ‘basal lineage’, or ‘primitive extant species’ are rather misleading concepts, and that genomes keep evolving even while morphology is under stabilizing selection. Still, I thought the long-going controversy over whether evolutionary rates can differ between lineages was by now clearly decided in the affirmative.

In my own field of botany, there appears to be some evidence that shorter generation times lead to longer branches in molecular phylograms, meaning that molecular evolution is faster in short-lived and slower in long-lived organisms. I am still a bit puzzled why that would be so - I would consider it more logical that faster growth would speed up mutation rates. On the other hand, because sex increases recombination, the number of generations per time might reasonably be expected to have an impact. Interestingly, I have also seen talks at conferences showing that carnivorous plants sometimes show very long branches compared to their non-carnivorous relatives; no idea why that is.

You know, I sometimes think the same. I hate it how you cannot describe your phylogeny anymore without the reviewers throwing a bucket of red ink at the manuscript. Every phrase except "X is sister group to Y, and in Y A is the sister group of B" appears to be considered unscientific now, but it makes for very obnoxious writing.

I think that "basal" still makes sense because you always have a perspective: that of your current study group. When you study primates, the rodents are a basal branch; when you study rodents, the primates are a basal branch. Still, the problem is with the laypeople whose perspective is nearly invariably a great chain of being leading up to humans...

We can't even say "basal lineage" when a lineage is basal??This is going too far...

You can, but it all depends... We primates are basal primatomorphs (from the colugos' point of view). Primatomorphs are basal euarchonts (with respect to tree-shrews). Euarchonta are basal Euarchontoglires (with respect to rodents and lagomorphs), etc., etc... with the ancestors of Tetrapoda ending up as basal sarcopterygians (if you ask the coelacanth).

So the term is not misleading at all if you use it or explain it right. Some lay people are still thinking of the Great Chain of Being, so TONS of scientific terms are going to be "misleading" for these people. Starting with "evolution".

I think it's misleading because it can't help but reference a great chain of being that doesn't exist. Even if you explain that there are different possible chains, it amounts to the same thing. The fact is that we are taking one fork of a basal divergence and calling it the main line of evolution and arbitrarily calling the other a side branch. If hagfish are basal vertebrates, then other vertebrates are basal hagfish, which sounds silly. The other meaning of "basal" is "less diverse than its sister group", which is also silly. Best dispensed with.

The problem remains that there is only one way of describing a tree left that is not considered misleading, and repeating the same phrase twenty times in a row in your results section looks like pretty bad writing and is seriously off-putting.

I find it hard to imagine why you would have to describe 20 nodes in a results section rather than allowing the tree to speak for itself, but there are still alternatives available. "X is the sister group of Y"; "X is outside Y"; "Y and Z form a clade that excludes X"; and so on. The perceived difficulty of combining accuracy and good writing is not an excuse.

We know that mutation rate per base is partially under genetic control, because different genes have different mutation rates per base. So I don't see that it is impossible for some lineages to evolve lower mutation rates than others. Lou Jost

Different genes are bound to have different mutation rates because that's what we should expect if the background mutation is random. Small genes will be more difficult to properly measure, and might look more biased, than larger genes, for example. Then there's purifying selection. Mutations causing harm will be selected against. This is why the apparent mutation rates inside genes is lower than in the spaces between genes or in introns. Then there's how important each part of a gene's sequence might be for function. If a lot of the gene is functionally important, we will se a much lower "mutation rate" because of stronger purifying selection ... so as you can see, before talking about genetic control for mutational rates, probability and purifying selection might explain a lot about apparent differences in mutation rates. Maybe some organisms do have different mutation rates for reasons other than random events combined with purifying selection. I have not studied that too much. Maybe some have better repair mechanisms and that translates as lower mutation rates. I have not studied that too much either.

No, different mutation rates per base aren't distributed at random across the genome. Mutation rates are properties of particular loci. These are actual mutation rates I am talking about, not a rate based on what's left after purifying selection. Those loci under no selection pressure often mutate more rapidly than loci which are under high selective pressure.LJ

Larry, so we agree that the mutation rate (error rate - repair rate) can vary with the locus. The existence of the hot spots you mention is evidence for this. There is also evidence of different mutation rates in different chromosomes of the same organism, and different mutation rates in different species. There is even variation in mutation rates between genotypes:

labs.eeb.utoronto.ca/agrawal/publications/nps_afa_pnas_2012.pdf

All of this suggests that mutation rate is under partial genetic control, and so it is certainly possible that the coelacanth has a lower mean mutation rate than most other vertebrates.

I have no opinion about whether it really does have a lower rate, though.

We have known that the error rate of DNA replication/repair is under genetic control for almost fifty years.

We also know that the error rate in most lineages has been approximately constant for hundreds of millions of years in spite of the fact that variation can arise from time to time.

It's possible that some clades have evolved an efficient replication/repair system that's twice as good as that in almost all other species but it's not the most parsimonious explanation of the data. Besides, the paper implies that the overall mutation rate in these fish isn't much different than in other species.

Note that the authors aren't invoking changes in mutation rate. They are claiming that the results support changes in the fixation propabilities in coelacanth protein-encoding genes. That's extremely unlikely, don't you agree? Would you have allowed this paper to be published with that kind of explanation and no mention of other possibilities or of the fact that this conflicts with a lot of theory and data?

Yes, if I had been a reviewer of this article I'd have asked for more explanation, and I would have made them give confidence intervals for meaningful parameters instead of (or alongside) p-values in their tests for rate differences. No more of this misleading "significantly slower" talk.

But my point was just that this effect, by itself, (especially if it were due to slightly lower mutation rate, perhaps via fewer hotspots) would not really be that earthshaking. LJ

I'm a bit late to this discussion, but I'm writing something on this for the museum where the picture on this blog post was taken (Oxford). I don't quite get their method: is the protein tree based on the amino acid sequence or the underlying ATCG code? If the latter (as I would guess from their choice of model), why is there no estimate of synonymous vs nonsynonymous evolution. This might show whether the slow rate is due to low mutation (few synonymous changes predicted) or to strong purifying selection (normal rate of accumulation of synonymous changes).

Also, is there likely to be an ascertainment bias in the genes chosen? Picking those that *can* be easily compared means you will undoubtedly chose conserved sequences. But this should equally apply to the tetrapod and coelacanth linages, unless the filled-out tetrapod tree was used to infer a basal sequence for the tetrapods.

Laurence A. Moran

Larry Moran is a Professor in the Department of Biochemistry at the University of Toronto. You can contact him by looking up his email address on the University of Toronto website.

Sandwalk

The Sandwalk is the path behind the home of Charles Darwin where he used to walk every day, thinking about science. You can see the path in the woods in the upper left-hand corner of this image.

Disclaimer

Some readers of this blog may be under the impression that my personal opinions represent the official position of Canada, the Province of Ontario, the City of Toronto, the University of Toronto, the Faculty of Medicine, or the Department of Biochemistry. All of these institutions, plus every single one of my colleagues, students, friends, and relatives, want you to know that I do not speak for them. You should also know that they don't speak for me.

Subscribe to Sandwalk

Quotations

The old argument of design in nature, as given by Paley, which formerly seemed to me to be so conclusive, fails, now that the law of natural selection has been discovered. We can no longer argue that, for instance, the beautiful hinge of a bivalve shell must have been made by an intelligent being, like the hinge of a door by man. There seems to be no more design in the variability of organic beings and in the action of natural selection, than in the course which the wind blows.Charles Darwin (c1880)Although I am fully convinced of the truth of the views given in this volume, I by no means expect to convince experienced naturalists whose minds are stocked with a multitude of facts all viewed, during a long course of years, from a point of view directly opposite to mine. It is so easy to hide our ignorance under such expressions as "plan of creation," "unity of design," etc., and to think that we give an explanation when we only restate a fact. Any one whose disposition leads him to attach more weight to unexplained difficulties than to the explanation of a certain number of facts will certainly reject the theory.

Charles Darwin (1859)Science reveals where religion conceals. Where religion purports to explain, it actually resorts to tautology. To assert that "God did it" is no more than an admission of ignorance dressed deceitfully as an explanation...

Quotations

The world is not inhabited exclusively by fools, and when a subject arouses intense interest, as this one has, something other than semantics is usually at stake.
Stephen Jay Gould (1982)
I have championed contingency, and will continue to do so, because its large realm and legitimate claims have been so poorly attended by evolutionary scientists who cannot discern the beat of this different drummer while their brains and ears remain tuned to only the sounds of general theory.
Stephen Jay Gould (2002) p.1339
The essence of Darwinism lies in its claim that natural selection creates the fit. Variation is ubiquitous and random in direction. It supplies raw material only. Natural selection directs the course of evolutionary change.
Stephen Jay Gould (1977)
Rudyard Kipling asked how the leopard got its spots, the rhino its wrinkled skin. He called his answers "just-so stories." When evolutionists try to explain form and behavior, they also tell just-so stories—and the agent is natural selection. Virtuosity in invention replaces testability as the criterion for acceptance.
Stephen Jay Gould (1980)
Since 'change of gene frequencies in populations' is the 'official' definition of evolution, randomness has transgressed Darwin's border and asserted itself as an agent of evolutionary change.
Stephen Jay Gould (1983) p.335
The first commandment for all versions of NOMA might be summarized by stating: "Thou shalt not mix the magisteria by claiming that God directly ordains important events in the history of nature by special interference knowable only through revelation and not accessible to science." In common parlance, we refer to such special interference as "miracle"—operationally defined as a unique and temporary suspension of natural law to reorder the facts of nature by divine fiat.
Stephen Jay Gould (1999) p.84

Quotations

My own view is that conclusions about the evolution of human behavior should be based on research at least as rigorous as that used in studying nonhuman animals. And if you read the animal behavior journals, you'll see that this requirement sets the bar pretty high, so that many assertions about evolutionary psychology sink without a trace.

Jerry Coyne
Why Evolution Is TrueI once made the remark that two things disappeared in 1990: one was communism, the other was biochemistry and that only one of them should be allowed to come back.

Sydney Brenner
TIBS Dec. 2000
It is naïve to think that if a species' environment changes the species must adapt or else become extinct.... Just as a changed environment need not set in motion selection for new adaptations, new adaptations may evolve in an unchanging environment if new mutations arise that are superior to any pre-existing variations

Douglas Futuyma
One of the most frightening things in the Western world, and in this country in particular, is the number of people who believe in things that are scientifically false. If someone tells me that the earth is less than 10,000 years old, in my opinion he should see a psychiatrist.

Francis Crick
There will be no difficulty in computers being adapted to biology. There will be luddites. But they will be buried.

Sydney Brenner
An atheist before Darwin could have said, following Hume: 'I have no explanation for complex biological design. All I know is that God isn't a good explanation, so we must wait and hope that somebody comes up with a better one.' I can't help feeling that such a position, though logically sound, would have left one feeling pretty unsatisfied, and that although atheism might have been logically tenable before Darwin, Darwin made it possible to be an intellectually fulfilled atheist

Richard Dawkins
Another curious aspect of the theory of evolution is that everybody thinks he understand it. I mean philosophers, social scientists, and so on. While in fact very few people understand it, actually as it stands, even as it stood when Darwin expressed it, and even less as we now may be able to understand it in biology.

Jacques Monod
The false view of evolution as a process of global optimizing has been applied literally by engineers who, taken in by a mistaken metaphor, have attempted to find globally optimal solutions to design problems by writing programs that model evolution by natural selection.