Friday, April 19, 2013

Coelacanths Evolve More Slowly?

I don't have a lot of time today (I leave for Boston tomorrow) but I can't let this pass.

The complete draft genome of the African coelacanth, Latimeria chalumnae has just been published in Nature (Amemiya et al. 2013). Ceolacanths have long been regarded as "living fossils," a term that persists even though the data have been disputed ever since the first fish were identified 75 years ago. I couldn't believe what I was reading when I saw the press release from the Broad Institute in Boston [Coelacanth genome surfaces]. The author, Haley Bridger of Broad Communications, says ...
An international team of researchers has decoded the genome of a creature whose evolutionary history is both enigmatic and illuminating: the African coelacanth. A sea-cave dwelling, five-foot long fish with limb-like fins, the coelacanth was once thought to be extinct. A living coelacanth was discovered off the African coast in 1938, and since then, questions about these ancient-looking fish – popularly known as “living fossils” – have loomed large. Coelacanths today closely resemble the fossilized skeletons of their more than 300-million-year-old ancestors. Its genome confirms what many researchers had long suspected: genes in coelacanths are evolving more slowly than in other organisms.

“We found that the genes overall are evolving significantly slower than in every other fish and land vertebrate that we looked at,” said Jessica Alföldi, a research scientist at the Broad Institute and co-first author of a paper on the coelacanth genome, which appears in Nature this week. “This is the first time that we’ve had a big enough gene set to really see that.”

Researchers hypothesize that this slow rate of change may be because coelacanths simply have not needed to change: they live primarily off of the Eastern African coast (a second coelacanth species lives off the coast of Indonesia), at ocean depths where relatively little has changed over the millennia.
This can't be right, I said to myself. Let's check out the actual paper.

Unfortunately, it was right. Here's the figure and here's what the authors say in the results section of the paper.
The morphological resemblance of the modern coelacanth to its fossil ancestors has resulted in it being nicknamed ‘the living fossil.’ This invites the question of whether the genome of the coelacanth is as slowly evolving as its outward appearance suggests. Earlier work showed that a few gene families, such as Hox and protocadherins, have comparatively slower protein-coding evolution in coelacanth than in other vertebrate lineages. To address the question, we compared several features of the coelacanth genome to those of other vertebrate genomes.

Protein-coding gene evolution was examined using the phylogenomics data set described above (251 concatenated proteins) (Fig. 1). Pair-wise distances between taxa were calculated from the branch lengths of the tree using the two-cluster test proposed previously to test for equality of average substitution rates. Then, for each of the following species and species clusters (coelacanth, lungfish, chicken and mammals), we ascertained their respective mean distance to an outgroup consisting of three cartilaginous fishes (elephant shark, little skate and spotted catshark). Finally, we tested whether there was any significant difference in the distance to the outgroup of cartilaginous fish for every pair of species and species clusters, using a Z statistic. When these distances to the outgroup of cartilaginous fish were compared, we found that the coelacanth proteins that were tested were significantly more slowly evolving (0.890 substitutions per site) than the lungfish (1.05 substitutions per site), chicken (1.09 substitutions per site) and mammalian (1.21 substitutions per site) orthologues (P < 10−6 in all cases) (Supplementary Data 5). In addition, as can be seen in Fig. 1, the substitution rate in coelacanth is approximately half that in tetrapods since the two lineages diverged. A Tajima’s relative rate test confirmed the coelacanth’s significantly slower rate of protein evolution (P < 10−20)
The authors make it clear in the discussion that they think of molecular evolution of amino acid sequences only in terms of adaptation.
Since its discovery, the coelacanth has been referred to as a ‘living fossil’, owing to its morphological similarities to its fossil ancestors. However, questions have remained as to whether it is indeed evolving slowly, as morphological stasis does not necessarily imply genomic stasis. In this study, we have confirmed that the protein-coding genes of L. chalumnae show a decreased substitution rate compared to those of other sequenced vertebrates, even though its genome as a whole does not show evidence of low genome plasticity. The reason for this lower substitution rate is still unknown, although a static habitat and a lack of predation over evolutionary timescales could be contributing factors to a lower need for adaptation. A closer examination of gene families that show either unusually high or low levels of directional selection indicative of adaptation in the coelacanth may provide information on which selective pressures acted, and which pressures did not act, to shape this evolutionary relict.
This extraordinary claim flies in the face of everything we know about molecular evolution. Preliminary data from some of these same authors was criticized by Casane and Laurenti1 (2013) earlier this year. I'll quote what they said and leave it up to Sandwalk readers to draw their own conclusions.
Transposing the concept of ‘living fossil’ to the genomic level has led to the hypothesis of genetic stasis (or at least to the idea of a reduced molecular evolutionary rate) that is in sharp contrast with the principles of evolutionary genetics. Genomes change continuously under the combined effects of various mutational processes, that produce new variants, and genetic drift and selection, that eliminates or fixes them in populations. In other terms, the only possibility for genomes to replicate without change implies at least one of the two following conditions: (i) new variants do not appear (i.e. no mutations), and (ii) new variants are systematically eliminated by selection (i.e. no genetic drift and very powerful selection against new variants). Of course we can consider a less extreme case, i.e. a reduced evolutionary rate of the genome, but this still implies a lower mutation rate and/or stronger selection against new variants than observed in other species.
The coelacanth data make no sense. You should be very skeptical.

You should also wonder about the kind of people that Nature asks to review their papers. Reviewers may not be inclined to challenge the data but they should challenge the conclusions and they should ask the authors to address the fact that their interpretation is inconsistent with the modern evolutionary theory.

One other thing, if you look through the names of the authors, you will see several people who should know better than to attach their name to a paper like this. What's going on?



[Photo Credit: This is a photo of a model of a related species Latimeria chalumnae from the Oxford University Museum. (Wikipedia)]

Amemiya, C.T. et al. (2013) The African coelacanth genome provides insights into tetrapod evolution. Nature 496:311–316. [doi: 10.1038/nature12027]

Casane, D. and Laurenti, P. (2013) Why coelacanths are not ‘living fossils.’ BioEssays 35:332-338. [doi: 10.1002/bies.201200145]

39 comments :

  1. What I notice about the tree is that the length of a lineage seems closely related to the number of species sampled in that lineage, which leads me to suspect that this is all about artifacts of analysis. The way we detect evolutionary change is by comparison with other species, and the fewer species, the less our chances of detecting a particular change. This is clearly true in parsimony analyses, and I expect it also to be true (though to a somewhat lesser extent) in likelihood analyses. Nothing to see here, folks. Move along.

    (Of course it's also possible that evolutionary change is correlated with speciation, as fans of punctuated equilibria might say. But that needs to be tested by a method that isn't biased toward such a conclusion.)

    ReplyDelete
  2. So are you saying the data is possibly wrong, or merely the interpretation? Because it seems to me that, as scientists, we have to always be willing to revise our theories in light of new data. And, as you point out, the authors aren't noobs in the areas of evolution OR genomics.

    ReplyDelete
    Replies
    1. Just based on Larry's summary above, it seems they are basing their conclusion only on data from "protein-coding genes", and extrapolating this to the genome as a whole, an extrapolation that may not be justified. I hope I'm not misreading that.

      Delete
    2. "In this study, we have confirmed that the protein-coding genes of L. chalumnae show a decreased substitution rate compared to those of other sequenced vertebrates, even though its genome as a whole does not show evidence of low genome plasticity."
      They seem to imply that the substitution rate of the rest of the genome is similar to other species.

      Delete
    3. "Of course it's also possible that evolutionary change is correlated with speciation, as fans of punctuated equilibria might say"

      Yes, but that is only related to macroevolution. PE doesn't have any conflict with the concept of drift or neutral evolution.

      Delete
    4. To clarify, evolution by genetic drift does occur, but it's directionless, which is what happens during stasis in punctuated equilibria. The phenotype shifts but randomly and not in any perceivable direction. It's more like a wobble. And neutral molecular evolution of course does not affect phenotypes so it should be expected in coelacanths.

      Delete
    5. I doubt if any sequences evolving neutrally can even be aligned, or recognized as homologous, between coelacanths and anything else. So all they have to compare would be sequences under fairly strong purifying selection, like most protein-coding exons and some structural RNAs.

      I should mention that there are other reasons than PE for supposing a correlation between evolutionary rate and speciation rate, if this is at all what we're seeing here rather than an artifact of sampling and analysis. For example, speciosity, size, generation time, and mutation rate are all expected to be somewhat correlated.

      Delete
    6. Just want to point out that fig 3 in the paper actually quantifies (a point estimate of) the proportion of sites evolving under purifying selection (blue), neutrally (yellow) and positive selection (red) on each branch of the tree, and the strength of that selection, for the gene depicted. The analysis is based on comparing synonymous and nonsynonymous rates, so can only be done for protein coding regions. Throughout the tree, purifying selection dominates, with quite a few neutrally evolving sites and an essentially negligible nr of positively selected sites. When the authors speculate about adaptation playing a role, they are ignoring this evidence.

      (Full disclosure: I'm one of the people responsible for the analysis methodology that produced this figure.)

      Delete
  3. What John Harshman says is wise. This --> "What I notice about the tree is that the length of a lineage seems closely related to the number of species sampled in that lineage, which leads me to suspect that this is all about artifacts of analysis." ...is a known phenomenon, there are articles on it, and a test for it. The name/author escapes me at the moment, maybe Pagel. It's expected to be a problem one sequences diverge enough such that you could have multiple substitutions sometimes, but would miss them if you didn't have dense enough taxon sampling.

    Anyhow, that said, it's not like a strict molecular clock often applies, and a cold environment, long generation time, low speciation rate, etc., might have some roll in producing slower substitution. They report something like a 40% difference in substitution rate. Is that possible? Sure. Does it match the naive/silly expectation of "no evolution in a living fossil"? No, it falsifies it...

    ReplyDelete
    Replies
    1. Ah yes, here's what I was thinking of: "node-density artefact"

      Test for Punctuational Evolution and the Node-Density Artifact.

      A web interface for submitting trees to test for the node density artefact. Implementing the method descried in Webster, A. J., R. J. Payne, and M. Pagel. 2003. Molecular phylogenies link rates of evolution and speciation. Science 301:478. and Venditti, C., Meade, A., Pagel, M. 2006. Detecting the node-density artifact in phylogeny reconstruction. Systematic Biology 55: 637-643

      http://www.evolution.reading.ac.uk/SoftwareMain.html

      Delete
    2. Ah, well at least they say they checked for this in the Supp. Mat., although nothing about the details is reported, despite 135 pages of Supp. Mat....

      Relative rate of gene evolution

      To test the rate of evolution of coelacanth relative to other species we performed two types of analyses, Tajima relative rate test105 and Two-Cluster test 106, on the carefully curated dataset used for the phylogenomic analysis (see section "Determining the closest living fish relative of the tetrapod ancestor").

      Tajima Relative Rate Test

      First, we applied Tajima relative rate test (RRT) on the sequence alignments of a dataset consisting of approximately 250 genes. Each gene-set was separately aligned and sites with gaps or unknown amino acids were excluded. Each comparison included two ingroups and one outgroup. For each such triplet, we concatenated all the aligned gene-sets that included all three species and performed the Tajima RRT using in-house perl scripts. The relative rates of evolution between coelacanth and other species (lungfish, human, mouse, chicken and dog) were evaluated using each of the three chondrichthyan species as outgroup (Leucoraja erinacea, Callorhinchus milii, Scyliorhinus canicula). Tajima RRT analysis shows that coelacanth is not only evolving significantly slower than any of the tetrapod species used but also more slowly than lungfish (p < 0.05; Supplementary Dataset 6). An only slightly different picture is revealed on the respective analysis between lungfish and tetrapods. Lungfish is evolving significantly slower than human, mouse and dog, but seems to evolve as fast as the chicken. As can be seen in Figure 1, the substitution rate observed on the coelacanth lineage is approximately half that of tetrapods. Because branch lengths may be underestimated in regions of a tree that have few species, here potentially confounding the analysis of the coelacanth branch, we examined the node-density effect107-108 in each tree of the Bayesian posterior distribution but found no evidence for this artifact.

      [...]

      107 Venditti, C., Meade, A. & Pagel, M. Detecting the node-density artifact in phylogeny reconstruction. Syst Biol 55, 637-643 (2006).

      108 Webster, A. J., Payne, R. J. & Pagel, M. Molecular phylogenies link rates of evolution and speciation. Science 301, 478, doi:10.1126/science.1083202301/5632/478 [pii] (2003)

      Delete
    3. Still suspicious. According to the abstract, their relative rate tests were done with distances abstracted from the tree. So if there's an artifact, their tests would not detect it. Can you really test for a node-density effect using the posterior tree distribution? I don't actually see how.

      Delete
  4. When the data do not fit your theory blame the data.

    ReplyDelete
    Replies
    1. The data is the sequences, not the analysis itself, which may or not be adequately done. If the analysis is not properly done, then conclusions are compromised. And the only way to evaluate the analysis is the same in all science, by subjecting it to a discussion between the authors and the rest of the community.

      The only exception is ID, which of course is right by default.

      Delete
  5. "The coelacanth data make no sense. You should be very skeptical."

    Actually, neither the data nor the analyses look suspicious. The interpretation, on the other hand, is dubious:

    Coelacanths are known (someone please correct me if this has been called into question) to have exceptionally long generation times. This means that we should expect a short branch length (if nr of substitutions per generation per site is similar, total nr of substitutions per site should be shorter), so that finding is not at all remarkable or in need of explanation.

    ReplyDelete
    Replies
    1. Exceptionally long generation times could explain the data provided that coelacanth eggs are formed after 30-odd generations and then remain dormant for one hundred years. The fish would also have to produce limited amounts of sperm during that time.

      Do you have a reference?

      Delete
    2. It is well established that mutation rate correlates strongly with generation time in vertebrates, and possibly also in other systems (e.g. invertebrates and plants). The reasons for the correlation are still under debate. See Thomas et al, MBE 2010, which starts with a review of some of the literature on this.

      Delete
    3. Also note that the reported effect size is not very large.

      Delete
  6. http://onlinelibrary.wiley.com/doi/10.1002/bies.201200145/abstract
    ==============================================
    Why coelacanths are not ‘living fossils’

    A review of molecular and morphological data
    Didier Casane1,2, Patrick Laurenti1,2,*
    Article first published online: 4 FEB 2013

    DOI: 10.1002/bies.201200145

    BioEssays
    Volume 35, Issue 4, pages 332–338, April 2013

    Keywords:

    coelacanth;evolutionary rate;Latimeria;living fossil;slow evolution;substitution rate;tree-thinking

    Abstract

    A series of recent studies on extant coelacanths has emphasised the slow rate of molecular and morphological evolution in these species. These studies were based on the assumption that a coelacanth is a ‘living fossil’ that has shown little morphological change since the Devonian, and they proposed a causal link between low molecular evolutionary rate and morphological stasis. Here, we have examined the available molecular and morphological data and show that: (i) low intra-specific molecular diversity does not imply low mutation rate, (ii) studies not showing low substitution rates in coelacanth are often neglected, (iii) the morphological stability of coelacanths is not supported by paleontological evidence. We recall that intra-species levels of molecular diversity, inter-species genome divergence rates and morphological divergence rates are under different constraints and they are not necessarily correlated. Finally, we emphasise that concepts such as ‘living fossil’, ‘basal lineage’, or ‘primitive extant species’ do not make sense from a tree-thinking perspective.

    Editor's suggested further reading in BioEssays Tree thinking for all biology: the problem with reading phylogenies as ladders of progress Abstract
    ==============================================

    ReplyDelete
  7. It is clear that ‘living fossil’, ‘basal lineage’, or ‘primitive extant species’ are rather misleading concepts, and that genomes keep evolving even while morphology is under stabilizing selection. Still, I thought the long-going controversy over whether evolutionary rates can differ between lineages was by now clearly decided in the affirmative.

    In my own field of botany, there appears to be some evidence that shorter generation times lead to longer branches in molecular phylograms, meaning that molecular evolution is faster in short-lived and slower in long-lived organisms. I am still a bit puzzled why that would be so - I would consider it more logical that faster growth would speed up mutation rates. On the other hand, because sex increases recombination, the number of generations per time might reasonably be expected to have an impact. Interestingly, I have also seen talks at conferences showing that carnivorous plants sometimes show very long branches compared to their non-carnivorous relatives; no idea why that is.

    ReplyDelete
    Replies
    1. We can't even say "basal lineage" when a lineage is basal??
      This is going too far...

      Delete
    2. You know, I sometimes think the same. I hate it how you cannot describe your phylogeny anymore without the reviewers throwing a bucket of red ink at the manuscript. Every phrase except "X is sister group to Y, and in Y A is the sister group of B" appears to be considered unscientific now, but it makes for very obnoxious writing.

      I think that "basal" still makes sense because you always have a perspective: that of your current study group. When you study primates, the rodents are a basal branch; when you study rodents, the primates are a basal branch. Still, the problem is with the laypeople whose perspective is nearly invariably a great chain of being leading up to humans...

      Delete
    3. We can't even say "basal lineage" when a lineage is basal??
      This is going too far...


      You can, but it all depends... We primates are basal primatomorphs (from the colugos' point of view). Primatomorphs are basal euarchonts (with respect to tree-shrews). Euarchonta are basal Euarchontoglires (with respect to rodents and lagomorphs), etc., etc... with the ancestors of Tetrapoda ending up as basal sarcopterygians (if you ask the coelacanth).

      Delete
    4. So the term is not misleading at all if you use it or explain it right. Some lay people are still thinking of the Great Chain of Being, so TONS of scientific terms are going to be "misleading" for these people. Starting with "evolution".

      Delete
    5. I think it's misleading because it can't help but reference a great chain of being that doesn't exist. Even if you explain that there are different possible chains, it amounts to the same thing. The fact is that we are taking one fork of a basal divergence and calling it the main line of evolution and arbitrarily calling the other a side branch. If hagfish are basal vertebrates, then other vertebrates are basal hagfish, which sounds silly. The other meaning of "basal" is "less diverse than its sister group", which is also silly. Best dispensed with.

      Delete
    6. The problem remains that there is only one way of describing a tree left that is not considered misleading, and repeating the same phrase twenty times in a row in your results section looks like pretty bad writing and is seriously off-putting.

      Delete
    7. I find it hard to imagine why you would have to describe 20 nodes in a results section rather than allowing the tree to speak for itself, but there are still alternatives available. "X is the sister group of Y"; "X is outside Y"; "Y and Z form a clade that excludes X"; and so on. The perceived difficulty of combining accuracy and good writing is not an excuse.

      Delete
    8. So if I understand you correctly, your results section in its entirety would read something like this: "The phylogeny can be seen in figure 2"?

      Delete
    9. If. In other words, no. I merely say that it's 1) unnecessary to say everything you think you need to and 2) possible to find several ways of saying what you actually do need to.

      Delete
  8. We know that mutation rate per base is partially under genetic control, because different genes have different mutation rates per base. So I don't see that it is impossible for some lineages to evolve lower mutation rates than others.
    Lou Jost

    ReplyDelete
    Replies
    1. Different genes are bound to have different mutation rates because that's what we should expect if the background mutation is random. Small genes will be more difficult to properly measure, and might look more biased, than larger genes, for example. Then there's purifying selection. Mutations causing harm will be selected against. This is why the apparent mutation rates inside genes is lower than in the spaces between genes or in introns. Then there's how important each part of a gene's sequence might be for function. If a lot of the gene is functionally important, we will se a much lower "mutation rate" because of stronger purifying selection ... so as you can see, before talking about genetic control for mutational rates, probability and purifying selection might explain a lot about apparent differences in mutation rates. Maybe some organisms do have different mutation rates for reasons other than random events combined with purifying selection. I have not studied that too much. Maybe some have better repair mechanisms and that translates as lower mutation rates. I have not studied that too much either.

      Delete
    2. No, different mutation rates per base aren't distributed at random across the genome. Mutation rates are properties of particular loci. These are actual mutation rates I am talking about, not a rate based on what's left after purifying selection. Those loci under no selection pressure often mutate more rapidly than loci which are under high selective pressure.
      LJ

      Delete
    3. @Lou Jost

      The mutation rate is mostly determined by the error rate of DNA replication. There are a few hotspots but the error rate is pretty much the same throughout the entire genome.

      Delete
    4. Larry, so we agree that the mutation rate (error rate - repair rate) can vary with the locus. The existence of the hot spots you mention is evidence for this. There is also evidence of different mutation rates in different chromosomes of the same organism, and different mutation rates in different species. There is even variation in mutation rates between genotypes:

      labs.eeb.utoronto.ca/agrawal/publications/nps_afa_pnas_2012.pdf

      All of this suggests that mutation rate is under partial genetic control, and so it is certainly possible that the coelacanth has a lower mean mutation rate than most other vertebrates.

      I have no opinion about whether it really does have a lower rate, though.

      LJ

      Delete
    5. We have known that the error rate of DNA replication/repair is under genetic control for almost fifty years.

      We also know that the error rate in most lineages has been approximately constant for hundreds of millions of years in spite of the fact that variation can arise from time to time.

      It's possible that some clades have evolved an efficient replication/repair system that's twice as good as that in almost all other species but it's not the most parsimonious explanation of the data. Besides, the paper implies that the overall mutation rate in these fish isn't much different than in other species.

      Note that the authors aren't invoking changes in mutation rate. They are claiming that the results support changes in the fixation propabilities in coelacanth protein-encoding genes. That's extremely unlikely, don't you agree? Would you have allowed this paper to be published with that kind of explanation and no mention of other possibilities or of the fact that this conflicts with a lot of theory and data?

      Delete
    6. Yes, if I had been a reviewer of this article I'd have asked for more explanation, and I would have made them give confidence intervals for meaningful parameters instead of (or alongside) p-values in their tests for rate differences. No more of this misleading "significantly slower" talk.

      But my point was just that this effect, by itself, (especially if it were due to slightly lower mutation rate, perhaps via fewer hotspots) would not really be that earthshaking.
      LJ

      Delete
  9. Larry: where do they actually make that claim? I saw speculation, but no actual claim. And as I pointed out above, they present (presumably without realizing) results against such a claim.

    ReplyDelete
    Replies
    1. (The claim that their results support changes in fixation probabilities of protein coding genes.)

      Delete
  10. I'm a bit late to this discussion, but I'm writing something on this for the museum where the picture on this blog post was taken (Oxford). I don't quite get their method: is the protein tree based on the amino acid sequence or the underlying ATCG code? If the latter (as I would guess from their choice of model), why is there no estimate of synonymous vs nonsynonymous evolution. This might show whether the slow rate is due to low mutation (few synonymous changes predicted) or to strong purifying selection (normal rate of accumulation of synonymous changes).

    Also, is there likely to be an ascertainment bias in the genes chosen? Picking those that *can* be easily compared means you will undoubtedly chose conserved sequences. But this should equally apply to the tetrapod and coelacanth linages, unless the filled-out tetrapod tree was used to infer a basal sequence for the tetrapods.

    ReplyDelete