More Recent Comments

Sunday, November 26, 2006

The Three Domain Hypothesis (part 3)

The scientific dispute over The Three Domain Hypothesis is based on the validity of RNA trees, the importance of protein trees that disagree with the rRNA tree, the evidence for fusions, and the frequency of Lateral Gene Transfer (LGT). But, as usual, there’s more to it than just science. The side with the best advocates has a huge advantage in fights like this.

Let's set the stage by quoting from the article by William Martin.
Thus, it seems to me that there is a schisma abrew in cell evolution, with the rRNA tree and proponents of its infallibility on the one side and other forms of evidence, proponents of LGT, or proponents of a symbiotic origin of eukaryotes on the other. The former camp is well organized behind a unified view (be it right or wrong, still a view) and is arguing that we already have the answers to microbial evolution. The latter camp is not organized into castes of recognized leadership and followers, meaning that (if we are lucky) concepts and their merits, not position or power, will determine the outcome of the battle as to what ideas might or might not be worthwhile entertaining as a working hypothesis for the purpose of further scientific endeavour.
The article by Norman Pace represents the side that already has the answers. He is a strong proponent of the Three Domain Hypothesis. These days, the main thrust of his argument is that we should all jump on the bandwagon or risk being left behind. I heard him speak in San Francisco last April and he sounded more like a preacher than a scientist. His article in Nature, ”Time for a Change”, is an example of the way the Three Domain Hypothesis proponents have been arguing for 20 years.

One of the key problems in deep phylogeny is choosing the right gene. Pace argues in favor of ribosomal RNA—not a surprise since he has invested over 20 years in this molecule. Ideally, what kind of gene do we want to examine in order to determine the deepest branches in the tree of life? According to Pace there are three criteria ....
1. The gene must be universal.
2. The gene must have resisted lateral gene transfer.
3. The gene must be large enough to provide useful phylogenetic information.
Only ribosomal RNA meets all three criteria, says Pace.

There’s no question about #1. Ribosomal RNA genes are fond in all species. There are very few other genes that meet this criterion. Almost all other candidates are absent in at least a few species. Ribosomal RNA satisfies #3 as well. Even the small subunit is large enough.

What about #2? Which genes have “resisted” lateral gene transfer? You can’t just declare by fiat that ribosomal RNA genes haven’t been transferred. It’s a debatable question as we’ll see later on.

I would add three other criteria.
4. The gene must be unique, or if it isn’t, paralogues must be easily recognized.
5. The gene must encode a protein because it’s much more accurate to analyze amino acid sequences than nucleic acid sequences. (And easier to align.)
6. The gene must be highly conserved in order to retain significant sequence similarity at the deepest levels.
Ribosomal RNA doesn’t do so well when we add these criteria. Most bacterial genomes have multiple copies of ribosomal RNA genes. They are usually 99% similar but there are known examples of more divergent paralogues. This is not likely to be a serious problem for deep phylogeny, but it has caused problems at the species level.

Ribosomal RNA does not encode protein. That’s a serious problem that Pace never addresses.

Ribosomal RNA genes are well conserved but not as highly conserved as some others. This is why rRNA can be used to distinguish closely related species whereas the sequences of other genes are identical unless the species diverged more than 10-20 million years ago. Part of the problem with using rRNA sequences in deep phylogeny is that they are too divergent.

Having declared that ribosomal RNA genes are the best choice, Pace then goes on to show us the “true”universal tree of life. As you can see, it is divided into three distinct clusters separated by long branches. The clades represent Bacteria, Archaea, and Eukaryotes; the Three Domains. The prokarotes (Bacteria and Archaea) seem to associate and the eukaryotes seem to be more distantly related.

But first impressions can be misleading. Pace puts the root on the branch leading to bacteria and not on the long branch leading to Eukaryotes. This root is based entirely on two old 1989 papers, which he references. Both of these papers have been refuted, but that’s not something you would learn from reading Pace’s article. (There are other, more recent, experiments that root the tree on the bacterial branch and these should have been used. The fact that they weren’t reflects Pace’s degree of critical thinking on this problem. )

To many of us, the large scale structure of the tree of life just doesn’t look right. The long branches leading from the trifurcation point to Bacteria and Eukaryotes smack of artifact. The branching within each of the domains looks too simple. It’s part of the reason why there’s skepticism about the rRNA tree, as we’ll see.

The rest of the article is a passionate defense of the importance of bacteria. I agree with him, for the most part, and so do lots of evolutionary biologists. Bacteria are much more important than eukaryotes! :-)

Pace contributes very little to the debate since he is not willing to entertain any doubts about the Three Domain Hypothesis. For that we have to look at some other papers.



Microbobial Phylogeny and Evolution: Concepts and Controversies Jan Sapp, ed., Oxford University Press, Oxford UK (2005)

Jan Sapp The Bacterium’s Place in Nature

Norman Pace The Large-Scale Structure of the Tree of Life.

Woflgang Ludwig and Karl-Heinz Schleifer The Molecular Phylogeny of Bacteria Based on Conserved Genes.

Carl Woese Evolving Biological Organization.

W. Ford Doolittle If the Tree of Life Fell, Would it Make a Sound?.

William Martin Woe Is the Tree of Life.

Radhey Gupta Molecular Sequences and the Early History of Life.

C. G. Kurland Paradigm Lost.

8 comments :

Rosie Redfield said...

The old idea that we need to choose "the right gene" is part of the problem. We shouldn't infer evolutionary relationships from a single gene any more than we should infer them from a single bone.

The "right gene' idea dates from the time when getting the sequence of even one gene was expensive. The question now is how much (and what kind of) sequence or non-sequence characters do we need to get a satisfactory approximation of evolutionary histories?

Maybe the best approach would be to take those organisms with whole-genome sequences and see how much information can be left out without seriously weakening the tree.

(I think Bill Martin's camp also has its share of politics and charismatic leaders.)

Unknown said...

About you proposed criteria number 5: The predicted secondary structure of rDNA could be used as a guide in the alignment process, and this will make it as acurate as a protein alignment.

Larry Moran said...

There are many assumptions in using the "predicted" secondary structure as a guide. First, you have to assume that your predictions are correct. Second, you have to assume that there has been no migration of secondary structure as double-stranded regions expand and contract over time. Third, you have to assume that secondary structure is conserved.

Even if all these assumptions hold you still have the problem of only four possible bases at each position and you still have the problem of deciding where the gaps should be placed.

If you've ever tried aligning ribosomal RNA sequences you'll find that it's much more difficult than aligning highly conserved amino acid sequences, even when you have predictions of double-stranded regions.

There's a large aligned sequence database of ribosomal RNAs so it's not that hard to fit in another related sequence. What we don't know is how accurate the original alignments were in the first place. (Some work has been done on this. If you have references, I'd appreciate hearing about them.)

Anonymous said...

Would it be possible to cite journal and year in addition to author’s name and article title? Or are all the mentioned articles chapters in the book edited by Sapp?

Larry Moran said...

All the articles are from the book. I haven't got time to do a complete review of the literature so I decided to concentrate on the articles in the book in order to higlight the controversy.

Most people don't realize that there's a fight going on.

Jonathan Badger said...

Bill Martin, as quoted by Larry "Thus, it seems to me that there is a schisma abrew in cell evolution, with the rRNA tree and proponents of its infallibility on the one side and other forms of evidence, proponents of LGT, or proponents of a symbiotic origin of eukaryotes on the other. The former camp is well organized behind a unified view (be it right or wrong, still a view) and is arguing that we already have the answers to microbial evolution."

I think Bill Martin is creating a straw man here; *nobody* on the "rRNA side" (really, the three domain side) thinks that "we already have the answers to microbial evolution".

Everybody knows that there are a lot of unanswered questions in evolution and studies of whole genome phylogeny, LGT and the possible symbiotic origin of eukaryotes are of interest to everybody in the field of microbial evolution. Much of the research in these subjects is conducted by people who support the three domains (not surprisingly, since that is the current majority view).

Larry Moran said...

Martin may be guilty of hyperbole—aren't we all from time to time?—but he makes a valid point. The proponents of the Three Domain Hypothesis have been very forceful in pushing the idea. They're currently acting as if it's a done deal and all the textbooks should be changed.

For example, Pace argues in "Time for a Change" (Nature 441:289 (2006)) that the word "prokaryote" should be abandoned now that we "know" for sure that the earliest split was not between eukaryotes and prokaryotes. He says ...

The use of the term 'prokaryote' fails to recognize that an idea about life's origins has been proved wrong.

I don't know about you but that sure sounds to me like he already knows the answer. Pace closes with some strong language ...

I believe it is critical to shake loose from the prokaryote/eukaryote concept. It is outdated, a guesswork solution to an articulation of biological diversity and an incorrect model for the course of evolution. Because it has long been used by all texts of biology, it is hard to stop using the word, prokaryote. But the next time you are inclined to do so, think what you teach your students: a wrong idea.

Do you agree? Do you think the Three Domain Hypothesis is so solidly established that all other hypotheses are just plain wrong?

In my opinion, that's not a good example of how scientists should behave when they know that respected colleagues disagree with their favorite hypothesis.

Jonathan Badger said...

Pace "Because it has long been used by all texts of biology, it is hard to stop using the word, prokaryote. But the next time you are inclined to do so, think what you teach your students: a wrong idea."

Larry: "Do you agree? Do you think the Three Domain Hypothesis is so solidly established that all other hypotheses are just plain wrong?"


I don't think that's what Pace is saying -- as I interpret it, all he is saying is that it doesn't seem plausible to maintain Haeckel's 19th century idea of Kingdom Monera where all organisms lacking nuclei were assumed to be more or less the same; we know too much about the biochemical and genetic differences between bacteria and archaea to believe it any more. And I agree.

But that doesn't mean that the Three Domain Hypothesis in the traditional sense is the One True Answer to explain these differences. It could well be that in the future people will believe that only the eukaryotic nucleus is related to archaea and that the cytoplasm arose from a bacterium or even some other form of life that we haven't discovered yet. Science gets more and more complicated as we learn more.

But just as flaws in Einstein's theories of gravity don't send physicists rushing back to Newton, possible flaws in Woese's Three Domains won't send biologists back to Haeckel.