We're interested in the last universal common ancestor of all life (LUCA). In theory, this is a species that gave rise to both Bacteria and Archaea. The general assumption is that this is a single species with a well-defined set of genes that can now be found in all, or almost all, living species.
There are some scientists who question that assumption because they see massive transfers of genes between "species" during the early history of life. This gives rise to a web of life and not a well-defined tree. [The Three Domain Hypothesis: RIP] [The Web of Life] If that model is correct, then the ancestor of all living species could be a group of species that contributed different genes to a pool of organisms that lived billions of years ago. Early Bacterial and Archaeal ancestors could have independently acquired some genes by horizontal gene transfer.
In spite of these complications, it seems reasonable to deduce which genes are ancient by looking at genes that are common in both domains (Bacteria and Archaea). A special class of such genes includes ancient paralogs—related genes that descend from an ancient gene duplication event. A classic example is the EF-Tu gene and the IF2 gene. EF-Tu is the elongation factor required during translation and IF2 is one of the iniation factors. The two genes are related—they arose from a gene duplication event.1
Both of these genes are found in Bacteria and Archaea and all their descendants so it's reasonable to assume that both genes were present in LUCA and that the gene duplication event occurred some time earlier. The implication is that LUCA and its ancestors were around for millions of years before the split that gave rise to Bacteria and Archaea.
A recent (February, 2026) paper by Goldman al. looked at five different ancient paralogs including EF-Tu/IF2. They conclude that all five provide evidence of ancient gene duplication events that precede LUCA (pre-LUCA evolution).
One set of paralogs includes various aminoacyl-tRNA synthetases. I'm copying a figure from the paper showing the phylogeny of seryl-tRNA synthetase and threonyl-tRNA synthetase to illustrate an important point that's often overlooked. Recall that the current model of the tree of life is that there are only two domains (Bacteria and Archaea) and eukaryotes arose from a fusion of two species from WITHIN each domain giving rise to a ring of life model.
What this means is that a typical eukaryotic gene might have either a Bacterial gene or an Archaeal gene as its most recent common ancestor. If the genes are ancient then this means that the original eukaryotic cell had two homologous genes—one from each domain—and only one of them survived.This is clearly illustrated in the seryl- and threonyl-tRNA synthetase trees. Note that in the seryl-tRNA synthetase tree it's the Archaeal version that survived in eukaryotes while in the threonyl-tRNA synthetase tree it's the bacterial gene that survived. Nice.
The evolutionary history of aminoacyl-tRNA synthetases might give some clues about which amino acids might have been used when life originated. Rather than try to explain this in my own words I'll just copy what the authors wrote in their paper.
One such line of research reconstructed the pre-LUCA ancestors of aminoacyl tRNA synthetases, and the frequencies of amino acids in their ancestral sequences were analyzed in order to investigate the late stages of genetic code evolution. These enzymes are responsible for adding the correct amino acid to its cognate tRNA, thereby establishing the genetic code. In one study,75 the composition of the common ancestor of leucyl-tRNA synthetase, isoleucyl-tRNA synthetase, and valyl-tRNA synthetase was shown to contain sites specific for leucine, isoleucine, and valine. In a second study,70 the pre-LUCA ancestor of tryptophan tRNA synthetase and tyrosine tRNA synthetase was shown to be devoid of tryptophan amino acids despite later versions of both synthetases containing tryptophan at several sites.
These results combined to illustrate important features of the later stages of genetic code evolution. The leucyl-tRNA, isoleucyl-tRNA, and valyl-tRNA synthetases appear to have evolved through subfunctionalization, as the ancestral protein sequence contains all three amino acids. This indicates that their ancestral protein did not discriminate between these three amino acids. This result suggested that the ability of early life forms to specifically incorporate these amino acids into genetically encoded proteins predates the evolution of the synthetases themselves, implying earlier, alternative aminoacylation processes. On the other hand, the reconstructed ancestral sequence of the tryptophanyl-tRNA and tyrosyl-tRNA synthetases has a statistically significant absence of tryptophan residues in its sequence, suggesting that the tryptophanyl-tRNA synthetase evolved through neofunctionalization. Taken together, these results support that the transition to a modern genetic code before the LUCA was a complex evolutionary process incorporating multiple mechanisms, including co-evolution with amino acid biosynthesis pathways.
I think this is an important paper for two reasons. It tells us a bit about what genes might have been present in LUCA but, more importantly, it strongly suggests that a lot of time passed between the first living cell (origin of life) and LUCA.
1. The EF-Tu/IF2 paralogs were not one of the two sets of genes used to root the original Three Domain Tree of Life back in 1992. Those paralogs were EF-Tu and EF-G.
Goldman, A.D., Fournier, G.P. and Kaçar, B. (2026) Universal paralogs provide a window into evolution before the last universal common ancestor. Cell Genomics. doi: [prepublication link: cell-genomics/fulltext/S2666-979X(26)00002-9]



No comments :
Post a Comment