Sandwalk: The frequency of splicing errors reflects the balance between selection and drift

Monday, April 01, 2019

The frequency of splicing errors reflects the balance between selection and drift

Splice variants are very common in eukaryotes. We know that it's possible to detect dozens of different splice variants for each gene with multiple introns. In the past, these variants were thought to be examples of differential regulation by alternative spicing but we now know that most of them are due to splicing errors. Most of the variants have been removed from the sequence databases but many remain and they are annotated as examples of alternative splicing, which implies that they have a biological function.

I have blogged about splice variants many times, noting that alternative splicing is a very real phenomenon but it's probably restricted to just a small percentage of genes. Most of splice variants that remain in the databases are probably due to splicing errors. They are junk RNA [The persistent myth of alternative splicing].

The ongoing controversy over the origin of splice variants is beginning to attract attention in the scientific literature although it's fair to say that most scientists are still unaware of the controversy. They continue to believe that abundant alternative splicing is a real phenomenon and they don't realize that the data is more compatible with abundant splicing errors.

Some molecular evolution labs have become interested in the controversy and have devised tests of the two possibilities. I draw your attention to a paper that was published 18 months ago.

Saudemont, B., Popa, A., Parmley, J. L., Rocher, V., Blugeon, C., Necsulea, A., Meyer, E., and Duret, L. (2017) The fitness cost of mis-splicing is the main determinant of alternative splicing patterns. Genome biology, 18:208. [doi: 10.1186/s13059-017-1344-6]

Background
Most eukaryotic genes are subject to alternative splicing (AS), which may contribute to the production of protein variants or to the regulation of gene expression via nonsense-mediated messenger RNA (mRNA) decay (NMD). However, a fraction of splice variants might correspond to spurious transcripts and the question of the relative proportion of splicing errors to functional splice variants remains highly debated.

Results
We propose a test to quantify the fraction of AS events corresponding to errors. This test is based on the fact that the fitness cost of splicing errors increases with the number of introns in a gene and with expression level. We analyzed the transcriptome of the intron-rich eukaryote Paramecium tetraurelia. We show that in both normal and in NMD-deficient cells, AS rates strongly decrease with increasing expression level and with increasing number of introns. This relationship is observed for AS events that are detectable by NMD as well as for those that are not, which invalidates the hypothesis of a link with the regulation of gene expression. Our results show that in genes with a median expression level, 92–98% of observed splice variants correspond to errors. We observed the same patterns in human transcriptomes and we further show that AS rates correlate with the fitness cost of splicing errors.

Conclusions
These observations indicate that genes under weaker selective pressure accumulate more maladaptive substitutions and are more prone to splicing errors. Thus, to a large extent, patterns of gene expression variants simply reflect the balance between selection, mutation, and drift.

This is another example of a well-written paper that explains the controversy and the two competing explanations; namely, functional alternative splicing and splicing errors. The authors suggest a test that might help distinguish between these two possibilities.

We propose here a test to quantify the fraction of splice variants corresponding to errors, i.e. having a negative impact on the fitness of organisms. The basis of this test is that the strength of splice signals is expected to reflect a balance between selection (which favors alleles that are optimal for splicing efficiency) and mutation and random genetic drift (which can lead to the fixation of non-optimal alleles). This selection-mutation-drift equilibrium therefore predicts a higher splicing accuracy at introns where errors are more deleterious for the fitness of organisms. Hence, if [splice variants] predominantly correspond to splicing errors, one should expect a negative correlation between the rate of [splice variant] events and their cost in terms of resource allocation (metabolic cost, mobilization of cellular machineries). The noisy splicing model therefore makes several specific predictions regarding the [splice variant] rate according to whether splice variants are detectable by NMD and according to the expression level, length, and number of introns of genes.¹

They carry out their main test using genes in Paramecium tetrauelia because this organisms has short introns (20-35 bp) that can be covered in single RNA-seq reads. Then they apply the same test to human genes and conclude ...

For a given error rate, errors are expected to be more costly (in terms of metabolic resources and mobilization of cellular machineries) in highly expressed genes. Hence the fitness cost of mis-splicing is expected to increase with increasing expression level. Indeed, this is precisely what we observed in humans: the strength of selection against deleterious mutations at splice sites is strongly correlated to gene expression level (Fig. 6b). Since the risk of producing erroneous transcripts increases with the number of introns, this implies that all else being equal, there should be a stronger selective pressure against mis-splicing in intron-rich genes. The mutation-selection-drift theory therefore predicts that introns from weakly expressed/intron-poor genes should accumulate more non-optimal substitutions in their splice signals and therefore should show a higher splicing error rate. The relationships that we observe between [splice variant] rate, expression level, and intron number are perfectly consistent with these predictions, both in human (Fig. 5) and in paramecia (Fig. 3).

I'm not going to argue that this is a definitive answer to the problem but I'm pleased that more and more groups are promoting the idea that splicing errors is a viable explanation of the data. I'm also pleased that more attention is being paid to the fact that slightly deleterious events can persist in the population because they are effectively invisible to selection. This counters the prevailing narrative that everything we observe must be adaptive and functional.

Note: Saudermont et al. (2019) review the literature on the rate of splicing errors and note that it can be as high as 3%. My own review of the literature suggests that an error rate of this magnitude is rare but splicing is still error-prone. I estimate that a typical splice site is only 99.9% effective and, in addition, inappropriate splice sites are activated about 0.1% of the time in a typical human gene. Saudermont et al. alerted me to a paper by Stepankiw et al. (2015) that I hadn't read before. Those authors presented evidence that 1% of all transcripts are incorrectly spliced due to errors in the spliceosome reaction.

1. The authors refer to all splice variants as examples of alternative splicing (AS). I think this is confusing since the term "alternative splicing" has been used for decades to refer to real examples of differential splicing with a biological function. I think we should reserve that term for biologically meaningful examples of splice variants as opposed to variants due to splicing errors.

Stepankiw, N., Raghavan, M., Fogarty, E. A., Grimson, A., and Pleiss, J.A. (2015) Widespread alternative and aberrant splicing revealed by lariat sequencing. Nucleic acids research, 43:8488-8501. [doi: 10.1093/nar/gkv763]

4 comments :

Unknown said...: Great read!

Where is this stated 3% splicing error rate stated: "Saudermont et al. (2019) review the literature on the rate of splicing errors and note that it can be as high as 3%"?

I can't find a Saudemont paper from 2019 nor can I find it on their 2017 Paramecium paper. I have seen estimates between 0.1% and 1%, so I'm curious where a 3% splicing error rate is stated.; Tuesday, April 02, 2019 1:33:00 PM
Palermo R said...: Hi profesor Moran
What do u think about this paper?

In the human genome, more than 4.5 million sequences can be readily identified as derived from transposable elements (TEs), accounting for at least 50% of its DNA content.

Long discarded as junk DNA, TEs are increasingly recognized as major motors of genome evolution.

https://www.sciencedirect.com/science/article/pii/S1934590919301110?via%3Dihub; Friday, April 19, 2019 2:22:00 PM
Dave said...: Sort of an offshoot, but i'm curious what you think about Wright's shifting balance theory of evolution, here: https://www.annualreviews.org/doi/pdf/10.1146/annurev.ge.16.120182.000245 It seems like it has a lot of potential to explain both macroevolutionary processes and population genetic level processes.; Friday, April 19, 2019 7:32:00 PM
Unknown said...: This is excellent! I'm glad to hear of this controversy, it reminds me of hidden dogmas that I probably take for granted.; Monday, May 13, 2019 11:25:00 PM

Quotations

The old argument of design in nature, as given by Paley, which formerly seemed to me to be so conclusive, fails, now that the law of natural selection has been discovered. We can no longer argue that, for instance, the beautiful hinge of a bivalve shell must have been made by an intelligent being, like the hinge of a door by man. There seems to be no more design in the variability of organic beings and in the action of natural selection, than in the course which the wind blows.Charles Darwin (c1880)

Although I am fully convinced of the truth of the views given in this volume, I by no means expect to convince experienced naturalists whose minds are stocked with a multitude of facts all viewed, during a long course of years, from a point of view directly opposite to mine. It is so easy to hide our ignorance under such expressions as "plan of creation," "unity of design," etc., and to think that we give an explanation when we only restate a fact. Any one whose disposition leads him to attach more weight to unexplained difficulties than to the explanation of a certain number of facts will certainly reject the theory.

Charles Darwin (1859)

Science reveals where religion conceals. Where religion purports to explain, it actually resorts to tautology. To assert that "God did it" is no more than an admission of ignorance dressed deceitfully as an explanation...

Peter Atkins

Quotations

The world is not inhabited exclusively by fools, and when a subject arouses intense interest, as this one has, something other than semantics is usually at stake. Stephen Jay Gould (1982)
I have championed contingency, and will continue to do so, because its large realm and legitimate claims have been so poorly attended by evolutionary scientists who cannot discern the beat of this different drummer while their brains and ears remain tuned to only the sounds of general theory. Stephen Jay Gould (2002) p.1339
The essence of Darwinism lies in its claim that natural selection creates the fit. Variation is ubiquitous and random in direction. It supplies raw material only. Natural selection directs the course of evolutionary change. Stephen Jay Gould (1977)
Rudyard Kipling asked how the leopard got its spots, the rhino its wrinkled skin. He called his answers "just-so stories." When evolutionists try to explain form and behavior, they also tell just-so stories—and the agent is natural selection. Virtuosity in invention replaces testability as the criterion for acceptance. Stephen Jay Gould (1980)
Since 'change of gene frequencies in populations' is the 'official' definition of evolution, randomness has transgressed Darwin's border and asserted itself as an agent of evolutionary change. Stephen Jay Gould (1983) p.335
The first commandment for all versions of NOMA might be summarized by stating: "Thou shalt not mix the magisteria by claiming that God directly ordains important events in the history of nature by special interference knowable only through revelation and not accessible to science." In common parlance, we refer to such special interference as "miracle"—operationally defined as a unique and temporary suspension of natural law to reorder the facts of nature by divine fiat. Stephen Jay Gould (1999) p.84

Quotations

My own view is that conclusions about the evolution of human behavior should be based on research at least as rigorous as that used in studying nonhuman animals. And if you read the animal behavior journals, you'll see that this requirement sets the bar pretty high, so that many assertions about evolutionary psychology sink without a trace.

Jerry Coyne
Why Evolution Is True

I once made the remark that two things disappeared in 1990: one was communism, the other was biochemistry and that only one of them should be allowed to come back.

Sydney Brenner
TIBS Dec. 2000

It is naïve to think that if a species' environment changes the species must adapt or else become extinct.... Just as a changed environment need not set in motion selection for new adaptations, new adaptations may evolve in an unchanging environment if new mutations arise that are superior to any pre-existing variations

Douglas Futuyma

One of the most frightening things in the Western world, and in this country in particular, is the number of people who believe in things that are scientifically false. If someone tells me that the earth is less than 10,000 years old, in my opinion he should see a psychiatrist.

Francis Crick

There will be no difficulty in computers being adapted to biology. There will be luddites. But they will be buried.

Sydney Brenner

An atheist before Darwin could have said, following Hume: 'I have no explanation for complex biological design. All I know is that God isn't a good explanation, so we must wait and hope that somebody comes up with a better one.' I can't help feeling that such a position, though logically sound, would have left one feeling pretty unsatisfied, and that although atheism might have been logically tenable before Darwin, Darwin made it possible to be an intellectually fulfilled atheist

Richard Dawkins

Another curious aspect of the theory of evolution is that everybody thinks he understand it. I mean philosophers, social scientists, and so on. While in fact very few people understand it, actually as it stands, even as it stood when Darwin expressed it, and even less as we now may be able to understand it in biology.

Jacques Monod

The false view of evolution as a process of global optimizing has been applied literally by engineers who, taken in by a mistaken metaphor, have attempted to find globally optimal solutions to design problems by writing programs that model evolution by natural selection.

Richard Lewontin

More Recent Comments

Monday, April 01, 2019

The frequency of splicing errors reflects the balance between selection and drift

4 comments :