Sandwalk: Alternative splicing and the gene concept

Tuesday, October 09, 2018

Alternative splicing and the gene concept

I just learned about a workshop scheduled for the end of this month. The topic is: Evolutionary Roles of Transposable Elements and Non-coding DNA: The Science and the Philosophy.

I'd love to attend but it's a just small workshop designed to encourage dialogue between scientists and philosophers who are interested in the topic. Here's a list of the speakers ...

Ryan Gregory: Junk DNA, genome size, and the onion test.

Stefan Linquist: Four decades debating junk DNA and the Phenotype Paradigm is (somehow) alive and well.

Chris Ponting: 92.9% of the human genome evolved neutrally.

Paul Griffiths: Both adaptation and adaptivity are relevant to diagnosing function.

Ford Doolittle: Selfish genes and selfish DNA: is there a difference?

Justin Garson: Biological functions, the liberality problem, and transposable elements.

Joyce Havstad: Evolutionary Thinking about Critique of Function Talk.

Guillame Bourque: Impact of transposable elements on human gene regulatory networks.

Ulrich Stegman: On parity, genetic causation and coding.

Steven Downes: Understanding non-coding variants as disease risk alleles.

Alexander Palazzo: How nuclear retention and cytoplasmic export of RNAs reduces the deleteriousness of junk DNA.

David Haig: Pax somatica

Cedric Feschotte: Transposable elements as catalysts of genome evolution.

There's a reading list for the workshop and several of the papers are new to me [Recommended Reading]. I was particularly interested in one of the papers by Stephen M Downes, a philosopher at the University of Utah and one of the participants in the upcoming workshop.

Downes, S.M. (2004) Alternative splicing, the gene concept, and evolution. History and philosophy of the life sciences:91-104. [PDF]

The paper discusses two of my favorite topics: alternative splicing and "what is a gene?" Another philosopher who's interested in defining the biological gene is Paul Griffiths and he will also be at the meeting. I remember talking to Paul and Karola Stotz at the junk DNA meeting in London a few years ago where I tried to explain that alternative splicing may not be real. They were not convinced.

Paul and Karola have written a book about genes where they claim that recent discoveries in genomics, including abundant alternative splicing, have overthrown the standard definition of a molecular gene. Their view on the importance of alternative splicing is not substantially different from that expressed by Stephen Downes in his 2004 paper so I'll concentrate on that paper.

Downes claims that the human proteome is enormously more complex than the number of genes would suggest. He is repeating a claim that, even today, is popular in the scientific literature. That doesn't make it true: in fact, there is no scientific evidence to support such a claim and plenty to refute it [The proteome complexity myth] [How many proteins in the human proteome?]. Downes goes on to offer an explanation for this imagined disparity between the number of genes and the number of proteins: the explanation is alternative splicing.

Giffiths and Stotz make the same argument on page 69 of their book ...

Another discovery of the postgenomic era has been the discrepancy between the number of genes in a genome and the number of products derived from them. For example, the human proteome outnumbers the number of discrete protein-coding genes by at least one order of magnitude, The human genome contains in the region of 20-25,000 genes (the correct number is still not known), while predictions have given numbers as high as 1 million proteins (Mueller et al., 2007). As we will show at length in 4.4 and 4.5, this discrepancy is explained by the fact that cellular mechanisms use the same coding region to make many different products and combine resources from different coding regions to make products.

I don't believe that there's a serious discrepancy that needs explaining. The reference quoted by Griffiths and Stotz does, indeed, make the claim that there may be up to one million different proteins in human cells but it's important to understand where this estimate comes from. Here's what Mueller et al. say in their review,

The relatively low number of human genes suggests that complexity of human biology is achieved through regulation on the transcriptional, post-transcriptional and post-translational level. Alternative splicing and translation as well as post-translational modification (e.g.: phosphorylation, glycosylation and proteolytic cleavage) both contribute to a “proteomic stratification” process that produces a protein population with a diversity that is several orders of magnitude higher than that of the number of genes encoding them. Correspondingly, it has been estimated that the human proteome comprises up to 1,000,000 protein species.

It looks like the estimate of one million different proteins is partly based on the assumption that alternative splicing is a real phenomenon in which case using the estimate to support the idea of alternative splicing seems like a failure in logic. But we don't need to quibble about "estimates" because there's real data to consider (see below).

Setting aside alternative splicing, there's still a major flaw in the argument that an enormous proteome requires rethinking fundamental concepts. Most of the Mueller et al. article is devoted to post-translational modifications that have been understood for decades. If every one of the 20,000 gene products have 50 such variants then there would be one million different protein species in the genome but, if true, this is not a "discrepancy" and it would not require any extraordinary explanation like alternative splicing. In other words, there's no mystery that needs explaining.

However, even the idea that the average polypeptide gene product gives rise to 50 different post-translational functional variants is ridiculous. For example, it would mean that each of the enzymes of the glycolytic pathway and the citric acid cycle have, on average, 50 different variants. These enzymes have been studied for half a century and there's no evidence to support such a claim. There's no evidence that every one of the subunits of RNA polymerases have 50 different variants nor is there any evidence that the subunits of the mitochondrial electron transport complexes exist in 50 different biologically relevant variations.

So, we can dismiss one of the major rationalizations for abundant alternative splicing but that doesn't mean that alternative splicing has been disproved. For that we have to look at the direct evidence. The evidence for abundant transcript variants for each multi-exon gene is solid. The important question is whether these variants are just the result of sloppy splicing, in which case they are junk RNA, or whether they are biologically relevant RNAs with a function, in which case they are genuine examples of alternative splicing.

Several groups have used sophisticated techniques to look for the alternative splice variants and they haven't found them [How many proteins in the human proteome?]. For those who are interested in seeing the actual experimental evidence, I recommend a paper by Bhuiyan et al. (2018). They say,

In this paper we take steps to address the gap between the commonplace assumption that most genes have more than one distinct functional product and evidence-based reality.

The "evidence-based reality" is that only ~5% of curated genes produce functionally diverse isoforms. In other words, massive alternative splicing is not supported by the available evidence. Most transcript variants are junk RNA produced by splicing errors.

The gene annotators have already decided that the vast majority of transcript variants are due to splicing errors. They have been purged from the databases. A typical gene in the genome database now has only two or three potential variants and most of those have not been shown to have a function. It's quite reasonable to hypothesize that only 5% of human protein-coding genes are involved in alternative splicing to produce two or more functional protein variants.

I've covered this debate in a series of post from last year so I won't repeat the arguments here [Are splice variants functional or noise?].¹

I believe I'm correct when I say that genuine alternative splicing is not a widespread phenomenon. I'm absolutely certain I'm correct when I say that there's no evidence supporting the claim that almost most all genes are alternatively spliced and that the average gene produces ten or more different functional variants.

That's not the point I'm trying to make. My main argument with philosophers who write about the gene concept is that they are uncritically accepting outlandish claims without considering alternative explanations. It may be true that every gene produces multiple splice variants with multiple promoters and transcription termination sites in which case we may or may not need to revise our definition of a gene. However, it may also be true that those variants just represent sloppy biology and they have no biological function, in which case we don't need to upend our understanding of the molecular gene.

It's wrong for philosophers (and scientists) to just assume that one of those possibilities is correct and then use that, possibly incorrect, assumption to re-define the gene. Real philosophers (and scientists) should be absolutely sure of their facts before making such a radical proposal.

P.S. I define a gene as, "A gene is a DNA sequence that is transcribed to produce a functional product." [Debating philosophers: The molecular gene] [Philosophers talking about genes] [What Is a Gene?]. The functional product is RNA and it may be further processed to give rise to ribosomal RNA, snoRNA, or any number of other functional RNAs. It may also give rise to mRNA that's then translated to produce a protein. There are many genuine examples of alternative splicing but that doesn't affect my definition of a gene. It just means that the primary transcript (= functional product) can be subsequently processed in several different ways.

1. [Debating alternative splicing (part I)] [Debating alternative splicing (part II)] [Debating alternative splicing (Part III)] [Debating alternative splicing (Part IV)]

Bhuiyan, S.A., Ly, S., Phan, M., Huntington, B., Hogan, E., Liu, C.C., Liu, J., and Pavlidis, P. (2018) Systematic evaluation of isoform function in literature reports of alternative splicing. BMC Genomics 19:637. [doi: 10.1186/s12864-018-5013-2]

Mueller, M., Martens, L., and Apweiler, R. (2007) Annotating the human proteome: Beyond establishing a parts list. Biochimica et Biophysica Acta (BBA) - Proteins and Proteomics, 1774(2):175-191. [doi: 10.1016/j.bbapap.2006.11.011]

8 comments :

Mikkel Rumraket Rasmussen said...: It occurs to me that even for the small minority of genes that actually have more than one functional isoform, it might not even be the case that those variants are somehow there for any adaptive reason.
So it's not that the alternatively spliced variant is necessary to the organism in any way. Rather it just happens to be the case that some small minority of proteins can function even in the absence (or with excessive copies) of particular exons.; Wednesday, October 10, 2018 5:12:00 AM
S Johnson said...: " It's quite reasonable to hypothesize that only 5% of human protein-coding genes are involved in alternative splicing to produce two or more functional protein variants."

It seems to me that this is a significant issue for selfish gene picture, which holds natural selection optimizes the genome. (Ditto epigenetics, unless methylation etc. are to be viewed as genetically determined processes that have undergone positive selection.)

In gene selectionist popularizations, "genes" are effectively the Evolution God's commandments written in DNA rather than stone. In that context, revising the optimized genome does imply a new understanding of "gene."; Wednesday, October 10, 2018 11:01:00 AM
Unknown said...: In our paper that Larry refers to (Bhuiyan et al., 2018), when we say ~5% of genes have functionally distinct splice isoforms (FDSIs), we define this by the splice isoform's necessity for the gene's overall function. To (self) quote:

"To establish the extent to which splice isoforms increase the functional repertoire of the genome, we need data on which genes have functionally distinct splice isoforms (FDSIs). Identification of genes with FDSIs requires experimental support to demonstrate the necessity of each splice isoform.[...] This idea readily extends to isoforms; if a single isoform is made absent and that isoform is necessary for the normal function of the gene, then a consequence (change in phenotype) would be expected. A gene has FDSIs if two or more isoforms meet this criterion independently (Fig. 1a)."; Wednesday, October 10, 2018 5:06:00 PM
Mikkel Rumraket Rasmussen said...: Thank you for the clarification. That implies there could be more genes with alternative splice isoforms which are not necessary for "normal" organismal function, yet nevertheless are capable of performing the function of the primary isoform.; Thursday, October 11, 2018 12:16:00 PM
John Harshman said...: What is the "phenotype paradigm"? The only reference I have found so far is in the title of a paywalled article by Ford Doolittle, and the abstract says nothing about it.; Thursday, October 11, 2018 1:07:00 PM
Larry Moran said...: Doolittle and Sapienza coined the term "phenotype paradigm" in their 1980 paper on selfish genes. They are referring to the adaptationist idea that all sequences are selected for their effect on the phenotype of the organism. They were proposing that there's a different kind of selection; namely, selection at the level of selfish genes, as in transposons. In that case, the phenotype of the organism not relevant because selection is occurring only for the survival of the selfish gene (transposon).

We know that bacterial and eukaryotic transposons have to make multiple copies of themselves before they are inactivated by mutations. In this way, they preserve functional copies that carry the elements essential for their survival. We can think of this as a form of selection at the level of the gene that's independent of the organism.

We know that this is a rare phenomenon accounting for only a tiny percentage of a typical genome and we know that in the big picture of evolution it's just a footnote. However, philosophers of biology tend to put a great deal of emphasis on the discovery of selfish DNA and the overthrowing of the phenotype paradigm. I don't think I've ever heard of the phenotype paradigm in the scientific literature except in 1980.

I'm going to meet with Stefan Linquist next week to discuss this issue. There seems to be a big difference between how scientists view the selfish DNA papers (interesting, but not terribly important) and how philosphers see them (groundbreaking, and paradigm shifting).; Thursday, October 11, 2018 3:51:00 PM
Joe Felsenstein said...: @Larry: So there's no difference in practice between "weird exception" and "paradigm shift", at least as far as these philosophers are concerned. An interesting contrast with mathematics, where the statement "prime numbers are odd" is false, while in a field like biology it would be generally true.; Thursday, October 11, 2018 4:51:00 PM
Unknown said...: Mikkel, at least for our paper, we tried to do away with the idea of "primary" and alternative transcripts. For the limited subset of genes with functionally distinct splice isoforms (FDSIs), what makes a transcript the primary one? Primary implies some level of importance, and if two or more splice isoforms of a gene are necessary, then they're both important.

Nevertheless, you are correct - the splice isoforms of the curated genes without evidence of necessity may still be capable of a "function", though as Larry has pointed out that there is limited evidence of that in functional genomics studies and I agree. Perhaps genes with functionally redundant splice isoforms is the correct term.

As somewhat of an aside, from our curation, we could only identify 43 human and mouse genes with FDSIs, and due to this limited number of genes, we avoided making sweeping generalizations in the paper about the 43 genes with FDSIs. I would hypothesize though that the splicing of these 43 genes are evolving under negative selection because their FDSIs are likely necessary for reproductive success.

Future researchers looking to provide new evidence for functional distinctness should prioritize their research towards splice isoforms likely evolving under negative selection. Given the neutral theory and the nearly neutral theory of evolution, randomly selecting splice isoforms will be a waste of resources as most splice isoforms are likely nonfunctional.; Thursday, October 11, 2018 6:21:00 PM

Quotations

The old argument of design in nature, as given by Paley, which formerly seemed to me to be so conclusive, fails, now that the law of natural selection has been discovered. We can no longer argue that, for instance, the beautiful hinge of a bivalve shell must have been made by an intelligent being, like the hinge of a door by man. There seems to be no more design in the variability of organic beings and in the action of natural selection, than in the course which the wind blows.Charles Darwin (c1880)

Although I am fully convinced of the truth of the views given in this volume, I by no means expect to convince experienced naturalists whose minds are stocked with a multitude of facts all viewed, during a long course of years, from a point of view directly opposite to mine. It is so easy to hide our ignorance under such expressions as "plan of creation," "unity of design," etc., and to think that we give an explanation when we only restate a fact. Any one whose disposition leads him to attach more weight to unexplained difficulties than to the explanation of a certain number of facts will certainly reject the theory.

Charles Darwin (1859)

Science reveals where religion conceals. Where religion purports to explain, it actually resorts to tautology. To assert that "God did it" is no more than an admission of ignorance dressed deceitfully as an explanation...

Peter Atkins

Quotations

The world is not inhabited exclusively by fools, and when a subject arouses intense interest, as this one has, something other than semantics is usually at stake. Stephen Jay Gould (1982)
I have championed contingency, and will continue to do so, because its large realm and legitimate claims have been so poorly attended by evolutionary scientists who cannot discern the beat of this different drummer while their brains and ears remain tuned to only the sounds of general theory. Stephen Jay Gould (2002) p.1339
The essence of Darwinism lies in its claim that natural selection creates the fit. Variation is ubiquitous and random in direction. It supplies raw material only. Natural selection directs the course of evolutionary change. Stephen Jay Gould (1977)
Rudyard Kipling asked how the leopard got its spots, the rhino its wrinkled skin. He called his answers "just-so stories." When evolutionists try to explain form and behavior, they also tell just-so stories—and the agent is natural selection. Virtuosity in invention replaces testability as the criterion for acceptance. Stephen Jay Gould (1980)
Since 'change of gene frequencies in populations' is the 'official' definition of evolution, randomness has transgressed Darwin's border and asserted itself as an agent of evolutionary change. Stephen Jay Gould (1983) p.335
The first commandment for all versions of NOMA might be summarized by stating: "Thou shalt not mix the magisteria by claiming that God directly ordains important events in the history of nature by special interference knowable only through revelation and not accessible to science." In common parlance, we refer to such special interference as "miracle"—operationally defined as a unique and temporary suspension of natural law to reorder the facts of nature by divine fiat. Stephen Jay Gould (1999) p.84

Quotations

My own view is that conclusions about the evolution of human behavior should be based on research at least as rigorous as that used in studying nonhuman animals. And if you read the animal behavior journals, you'll see that this requirement sets the bar pretty high, so that many assertions about evolutionary psychology sink without a trace.

Jerry Coyne
Why Evolution Is True

I once made the remark that two things disappeared in 1990: one was communism, the other was biochemistry and that only one of them should be allowed to come back.

Sydney Brenner
TIBS Dec. 2000

It is naïve to think that if a species' environment changes the species must adapt or else become extinct.... Just as a changed environment need not set in motion selection for new adaptations, new adaptations may evolve in an unchanging environment if new mutations arise that are superior to any pre-existing variations

Douglas Futuyma

One of the most frightening things in the Western world, and in this country in particular, is the number of people who believe in things that are scientifically false. If someone tells me that the earth is less than 10,000 years old, in my opinion he should see a psychiatrist.

Francis Crick

There will be no difficulty in computers being adapted to biology. There will be luddites. But they will be buried.

Sydney Brenner

An atheist before Darwin could have said, following Hume: 'I have no explanation for complex biological design. All I know is that God isn't a good explanation, so we must wait and hope that somebody comes up with a better one.' I can't help feeling that such a position, though logically sound, would have left one feeling pretty unsatisfied, and that although atheism might have been logically tenable before Darwin, Darwin made it possible to be an intellectually fulfilled atheist

Richard Dawkins

Another curious aspect of the theory of evolution is that everybody thinks he understand it. I mean philosophers, social scientists, and so on. While in fact very few people understand it, actually as it stands, even as it stood when Darwin expressed it, and even less as we now may be able to understand it in biology.

Jacques Monod

The false view of evolution as a process of global optimizing has been applied literally by engineers who, taken in by a mistaken metaphor, have attempted to find globally optimal solutions to design problems by writing programs that model evolution by natural selection.

Richard Lewontin

More Recent Comments

Tuesday, October 09, 2018

Alternative splicing and the gene concept

8 comments :