Thursday, December 18, 2014

Questions about alternative splicing

Alternative splicing is a mechanism where am intron-containing gene is transcribed and the primary transcript is spliced in two or more different ways to produce different functional RNAs. If it's a protein-coding gene then the idea is that different forms of the protein are produced in this way and each of them is functional.

It's important to emphasize that the products of alternative splicing must be functional because we know that splicing is error-prone and that mispliced, nonfunctional, RNAs will be quite common. Every gene will produce a bunch of these aberrantly spliced variants but that doesn't mean that every primary transcript is alternatively spliced.

It's important to distinguish between real functional alternative splicing and junk RNAs that arise from splicing errors. One of the ways to do this is to report on the concentrations of the various transcripts but that's rarely done in papers that promote alternative splicing [see: The most important rule for publishing a paper on alternative splicing].

The importance of alternative splicing is related to the debate over the importance of pervasive transcription and junk DNA since advocates of alternative splicing are often the same people who object to junk DNA [see: Vertebrate Complexity Is Explained by the Evolution of Long-Range Interactions that Regulate Transcription?]. I call this The Deflated Ego Problem because these scientists are usually looking for way to "explain" the complexity of humans in light of the fact that we seem to have the same number of genes as many other species.

If it's true that most human genes are alternatively spliced then let's see the evidence. That means actually demonstrating that different proteins with different functions are produced from the same gene. We've known for 35 years that this is possible but that's not the point. The point is whether all, or most, human RNAs are alternatively spliced. I've issued a simple challenge to those who use the alternative splice databases [A Challenge to Fans of Alternative Splicing]. So far, nobody has stepped up to the plate.

Some of the examples that are promoted in those databases make no sense whatsoever [Two Examples of "Alternative Splicing"] [The Frequency of Alternative Splicing ].

Someone raised this issue in the comments to another post and send me a link to a paper published in 2010. Here's the paper and part of the introduction.

Keren, H., Lev-Maor, G. and Ast, G. (2010) Alternative splicing and evolution: diversification, exon definition and function. Nature Reviews Genetics 11:345-355 [doi: 10.1038/nrg2776].
Splicing of precursor mRNA (pre-mRNA) is a crucial regulatory stage in the pathway of gene expression: introns are removed and exons are ligated to form mRNA. The inclusion of different exons in mRNA — alternative splicing (AS) — results in the generation of different isoforms from a single gene and is the basis for the discrepancy between the estimated 24,000 protein-coding genes in the human genome and the 100,000 different proteins that are postulated to be synthesized.

... Comparing species to see what has changed and what is conserved is proving valuable in addressing these issues and has recently yielded substantial progress. For example, new high-throughput sequencing technology has revealed that >90% of human genes undergo AS — a much higher percentage than anticipated. Such technological progress is providing more comprehensive studies of splicing and genomic architecture in an increasing number of species, and these studies have extended our evolutionary understanding.
I'd like you to answer two questions.
  1. Do you believe that there are about four (4) different, functional, proteins produced on average from every human protein-encoding gene?
  2. Do you believe that more than 90% of human genes produce a transcript that can be alternatively spliced, where alternative splicing is restricted to producing different functional RNAs and not just noise?


  1. 1. Sure. If nature can produce a mechanism that allows it to read DNA backward and forwards, and repair itself, it can produce a mechanism to allow all genes to be alternatively spliced to increase the functionality of a limited set of tools. Pure design genius.

    2. Sure. Ditto the above. Faulty splicing may create noise but its the fact that splicing even takes place at all that is awesome and intriguing. AND the likelihood that life's intelligence may possibly be able to reform broken pieces into functional wholes says jDNA is a losing proposition. One mans' junk is another mans gold.

    But certain folks would rather denigrate any expressions of awe at what nature is capable of. They would rather characterize nature's awesome achievements as bumbling mis-steps that are miraculously slow-cooked into humans.

    Nature tells a different story. Its a story of carefully crafted parts that are packed with multi-functional attributes that can be repackaged, reassigned, retrained, repositioned, almost endlessly to achieve biodiversity. Now thats superior design at work.

    Since humans are now looking even more closely into the mirror, we will someday imitate what nature has already accomplished.

    But that will take a design approach, as opposed to a non-planned, non-goal oriented, non-directional approach.

    The future of Man is intelligent design.

    1. "The future of Man is intelligent design"

      This is not even true for the past. You time and again prove with your statements.

    2. Nature tells a different story. Its a story of carefully crafted parts that are packed with multi-functional attributes that can be repackaged, reassigned, retrained, repositioned, almost endlessly to achieve biodiversity. Now thats superior design at work.

      Cancer. "Intelligent design" at work.

    3. Interestingly, trumpeting alternative splicing is, to the extent that it exists, something of an own goal to the 'isolated function' brigade of ID's broad church. If you could create 4 or more different functional proteins from the same basic RNA for every gene, by simple shuffling or subunit omission, it rather suggests that protein space is not nearly so function-poor and unexplorable as ID-ers would have us believe.

    4. Where is your evidence Steve? Or is this just what you want to believe (motivated reasoning) because your religious beliefs are a lot less fragile if you can show that we are carefully designed?

      So where are these papers that show that most of these alternatively spliced variants are functional?

  2. Hi again Larry

    First of all thank you for helping out an aging and hopelessly out-of-date high school Biology teacher.

    I am very grateful you moved this inquiry to its own thread!

    Here is what I have been telling my students as cut and pasted from what I (until now) considered an excellent reference

    Within the last few decades, scientists have discovered that the human proteome is vastly more complex than the human genome. While it is estimated that the human genome comprises between 20,000 and 25,000 genes (1), the total number of proteins in the human proteome is estimated at over 1 million (2). These estimations demonstrate that single genes encode multiple proteins. Genomic recombination, transcription initiation at alternative promoters, differential transcription termination, and alternative splicing of the transcript are mechanisms that generate different mRNA transcripts from a single gene (3).

    The increase in complexity from the level of the genome to the proteome is further facilitated by protein post-translational modifications (PTMs). PTMs are chemical modifications that play a key role in functional proteomics, because they regulate activity, localization and interaction with other cellular molecules such as proteins, nucleic acids, lipids, and cofactors.

    The link goes on to explain how the jump from 20 000 genes (the genome) to about 100 000 transcripts (the transcriptome) is explained by alternate promoters, alternate splicing and differential mRNA editing (see diagram in link)

    My reading of this link indicates that the 100 000 transcripts are all functional which I understand you may take issue with.

    That said – I cannot wrap my head around how we get from about 20 000 potential protein-encoding genes to a proteome of over 1 000 000 functional proteins without "exon shuffling"!

    A completely different but related question:

    Surely gene duplication and exon shuffling GREATLY enhances the repertoire of protein variability available for selection along the lines of promiscuous enzymes you often mention. I also understand that even Archaea embrace this strategy of generating enhanced variability at the molecular level.

    I just have to wonder if you are not over-stating your case here…
    … either that, or I am hopelessly and naïve confused.

    1. There have been countless papers on individual proteins and enzymes, like the enzymes of the citric acid cycle. There are labs that specialize in just one or two enzymes—or a small class of enzymes.

      It's interesting that all these studies, spread out over several decades, have not discovered that there are actually many different variants of their favorite enzymes with different internal amino acid sequences.

      I wonder why studies of individual proteins haven't revealed the same level of diversity that proteomics and genomics predicts? Maybe it's only the proteins that haven't been looked at closely that have alternative splice variants?

      That would explain a lot.

    2. There are a few cases of well-attested alternative splicing. Out of the 20 or so genes I've worked with, one of them, MYC (or c-myc) has a documented alternative form (and just one of them). Of course, maybe humans are so different from birds that this generalization won't hold. Bets?

    3. Johnny Harsh,

      Are you disagreeing with Larry by any chance...? What would you do if you were right...? Can you tell me please..?
      I bet $100 you are and you can prove it... Go!!!

    4. Are you disagreeing with Larry by any chance...?

      No. Is any more evidence required that you can't read?

  3. 1: No
    2: No

    because proteomics rather points to one major transcript per gene.
    Gonzàlez-Porta M, Frankish A, Rung J, Harrow J, Brazma A. Transcriptome analysis of human tissues and cell lines reveals one dominant transcript per gene. Genome Biol. 2013 Jul 1;14(7):R70

    The second reason: I do remember how many transcripts per gene appeared on Northern blots.

    1. These findings aren't mutually exclusive. You could have tons of (potentially nonproductive) splice variants, yet still also make one major protein isoform due to differential translation/mRNA decay/protein stability.

    2. Tons of differentially spliced RNAs have been found and added to splice databases. However, the question remains if any of these is of any relvance if it appears with a frequncy >10-fold smaller than the major transcript. In addition, many of the entries in splice databases may be the results of processes interrupted through RNA isolation procedures. I.e., these may be incompletely spliced pre-mRNAs or trnascripts prone to NMD. However, I must admit that the last time I looked into one of the relevant databases is about eight years ago when I was looking for a constitutively spliced intron. I actually couldn't find any due to all the alternative transcripts around. I actually looked for genes encoding proteins involved in glycolysis and the citrate cicle, proto-oncogenes and anti-oncogenes and couldn't find any.

  4. 1. Maybe
    2. No

    I would think most housekeeping genes have only one transcript but there may be genes, especially in the nervous system, that have many so it could average to 2-3. Some 'alternate' transcripts might be obviously non-functional, especially if they trigger the mRNA decay mechanisms.
    People who appeal to alternate transcripts to solve the "deflated ego" problem are out of luck when it comes to Drosophila. Drosophila has one gene, DSCam that has more alternate transcripts than all human protein coding loci combined. A recent paper claimed in its abstract that most of these are functional ( but I don't remember how they determined that)
    ( Non-functional spliced transcripts are not a new idea. In 1985 I was an undergrad at SUNY SB. I took a grad class where the professor (Paul Bingham) asked a question on his exam about how you'd test to see if an alt transcript was functional)

    1. Maybe, but I thought the paper I saw was a lot more recent. This paper shows that the isoforms have homophilic binding but that doesn't necessarily imply function. I'd like to know what Larry thinks

  5. "So far, nobody has stepped up to the plate."

    Why wouldn't Jonathan Wells step up to the plate? He could hit a home run (as it says so on Wikipedia)

  6. My apologies for such a tardy revisiting of this topic. The fact is that I was away from my textbooks over the holidays.

    I have examined the three most adopted AP textbooks, all of which sing from the same hymnal regarding the ubiquity of alternate splicing to enhance the repertoire of the proteome.

    Sadly they do not delve in much detail, except for the Raven text which specifically subscribes to "computer-based" studies suggesting that 35-59% of human genome transcripts are subject to alternate splicing generating at least 80 000 different mRNAs in human cells.

    Raven Biology 9th Ed. Page 291

    Doing some google-whacking based on those cited numbers brings me to this reference:

    I quote:

    Several studies have analyzed alternative RNA splicing on a genome-wide basis (4, 5, 6, 7, 8) . These studies showed that 35–59% of human genes have at least one alternative splicing isoform.

    Meanwhile – I refer everybody to a resource that until now I had considered current :

    Their cited numbers are more astonishing:

    Recent predictions based on deep sequencing of cDNA fragments suggest that more than 90 percent of human pre-mRNA transcripts are alternatively spliced, many in a tissue-specific or developmental stage-specific fashion (Wang et al. 2008).

    Looking up Larry’s references is aking to playing with Russian Matrushka dolls. From what I can gather, the optimistic results cited above can be attributed to two kinds of artifacts –

    1 – BLAST computer generations that are clearly non-functional
    2 – Many if not most in vivo "junk" transcripts are similarly not functional

    Have I got this correct so far?

    In defense of my blatant naïveté, I do mention to my students that alternate splicing is only part of the story! I tell them that some estimates suggest that 5% (I reckoned approximately 50,000!!!) of the proteome comprises enzymes that alone perform more than 200 types of post-translational modifications on a far larger repertoire of proteins including an ever growing list of enzymes.

    I imagine that Larry will jump all over this, to my chagrin.

    What best counsel can I provide my colleagues in the AP Biology community regarding the error of textbook orthodoxy on this particular topic.

    Thanks in advance to one and all! Again, I remain in your debt – as do my students but even moreso!

  7. Hi, I've come to this interesting forum thread and wanted to share our latest Opinion article as I believe it would be of interest to the debate (no self-advertisment is intended!). You can find it here:

    Tress, Michael L., Federico Abascal, and Alfonso Valencia. "Alternative Splicing May Not Be the Key to Proteome Complexity." Trends in biochemical sciences (2016).

  8. hi Federico

    you quote in your paper:

    " The breadth of alternative splicing detectable at the transcript level has led to claims that alternative protein isoforms could be the key to mammalian complexity "

    Do you believe or agree with that ? If not, what do you think are the mechanisms that lead to mammalian complexity ?

    How did the information for alternative splicing emerge ? ( the splicing code ? )

    What emerged first, the splicing code, or the machinery for splicing ?

    1. Hi Otangelo (and Fede!),

      Heres the answers to your questions:

      "Do you believe or agree with that ? If not, what do you think are the mechanisms that lead to mammalian complexity ?"

      No, of course we do not agree with the claim. Nor do we believe that mammalian complexity needs any special explanation. But if we have to look for explanations for why alternative splicing is not necessary to explain cellular, then there are plenty. Gene expression levels, post-translational modifications, different protein-protein interactions ...

      "How did the information for alternative splicing emerge ? ( the splicing code ? ) "

      It evolved over hundreds of million years.

      "What emerged first, the splicing code, or the machinery for splicing ?"

      Hmm, the words "chicken" and "egg" spring to mind here ...

  9. It is clear that almost all genes with more than one exon have the ability to generate more than one differently spliced mRNA. Some of these alternative transcripts, generally the most conserved transcripts, will go on to produce functional alternative proteins. But current evidence ( suggests that most will not.

    As a side note results from colleagues working in proteomics backs up your other point. We do seem to be underestimating the level of post-translational modification in vivo.