More Recent Comments

Wednesday, November 11, 2020

On the misrepresentation of facts about lncRNAs

I've been complaining for years about how opponents of junk DNA misrepresent and distort the scientific literature. The same complaints apply to the misrepresentation of data on alternative splicing and on the prevalence of noncoding genes. Sometimes the misrepresentation is subtle so you hardly notice it.

I'm going to illustrate subtle misrepresentation by quoting a recent commentary on lncRNAs that's just been published in BioEssays. The main part of the essay deals with ways of determining the function of lncRNAs with an emphasis on the sructures of RNA and RNA-protein complexes. The authors don't make any specific claims about the number of functional RNAs in humans but it's clear from the context that they think this number is very large.

Graf, J. and Kretz, M. (2020) From structure to function: Route to understanding lncRNA mechanism. BioEssays [doi:]
Let's look at the first paragraph of the introduction to see what I mean by misrepresentation of facts.
With fundamental cellular functions ranging from energy metabolism to structural components, over signal transduction to being key regulators of gene expression, proteins were attributed great scientific attention, while—with a few exceptions—the RNA was contemplated as the inevitable intermediary required for protein production. However, this picture changed dramatically when high‐throughput sequencing data revealed that more than two‐thirds of the human genome are actively transcribed into RNA but only <2% actually encodes for proteins. Several classes of shortnon‐coding RNAs (ncRNAs) controlling basic cellular functions such as translation (transfer RNAs, ribosomal RNAs), RNA editing (small nucleolar RNAs), or splicing (small nuclear RNAs) and have been known for quite a long time. More recently, short regulatory ncRNAs (20‐30 nt in length), including microRNAs, endogenous short‐interfering RNAs, or piwi‐associated RNAs, acting as crucial regulators of gene expression were also identified. Long noncoding RNAs (lncRNAs) have lately gained widespread attention and we are only at the beginning to understand their significant roles for a multitude of cellular processes.

There's nothing in that paragraph that's false but the net effect is very misleading. The best way to illustrate this is for me to re-write the paragraph from the point of view of a skeptic. My version (below) contains no false information but the perspective is very different. I'm pretty sure that opponents of junk DNA would see my version as a misrepresentation.

Humans cells contain many examples of well-characterized noncoding RNAs that we've known about for many decades. These include transfer RNAs, ribosomal RNAs,and several classes of small RNAs such as snoRNAs, snRNAs, micro RNAs, siRNAs, and piwiRNAs. In addition, there are a number of unique noncodng RNAs that are found in many species such as the RNA component of RNAseP and the 7SL RNA of signal transduction particles. There are also many larger noncoding RNAs that are generally grouped together as long-noncoding RNAs or lncRNAs. The function of some of these lncRNAs is well known—examples include Xist and HOTAIR—but many of these transcripts have not been associated with any known function.

We have known for more than half a century that most of the human genome is transcribed but the extent of pervasive transcription has only become clear since the 1990s. Most of the transcripts are not conserved and they are rapidly degraded so that they are usually present at less than one copy per cell. This has led to the widespread belief that these transcripts are nonfunctional junk RNA produced by spurious transcription. However, it seems likely that some of these RNAs will prove to have a function. One way to identify these candidates is to look at the secondary and tertiary structures of long transcripts and their interaction with identifiable proteins.

I believe that the proper way to write about these transcripts is to emphasize that they are mostly junk. That doesn't preclude looking for function in a small subset where there's evidence for function but it's a very different perspective from the one usually taken in these papers. The usual approach is to emphasize the exciting new discovery of lncRNAs making it seem like they all have some mysterious function waiting to be revealed.1

1. It goes without saying that lots of research grant funding will be required to reveal the mysterious functions.


  1. Great article. I have a tangential question - what do you think about the role for alternative transcriptional start sites? Just a cellular error, or something more? One of our genes of interest seems to doing this (based on 5' RACE), but the literature is decidedly not helpful in decoding what we're seeing.

    1. Hi Unknown!

      »I have a tangential question - what do you think about the role for alternative transcriptional start sites? Just a cellular error, or something more?«

      Certainly you know I especially like this introductory sentence:

      »Regulation of the proteome is therefore the primary output of signaling pathways that connect cell physiology to internal and external environmental cues.«

      From the perspective of the proteome this would not be an error of any kind anywhere, but a switch in a dynamic network (“It's not a bug, it's a feature”).



  2. Thanks for this. I harp on this quite a lot with my colleagues. That the genome has a low level bit of tanscriptional noise should just be obvious to anyone who works with genetic systems. Strange things happen in cells. And any transcript that you never see expressed at levels above 1/cell? No way in hell that is doing anything.

    And that doesn't even take into account that a bunch of that transcriptional noise may also be sequencing and mapping errors

    1. One copy per cell could be important, if it is doing something like inactivating a homing endonuclease gene before it takes over the whole genome.

      Maybe that transcriptional 'noise' is also a 'feature', to prevent homing endonuclease genes from taking over the whole genome. Maybe that is what limits the fidelity of eukaryote DNA replication; 'killing off' homing endonuclease genes before they take over a species gene pool.

    2. "One copy per cell could be important, if it is doing something like inactivating a homing endonuclease gene before it takes over the whole genome."

      That seems very unlikely. With one or less than one copy per cell (meaning some cells don't even have a copy some of the time) it would be rather unlikely for that one gene copy to find and deactivate some aberrant molecule before it manages to cause damage. And in any case, there's zero evidence that noisy transcripts function by "killing off" anything at all.

      The idea that products of transcriptional noise is somehow still functional makes for a very poor null hypothesis because it's almost impossible to test in practice.

      Finding function in the genome with a null hypothesis

      Splendor and misery of adaptation, or the importance of neutral null for understanding evolution

    3. Those genes might function by damaging DNA and killing off the cells that contain active endonuclease genes, or by killing off cells with insufficiently robust DNA repair systems, or insufficiently robust DNA error correcting systems to prevent cancer.

      The 'activity' doesn't need to be very high to be effective, once per organism lifetime would be enough.

      Once per gamete lifetime in haploid gametes would also be enough. Purging deleterious genes in haploid gametes would be quite advantageous.

      Yes, these would be difficult to demonstrate.

      We know that endogenous retroviruses have been inhibited by ‘stuff’ that inactivate them. Presumably the 'mechanism(s)' that did that are 'features' and not 'bugs'.

    4. Marcus in 1983 reported that one molecule of RNA, should it find a complementary partner, would suffice to trigger an interferon response (Interferon 5, 115-180). So, say, a coronavirus mRNA "vaccine" is injected and one gets in a host cell that happens to have a complement (perhaps aka "junk RNA"). The dsRNA, courtesy of interferon, enhances MHC protein expression, while the mRNA then shakes itself loose and gets translated. The resulting foreign peptides form pMHC complexes that are displayed at the cell surface just in time to entice a passing T-cell. The rest is history!

  3. As the Australians say "good on you". Your all-too-lonely struggle is against the delusion that the cell is perfectly adapted in every respect. As you know, that image ignores the actual processes of adaptive, nonadaptive and maladaptive change. It is hard to make a dent in this widely-shared delusion, but you are trying. Applause!

    1. Hi Joe Felsenstein!

      »Your all-too-lonely struggle is against the delusion that the cell is perfectly adapted in every respect. As you know, that image ignores the actual processes of adaptive, nonadaptive and maladaptive change.«

      Of course I underline the first sentence: Adaptation is qualitative - it is sufficient to maintain the continuity of the germ line. Any organism that succeeds in this is in this respect a resultant solution to the environmental conditions. Thus the germ line is potentially immortal. With regard to the second sentence, I would like to discuss the corresponding topic using the keyword sickle cell anaemia. Or why I seem to have advantages with Type O negative to cope with SARS-CoV-2.



  4. Hi Larry Moran!

    Not so bad at all! If you would succeed in depicting the metaphor ‘function’ as a form of dynamic biochemical activity, I would wear a shirt with your portrait and laurel wreath...

    However, I am afraid that it is not understood which bias is present in the paper in question. In their paper, the authors have finally only made the analogy conclusion that if it is certain that ncRNA can be considered ‘functional’ [sic!] in principle, this could be the case with lncRNA as well. The devil is in the premises...
    Some rhetorical questions towards a theory of the gene:

    Q1: If DNA transcribes into junk RNA, is this DNA not junk RNA because it has the 'function' to transcribe junk RNA?
    Q2: What is the situation here in relation to reverse transcription?
    Q3: If junk DNA and junk RNA are conceptually associated with 'function', isn't it also possible to associate them with junk protein, junk cell, junk organism?
    Q4: If a genome is considered functional in its entirety, why is it appropriate not to consider some of its compartments as functional?



    1. 1.Doing something does not = functional.
      2. Junk RNA can be reverse transcribed, still junk.
      3. Fallacy of composition, infers that something is true of the whole from the fact that it is true of some part of the whole.
      4. Fallacy of division, infers something that is true for a whole must also be true of all or some of its parts.