Monday, September 22, 2014

What are lncRNAs?

Many genes encode proteins and many other genes specify functional RNAs that do not encode proteins. The "RNA genes" include the classic genes for ribosomal RNAs and tRNAs as well as genes for very well-studied RNAs that carry out catalytic roles in the cell. There are a myriad of small RNAs required for things like splicing and regulation. All species, both prokaryotes and eukaryotes, contain genes for a wide variety or functional RNAs.

Eukaryotes seem to have an abundance of genes for small RNAs that perform a number of specific roles in regulation etc. They also have a lot of DNA regions complementary to long noncoding RNAs or lncRNAs (also lincRNA). The definition of long noncoding RNAs seems arbitrary and ambiguous [see Long Noncoding RNA]. Some of them might even encode proteins!

As a general rule, these RNAs are longer than 200 bp and some scientists put the cutoff at 1000 bp. Simple eukaryotes, such as yeast, don't have a lot of lncRNAs but eukaryotes with large complex genomes that are full of junk DNA seem to have a lot of different lncRNAs. The DNA regions1 that specify these lncRNAs ar not conserved. This strongly suggest that many of the lncRNAs are spurious nonfunctional transcripts even though some of them have well-characteized functions [see On the function of lincRNAs].

As usual, we have a definition problem. Are "lncRNAs" just a generic class of long noncoding RNAs that include thousands of nonfunctional molecules that are nothing more than junk RNA? Or, does the term "lncRNA" refer only to the subset that has a function? If it's the latter, then we should probably be referring to "putative" lncRNAs most of the time since the vast majority have not been shown to have a function. (There are about 10,000 of these RNAs in humans.)

I don't see how you can avoid the elephant in the room whenever you talk about lncRNAs. The most important question in NOT whether some of them have a function—that was demonstrated 30 years ago. The important question is whether the majority, or even a substantial minority, have a function.

That's why I was eager to read a short review by Rinn and Guttman in a recent issue of Science (Rinn and Guttman, 2014). They describe two lncRNAs that probably play a role in organizing chromatin within the nucleus (Xist and Neat1, both fram mammals). That's cool.

Then they say,
Collectively, these studies suggest that lncRNAs may shape nuclear organization by using the spatial proximity of their transcription locus as a means to target preexisting local neighborhoods. lncRNAs can in turn modify and reshape the organization of these local neighborhoods to establish new nuclear domains by interacting with various protein complexes, including chromatin regulators. Once established, a lncRNA can act to maintain these nuclear domains through active transcription and recruitment of interacting proteins to these domains. While the mechanism for how lncRNAs establish these domains is not fully understood, it is becoming increasingly clear that lncRNAs are important at all levels of nuclear organization—exploiting, driving, and maintaining nuclear compartmentalization.
It sure sounds like they are describing a particular function (nuclear organization) to the majority of lncRNAs. But what if 90% of all 10,000 lncRNAs have no function and what if only 100 of the remaining functional lncRNAs are involved in nuclear organization? That means there are 900 functional lncRNAs that play a different role in the cell?

If that were true, you would write that last paragraph very differently. If you recognize the elephant, you might say something like this ....
Very few lncRNAs have been shown to have a function and there's a very good chance that most of them are spurious transcripts that have no function. However, a small percentage do seem to have a function. In this review we have identified some long noncoding RNAs that appear to be involved in nuclear organization. We propose to call these RNAs "noRNAs" for "nuclear organizer RNAs" on the grounds that once a function has been identified we should stop referring to them as lncRNAs.
But that doesn't sound nearly as exciting as the subtitle of the article, "Long noncoding RNAs may function as organizing factors that shape the cell nucleus" or the quotation that's prominently displayed in a box in the center of the page, "... it is becoming increasingly clear that IncRNAs are important in all levels of nuclear organization—exploiting, driving, and maintaining nuclear compartmentalization." When did science become so dedicated to hype over substance? I must have missed the memo.

1. I use "DNA regions" instead of "genes" because the definition of a gene requires that the gene product be functional. You can't call them genes unless you have demonstrated that the RNA has a function.

Rinn, J. and Guttman, M. (2014) RNA and dynamic nuclear organization. Science 345"1240-1241 [doi: 10.1126/science.1252966]


  1. Laurence A. Moran: “The most important question in NOT whether some of them have a function—that was demonstrated 30 years ago. The important question is whether the majority, or even a substantial minority, have a function.”

    I think all scientists working on lncRNAs, including Rinn and Guttman, would agree with this statement if specifically asked and required to address the issue; unfortunately, there is no system to conduct this type of ‘questioning.’

    It just happen that a few weeks ago a raised a similar issue with a paper by Sauvageau M. et al., entitled “Multiple knockout mouse models reveal lincRNAs are required for life and brain development” at:

  2. I do think when a sequence is found to have a function, it should no longer be called a "lncRNA." We need another category.

    1. Hi Barbara

      I do not follow your reasoning unless everybody agrees that "lncRNA" is a priori synonymous with "junk DNA" ...

      ... rasing of course the qustion of whether or not we are begging the question.

      Look at Larry's excellent diagram above!

    2. I propose a simple solution. Let’s call the long noncoding RNAs (lncRNAs) that have a function, functional lncRNAs, and those who do not have a function, non-functional lncRNAs.

      For example, if we study 100 random human lncRNAs and find out that 90 are functional and 10 are non-functional, then we say that 90% of the investigated lncRNAs are functional lncRNAs and 10% are non-functional lncRNAs (or vice versa, if that's the case). In this situation, we can infer/hypothesize that this ratio between functional and nonfunctional lncRNAs is probably true for the rest of lncRNAs, and say with reasonable certitude that most of the lncRNAs are functional (

      There might be a big problem with this approach, though: if we implement this sensible (‘dull,’ if you want) way of reporting science, then, what would be Larry writing about?

    3. Hi Claudiu

      This thread conjures to mind the movie Groundhog Day.

      I am reminded of remarks that you and Joe made on an earlier thread:

      I think it was me that first raised the question whether

      I always understood that retroviruses co-opted host regulatory machinery and vice versa constituting the acme in molecular host-parasite coevolution.

      Meanwhile, the different distributions of Alu and LINE1 in the genome would suggest that selection pressure may be involved. Do Alus direct methylation? Are Alus and Line1 DNA symbionts?

      Joe’s response was sweet and to the point:

      @Tom Mueller: I was naïvely assuming that Alus and such are mostly just parasites. The occasional case where they get coopted into doing something useful does not persuade me that most of them are doing anything useful (the folks over at Uncommon Descent seem instantly persuaded of that by every single case they hear of).

      The question I raised was: if so, should we still call them "functional" and should be still call them "junk"? I'd say "yes" and "yes" but it's a matter of semantics and others here will have different opinions on that.

      Things that make you go hmmmm... I would be inclined to differ, but more on chromatin architecture later perhaps.

      Ford Doolittle himself came up with an apt metaphor,

      “…it's like the "clean fill" you see signs for along the highway. There may be a need for that much DNA but it doesn't matter what it is, as long as it doesn't contain deleterious sequences.”

      Larger-genomed organisms might be regulating the same amount of function as smaller-genomed organisms, just in more (and possibly unnecessarily?) elaborate ways.”

      Doolittle’s rejoinder to my mind seems to address Larry’s earlier contention:

      If having junk DNA were a clear advantage for future evolution then the genomes of all extant lineages should have lots of junk DNA and should make lots of lncRNAs.

      I repeat – Larry’s logic to my mind appears flawed. If I am in error – I would greatly appreciate correction.

    4. Tom,

      Not quite sure where you're heading, but if you are proposing a role for Alus that includes selection at organismal level for insertion at 'useful' points (rather than occasional incidental usefulness, akin to mutation), it is quite hard to explain this pattern:

      16, 17 and 19 are stuffed with the things, and there are distinct bands on several others. This may be (in the general case) more related to mode of transmission, local accessibility and amplification than to a 'role' as such.

    5. Hi Allan,

      Thanks for responding but I am not sure I follow your reasoning.

      Granted some chromosomes are stuffed with Alu’s … that said no chromosomes are absent Alu’s.

      As discussed in threads long past, I am fascinated by the notion of chromosome architecture in defined nuclear domains.

      I am particularly fascinated (as mentioned earlier) by Peter Fraser’s work:

      It seems to me that lncRNAs are ideal candidates for structural scaffolding of interphase chromosomal architecture constituting yet another level of gene regulation even if this category of gene regulation may be “more elaborate” and not necessarily ubiquitous in all eukaryotes.

      Of course, if this suggestion is correct, the notion of sequence conservation becomes a red herring. I think we should rather be talking “sequence category” conservation. For example, the lncRNA sequences in mouse doing the same job of nuclear architecture in humans should not be expected to be identical sequences even though they are doing the same job. We are after all talking different “parasitic” sequences that have been co-opted to becoming “symbiotic”.

      This of course describes convergent evolution.

      If such a scenario is correct, we would predict these scenarios which together would (to my way of thinking) resolve the c-value paradox:

      1 – we would expect to see some species with small genomes and little or no lncRNA because these species happen not to employ this particular level of gene regulation. (we are witnessing neither parasitism nor symbiosis)

      2 – we should expect to see some species with larger genomes as a result of more lncRNA that still do NOT happen not to employ this particular level of gene regulation.(we are witnessing parasitism)

      3 – and finally, we should expect to see some species with larger genomes as a result of more lncRNA that still DO happen to employ this particular level of gene regulation. (we are witnessing parasitism co-opted into symbiosis)

      I find nothing remarkable in this… no less remarkable than other instances of convergent evolution.

      I admit – I may be betraying naïveté in extremis and welcome correction.

    6. Ooops -

      Of course, I meant to say:

      For example, the lncRNA sequences in mouse doing the same job of nuclear architecture as the the lncRNA sequences in humans should not be expected to demonstrate conserved sequences even though both categories of lncRNAs are doing the same job.

      again and as always - I welcome correction from any and all...

    7. Hi Tom,

      I agree with many of the points you made. For example, there is no doubt that numerous lncRNAs are functional along the lines you stated.

      However, based on the current data and knowledge, only a relatively small percentage of lncRNAs are implicated in these functions; the others, probably the vast majority, are likely to be the result of spurious transcription.

      That’s the point Larry, I and other people have made, repeatedly.

      See for example my comments above, and at:

      See also Larry’s statement: “The most important question in NOT whether some of them have a function—that was demonstrated 30 years ago. The important question is whether the majority, or even a substantial minority, have a function.”

    8. Hi Claudiu,

      Thank you for bringing me up to speed... I checked out your link and did some more google-whacking on my own.

      Here is a great review that addresses much of what is being discussed here and elsewhere.

      Long Noncoding RNAs: Past, Present, and Future

      My take... absence of evidence does not constitute evidence of absence unless strong evidence is provided that knocking out different classes of lncRNA proves benign.

      Even that strategy poses problems. For example, according to my wife, the parking brake in our automatic car serves no function. It is just an elaborate and superfluous add-on she herself never uses.

      Are lncRNAs any different at the cellular level?

  3. Forgive my cross-posting. I now realize I should have posted the following query here:

    OK Reality check

    some lncRNAs have been found to participate in the regulation of such diverse activities as
    • splicing,
    • translation,
    • imprinting, and
    • transcription. Two examples:
    o XIST. XIST RNA, which contains thousands of nucleotides, inactivates one of the two X chromosomes in female vertebrates. [Discussion]
    o Some lncRNAs participate in bringing the enhancer and promoter regions of genes close together ("looping" — View) to regulate gene transcription. (More)

    I am fascinated by XIST lncRNA-mediated Barr body formation and wonder out loud whether lncRNA is in general crucial for another level of gene control often not considered in introductory textbooks… namely chromatin architecture in the nucleus.

    I realize I am rehashing – but I am going to float this balloon again with premeditation aforethought; in order to have my exuberant naiveté reined in.

    I thank any and all in advance for their patience and indulgence.

    Perhaps chromosomes have their equivalent to tertiary and quaternary structure, in large part due to lncRNA. Otherwise how does one explain constancy of karyotypes across primate lineages unless invoking positive selection? Otherwise how does one explain constancy of chromatin organization within the nucleus?

    This makes intuitive sense to me - Check out this link: