Tuesday, May 14, 2013

Scientific Authority and the Role of Small RNAs

A few weeks ago I criticized Philip Ball for an article he published in Nature: DNA: Nature Celebrates Ignorance. Phil has responded to my comments and he has given me permission to quote from his response. I think this is going to stimulate discussion on some very interesting topics.

The role of small RNAs is one of those topics. There are four types of RNA inside cells: tRNA, ribosomal RNA (rRNA), messenger RNA (mRNA), and a broad category that I call “small RNAs.”

The small RNAs include those required for splicing and those involved in catalyzing specific reactions. Many of them play a role in regulating genes expression. These roles have been known for at least three decades so there haven’t been any conceptual advances in the big picture for at least that long.

What’s new is an emphasis on the abundance and importance of small regulatory RNAs. Some workers believe that the human genomes contains thousands of genes for small RNAs that play an important role in regulating gene expression. That’s a main theme for those interpreting the ENCODE results. Several prominent scientists have written extensively about the importance of this “new information” on the abundance of small RNAs and how it assigns function to most of our genome.

One of these prominent scientists is John Mattick who recently received an award from the Human Genome Organization for ....
The Award Reviewing Committee commented that Professor Mattick’s “work on long non-coding RNA has dramatically changed our concept of 95% of our genome”, and that he has been a “true visionary in his field; he has demonstrated an extraordinary degree of perseverance and ingenuity in gradually proving his hypothesis over the course of 18 years.”
I wrote an lengthy post explaining why this award was not justified [John Mattick Wins Chen Award for Distinguished Academic Achievement in Human Genetic and Genomic Research]. I don't think Mattick is correct. Here's are some of the reason why: How Much Junk in the Human Genome?, Genome Size, Complexity, and the C-Value Paradox, Basic Concepts: The Central Dogma of Molecular Biology.

Here’s the problem: who do you believe? Who are the scientific authorities on this topic?

Phil Ball writes.
To make a start, I know of course that RNA regulation is an old subject – I mentioned Cech and Altman in my 1994 book Designing the Molecular World. But it does seem that the extent and complexity of involvement of non-coding RNAs in genetics has become argued only much more recently. Certainly, I’m not aware that anyone was saying things such as “RNA is the computational engine of the system” when Cech and Altman got their Nobels. One can agree or disagree with such claims of Mattick and others, of course, but they are being prominently made – I’m not just making this up.
How are science writers supposed to know that the claims of Mattick and others are—to say the least—controversial when Mattick gets a prestigious award for "proving his hypothesis" in spite of opposition from those who want to preserve the old view of junk DNA?

Who are the authorities and how do you recognize them?1 How do science writers even know there’s a controversy if there’s nothing in the scientific literature that refutes Mattick?

1. This is an important part of my course on scientific controversies. The answer is that you should always be skeptical of claims made in the scientific literature; especially claims that a paradigm has been overthrown. You don't have to decide which authority is correct but you do have to be careful not to get bamboozled by hype.


  1. The proof is in the pudding, and although I find the whole concept of lncRNAs and the like pretty cool, I've yet to be convinced that they're a general rule and not an exception. I think that the stochastic expression of most small "junk" RNAs (and often at vanishingly small levels, too) argues against their playing a major role. I could be wrong, of course (and I don't personally work on small non-coding RNAs), but the idea that all sequences in the genome can be transcribed *sometimes* does not strike me as revolutionary; it's even exactly what should be expected of a transposon-riddled genome.

    I'm sure that several non-coding transcripts do indeed have a function (the role of many miRNAs seems pretty clear, and the original p21 lncRNA paper had me convinced, although a colleague of mine struggles to reproduce its results) but I'm not ready to shout "change of paradigm" just yet.

    As for whom to believe, the answer is the same as usual: trust no one. Let's look at the data. And if it's not clear enough to decide, let's wait for more data.

  2. Larry, take a look at Ed Yong's piece on the lean, almost junk-free genome of the bladderwort, just sequenced:


    Lou Jost

    1. This is particularly interesting because the plant is a specialist in nutrient-poor habitats. This might give a fitness advantage to junkless plants, though the authors say there was no sign of selection, so I am guessing you will suspect it was an accident. It would be interesting to test other carnivorous plants to see if they all have unusually small genomes....

    2. Bladderwort genomes sizes range a lot between the different species, so the selection-due-to-poor-nutrients hypothesis does not appear to broadly explain the variation. There is even some indication of population-level variation in size, but I could not find the primary reference. There is some indication of an inverse correlation with mutation rate.

      Greilhuber et al. Plant Biol (Stuttg). 2006 Nov;8(6):770-7.

      Nuclear holoploid genome sizes (C-values) have been estimated to vary about 800-fold in angiosperms, with the smallest established 1C-value of 157 Mbp recorded in Arabidopsis thaliana. In the highly specialized carnivorous family Lentibulariaceae now three taxa have been found that exhibit significantly lower values: Genlisea margaretae with 63 Mbp, G. aurea with 64 Mbp, and Utricularia gibba with 88 Mbp. The smallest mitotic anaphase chromatids in G. aurea have 2.1 Mbp and are thus of bacterial size (NB: E. coli has ca. 4 Mbp). Several Utricularia species range somewhat lower than A. thaliana or are similar in genome size. The highest 1C-value known from species of Lentibulariaceae was found in Genlisea hispidula with 1510 Mbp, and results in about 24-fold variation for Genlisea and the Lentibulariaceae. Taking into account these new measurements, genome size variation in angiosperms is now almost 2000-fold. Genlisea and Utricularia are plants with terminal positions in the phylogeny of the eudicots, so that the findings are relevant for the understanding of genome miniaturization. Moreover, the Genlisea-Utricularia clade exhibits one of the highest mutational rates in several genomic regions in angiosperms, what may be linked to specialized patterns of genome evolution. Ultrasmall genomes have not been found in Pinguicula, which is the sister group of the Genlisea-Utricularia clade, and which does not show accelerated mutational rates. C-values in Pinguicula varied only 1.7-fold from 487 to 829 Mbp.


    3. Weird... I just was alerted to a small chordate genome (On the Tree of Life Blog, in the comments). Also a small genome and an elevated mutation rate.


      Primary article

  3. A related discussion on Mike Taylor's blog:
    See especially this comment by Mike:

  4. Jim thanks for the extra information. Interesting that Pinguicula does not have an ultrasmall genome, but Genlisea does. I wonder if anyone looked at other carnivorous plants like Nepenthes or especially Dionaea? It looks like no one has looked at Drosera (sundew) for this, but that Drosera does have some weird features (no centromeres!) and some species seem to have small chromosomes.

    1. I could not find Nepenthes or Dionaea in the C-value databases. But it is a very interesting group to look into. If we hurry up, they will probably provide for a lot of good hypothesis testing, before everyone else measures the C-values and ruins it with post-hoc explanations ;-)

  5. You know an experiment somebody should do?

    Pick the bladderwort with the biggest genome-- let's say Genlisea hispidula with 1510 Mbp [about one-half the size of the human genome].

    Now do an ENCODE-type experiment with that, totting up all its biochemical "activities" as Ewan Birney would say-- find all its low-abundance RNA transcripts, everywhere a protein binds to the DNA, everywhere the DNA is chemically modified.

    Who is willing to bet that the fraction of "biochemical activity" in Genlisea hispidula will be close to, or more than, the 80% "activity" reported by ENCODE for human beings?

    And since Utricularia gibba with its 88 Mbp is so closely related to Genlisea, nature has already done for us the experiment of asking, "Hey, what would happen if you delete all that junk?"

    Think the NIH would give me a grant for that?