Sandwalk: Functional RNAs?

Friday, January 16, 2015

Functional RNAs?

One of the most important problems in biochemistry & molecular biology is the role (if any) of pervasive transcription. We've known for decades that most of the genome is transcribed at some time or other. In the case of organisms with large genomes, this means that tens of thousand of RNA molecules are produced from regions of the genome that are not (yet?) recognized as functional genes.

Do these RNAs have a function?

Most knowledgeable biochemists are aware of the fact that transcription factors and RNA polymerase can bind at many sites in the genome that have nothing to do with transcription of a normal gene. This simply has to be the case based on our knowledge of DNA binding proteins [see The "duon" delusion and why transcription factors MUST bind non-functionally to exon sequences and How RNA Polymerase Binds to DNA].

If you have a genome containing large amounts of junk DNA then it follows, as night follows day, that there will be a great deal of spurious transcription. The RNAs produced by these accidental events will not have a biological function.

The human genome is large and most knowledgeable biochemists think that 90% of it is junk. There should be a lot of junk RNA produced in any particular cell and if you look at a large number of different tissues you are bound to find that most of the genome is transcribed. In spite of the fact that this is the expected result if you understand the biochemistry, there are those who believe that most of these RNAs have a function—and therefore most of the genome is functional.

The journal Nature Structural & Molecular Biology decided to publish a special issue called Focus on Noncoding RNAs. It came out at the same time as a paper by my colleague Alex Palazzo and his graduate student, Eliza Lee (Palazzo and Lee, 2015). The contrast is remarkable.

Let's look at the Nature Structural & Molecular Biology papers first. Keep in mind that the most important question is whether these RNAs have a function or whether they are just junk RNA produced as a result of spurious transcription. The lead editorial sets the stage [The noncoding explosion].

The long-held view that the primary role of RNA is to code for proteins has been severely undermined. This Focus explores the remarkable functional diversity of RNA in light of recent breakthroughs in noncoding-RNA biology.

In 1958, Francis Crick postulated the 'central dogma' to describe the flow of genetic information from DNA to RNA to protein (Crick F.H., Symp. Soc. Exp. Biol. 12, 138–163, 1958). Experimental evidence then established the mechanistic pathway linking genes to proteins: mRNAs act as transitory templates, tRNAs serve as adaptors between nucleotide and amino acid sequences, and the ribosome functions as the molecular machine that drives protein synthesis. This body of work cemented a canonical view of RNA as primarily a 'coding molecule'. Although tRNAs and rRNAs have obvious noncoding functions, their roles are nevertheless intimately tied to translation, thus reinforcing the notion of RNA as template and structural component to aid in protein synthesis.

The finding that RNA itself is capable of enzymatic catalysis in the 1980s jolted the community and eventually led to the 'RNA world' hypothesis, which proposes that self-replicating RNA molecules were precursors to life based on DNA, RNA and proteins. In comparison, the notion of RNA as a regulatory molecule is relatively recent, and the tremendous number, diversity and biological importance of noncoding RNAs (ncRNAs) are only beginning to be fully appreciated. In this issue, we present a special Focus on noncoding RNAs that explores the functional diversity of ncRNAs, discusses the molecular mechanisms of different RNA interference (RNAi) pathways and highlights the latest breakthroughs in ncRNA biology.

I'm sick and tired of supposedly intelligent people misrepresenting the Central Dogma and misrepresenting the history of a field in order to hype the latest discoveries. You would think that Nature publications would be particularly sensitive to this after the ENCODE disaster.

For the record, regulatory RNAs have been known for 40 years and most of the diverse small RNAs have been around for 20 years. These hardly count as "relatively recent" and it's nothing short of ridiculous to claim that they "are only beginning to be appreciated."

Don't forget that the important question is whether most of these RNAs have a biological function. In other words, SHOULD they be appreciated!

The editors have thought about this question. How do they deal with it?

Thousands of lncRNAs have been discovered to date, but their functional characterization has remained a challenge. This is partly because of a shortage of experimental techniques to explore their functions. In their Perspective, Spitale, Chang and Chu (p 29) highlight recent technological advances that will aid in the functional characterization of lncRNAs and discuss their advantages and caveats. The sheer number and the increasing pace of the discovery of new lncRNAs also present a challenge in terms of lncRNA definition and annotation. This issue is addressed in a Commentary by Rinn and Mattick (p 5), who propose considerations and best practices for identifying and annotating lncRNAs. These guidelines should assist the growing research community embarking on the mechanistic investigation of lncRNAs.

I'm not going to bother discussing either of those papers in any detail. They don't address the question at all. The first paper (Chu et al. 2015) just talks about " ...technologies that have finally made it possible to directly address the where, what and how of lncRNA function..." They assume that most of the RNAs have a function that's just waiting to be nailed down.

The second paper is by John Mattick and John Rinn (Mattick and Rinn, 2015). Asking these guys to write about whether most lncRNAs have a function is like asking Michael Behe to write a critical review of irreducibly complexity. Mattick and Rinn don't discuss the important question at all. They're mostly concerned with how to classify all those thousands of lncRNAs that have been discovered.

If you really want to know about function then you have to read the Palazzo and Lee paper on "Non-coding RNA: what is functional and what is junk?"

They make the same points that have been made repeatedly over the past two decades. Clearly they haven't sunk in and need to be repeated. You begin by assuming, in the absence of evidence to the contrary, that the newly discovered RNAs don't have a function. They are spurious transcripts. That's the default hypothesis. How do you determine if a given RNA has a function?

If it's present in significant amounts [see also: How to Evaluate Genome Level Transcription Papers]. We know that the vast majority of RNAs are present at less that one copy per cell. Palazzo and Lee point out that this is a good indication of lack of function although there are situations where low abundance RNA might still have a function.
How many have a known function? As of December 2014, there are only 166 lncRNAs with a validated function. (I suspect that not all of them will pan out.) That's after 20 years of looking for function among tens of thousands of putative lncRNAs. It doesn't prove anything but it surely points in one direction.
We expect functional RNA to be conserved and most of them aren't. Pallazo and Lee have a good discussion about exceptions to the rule. They are correct to point out that you can have functional transcription without sequence conservations and you can have functional RNAs that have just evolved in one lineage. However, these exceptions cannot account for the thousands of nonconserved RNAs that are supposed to have a biological function.
Cell specific transcription. It's often assumed that if an RNA is expressed in only certain tissues, or cells, that this is an indication of function. This is a bad assumption. If we are dealing with spurious transcripts then these will be produced when certain transcription factors bind nonspecifically to DNA. Since different cells have different transcription factors, it follows that they will produce different junk RNAs.
What if the RNA is localized within the cell? Palazzo and Lee point out that most of these RNAs are only found in the nucleus and that's where you expect junk RNA. Some are exported to the cytoplasm but that's not a reliable indication of function.

The question has not been answered but if you're a betting person, I'd put my money on most of these RNAs turning out to be junk.

Chu, C., Spitale, R.C., and Chang, H.W. (2015) Technologies to probe functions and mechanisms of long noncoding RNAs. Nature Structural & Molecular Biology 22:29-35. [doi: 10.1038/nsmb.2921]

Mattick, J.S. and Rinn, J.J. (2015) Discovery and annotation of long noncoding RNAs. Nature Structural & Molecular Biology 22:5-7. [doi: 10.1038/nsmb.2942]

Palazzo, A.F. and Lee, E.S. (2015) Non-coding RNA: what is functional and what is junk? Front. Genet. 6:2. [doi: 10.3389/fgene.2015.00002

20 comments :

John Harshman said...: I do wonder if there's any selection against spurious (should we say "ectopic"?) transcription factor binding sites and other control sequences. Is there evidence of such selection in any organism? Have there been any studies? There is at least selection against restriction sites, which are a sort of control sequence.; Friday, January 16, 2015 1:43:00 PM
Bjørn Østman said...: 'The Central Dogma: "Once information has got into a protein it can't get out again". Information here means the sequence of the amino acid residues, or other sequences related to it.'
Crick (1956) http://profiles.nlm.nih.gov/SC/B/B/F/T/_/scbbft.pdf; Friday, January 16, 2015 1:47:00 PM
Georgi Marinov said...: To the extent that they are harmful, there is.

Then what happens is of course a matter of selection coefficients and population genetic environment...; Friday, January 16, 2015 2:31:00 PM
Larry Moran said...: A typical transcription factor binding site is about 6 bp. There should be about one every 4000 bp in the human genome. That's 750,000 sites per haploid genome of 1.5 million in a typical diploid cell. It's hard to imagine how there could be significant selection against one of them such that eliminating one would confer a selective advantage.; Friday, January 16, 2015 2:33:00 PM
Joe Felsenstein said...: This is where I take the argument that I usually use against Larry and use it to agree with him. Whether natural selection will be effective is a function of whether 4Ns exceeds 1 in absolute value, where s is the selection coefficient. In these cases (selection favoring deletion or change of one of these useless transcription sites, for example, or selection to delete a short stretch of bases from our junk DNA) the selection coefficient s is most likely not big enough to be effective. Even though 1/(4N) may be small the value of s is most likely a lot smaller than that.; Friday, January 16, 2015 4:50:00 PM
Georgi Marinov said...: Let's clarify the terminology a bit here. Let's say we have s and it's negative. Are we only allowed to say that there is selection against the allele if |s| > 1/4N or we can also say that there is selection against it but it's overwhelmed by drift so in the end it does not do anything? This is a source of confusion (I personally tend to do the latter).; Friday, January 16, 2015 5:31:00 PM
Mikkel Rumraket Rasmussen said...: @Joe Felsenstein
"Whether natural selection will be effective is a function of whether 4Ns exceeds 1 in absolute value, where s is the selection coefficient."

Isn't time also a factor here? I'm of the understanding that you and Simon have pointed out several times that even very small s values become important on geological timescales.; Friday, January 16, 2015 7:54:00 PM
Unknown said...: I have a bookmark on page 57 of the 109 page book titled:
"Immunology and the Quest for an HIV Vaccine: A New Perspective" see:
http://www.amazon.com/Immunology-Quest-HIV-Vaccine-Perspective/dp/146850830X
because that's where I stopped about six months ago because it was so repetitive and uninformative.

Anyway the authors claimed (repeatedly and without apparent evidence) that noncoding RNA elements were part of an unappreciated molecular immune system that operates entirely within individual cells. There thesis seemed to be that since much of the Junk was old fragments of retroviruses, disabling these transcribed segments gave their "molecular immune system" practice for the big day when they were confronted by real retrovirus elements.; Friday, January 16, 2015 8:32:00 PM
Joe Felsenstein said...: Georgi Marinov: I'd say there is selection as long as s isn't zero, but it is ineffective if |4Ns| is much less than 1, as then selection is overwhelmed by drift.

Mikkel: No, if |4Ns| is quite small, drift overwhelms selection, and by about 10-20 N generations drift has finished fixing or losing the allele. Waiting longer won't help. It is only if we have a deterministic model, with an infinite population, that waiting long enough causes selection to have an effect. In effect in that idealized case N is infinity of |4Ns| is too.; Friday, January 16, 2015 8:42:00 PM
Claudiu Bandea said...: Laurence A. Moran: “Most *knowledgeable* biochemists are aware of the fact that transcription factors and RNA polymerase can bind at many sites in the genome that have nothing to do with transcription of a normal gene”

I think you underestimate the understanding and knowledge of the scientists working on genomics and gene expression. Similar to the ENCODE scientists, which represent some of the finest academic institutions in the world, I think all leading scientists working on lncRNAs are highly knowledgeable.

However, in order to be competitive and at the top of their field, many of them choose to misrepresent the current knowledge in order to promote their studies and results. Ultimately, these shrewd scientists are themselves ‘victims’ of the current science enterprise, which is based on a deficient peer review system (see my note in PubMed Commons entitled: “Multiple knockout mouse models reveal that some lincRNAs might be required for life and brain development” at: http://www.ncbi.nlm.nih.gov/pubmed/24381249)

You might also want to see a note entitled “Everlasting confusion on ‘functional DNA’ and ‘junk DNA’” addressing the ENCODE project (http://www.ncbi.nlm.nih.gov/pubmed/23479647). Here is an excerpt from this note:

“After all, the ENCODE ‘function fiasco’ was not the result of misunderstanding the concept of biological function, nor was it due to scientific incompetence as suggested by others (2). On the contrary, because it conflicted with some of the project’s objectives and with its significance, there was a concerted effort not to bring this concept forward (3); indeed, as clearly shown in a recent ENCODE publication (4), at least some ENCODE members seem well aware of the scientific rationale and criteria for addressing putative biological functions for genomic DNA….

References

(2) Graur D et al., 2013. On the immortality of television sets: "function" in the human genome according to the evolution-free gospel of ENCODE. Genome Biol Evol., 5:578-90. Graur D, 2013.

(3) Bandea CI. 2014. Closing the gap between ‘words’ and ‘facts’ in evaluating genome biology and the ENCODE project. PubMed Commons (National Library of Medicine; Bethesda, MD). Comment on: Doolittle WF. 2013. Is junk DNA bunk? A critique of ENCODE. Proc Natl Acad Sci USA., 110:5294-300.

(4) Kellis M. et al., 2014. Defining functional DNA elements in the human genome. Proc Natl Acad Sci USA., 111:6131-8. Kellis M, 2014.”; Saturday, January 17, 2015 12:03:00 PM
John Harshman said...: A typical transcription factor binding site is about 6 bp.

Irrelevant. So is a typical restriction site. The argument to make is that selection against spurious transcription (or spurious suppression) is so weak that it makes no difference.; Saturday, January 17, 2015 4:02:00 PM
Georgi Marinov said...: We've been through this so many times. Why do we have to do it again?; Saturday, January 17, 2015 4:36:00 PM
Claudiu Bandea said...: @Georgi,

Before engaging in discussions about ENCODE and related projects, it would make sense that you address some of the issues raised in previous posts ( http://sandwalk.blogspot.com/2014/10/nature-criticizes-science-hyperbole-and.html#comment-form):

Laurence A. Moran; Friday, October 31, 2014 9:36:00 AM:

“Georgi, there are 30 authors on the PNAS paper from last April. How many of them do you think are prepared to stand by everything that's in that paper and how many will claim that the paper may not represent their views because they never approved the draft that was sent to PNAS?”

Georgi Marinov; Friday, October 31, 2014 2:49:00 PM:

I have said enough over the many threads on these subjects here for my position to be clear to anyone who has read my posts.

John Harshman; Friday, October 31, 2014 3:25:00 PM:

No, that's the problem. Your position isn't clear. When you say the main Nature paper was "technically correct", that sounds to everyone like a way of avoiding the controversy by pedantic legalism. Larry's trying to pin you down here, and you keep squirming. Or that's how it looks to me and, I suspect, to other readers. An unequivocal statement would be nice.

I also raised the following issues that you to did not address:

(1) Is there anything wrong with ENCODE’s flagship paper in Nature? If the answer is yes, please let us know what’s wrong with it.

(2). Is there anything wrong with the presentation of ENCODE findings by Ewan Birney and other ENCODE leaders to science writers and media? If the answer is yes, please let us know what’s wrong with it; Saturday, January 17, 2015 5:42:00 PM
Georgi Marinov said...: My posts in this very same thread provide more than sufficient information to answer your questions.

Then there are things like Google Scholar that will give you even more information.; Saturday, January 17, 2015 5:56:00 PM
John Harshman said...: Georgi. I'm afraid that this too looks like weaseling to me and, I suspect again, to other readers. I also suspect that everyone would be interested in your answers to the questions, and the fact that people keep asking them should suggest to you that they think you haven't answered them. Now if you don't care that people think you're being a weasel, no problem.; Saturday, January 17, 2015 8:53:00 PM
Tom Mueller said...: re: A typical transcription factor binding site is about 6 bp.

Uhmm... actually I was under the impression that the consensus sequence for typical transcriptional binding sites were larger than restriction sites.

for example:

http://upload.wikimedia.org/wikipedia/commons/thumb/8/85/LexA_gram_positive_bacteria_sequence_logo.png/500px-LexA_gram_positive_bacteria_sequence_logo.png

Yes the consensus logo for the LexA-binding motif has six crucial nucleotides. But these are mirrored by a complementary palindromic 6 base sequence further downstream.

Classical footprint analyses in olden days before bioinformatics indicated a greater surface area of protein-DNA contact.

Meanwhile I am reminded of crucial differences between Chimpanzees and Humans that were obviously subject to significant selection pressures

Those 13 nucleotide changes in an 81 nucleotide stretch in just ONE enhancer HACSN1 ( what I thought typical for transcription factor binding sites) represents quite an anomalous mutation rate that could not be attributed to drift.

http://sandwalk.blogspot.com/2015/01/evolutionary-biochemistry-and.html?showComment=1420901851573#c5429010300897190288

OK, I have gotten off the topic of lnRNA, but I just wanted to clarify whether or not "a typical transcription factor binding site is about 6 bp.

Again, thanks to everybody for their patience and indulgence.; Sunday, January 18, 2015 6:31:00 AM
Georgi Marinov said...: LexA is a bacterial TF

There is a difference between prokaryotic and eukaryotic TFs - the prokaryotic ones have longer motifs. For which there are good evolutionary reasons.

The 6bp comments referred to the eukaryotic ones. There are of course plenty of eukaryotic examples with longer motifs - CTCF, NRSF, etc. But most are indeed short, 6 to 8p; Sunday, January 18, 2015 6:45:00 AM
Tom Mueller said...: Hi Georgi – thank you for your patient assistance

I thought that Helix-turn-helix, Zinc fingers & Leucine zippers – all resulted by mixing and matching of protein subunits into different heterodimers or homodimers.

By definition that meant that while one subunit would interact with the first 6 bps, the other subunit necessarily had to interact with another upstream/downstream 6 bps that did not necessarily need to be palindromic with the first unless we were discussing homodimers.

We are still speaking of sequence identity of 12 AND NOT 6 bps.

I remain intrigued by those 13 nucleotide changes in an 81 nucleotide stretch in just ONE enhancer HACSN1 (are enhancer activator protein binding sites different perhaps?) representing quite an anomalous mutation rate that could not be attributed to drift.

My understanding was that HACSN1 was first discovered by comparing Human/Chimp genome variation and focusing on areas of unusual high variability indicating putative high selection for mutation in short regions of DNA (in the case of HACSN1, a region of DNA spanning 81 bp).

What am I missing here?; Sunday, January 18, 2015 4:52:00 PM
Georgi Marinov said...: I am not familiar with the HACSN1 case, but those changes clearly have to be not in a single TFBS but multiple ones. That's what an enhancer is - multiple TFBSs. And it pretty much has to be.

Because while it is OK to point out how many random TFBS matches exist in a genome when discussing TF binding, what is less often noted is that the vast majority of those are not detectably occupied. So if there is occupancy detected by ChIP-seq, that's something you do want to pay attention to. It does not mean that it is functional, but you need to take it seriously - clearing the chromatin barrier is a significant achievement on its own that forces one to take note of it. Unfortunately we still don't have a good answer to the question why some sites are bound and others are not. The usual explanations are pioneer factors and combinatorial occupancy, but the pioneer factors do not open all matches to their motifs either and the combinatorial occupancy concept is still quite fuzzy when it comes to the specifics.; Sunday, January 18, 2015 5:03:00 PM
Tom Mueller said...: Georgi

I am still unclear on one point:

I understand that Helix-turn-helix, Zinc fingers & Leucine zippers are part and parcel of the eukaryote TFBS story. If so, Transcription factors are dimers; if so, then in fact it is incorrect to claim that

A typical transcription factor binding site is about 6 bp ...

That would be true only for the monomer and not the functional dimer.

Of course, you raise another excellent point! Eukaryote enhancers bind a minimum of 3 Activator proteins (if I am not mistaken) and up to 8 Activator Proteins in enhancers for many genes.

This ups the ante considerably when considering the notion of fortuitous junk transcription.; Sunday, January 18, 2015 6:02:00 PM

Quotations

The old argument of design in nature, as given by Paley, which formerly seemed to me to be so conclusive, fails, now that the law of natural selection has been discovered. We can no longer argue that, for instance, the beautiful hinge of a bivalve shell must have been made by an intelligent being, like the hinge of a door by man. There seems to be no more design in the variability of organic beings and in the action of natural selection, than in the course which the wind blows.Charles Darwin (c1880)

Although I am fully convinced of the truth of the views given in this volume, I by no means expect to convince experienced naturalists whose minds are stocked with a multitude of facts all viewed, during a long course of years, from a point of view directly opposite to mine. It is so easy to hide our ignorance under such expressions as "plan of creation," "unity of design," etc., and to think that we give an explanation when we only restate a fact. Any one whose disposition leads him to attach more weight to unexplained difficulties than to the explanation of a certain number of facts will certainly reject the theory.

Charles Darwin (1859)

Science reveals where religion conceals. Where religion purports to explain, it actually resorts to tautology. To assert that "God did it" is no more than an admission of ignorance dressed deceitfully as an explanation...

Peter Atkins

Quotations

The world is not inhabited exclusively by fools, and when a subject arouses intense interest, as this one has, something other than semantics is usually at stake. Stephen Jay Gould (1982)
I have championed contingency, and will continue to do so, because its large realm and legitimate claims have been so poorly attended by evolutionary scientists who cannot discern the beat of this different drummer while their brains and ears remain tuned to only the sounds of general theory. Stephen Jay Gould (2002) p.1339
The essence of Darwinism lies in its claim that natural selection creates the fit. Variation is ubiquitous and random in direction. It supplies raw material only. Natural selection directs the course of evolutionary change. Stephen Jay Gould (1977)
Rudyard Kipling asked how the leopard got its spots, the rhino its wrinkled skin. He called his answers "just-so stories." When evolutionists try to explain form and behavior, they also tell just-so stories—and the agent is natural selection. Virtuosity in invention replaces testability as the criterion for acceptance. Stephen Jay Gould (1980)
Since 'change of gene frequencies in populations' is the 'official' definition of evolution, randomness has transgressed Darwin's border and asserted itself as an agent of evolutionary change. Stephen Jay Gould (1983) p.335
The first commandment for all versions of NOMA might be summarized by stating: "Thou shalt not mix the magisteria by claiming that God directly ordains important events in the history of nature by special interference knowable only through revelation and not accessible to science." In common parlance, we refer to such special interference as "miracle"—operationally defined as a unique and temporary suspension of natural law to reorder the facts of nature by divine fiat. Stephen Jay Gould (1999) p.84

Quotations

My own view is that conclusions about the evolution of human behavior should be based on research at least as rigorous as that used in studying nonhuman animals. And if you read the animal behavior journals, you'll see that this requirement sets the bar pretty high, so that many assertions about evolutionary psychology sink without a trace.

Jerry Coyne
Why Evolution Is True

I once made the remark that two things disappeared in 1990: one was communism, the other was biochemistry and that only one of them should be allowed to come back.

Sydney Brenner
TIBS Dec. 2000

It is naïve to think that if a species' environment changes the species must adapt or else become extinct.... Just as a changed environment need not set in motion selection for new adaptations, new adaptations may evolve in an unchanging environment if new mutations arise that are superior to any pre-existing variations

Douglas Futuyma

One of the most frightening things in the Western world, and in this country in particular, is the number of people who believe in things that are scientifically false. If someone tells me that the earth is less than 10,000 years old, in my opinion he should see a psychiatrist.

Francis Crick

There will be no difficulty in computers being adapted to biology. There will be luddites. But they will be buried.

Sydney Brenner

An atheist before Darwin could have said, following Hume: 'I have no explanation for complex biological design. All I know is that God isn't a good explanation, so we must wait and hope that somebody comes up with a better one.' I can't help feeling that such a position, though logically sound, would have left one feeling pretty unsatisfied, and that although atheism might have been logically tenable before Darwin, Darwin made it possible to be an intellectually fulfilled atheist

Richard Dawkins

Another curious aspect of the theory of evolution is that everybody thinks he understand it. I mean philosophers, social scientists, and so on. While in fact very few people understand it, actually as it stands, even as it stood when Darwin expressed it, and even less as we now may be able to understand it in biology.

Jacques Monod

The false view of evolution as a process of global optimizing has been applied literally by engineers who, taken in by a mistaken metaphor, have attempted to find globally optimal solutions to design problems by writing programs that model evolution by natural selection.

Richard Lewontin

More Recent Comments

Friday, January 16, 2015

Functional RNAs?

20 comments :