Monday, July 27, 2015

More confusion about the central dogma of molecular biology

I was doing some reading on lncRNAs (long non-coding RNAs) in order to find out how many of them had been assigned real biological functions. My reading was prompted by the one of the latest updates to the human genome sequence; namely, assembly GRCh38.p3 from June 2015. The Ensembl website lists 14,889 lncRNA genes but I'm sure that most of these are just speculative [Ensembl Whole Genome].

The latest review by my colleagues here in the biochemistry department at the University of Toronto (Toronto, Canada), concludes that only a small fraction of these putative lncRNAs have a function (Palazzo and Lee, 2015). They point out that in the absence of evidence for function, the null hypothesis is that these RNAs are junk and the genes don't exist. That's not the view that annotators at Ensembl take.

I stumbled across a paper by Ling et al. (2015) that tries to make a case for function. I don't think their case is convincing but that's not what I want to discuss. I want to discuss their view of the Central Dogma of Molecular Biology. Here's the abstract ...
The central dogma of molecular biology states that the flow of genetic information moves from DNA to RNA to protein. However, in the last decade this dogma has been challenged by new findings on non-coding RNAs (ncRNAs) such as microRNAs (miRNAs). More recently, long non-coding RNAs (lncRNAs) have attracted much attention due to their large number and biological significance. Many lncRNAs have been identified as mapping to regulatory elements including gene promoters and enhancers, ultraconserved regions and intergenic regions of protein-coding genes. Yet, the biological function and molecular mechanisms of lncRNA in human diseases in general and cancer in particular remain largely unknown. Data from the literature suggest that lncRNA, often via interaction with proteins, functions in specific genomic loci or use their own transcription loci for regulatory activity. In this review, we summarize recent findings supporting the importance of DNA loci in lncRNA function and the underlying molecular mechanisms via cis or trans regulation, and discuss their implications in cancer. In addition, we use the 8q24 genomic locus, a region containing interactive SNPs, DNA regulatory elements and lncRNAs, as an example to illustrate how single nucleotide polymorphism (SNP) located within lncRNAs may be functionally associated with the individual’s susceptibility to cancer.
This is getting to be a familiar refrain. I understand how modern scientists might be confused about the difference between the Watson and the Crick versions of the Central Dogma [see The Central Dogma of Molecular Biology]. Many textbooks perpetuate the myth that Crick's sequence hypothesis is actually the Central Dogma. That's bad enough but lots of researchers seem to think that their false view of the Central Dogma goes even further. They think it means that the ONLY kind of genes in your genome are those that produce mRNA and protein.

I don't understand how such a ridiculous notion could arise but it must be a common misconception, otherwise why would these authors think that non-coding RNAs are a challenge to the Central Dogma? And why would the reviewers and editors think this was okay?

I'm pretty sure that I've interpreted their meaning correctly. Here's the opening sentences of the introduction to their paper ...
The Encyclopedia of DNA Elements (ENCODE) project has revealed that at least 75% of the human genome is transcribed into RNAs, while protein-coding genes comprise only 3% of the human genome. Because of a long-held protein-centered bias, many of the genomic regions that are transcribed into non-coding RNAs (ncRNAs) had been viewed as ‘junk’ in the genome, and the associated transcription had been regarded as transcriptional ‘noise’ lacking biological meaning.
They think that the Central Dogma is a "protein-centered bias." They think the Central Dogma rules out genes that specify noncoding RNAs. (Like tRNA and ribosomal RNA?)

Later on they say ....
The protein-centered dogma had viewed genomic regions not coding for proteins as ‘junk’ DNA. We now understand that many lncRNAs are transcribed from ‘junk’ regions, and even those encompassing transposons, pseudogenes and simple repeats represent important functional regulators with biological relevance.
It's simply not true that scientists in the past viewed all noncoding DNA as junk, at least not knowledgeable scientists [What's in Your Genome?]. Furthermore, no knowledgeable scientists ever interpreted the Central Dogma of Molecular Biology to mean that the only functional genes in a genome were those that encoded proteins.

Apparently Lee, Vincent, Picler, Fodde, Berindan-Neagoe, Slack, and Calin knew scientists who DID believe such nonsense. Maybe they even believed it themselves.

Judging by the frequency with with such statements appear in the scientific literature, I can only assume that this belief is widespread among biochemists and molecular biologists. How in the world did this happen? How many Sandwalk readers were taught that the Central Dogma rules out all genes for noncoding RNAs? Did you have such a protein-centered bias about the role of genes? Who were your teachers?

Didn't anyone teach you who won the Nobel Prize in 1989? Didn't you learn about snRNAs? What did you think RNA polymerases I and III were doing in the cell?


Ling, H., Vincent, K., Pichler, M., Fodde, R., Berindan-Neagoe, I., Slack, F.J., and Calin, G.A. (2015) Junk DNA and the long non-coding RNA twist in cancer genetics. Oncogene (published online January 26, 2015) [PDF]

Palazzo, A.F. and Lee, E.S. (2015) Non-coding RNA: what is functional and what is junk? Frontiers in genetics 6: 2 (published online January 26, 2015 [Abstract]

52 comments :

  1. What did you think RNA polymerases II and III were doing in the cell?

    I think there's a typo here

    ReplyDelete
    Replies
    1. I'm guessing you were taught properly! :-)

      Delete
    2. As I've said many times, I don't have the feeling I was taught properly at any point after high school (which does not mean I haven't learned anything, just that I am deeply dissatisfied with the teaching I've encountered (partly because it was poor, partly because often there wasn't any to begin with). But this one is actually high school material anyway :)

      Delete
  2. I find it very difficult to suppose that any of the authors has not heard of ribosomal and transfer RNA. So this misunderstanding can't be entirely due to lack of education or lack of knowledge that non-protein-coding, functional RNAs have been known for many years. It has to be some odd selective blindness or lack or thinking. Or perhaps the desire to make revolutionary contributions to science has temporarily paralyzed the frontal lobes.

    ReplyDelete
  3. I am not surprised at the sequence hyp-central dogma confusion... I'm pretty sure several textbooks made this mistake even, and I am pretty sure I was mis-taught the concept. But I don't see how the notion of functional non-protein coding RNAs could ever be construed as a violation of either concept. Seems to me the sequence hypothesis was an explanation for where proteins came from, and not a declaration that all RNAs must therefore code for proteins. In any case, its strange that any modern molecular biologist would make this mistake.

    ReplyDelete
  4. Oh boy. How long will you hammer on this??? This is what tenure is good for: decades of scholarship allowed Larry to deeply understand a few of perfectly trivial issues, leading to another decade of him mercilessly pummeling his wind mills opposition. Hooray for the free thought of the world!

    ReplyDelete
    Replies
    1. Well, it might be trivial, but apparently either it has not been understood or the temptation to unnecessarily overhype new discoveries is too strong to resist.

      So some corrective is needed. But it's not going to be just one blog that does it.

      Delete
    2. When a lot of people fail to understand the trivial, it is all the more important to point that out. You can't get the difficult stuff right if you fail already at the basics.

      Delete
    3. @Georgi Marinov

      "Central Dogma" has only historical value (and a hype value for journalists). No correction is necessary for something that's not actually part of scientific inquiry.

      Delete
    4. The authors clearly got the history wrong!

      Any cursory reading of Sidney Semour's understanding of "cistron" clearly indicates the earliest understanding of the Central Dogma did not have a protein-cenetered bias.

      Jacob and Monod's famous 1961 paper ambiguously mentioned an "operator locus". However, the diagram in this same paper unambiguously describes an "operator gene", "regulator gene" and "structural gene". J & M distinguished between the later two precisely because at the time they believed that "repressor" was not a protein.

      ref p 333 following paper where J & M speculate that "repressor" is RNA

      http://www.gs.washington.edu/academics/courses/braun/55106/readings/jacob_and_monod.pdf

      All this of the above was cited under the heading "Regulatory Genes" i.e "non-protein-product" was explicit.

      Part of the problem is that "gene" as understood in classical terms of the last century has become obsolete.

      Delete
    5. Here in fact, is the link to the famous 1961 J & M paper mentioned above

      J Mol Biol. 1961 Jun;3:318-56

      http://www.pasteur.fr/ip/resource/filecenter/document/01s-000046-03t/genetic-regulatory.pdf

      Delete
    6. re: hype value for journalists

      No correction is necessary for something that's not actually part of scientific inquiry.

      Reality Check: The hype surrounding ENCODE requires correction or not?

      Delete
    7. DK says,

      decades of scholarship allowed Larry to deeply understand a few of perfectly trivial issues, leading to another decade of him mercilessly pummeling his wind mills opposition

      Ling et al. don't think it's trivial and that's the problem. Why would you mention the Central Dogma in the first sentence of your abstract and the first paragraph of your paper if you didn't think it was important?

      The purpose of my "pummeling" is to show these scientists that The Central Dogma is trivial. It's not as important as they think it is.

      Delete
    8. Tages Haruspex says,

      Part of the problem is that "gene" as understood in classical terms of the last century has become obsolete.

      If by "last century" you mean the 20th century then I disagree. Here's what many of us understood to be the definition of a gene in the later half of the last century ...

      A gene is a DNA sequence that is transcribed to produce a functional product.

      See: What Is a Gene? . I still think that's the best definition for the 21st century, notwithstanding the fact that no definition is perfect. It covers both protein-coding genes and genes that specify functional noncoding RNAs.

      Delete
    9. Professor Moran

      I must respectfully disagree with your optimistic accommodation of increasingly antiquated terminology, even when providing for a series of multi-operational definitions.

      Clearly there is more to "gene expression" than one solitary "DNA sequence".

      Both Peter Portin and William Gelbart have explained how our understanding of heredity has superceded the obsolete term "gene" as "a concept past its prime".

      Delete
    10. On the subject of "historical record":

      The term "gene" was first coined in the last (i.e. 20th) Century

      http://www.historyofinformation.com/expanded.php?id=4289

      Wilhelm Johannen proved himself quite the gentleman whereas Bateson was one hell of an SOB as described here

      http://www.esp.org/books/sturt/history/intro-lewis.html

      Delete
    11. Of course, I meant to say "Petter Portin" with two t's

      Delete
    12. @Tages Haruspex

      We aren't talking about gene expression.

      Feel free to offer a better definition of "gene."

      Delete
    13. Larry
      I'm sorry to intervene like that but can you provide the latest definition of "gene"?
      Thank you. I think I have lost track of what's going lately. If I missed something, please forgive.

      Delete
    14. Professor Moran

      We aren't talking about gene expression.

      You are correct, I could have phrased that better.

      Feel free to offer a better definition of "gene."

      I think you missed my point. I do not need to provide a definition for a concept I already deemed "incoherent".

      Perhaps you could define "functional product" in light of this paper:

      http://www.nature.com/nature/journal/v480/n7376/full/nature10665.html

      A simplistic Mendelian “one gene, one trait” model violates everything we know about inheritance.

      A thought experiment: Is it possible, even in principle, to predict phenotype even if we could sequence and "characterize" some eukaryotic organism's genome?

      Delete
    15. If pushed to the wall: I would compare a genome to an "operating system" and a "gene" to some sort of "subroutine" including but not limited to what you call "functional products".

      I cannot claim originality for that metaphor. Frankly it is lacking, but the best I could think of on the spur of a moment.

      Delete
    16. Tages Haruspex asks,

      Perhaps you could define "functional product" in light of this paper:

      No. That paper has nothing to do with the biochemical definition of a gene. I'm saying that the functional product is a transcript.

      Delete
    17. Professor Moran,

      "Biochemical" you say?

      That could be as good a definition as any other for a term that has outlived its utility, on the understanding your version of "function" is far removed from what has been generally understood to be "trait"or "phenotype".

      That is alright. The meaning of words can also evolve, why should "gene" be any different?

      http://www.mirror.co.uk/news/uk-news/words-literally-changed-meaning-through-2173079

      I am surprised at your facile dismissal of Lehner's paper. The random/statistical variation exhibited by the multiple genetic components which all contribute to a "phenotype" should vex the apologists/accomodationists on Uncommon Descent.

      Delete
    18. Of course I meant to place "GENETIC" in quotes.

      Delete
    19. @Tages Haruspex

      I don't understand your fixation on "trait" and "phenotype." Genes aren't the only thing that affects phenotype. We've known that for half a century.

      Delete
    20. Professor Moran

      "half a century"? Much longer than that actually.

      Speaking of "half a century": I hope we are not repeating Ernst Mayr's debate with J.B.S. Haldane, when Mayr in 1959 suggested Haldane's approach to Population Genetics could be characterized as "beanbag genetics".

      What exactly do you mean by "function"?

      DNA that is transcribed into any version of RNA; that directly or indirectly, actually ends up doing something useful for the cell, even in the long term, even if non-essential?

      Delete
    21. Professor Moran

      Djebali et al. :

      …the determination of genic regions is currently defined by the cumulative lengths of the isoforms and their genetic association to phenotypic characteristics, the likely continued reduction in the lengths of intergenic regions will steadily lead to the overlap of most genes previously assumed to be distinct genetic loci. This supports and is consistent with earlier observations of a highly interleaved transcribed genome, but more importantly, prompts the reconsideration of the definition of a gene. As this is a consistent characteristic of annotated genomes, we would propose that the transcript be considered as the basic atomic unit of inheritance. Concomitantly, the term gene would then denote a higher-order concept intended to capture all those transcripts (eventually divorced from their genomic locations) that contribute to a given phenotypic trait.

      I am guessing, but I think you are testing me. If we were to continue this exchange, I presume you really endorse the more subtle version of Djebali et al where "gene" comprises more than some solitary transcript.

      Myself, I have difficulties with Djebali et al's approach. The "function" of the transcript is inherent in the DNA sequence. In other words, the DNA sequence contains information, which when mutated creates a loss of function.

      Loss of function can also arise when mutating crucial DNA sequences that are not transcribed (enhancers/insulators as one example). These DNA sequences are no less hereditary than DNA sequences that are transcribed and should similarly be included in "gene" as a "higher-order concept on the presumption that "Modern Genetics" ( I doubt that term will follow phrenology into history's dustbin) still has something to do with "Hereditary".

      At this point, I am betraying my own personal prejudice on how eventual agreement on common usage will ultimately play out.

      Delete
    22. Tages Haruspex asks,

      What exactly do you mean by "function"?

      DNA that is transcribed into any version of RNA; that directly or indirectly, actually ends up doing something useful for the cell, even in the long term, even if non-essential?


      That pretty much covers it. I'm not really interested in quibbling about the exact meaning of the word "function" because there's no definition of a gene that's airtight. Biology is messy but we do the best we can.

      I notice that you have not responded to my request to give us a better definition. Is that because you think the word "gene" is no longer useful in biochemistry and molecular biology?

      Delete
    23. Tages Haruspex

      I am guessing, but I think you are testing me. If we were to continue this exchange, I presume you really endorse the more subtle version of Djebali et al where "gene" comprises more than some solitary transcript.

      Not a chance. The paragraph you quoted is just new-age gobbledygook.

      Some genes produce more than one functional transcript. The well-studied examples of known biologically functional alternative splicing are good examples. It's difficult to capture all these exceptions in a single concise definition but that doesn't mean we have to abandon the idea of a gene altogether.

      Have you read What Is a Gene??

      Delete
    24. Professor Moran

      Regarding your assessment of the Djebali et al quotation

      Not a chance. The paragraph you quoted is just new-age gobbledygook.

      I am surprised you would say so. I cut and pasted that endorsement by Ford Doolittle directly from his paper:
      Is junk DNA bunk? A critique of ENCODE

      No differently than Djebali et al, Professor Doolittle maintains:

      Minimally, gene means more than it used to mean.

      regarding : "Have you read What Is a Gene??"

      I have and I repeat; you seem to be insisting on a common outdated presumption that gene must still somehow imply a single "locus".

      That may be true for a minority of genes that can be mapped like Morgan's and Sturtevant's "beads on a string" but the majority of "genes" (if we must persist in using that term for pragmatic considerations) belies a far more complex narrative.

      Have you read
      http://www.nature.com/nature/journal/v480/n7376/full/nature10665.html
      ?

      Delete
    25. Professor Moran

      So I did answer your question.

      I notice you did not answer mine. I repeat:

      A thought experiment: Is it possible, even in principle, to predict phenotype even if we could sequence and "characterize" some eukaryotic organism's genome?

      On the presumption, of course, there are no obvious confounding variables.

      Delete
    26. Yes, of course it's possible. Send me your genome sequence and I'll predict whether you are male or female. I'll also have an excellent chance of predicting the color of your eyes and your blood type.

      Delete
    27. Tages Haruspex says

      you seem to be insisting on a common outdated presumption that gene must still somehow imply a single "locus".

      That may be true for a minority of genes that can be mapped like Morgan's and Sturtevant's "beads on a string" but the majority of "genes" (if we must persist in using that term for pragmatic considerations) belies a far more complex narrative.


      Really? I bet you can't name ten genes in the E. coli genome that aren't confined to a single locus. How about yeast? Can you name ten yeast genes that conflict with my preferred definition?

      How about humans? Name ten well-characterized human "genes" that aren't confined to a single locus. (You can count immunoglobulin genes as four of them.)

      Delete
    28. Professor Moran

      Funny you should mention those three.

      All three exhibit non-Mendelian inheritance on occasion.

      1 - Bombay blood group
      2 - complete androgen insensitivity syndrome
      3 - HERC2 blue eyed children

      Only recently have we tentatively been able to predict from sequence evidence correct phenotypes although I am still not too sure we can yet say that with 100 % certainty.

      In any case, I already conceded there do exist a minority of "loci" that behave in Mendelian fashion.

      The majority of genetics is best understood as the random/statistical variation exhibited by a cascade of multigenic and pleiotropic components which all contribute to a "phenotype" that remain quite refractory to prediction. Single locus genetics remains the exception and not the rule.

      Please read that paper. I have no desire to recapitulate Mayr's debate with Haldane regarding "beanbag genetics".

      Delete
    29. Professor Moran

      We just crossed posts.

      I am packing for a conference and will be absent for the interim. We will need to resume this discussion on a later date.

      au revoir

      Delete
    30. Tages Haruspex says,

      We will need to resume this discussion on a later date.

      I don't think so. Your ideas are so confusing that it hurts my brain trying to imagine what you could possible mean that makes any sense.

      I think you're just quibbling as in your response to my suggestion that there are genes that determine the color of your eyes or your blood type. That's just childish.

      If you really want to continue this conversation then give me some real examples of your kind of genes. You should not have any trouble finding ten in E. coli, yeast, and humans.

      Delete
    31. Professor Moran

      Your challenge: I bet you can't name ten genes in the E. coli genome that aren't confined to a single locus...
      How about humans?


      You will note, I specifically restricted my conjecture to eukaryotic organisms. I find it ironic that you seem to be having difficulty finding human genes that are restricted to one locus

      Let's focus on your first three citations, and I repeat:

      1 - two blue eyed parents can have a brown eyed child because part of the pigment making process involves two loci OCA2 and HERC2

      2 - Similarly, a Bombay Type O father and an A mother can have an AB child.

      3 - Similarly, an XY genotype can be female for a variety of complicated reasons involving more than one loci.

      All of these examples are admitedly trivial! William Bateson himself coined the phrase "epistasis" and as far back as the 1940's, George Beadle and Edward Tatum characterized gene complementation in Fungi.

      Clearly you either have not read Lehner's paper or you have failed to understand its implications wrt the random/statistical variation exhibited by the multiple genetic components which all contribute to a very unpredictable "phenotype"

      Meanwhile you have ignored my challenge regarding Professor Doolittle's expansion of the term "gene" in his seminal paper challenging ENCODE.

      I never "quibbled" but clearly we are both wasting our time.



      Delete
    32. Tages: You seem to be conflating traits with genes. Is the term "polygenic" at all familiar to you? You have merely mentioned a number of polygenic traits, not genes with more than one locus.

      Delete
    33. John Harshman

      I am perfectly cognizant of the distinction between multigenenic aka polygenic vs. pleiotropic vs. epistatic as indicated in more than one post above.

      You are incorrect: I have in fact NOT "mentioned a number of polygenic traits" but have mentioned rather, epistasis & gene complementation which I conceded were trivial examples of traits "that aren't confined to a single locus".

      My initial disagreement with Professor Moran regards what I currently consider ambiguous if not downright incoherent usage of the word "gene", which brings me back to Sydney Brenner's higher order conception of "gene" as a computer subroutine. I can only suppose Sydney Brenner is also guilty of "new-age gobbledygook".

      Meanwhile, there is more to development than DNA sequences.

      Unfortunately my exchange with Professor Moran appears to have become "personal", so I really want to add nothing more.

      Other forums are more inviting and more engaging.

      Delete
    34. Your problem is that nobody was asking about "traits that aren't confined to a single locus". The question was about genes that aren't confined to a single locus. Which shows that, as I said, you have conflated genes and traits.

      You may wish to flounce off, but all that happened here is that someone disagreed with you. Perhaps on other forums nobody does that. But it's not usually considered an insult.

      Delete
    35. John Harshman

      regarding: ...you have conflated genes and traits.

      Not at all; I have merely attempted to illustrate why the current usage of term "gene" is problematic. Professor Moran's invocation of a restrictive "biochemical" definition does not resolve the problem.

      My thought experiment remains unanswered: In fact, it is not possible in principle to with certainty predict my eye color (for example) even when provided complete OCA2 and HERC2 DNA sequence data.

      That was the whole point of the Lehner paper that you and Professor Moran persist in ignoring.

      I hope at some time in future when you finally get around to reading the the Lehner paper, you find it as exciting as I did.

      regarding: "But it's not usually considered an insult."

      You and Professor Moran have demonstrated no little condescension. You both have attempted to "put me in my place". I have a plane to catch and better occasions to occupy my time. I suggest in future you both may want to treat passers-by with more consideration.

      Delete
    36. "I suggest in future you both may want to treat passers-by with more consideration."

      I suggest in the future you, when passing by, stop presenting yourself as the Holy Soothsayer of all knowledge. You reap what you sow.

      Delete
    37. I do not understand why we need to redefine and jettison the term "gene", just because there is no one-to-one (or even many-to-one or one-to-many) correspondence between genes and traits. And thus I do not see the relevance of your examples. Could you explain?

      Unless of course you've already left in a huff. If that's too soon, make it a minute and a huff. Or you could leave in a taxi.

      Delete
  5. I thought I just had: Perhaps a reason that so many people misconstrue Crick's Central Dogma is that it now seems to obvious as to be trvial. Of course the amino acid sequence of a protein cannot be transferred back and affect the information content of the DNA and RNA sequences that led to it. How could that happen? But at the time Crick proposed this, I surmise this was not nearly so obvious.

    ReplyDelete
    Replies
    1. I don't think reverse-translation is imossible in principle - I could imagine an evolved biological mechanism that did this. But it's impossible given the mechanisms we're aware of.

      Delete
    2. Not impossible, but I suspect much, much more difficult than forward translation. Consistently recognizing a single amino acid within a polypeptide chain under many difference sequence contexts is a harder task than recognizing a single nucleic acid (or, even easier, a set of three) within an RNA strand. Of course, if you were working with a smaller set of highly distinct amino acids, things would be easier. Still, reading and writing at the molecular scale is much easier with nucleic acids than proteins, independent of current biological mechanisms for doing so.

      Delete
  6. "Because of a long-held protein-centered bias...". That sounds like straight out of any Mattick talk/paper ;-)

    ReplyDelete
  7. I was taught that genes are sequences of DNA that code for proteins. That's it.

    ReplyDelete