Wednesday, July 31, 2013

The Dark Matter Rises

John Mattick is a Professor and research scientist at the Garvan Institute of Medical Research at the University of New South Wales (Australia).

John Mattick publishes lots of papers. Most of them are directed toward proving that almost all of the human genome is functional. I want to remind you of some of the things that John Mattick has said in the past so you'll be prepared to appreciate my next post [The Junk DNA Controversy: John Mattick Defends Design].

Mattick believes that the Central Dogma means DNA makes RNA makes protein. He believes that scientists in the past took this very literally and discounted the importance of RNA. According to Mattick, scientists in the past believed that genes were the only functional part of the genome and that all genes encoded proteins.

If that sounds familiar it's because there are many IDiots who make the same false claim. Like Mattick, they don't understand the Central Dogma of Molecular Biology and they don't understand the history that they are distorting.

Mattick believes that there is a correlation between the amount of noncoding DNA in a genome and the complexity of the organism. He thinks that the noncoding DNA is responsible for making tons of regulatory RNAs and for regulating expression of the genes. This belief led him to publish a famous figure (left) in Scientific American.

Mattick has many followers. So many, in fact, that the Human Genome Organization (HUGO) recently gave him an award for his contributions to the study of the human genome. Here's the citation.
Theme
Genomes
& Junk DNA
The Award Reviewing Committee commented that Professor Mattick’s “work on long non-coding RNA has dramatically changed our concept of 95% of our genome”, and that he has been a “true visionary in his field; he has demonstrated an extraordinary degree of perseverance and ingenuity in gradually proving his hypothesis over the course of 18 years.”
Let's see what this "true visionary" is saying this year. The first paper is "The dark matter rises: the expanding world of regulatory RNAs" (Clark et al., 2013). Here's the abstract ...
The ability to sequence genomes and characterize their products has begun to reveal the central role for regulatory RNAs in biology, especially in complex organisms. It is now evident that the human genome contains not only protein-coding genes, but also tens of thousands of non–protein coding genes that express small and long ncRNAs (non-coding RNAs). Rapid progress in characterizing these ncRNAs has identified a diverse range of subclasses, which vary widely in size, sequence and mechanism-of-action, but share a common functional theme of regulating gene expression. ncRNAs play a crucial role in many cellular pathways, including the differentiation and development of cells and organs and, when mis-regulated, in a number of diseases. Increasing evidence suggests that these RNAs are a major area of evolutionary innovation and play an important role in determining phenotypic diversity in animals.
This is his main theme. Mattick believes that a large percentage of the human genome is devoted to making regulatory RNAs that control development. He believes that the evolution of this complex regulatory network is responsible for the creation of complex organisms like humans, which, incidentally, are the pinnicle of evolution according to the figure shown above.

The second paper I want to highlight focuses on a slightly different theme. It's title is "Understanding the regulatory and transcriptional complexity of the genome through structure." (Mercer and Mattick, 2013). In this paper he emphasizes the role of noncoding DNA in creating a complicated three-dimensional chromatin structure within the nucleus. This structure is important in regulating gene expression in complex organisms. Here's the abstract ...
An expansive functionality and complexity has been ascribed to the majority of the human genome that was unanticipated at the outset of the draft sequence and assembly a decade ago. We are now faced with the challenge of integrating and interpreting this complexity in order to achieve a coherent view of genome biology. We argue that the linear representation of the genome exacerbates this complexity and an understanding of its three-dimensional structure is central to interpreting the regulatory and transcriptional architecture of the genome. Chromatin conformation capture techniques and high-resolution microscopy have afforded an emergent global view of genome structure within the nucleus. Chromosomes fold into complex, territorialized three-dimensional domains in concert with specialized subnuclear bodies that harbor concentrations of transcription and splicing machinery. The signature of these folds is retained within the layered regulatory landscapes annotated by chromatin immunoprecipitation, and we propose that genome contacts are reflected in the organization and expression of interweaved networks of overlapping coding and noncoding transcripts. This pervasive impact of genome structure favors a preeminent role for the nucleoskeleton and RNA in regulating gene expression by organizing these folds and contacts. Accordingly, we propose that the local and global three-dimensional structure of the genome provides a consistent, integrated, and intuitive framework for interpreting and understanding the regulatory and transcriptional complexity of the human genome.
Other posts about John Mattick.

How Not to Do Science
John Mattick on the Importance of Non-coding RNA
John Mattick Wins Chen Award for Distinguished Academic Achievement in Human Genetic and Genomic Research
International team cracks mammalian gene control code
Greg Laden Gets Suckered by John Mattick
How Much Junk in the Human Genome?
Genome Size, Complexity, and the C-Value Paradox


Clark, M.B., Choudhary, A., Smith, M.A., Taft, R.J. and Mattick, J.S. (2013) The dark matter rises: the expanding world of regulatory RNAs. Essays in Biochemistry 54:1-16. [doi:10.1042/bse0540001]

Mercer, T.R. and Mattick, J.S. (2013) Understanding the regulatory and transcriptional complexity of the genome through structure. Genome research 23:1081-1088 [doi: 10.1101/gr.156612.113]

23 comments:

  1. In Mattick's figure, I think you cut off the right-hand part of it: the part showing the even-higher bars for the onion, the congo eel, and the lungfish.

    ReplyDelete
    Replies
    1. It can't go higher than 100% :P

      In any case, it's still wildly misleading, this "Plants/fungi" bar should show a much greater range, still going above the human one and almost all the way to the bottom.

      That figure really is unbelievably stupid.

      Delete
    2. Yes, you're right, the 100% limit is important there. It is a plot of percentage of the genome not coding for proteins. So if you have N bases noncoding and C bases coding it is plotting N/(N+C).

      A more intuitive measure of amount of noncoding DNA would be N/C. That is the one which would be much higher in (say) lungfish that in humans. Plotted the way Mattick plots it, humans and lungfish and up both very close to 100%.

      Delete
    3. This figure is the original DAP:

      http://www.genomicron.evolverzone.com/2007/09/dogs-ass-plots-daps/

      Delete
    4. Yep, you beat me to it. It would be nice to slap a couple of onion species on that graph.

      Delete
  2. Good idea. It would have also been nice to put, at the immediate right of the human bar, the largest mammalian genome size (so far): 8.40pg for Tympanoctomys barrerae, the red viscacha rat. While it isn't the pinnacle of all creation, it's clearly almost three times as complex as H. sapiens.

    ReplyDelete
  3. I think Mattick is one of the best writers in the field of biology; the language is exquisite and the composition sophisticated. Moreover, his writings are usually consistent with the data. Take for example the abstract of the "The dark matter rises: the expanding world of regulatory RNAs" article. Here is my true/false evaluation:

    #1. The ability to sequence genomes and characterize their products has begun to reveal the central role for regulatory RNAs in biology, especially in complex organisms.

    True or false: True

    #2. It is now evident that the human genome contains not only protein-coding genes, but also tens of thousands of non–protein coding genes that express small and long ncRNAs (non-coding RNAs).

    True or false: True

    #3. Rapid progress in characterizing these ncRNAs has identified a diverse range of subclasses, which vary widely in size, sequence and mechanism-of-action, but share a common functional theme of regulating gene expression.

    True or false: True

    #4. ncRNAs play a crucial role in many cellular pathways, including the differentiation and development of cells and organs and, when mis-regulated, in a number of diseases.

    True or false: True

    #5. Increasing evidence suggests that these RNAs are a major area of evolutionary innovation and play an important role in determining phenotypic diversity in animals.

    True or false: True

    ReplyDelete
    Replies
    1. 1. False: We've known about regulatory RNAs for over 35 years.

      2. False implication: There's no evidence that the regions that are transcribed are genes by any reasonable definition of "genes."

      3. True

      4. True: Several examples of regulatory RNAs are known.

      5. False.

      Two out of five is not good.

      Delete
    2. Even if those 5 points were all true, most DNA could be non-functional, in any sensible sense of 'functional'.

      Delete
  4. Separating humans from vertebrates in the figure? What is the justification for that?

    ReplyDelete
    Replies
    1. Well what the hell is the point of separating vertebrates from chordates, or lumping all invertebrates together?

      Imagine lumping together the mantis shrimp and platyhelminthes! It's insane.

      Delete
    2. Chordatocentrism rears its ugly head once again.

      Delete
    3. Obviously ants and giant squid have so much in common. I'm sure there's virtually no range in the amount of ncDNA they have.

      Delete
    4. Holy crap you're right! Apparently according to Mattick, humans and other vertebrates are not chordates. It's a phylogenetic revolution! That figure is completely IDiotic, which makes one wonder about the mind that created it.

      Delete
    5. Well William Jennings Bryan at the Scopes Trial said humans are not mammals. How dare you devalue human life by calling my children vertebrates?

      Delete
  5. Mike White's team has published a relatively new paper that might be relevant to the discussion about the definition of function in the genomic context: Finding function in the genome with a null hypothesis

    He also addressed some of Mattick's points in the most recent HUGO paper:
    Having your cake and eating it: more arguments over human genome function

    ReplyDelete
    Replies
    1. The study reported by White et al. PNAS paper (http://www.ncbi.nlm.nih.gov/pubmed/23818646) is interesting and valuable. However, there is a big problem with the main conclusion of the study:

      “Our results show that the cis-regulatory potential of TF-bound DNA is determined largely by highly local sequence features and not by genomic context.

      Although that might be indeed the case, their results *do not* show that “the cis-regulatory potential of TF-bound DNA is determined largely by highly local sequence features and not by genomic context”, simply because their sequences were assayed in *plasmid* not in *genomic* context.

      I think that drawing such a misleading conclusion, when the authors were well aware of the limitation of their study, is similar to the drawing of conclusion of the ENCODE study, which, ironically, White et al. have set to evaluate.

      Delete
    2. The plasmid is the point - the sequences were removed from their genomic context and placed into a permissive plasmid context.

      The prediction made by most of my colleagues was that putative transcription factor binding sites that are unbound in the genome (i.e., non-functional), would prove to be highly functional in the plasmid context. Turns out that's not true. Non-functional sites in the genome behaved like randomly generated DNA on the plasmid.

      Delete
    3. Your conclusion (see quote form your Abstract in my previous comment) refers to the cis-regulatory potential of *TF-bound DNA*, not the cis-regulatory potential of “putative transcription factor binding sites that are *unbound* in the genome” as you state in your reply.

      Delete
    4. That's right, because in the paper we compare the function of TF bound DNA to unbound motifs. It's not that complicated.

      Delete
    5. I still maintain that in order to *show* that “the cis-regulatory potential of TF-bound DNA is determined largely by highly local sequence features and not by genomic context” you need to place these elements at various sites in the genome. Maybe we are disagreeing about the meanings of the term *show* vs. *suggest*, which I think would have been more appropriate.

      Nevertheless, I would guess that you and your colleagues have investigated the local sequence features of the TFs *bound* and *unbound* DNA elements in the genome. What do the results show?

      Delete
  6. Some range bars would be nice. Presumably that's what the sloping bar top represents. Fungi/Plants barely overlap in total genome size range, and as a composite group cover 4 orders of magnitude, yet their noncoding proportion remains a healthy 65-75%-ish throughout? http://en.wikipedia.org/wiki/File:Genome_Sizes.png

    ReplyDelete
  7. A key question in the field is whether the transcripts resulting from pervasive transcription of intergenic regions are functional or the result of noisy transcription. The lincRNAs we describe are specifically regulated and contain conserved sequence, attributes inconsistent with transcriptional noise.

    http://www.plosgenetics.org/article/info%3Adoi%2F10.1371%2Fjournal.pgen.1003569

    I thought you might be interested.

    Cheers.

    ReplyDelete