Tuesday, November 05, 2013

Stop Using the Term "Noncoding DNA:" It Doesn't Mean What You Think It Means

Axel Visel is a member of the ENCODE Consortium. He is a Staff Scientist at the Lawrence Berkeley National Laboratory in Berkeley, California (USA). Axel Visel is responsible, in part, for the publicity fiasco of September 2012 where the entire ENCODE Consortium gave the impression that most of our genome is functional.

He is also the senior author on a paper I blogged about last week—the one where some journalists made a big deal about junk DNA when there was nothing in the paper about junk DNA [How to Turn a Simple Paper into a Scientific Breakthrough: Mention Junk DNA].

Dan Graur contacted him by email to see if he had any comment about this misrepresentation of his published work and he defended the journalist. Here's the email response from Axel Visel to Dan Gaur.
I’m not sure about the sources of individual journalists (although I did speak to some of them), but generally speaking I think it’s a valid strategy for general media to use a provocative and widely recognized term in a title to capture the attention of their audience, as long as they set the record straight in the text.

When I talk to general audiences (or journalists) about my research, I generally explain that the function of most of the non-coding portion of the genome was initially unclear and many people thought of it as “junk DNA”, but that it has become clear by now that many parts of the non-coding genome are functional - as we know from the combined findings of comparative genomics, epigenomic studies, and functional studies (such as the mouse knockouts in our paper).

As far as I can tell, the majority of general news reports appropriately conveyed the point of this paper, i.e. that at least those non-coding sequences we looked at here are indeed not “junk”.
That reply is astonishing on many levels. First, scientists should not condone the use of provocative titles that have to be corrected in the text. Second, no knowledgeable scientist ever said that all noncoding DNA was junk. Third, it is not true that the majority of press reports conveyed the point of the paper. The point of the paper was that some more mouse enhancers have been putatively identified. They account for about 0.001% of the genome. This is a miniscule percentage of the entire genome and a tiny percentage of the known amount of functional noncoding DNA.

Theme Genomes & Junk DNADan Graur rightly complains about Axel Visel's improper use of the term "noncoding DNA" [Dear Card Carrying #ENCODE members: Please Remember That Junk DNA is Not a Synonym for Noncoding DNA] and I want to emphasize this point. It's about time we banned the use of "noncoding DNA" because it really doesn't serve any useful purpose. In most cases it's used as a strawman synonym for "junk DNA" or a synonym for DNA with no known function.

I've said it many times but it bears repeating. A small percentage (about 1.4%) of our genome encodes proteins. There are many other interesting regions in our genome including ...
  • ribosomal RNA genes
  • tRNA genes
  • genes for small RNAs (e.g spliceosome RNAs, P1 RNA, 7SL RNA, linc RNA etc.)
  • 5' and 3' UTRs in exons
  • centromeres
  • introns
  • telomeres
  • SARs (scaffold attachment regions)
  • origins of DNA replication
  • regulatory regions of DNA
  • transposons (SINES, noncoding regions of LINES, LTRs)
  • pseudogenes
  • defective transposons
These parts of noncoding DNA accounts for about 80% of the human genome. A lot of this noncoding DNA is functional (about 7% of the total genome [What's in Your Genome?]). None of it is mysterious in any way. We've known about it for decades. As Dan Graur says, it's a known known.

Given all this, under what circumstances does the term "noncoding DNA" mean anything significant? Here's an excerpt from Axel Visel's webpage. Does his use of "noncoding DNA" mean anything that's useful?
Research Interests

The sequence of the human genome has been known for over a decade, but well-defined functional annotations exist mainly for the small portion of the genome that encodes proteins. In sharp contrast, the 98% non-coding regions of the genome remain poorly annotated. Examples from gene-centric studies have provided strong evidence that the non-coding portion of the genome harbors distant-acting transcriptional enhancers and have shown that these regulatory sequences are critical for normal embryonic development.
Look at that list up above. Does that look like non-annotated regions to you? Transposons and defective transposons alone account for more than 50% of the annotated genome. Introns account for 40% of the annotated genome—although this is probably an overestimate. Regulatory regions have been part of annotated human genome sequences for thirty years. The discovery of a few more transcription factor binding sites in remaining parts of the genome that were not previously annotated is not exactly breaking news.


  1. The mentality exemplified by this man will destroy science.

    Visel: "it’s a valid strategy for general media to use a provocative and widely recognized term in a title to capture the attention of their audience"

    OK. Let's use provocative language: You're a publicity whore. You will tell any ridiculous lie to get publicity.

    See? I'm using a provocative term to capture the audience's attention. Just like you, publicity whore. How do you like it when it is done to you?

    We should call all these "death of junk DNA", "junk DNA = non-coding DNA", "death of central dogma" dipshits what they are: Press Release Sociopaths and publicity whores. You want provocative language to focus attention on you-- OK, you got it.

    Our attention's on you. Man-whore.

  2. What the hell subject did Visel get his Ph.D. in? And from where? How could anyone with a Ph.D. be THIS stupid: "the function of most of the non-coding portion of the genome was initially unclear and many people thought of it as “junk DNA”"

    Jesus Tapdancing Christ, that's stupid. Didn't Monod and Jacob find out about function in non-coding DNA in the frikkin 1950's!? And get a Nobel Prize for it!? And didn't the Nobel committee hand out ~6 Nobel Prizes for finding function in non-coding DNA!? And doesn't every single grad student in molecular biology need to know this, EVERY SINGLE GRAD STUDENT, in order to clone and express a gene?

    I really wanna know what school gave the man-whore a Ph.D. Bob's Correspondence School, Christian Seminary and Waffle Emporium, I bet.

    &%@#, that's one stupid whore. I've had lap dances from strippers with higher intellects. Man-whore better get breast implants, cause he'll never be able to intellectually compete against actual whores.

    1. Hei, stop insulting lap-dancers and whores, please. Their intellect is the same as everyone else's, and their services are far more important for human societies than Visel's work is for science.

    2. I share your astonishment. I think we (instructors) are partly to blame for the stupidity of the current generation of scientists who don't seem to have been taught properly. I wonder if potential post-docs and graduate students who read his webpage will be turned off? Or are they just as ignorant of the history of their field?

    3. I'm pretty sure most grad students and post-docs are quite ignorant of the history of their field. Most textbooks just dump info with little discussion (or references to classical papers) on exactly how did we reach some conclusion and why. Most courses don't provide much discussion on the how and why either.

      My personal experience, as someone who graduated in Geology and came to molecular biology as a PhD student, is that I had to do all the work by myself. The courses I took so far as part of the program didn't really provide much regarding the origin of the concepts we use at all. I guess I was fortunate that as a young teenager I had a big interest in Fred Hoyle's Panspermia theory. That led me later to see the same claims by Creationists/IDists. I found the arguments curious (are these problems real?!?!) and later that led me to blogs such as Sandwalk and to Molecular Evolution as a field. But I had to do all the work by myself by searching for answers. It wasn't easy, because there aren't many Molecular Evolution texbooks that can work at an introductory level and that cover what I consider the most important issues. Most students are just happy to make their degrees and that's it. The vast majority would not be able to point out what's wrong with any of the arguments that come from ID. Most barely heard of what the Genetic Load or C-value Paradox arguments are. Most are convinced ENCODE did what their author's claim.

      This is all unfortunate, but hardly surprising. We now have researchers with decades of experience writing articles in which Non-Coding DNA is given as synonimous with Junk DNA. It's unbelievable how can any experienced researcher make such a statement, and therefore hardly surprising that so many new students are so ill-prepared.

      I hope the new textbook on Molecular Evolution that Dan Graur is writing is writen in a accessible way to young students and that it gets some decent levels of adoption. Graur will certainly hammer into the readers' heads what the correct definitions are and why we know what we know and how.

  3. Correction: his first name is Axel, not Alex.

    I suspected that he had something to do with spreading the misinformation about the role of 'junk DNA' in his study, because the journalists who picked it up don't seem to be that bright to come up with that nonsense on their own. All doubt was removed though after I saw the corresponding article on the New Scientist magazine: Your face may have been sculpted by junk DNA. The reporter (Colin Barras) doesn't seem to have any formal training in genetics or molecular biology. He's mainly a paleontologist and a geologist. This explains, at least in part, some of the appalling statements you find in that article. What interested me in that article, however, is that the quoted words of Visel almost exactly match the nonsense that we saw in the Guardian and in other news sources as well. For instance:

    "Enhancers are part of the 98 per cent of the human genome that is non-coding DNA – long thought of as 'junk DNA'," says Visel. "It's increasingly clear that important functions are embedded in this 'junk'."

    Even though science journalists and a respected science magazine like the New Scientist should bear some of the blame for spreading this nonsense, Visel is the one who should be accountable for it. He was the source of this error, and he doesn't feel as though he did anything wrong. This is really sad.

    1. Correction: his first name is Axel, not Alex.

      Thank-you. That was embarrassing.

  4. "Noncoding DNA" means what it says - and it seems like a pretty reasonable term.

    That seems to make this an article with a sensationalist title that's not corrected in the text.

    1. Actually, I disagree. The reason why it is not a very reasonable term is that it's definiton means "DNA that doesn't code for PROTEINS". However, DNA can still *code* for rRNA or tRNA. for example. Those are real products, and they are CODED by DNA. Therefore, "non-coding DNA" is quite an unfortunate term, and is far from just "meaning what it says", because what it says makes no reference to what is coded.

      I agree with Moran, we should just drop the term, and simply use "protein-coding DNA" and "non protein-coding DNA" instead.

    2. Sorry Pedro, but I am against using the term "code" for rRNA genes, tRNA genes, etc. Those don't code.

      non-coding means non-coding. Using the term properly should not be a problem, but it is so easy for it to be misleading that maybe we should not use it. Just in case.

    3. Yeah -- if something isn't translated, the genetic code isn't involved and so no coding is involved. RNA genes are non-coding by definition.

    4. I think the root is a grammatical confusion between "coding" in the sense of computer code, and "coding" in the sense of the 3-letter mapping between DNA and aa (the "genetic code"). It was perhaps an unfortunate choice to call this mapping "the genetic code", as it implies to the layman that all functionality comes through this route. Perhaps a less leading phrase would be "translated DNA"?

    5. """Sorry Pedro, but I am against using the term "code" for rRNA genes, tRNA genes, etc. Those don't code.

      I totally agree with you, so it's clear my choice of wording was poor, so let me refrase that. We know that BY DEFINITION, a gene for rRNA or tRNA is a NON-coding gene. But here is precisely where the problem stems from. It just generates confusion, and it's clear where this confusion has led. In a loose sense it is very easy to simply refer to a rRNA or tRNA gene as "coding" for structural RNA, as in "blueprint". You even see it in the literature occasionaly, because it is many times used in this loose sense. I can understand that. I've used it myself many times when fast-talking (to my shame) and I do try to avoid general misuse of terms all the time. So I agree with Moran, the term generates too much confusion to be useful, specially when it leads to such atrocities like junk DNA=non-coding DNA.

    6. Ups, I just re-read what I wrote originaly and I WAS misusing the term again. SHAME ON ME!

      There, further "proof" that "coding" is an eeeevvviiilllll word.

    7. I don't see the shame in saying that an RNA sequence is encoded in DNA. Ever heard the phrase "DNA codes for RNA codes for protein"? Just because the code is very simple (A=A,T=U,C=C,G=G) doesn't mean that it isn't coding.

    8. """Ever heard the phrase "DNA codes for RNA codes for protein"?"""

      No, but I heard the phrase "DNA makes RNA makes protein".

      If RNA genes "coded" for RNA it would make no sense to call them non-coding genes. That's why it would make a lot more sense to stop using the expression "protein coding genes". If they are "coding genes", then they code for protein; if they are non-coding genes, then they don't code for protein neither do they code for anything else. The "coding" part refers to the genetic code.

  5. Ye gods, where do you always get these pretentious / unflattering / hilarious pictures of the people you are writing about? Surely nobody in their right mind puts these onto their staff websites?

    1. One would think so, but yes, they do: http://newscenter.lbl.gov/news-releases/2013/01/31/genome-wide-atlas-of-gene-enhancers-in-the-brain-on-line/

  6. I recommend the following glossary of neologisms to describe "death of Junk DNA" types:

    Pubwhore: one who will do anything with his mouth in order get attention from the media.

    Pressiopath: aka Press Release Sociopath; one who says a quotidian thing in his published paper, but claims to have made a revolutionary breakthrough in his Press Release, and has an untroubled conscience about lying to the non-scientific public.

    Paradigm Shaft: lying about the content of a theory, and replacing it with a different, dumber theory, which one then claims to have disproven, inducing a revolutionary Kuhnian "Paradigm Shift."

    Passive Tense Pussy: one who uses the passive tense for purposes of evasion, to avoid naming unknown scientists who are accused of having expressed some dumb in an unspecified text at an unspecified time; e.g. "long dismissed as Junk DNA...".

    cf. Axel Visel: "Enhancers are part of the 98 per cent of the human genome that is non-coding DNA – long thought of as 'junk DNA'," says Visel. "It's increasingly clear that important functions are embedded in this 'junk'."

    1. Excellent, as long as we can all agree that these are not mutually exclusive categories. It's possible for some "scientists" to be guilty of all four at the same time.

    2. Paradigm Shit: almost a paradigm shift, were it not for one missing detail.

    3. I'm going to start using that. F'n brilliant.

  7. Have you seen this Perspective article in today's Science? http://www.sciencemag.org/content/342/6159/705.full

    "One of the most important discoveries in genetics in the last 10 years is that the vast majority of trait-associated DNA variations occur in regions of the genome that were once labeled as “junk DNA” because they do not code for proteins. We now know that these regions harbor genetic elements that control where, when, and to what extent specific genes are expressed to make functional RNA and protein products."

    When will we stop having to read these ignorant assertions?

    1. I just droped a comment there stating the following regarding that quote from the Perspective:

      """I find it amazing that some molecular biologists/geneticists apparently don't
      know what "junk DNA" means. According to the definition presented in this
      perspective, genes that are transcribed to 16S/18S rRNA, tRNA, etc were
      though as being functionless junk just 10 years ago...

      If this is representative of the level of knowledge of college teachers
      worldwide I have to say that I fear for the present and next generations of

      Lets see if it is approved or not.

    2. @ TheOtherJim:

      We are now in 1975. Didn't you get the memo? I'm still to be born yet and I'm typing this from the future.

    3. Looks like the comment I left last friday at Science wasn't aproved...

  8. Seen this: Un-junking junk DNA?

    I don't know what to say of it. Yet again committing the same errors and fallacies. It doesn't seem that they are ignorant of the debate because the press release does specifically mention what we know about the extent of functionality in the genome using the selected effect criterion.