Sunday, July 19, 2015

The fuzzy thinking of John Parrington: pervasive transcription

Opponents of junk DNA usually emphasize the point that they were surprised when the draft human genome sequence was published in 2001. They expected about 100,000 genes but the initial results suggested less than 30,000 (the final number is about 25,0001. The reason they were surprised was because they had not kept up with the literature on the subject and they had not been paying attention when the sequence of chromosome 22 was published in 1999 [see Facts and Myths Concerning the Historical Estimates of the Number of Genes in the Human Genome].

The experts were expecting about 30,000 genes and that's what the genome sequence showed. Normally this wouldn't be such a big deal. Those who were expecting a large number of genes would just admit that they were wrong and they hadn't kept up with the literature over the past 30 years. They should have realized that discoveries in other species and advances in developmental biology had reinforced the idea that mammals only needed about the same number of genes as other multicellular organisms. Most of the differences are due to regulation. There was no good reason to expect that humans would need a huge number of extra genes.

That's not what happened. Instead, opponents of junk DNA insist that the complexity of the human genome cannot be explained by such a low number of genes. There must be some other explanation to account for the the missing genes. This sets the stage for at least seven different hypotheses that might resolve The Deflated Ego Problem. One of them is the idea that the human genome contains thousands and thousands of nonconserved genes for various regulatory RNAs. These are the missing genes and they account for a lot of the "dark matter" of the genome—sequences that were thought to be junk.

Here's how John Parrington describes it on page 91 of his book.
The study [ENCODE] also found that 80 per cent of the genome was generating RNA transcripts having importance, many were found only in specific cellular compartments, indicating that they have fixed addresses where they operate. Surely there could hardly be a greater divergence from Crick's central dogma than this demonstration that RNAs were produced in far greater numbers across the genome than could be expected if they were simply intermediates between DNA and protein. Indeed, some ENCODE researchers argued that the basic unit of transcription should now be considered as the transcript. So Stamatoyannopoulos claimed that 'the project has played an important role in changing our concept of the gene.'
This passage illustrates my difficulty in coming to grips with Parrington's logic in The Deeper genome. Just about every page contains statements that are either wrong or misleading and when he strings them together they lead to a fundamentally flawed conclusion. In order to critique the main point, you have to correct each of the so-called "facts" that he gets wrong. This is very tedious.

I've already explained why Parrington is wrong about the Central Dogma of Molecular Biology [John Avise doesn't understand the Central Dogma of Molecular Biology]. His readers don't know that he's wrong so they think that the discovery of noncoding RNAs is a revolution in our understanding of biochemisty—a revolution led by the likes of John A. Stamatoyannopoulos in 2012.

The reference in the book to the statement by Stamatoyannopoulos is from the infamous Elizabeth Pennisi article on ENCODE Project Writes Eulogy for Junk DNA (Pennisi, 2012). Here's what she said in that article ...
As a result of ENCODE, Gingeras and others argue that the fundamental unit of the genome and the basic unit of heredity should be the transcript—the piece of RNA decoded from DNA—and not the gene. “The project has played an important role in changing our concept of the gene,” Stamatoyannopoulos says.
I'm not sure what concept of a gene these people had before 2012. It appears that John Parrington is under the impression that genes are units that encode proteins and maybe that's what Pennisi and Stamatoyannopoulos thought as well.

If so, then perhaps the publicity surrounding ENCODE really did change their concept of a gene but all that proves is that they were remarkably uniformed before 2012. Intelligent biochemists have known for decades that the best definition of a gene is "a DNA sequence that is transcribed to produce a functional product."2 In other words, we have been defining a gene in terms of transcripts for 45 years [What Is a Gene?].

This is just another example of wrong and misleading statements that will confuse readers. If I were writing a book I would say, "The human genome sequence confirmed the predictions of the experts that there would be no more than 30,000 genes. There's nothing in the genome sequence or the ENCODE results that has any bearing on the correct understanding of the Central Dogma and there's nothing that changes the correct definition of a gene."

You can see where John Parrington's thinking is headed. Apparently, Parrington is one of those scientists who were completely unaware of the fact that genes could specify functional RNAs and completely unaware of the fact that Crick knew this back in 1970 when he tried to correct people like Parrington. Thus, Parrington and his colleagues were shocked to learn that the human genome only had only 25,000 genes and many of them didn't encode proteins. Instead of realizing that his view was wrong, he thinks that the ENCODE results overthrew those old definitions and changed the way we think about genes. He tries to convince his readers that there was a revolution in 2012.

Parrington seems to be vaguely aware of the idea that most pervasive transcription is due to noise or junk RNA. However, he gives his readers no explanation of the reasoning behind such a claim. Spurious transcription is predicted because we understand the basic concept of transcription initiation. We know that promoter sequences and transcription binding sites are short sequences and we know that they HAVE to occur a high frequency in large genomes just by chance. This is not just speculation. [see The "duon" delusion and why transcription factors MUST bind non-functionally to exon sequences and How RNA Polymerase Binds to DNA]

If our understanding of transcription initiation is correct then all you need is a activator transcription factor binding site near something that's compatible with a promoter sequence. Any given cell type will contain a number of such factors and they must bind to a large number of nonfunctional sites in a large genome. Many of these will cause occasional transcription giving rise to low abundance junk RNA. (Most of the ENCODE transcripts are present at less than one copy per cell.)

Different tissues will have different transcription factors. Thus, the low abundance junk RNAs must exhibit tissue specificity if our prediction is correct. Parrington and the ENCODE workers seem to think that the cell specificity of these low abundance transcripts is evidence of function. It isn't—it's exactly what you expect of spurious transcription. Parrington and the ENCODE leaders don't understand the scientific literature on transription initiation and transcription factors binding sites.

It takes me an entire blog post to explain the flaws in just one paragraph of Parrington's book. The whole book is like this. The only thing it has going for it is that it's better than Nessa Carey's book [Nessa Carey doesn't understand junk DNA].


1. There are about 20,000 protein-encoding genes and an unknown number of genes specifying functional RNAs. I'm estimating that there are about 5,000 but some people think there are many more.

2. No definition is perfect. My point is that defining a gene as a DNA sequence that encodes a protein is something that should have been purged from textbooks decades ago. Any biochemist who ever thought seriously enough about the definition to bring it up in a scientific paper should be embarrassed to admit that they ever believed such a ridiculous definition.

Pennisi, E. (2012) "ENCODE Project Writes Eulogy for Junk DNA." Science 337: 1159-1161. [doi:10.1126/science.337.6099.1159"]

26 comments :

  1. The study [ENCODE] also found that 80 per cent of the genome was generating RNA transcripts having importance

    This is just sad...

    There was an old Bulgarian movie called "Whale":

    https://en.wikipedia.org/wiki/Whale_(film)

    This is starting to have a lot of similarities to it...

    ReplyDelete
    Replies
    1. Thanks Georgi, that was an interesting read. My brother's partner is Bulgarian, will have to ask him if he's familiar with it.

      It appears that a lot of junk DNA writing is indeed junk DNA writing.

      Delete
    2. The Wikipedia entry is very poorly written, unfortunately. And there is no English translation of the movie I am aware of (the movie itself is on YouTube in its entirety). But the parallels are striking - the way facts and interpretations get embellished as they travel through the chain of transmission, the institutional and sociological factors that drive it, everything....

      Delete
  2. The study [ENCODE] also found that 80 per cent of the genome was generating RNA transcripts having importance, many were found only in specific cellular compartments, indicating that they have fixed addresses where they operate. Surely there could hardly be a greater divergence from Crick's central dogma than this demonstration that RNAs were produced in far greater numbers across the genome than could be expected if they were simply intermediates between DNA and protein.

    Wait, wha, wait-- hold the phone!! You mean to tell me that RNA transcripts are *NOT* just intermediates between DNA and protein?

    Jesus Christ in a cheerleader uniform! This completely changes the science of 1954!


    https://youtu.be/wxlhyX-4qKI


    Call Doctor Who! Or make me a Tardis. We must alert the media of 1954!

    ReplyDelete
  3. Wow that's bad. You'll love this piece, from the same guy. Note that he suggests that RNA is transmitted from the brain to sperm, but the study he links doesn't say that at all. Maybe he's just generally sloppy.

    http://blog.oup.com/2015/05/the-genetics-of-consciousness/

    I would have thought Oxford University Press would do better...

    ReplyDelete
  4. What do you expect from a blog post headed by a picture which displays left-handed DNA in a human head? Seemingly, the likely original further down the page contains a right-handed molecule and they just carelessly mirrored it.

    ReplyDelete
  5. This passage illustrates my difficulty in coming to grips with Parrington's logic in The Deeper genome.

    Tedious is right. This is a good example of the author being "not even wrong". Functional RNAs do not violate the Central Dogma, nor even the common misinterpretation of the Central Dogma (the Sequence Hypothesis).

    ReplyDelete
  6. The so called jDNA controls telomere length that serves as a molecular clock that helps controlling aging. Here is one clue as to how jDNA could have been responsible for influencing immortality.

    ReplyDelete
    Replies
    1. You never get tired of displaying your ignorant stupidity in unambiguous terms.

      Delete
    2. Here is one clue as to how jDNA could have been responsible for influencing immortality.

      Right, so since "junk DNA" is actually good for something, that's why we're immortal...oh, wait.

      Delete
    3. "The so called jDNA controls telomere length that serves as a molecular clock that helps controlling aging. Here is one clue as to how jDNA could have been responsible for influencing immortality."

      What the hell is this complete void-skull even trying to say?

      Delete
    4. What the hell is this complete void-skull even trying to say?

      He's trying to claim that junk DNA, before it was inactivated by mutations beginning after the Fall, made people immortal or at least very long-lived, e.g. Methuselah. And thus we conclude Jesus.

      Delete
    5. No, I've been through it with a fine-tooth comb and it's absolutely watertight. We will remember this day.

      Delete
    6. How would you know this, without assuming religious beliefs?

      As I've argued over and over, ID can only make testable predictions if it admits its religious beliefs about the purposes of God.

      Consider another ID "theory": junk DNA, at an unspecified time in the past, used to make our lives *shorter*. That was the Designer's purpose. Then mutations disabled this function, so we live longer now.

      How is this ID theory worse than your ID theory? How do they differ? Different assumptions about God's purposes.

      Delete
    7. Diogenes may I ask you a question? Do you have a problem with believers in general or just those who fight science? I'm starting to understand your frustration a bit more and I'm a believer. It seems many religious people have created a battle that isn't really necessary or likely winnable.

      Delete
    8. Beau asks: Do you have a problem with believers in general or just those who fight science?

      I don't think it matters, but I have a problem with believers who fight science, or who want to use taxpayer money to violate church/state separation, or those who say they've got evidence of God's existence and the evidence is a joke, or who take a cavalier/whitewash attitude toward genocides in the Bible, Confederate slavery in the US, etc.

      In short, politicized applied conservative religion I have a problem with.

      Beyond that, all of us have religious family members and we have to get along.

      Delete
    9. I'm confused....

      Is Skeptical Mind's statement about junk DNA and telomeres true at least regarding aging? I think it is....

      Knockouts showed that the telomere length shortens much faster drastically affecting the lifespan of mice.....

      I personally don't believe junk DNA used to be responsible for immortality.... as the Bible talks about the "tree of life" being a source of everlasting life whatever that tree represented.... but who knows....

      Delete
  7. Speaking of only 30,000 genes, they are part of generating possibly billions or trillions or more protein isoforms and glycoforms produced from a quadrillion transcriptomes. Plenty of room for RNAs to do work in coordinating all these buzzillion isoforms. We haven't even scratched the surface with RNA-seq experiments, but papers are coming out on the role of RNAs in isoform creation and cellular differentiation. The field of RNA interactomes has just barely gotten off the ground. What may look like noise to Graur looks like interactome machinery to ChIRP-SEQuencers.

    How do biochemists actually know RNA transcripts are non-functional? How many GWAS, RNA-Seq, ChIRP-seq, Chip-Seq, etc. experiments on the quadrillion transcriptomes would be adequate to establish non-function? Many functions, such as those involved in recovery from injury, are not detected unless the back up nature of this redundancy is called upon.

    How do you distinguish spare tires from junk without all these requisite experiments? Even recently we found quadruplex DNA may have functional role. Example: miles and miles of repeated TTAGGG the are often incorporated in nuclear only telomeric repeat-containing RNA (TERRA), and these repeats have biophysical function in preventing error.

    Steve Matheson kept saying many RNAs don't even leave the nuclear complex. Did it occur to him then that maybe these RNAs are used to do something that's done in the nuclear complex alone, like creating and regulating part of the manufacture of ribosomes? It's to early to tell what these RNAs do. Simply saying they have no function because people like Matheson and Graur don't perceive the function is not a basis for declaring there is no function. It's just prejudice against the idea of deep integrated functionality in biological systems. And prejudice is not experiment and observation.

    ReplyDelete
    Replies
    1. "liarsfordarwin" lives up to his name in a comment where he asks,

      How do biochemists actually know RNA transcripts are non-functional?

      We don't. But since spurious, nonfunctional, transcripts are expected by everyone who understands biochemistry, the onus is on the true believers to prove that transcripts are functional.

      How many GWAS, RNA-Seq, ChIRP-seq, Chip-Seq, etc. experiments on the quadrillion transcriptomes would be adequate to establish non-function?

      None. Those techniques identify transcripts that are potentially functional but most likely are not. It's up to real biochemists to do the dirty work and figure out which ones really have a significant biological function. Most genomics labs would rather just skip the dirty work and claim that they've revolutionized biology by applying those techniques to entire genomes.

      They don't even realize how silly they look when they make unwarranted conclusions based on whole genome analyses.

      Many functions, such as those involved in recovery from injury, are not detected unless the back up nature of this redundancy is called upon.

      See what I mean?

      It's just prejudice against the idea of deep integrated functionality in biological systems. And prejudice is not experiment and observation.

      The belief in "deep integrated functionality" is not only a prejudice, it's a really stupid prejudice because it flies in the face of all we know about biochemistry and evolution.

      It's people like you who are claiming a vast amount of function based entirely on the unsubstantiated belief that if it's detectable, it must have a function. That's not experiment and observation; it's a conclusion. It's not science.

      Delete
    2. Dr. Moran, I should have waited longer to respond to Liarsfordarwin. You said what I tried to say, but better.

      Delete
    3. Thank you for replying, Dr. Moran.

      I didn't say junk DNA had function. I already said in another thread I agree with you there could be junk DNA and that ID theory didn't predict junk DNA is functional. I was pointing out it certainly is premature to say it doesn't.

      Evolutionary theory is to ambiguous to be used as a guide for the physiology and function of DNA. The fact that this is a disagreement between ENCODE evolutionists and Graurists is evidence of the ambiguity.

      Thanks for your response.

      Delete
    4. We have gone over this many times in the past. Why do we have to do it again?

      The worst part about the fact that creationists are trying to use ENCODE to disprove junk DNA is not that ENCODE data does not do that, it is that the arguments for most of the genome being junk do not even require evolution to be true. Originally, the idea appeared because a back-of-the-envelope calculation shows that for such a genome size and such a mutation rate, if all of it is functional, we would basically not exist. That argument is just as valid under a model in which we were specially created 6000 years ago and have absolutely no phylogenetic relationship to other organisms as it is under the evolutionary model of modern science. Sure, it makes a lot more sense in the light of evolution, but so does all of biology.

      Delete
  8. How do biochemists actually know RNA transcripts are non-functional? How many GWAS, RNA-Seq, ChIRP-seq, Chip-Seq, etc. experiments on the quadrillion transcriptomes would be adequate to establish non-function?

    Throwing fancy terms around without really understanding what they mean usually leads to self-embarrassment, not to successfully making your point.

    ReplyDelete
  9. To me, this is a question about where the burden of proof lies. One can't prove that an RNA transcript is useless. As Liarsfordarwin demonstrates, one can always imagine some more subtle function that a transcript might perform. However, one can prove (as well as anything in science is proved - let's not quibble) that an RNA transcript does perform a function. I think that the burden of proof lies on those who think all transcripts have useful functions.

    I do think that there is good evidence that "all RNA transcripts have useful functions" should not be the default position. When very sensitive tests are run, RNA transcripts are found for most of the DNA -- but most occur at a frequency of less than one transcript per cell. It's hard to see how those RNAs could do anything useful.

    Some of the reasons to think some transcripts are non-functional are the same reasons we think some DNA is junk. Surely the default assumption about RNA transcripts formed from the viruses that make up 8% of our DNA should be that the transcripts are useless to us (unless proved otherwise, as at least one has been). If mice seem to do well with 3% of their DNA removed, surely the burden of proof lies with those who want to argue that any RNA transcripts made from that 3% of DNA are functional.

    We have a great deal to learn about the function and lack of function of RNA transcripts. Starting from the assumption that they're all functional (often in really subtle ways) would be as foolish as working from the assumption that none of them are.

    ReplyDelete