More Recent Comments

Showing posts sorted by date for query mattick. Sort by relevance Show all posts
Showing posts sorted by date for query mattick. Sort by relevance Show all posts

Friday, July 24, 2015

John Parrington discusses pseudogenes and broken genes

We are discussing Five Things You Should Know if You Want to Participate in the Junk DNA Debate and how they are described in John Parrington's book The Deeper Genome: Why there is more to the human genome than meets the eye. This is the fourth of five posts.

1. Genetic load
John Parrington and the genetic load argument
2. C-Value paradox
John Parrington and the c-value paradox
3. Modern evolutionary theory
John Parrington and modern evolutionary theory
4. Pseudogenes and broken genes are junk (this post)
John Parrington discusses pseudogenes and broken genes
5. Most of the genome is not conserved
John Parrington discusses genome sequence conservation

4. Pseudogenes and broken genes are junk

Parrington discusses pseudogenes at several places in the book. For example, he mentions on page 72 that both Richard Dawkins and Ken Miller have used the existence of pseudogenes as an argument against intelligent design. But, as usual, he immediately alerts his readers to other possible explanations ...
However, using the uselessness of so much of the genome for such a purpose is also risky, for what if the so-called junk DNA turns out to have an important function, but one that hasn't yet been identified.
This is a really silly argument. We know what genes look like and we know what broken genes look like. There are about 20,000 former protein-coding pseudogenes in the human genome. Some of them arose recently following a gene duplication or insertion of a cDNA copy. Some of them are ancient and similar pseudogenes are found at the same locations in other species. They accumulate mutations at a rate consistent with neutral theory and random genetic drift. (This is a demonstrated fact.)

It's ridiculous to suggest that a significant proportion of those pseudogenes might have an unknown important function. That doesn't rule out a few exceptions but, as a general rule, if it looks like a broken gene and acts like a broken gene, then chances are pretty high that it's a broken gene.

As usual, Parrington doesn't address the big picture. Instead he resorts to the standard ploy of junk DNA proponents by emphasizing the exceptions. He devotes more that two full pages (pages 143-144) to evidence that some pseudogenes have acquired a secondary function.
The potential pitfalls of writing off elements in the genome as useless or parasitical has been demonstrated by a recent consideration of the role of pseudgogenes. ... recent studies are forcing a reappraisal of the functional role of these 'duds."
Do you think his readers understand that even if every single broken gene acquired a new function that would still only account for less than 2% of the genome?

There's a whole chapter dedicated to "The Jumping Genes" (Chapter 8). Parrington notes that 45% of our genome is composed of transposons (page 119). What are they doing in our genome? They could just be parasites (selfish DNA), which he equates with junk. However, Parrrington prefers the idea that they serve as sources of new regulatory elements and they are important in controlling responses to environmental pressures. They are also important in evolution.

As usual, there's no discussion about what fraction of the genome is functional in this way but the reader is left with the impression that most of that 45% may not be junk or parasites.

Most Sandwalk readers know that almost all of the transposon-related sequences are bits and pieces of transposons that haven't bee active for millions of years. They are pseudogenes. They look like broken transposon genes, they act like broken genes, and they evolve like broken transposons. It's safe to assume that's what they are. This is junk DNA and it makes up almost half of our genome.

John Parrington never mentions this nasty little fact. He leaves his readers with the impression that 45% of our genome consists of active transposons jumping around in our genome. I assume that this is what he believes to be true. He has not read the literature.

Chapter 9 is about epigenetics. (You knew it was coming, didn't you?) Apparently, epigentic changes can make the genome more amenable to transposition. This opens up possible functional roles for transposons.
As we've seen, stress may enhance transposition and, intriguingly, this seems to be linked to changes in the chromatin state of the genome, which permits repressed transposons to become active. It would be very interesting if such a mechanism constituted a way for the environment to make a lasting, genetic mark. This would be in line with recent suggestions that an important mechanism of evolution is 'genome resetting'—the periodic reorganization of the genome by newly mobile DNA elements, which establishes new genetic programs in embryo development. New evidence suggests that such a mechanism may be a key route whereby new species arise, and may have played an important role in the evolution of humans from apes. This is very different from the traditional view of evolution being driven by the gradual accumulation of mutations.
It was at this point, on page 139, that I realized I was dealing with a scientist who was in way over his head.

Parrington returns to this argument several times in his book. For example, in Chapter 10 ("Code, Non-code, Garbage, and Junk") he says ....
These sequences [transpsons] are assumed to be useless, and therefore their rate of mutation is taken to taken to represent a 'neutral' reference; however, as John Mattick and his colleague Marcel Dinger of the Garvan Institute have pointed out, a flaw in such reasoning is 'the questionable proposition that transposable elements, which provide the major source of evolutionary plasticity and novelty, are largely non-functional. In fact, as we saw in Chapter 8, there is increasing evidence that while transposons may start off as molecular parasites, they can also play a role in the creation of new regulatory elements, non-coding RNAs, and other such important functional components of the genome. It is this that has led John Stamatoyannopoulos to conclude that 'far from being an evolutionary dustbin, transposable elements appear to be active and lively members of the genomic regulatory community, deserving of the same level of scrutiny applied to other genic or regulatory features. In fact, the emerging role for transposition in creating new regulatory mechanisms in the genome challenges the very idea that we can divide the genome into 'useful' and 'junk' coomponents.
Keep in mind that active transposons represent only a tiny percentage of the human genome. About 50% of the genome consists of transposon flotsam and jetsam—bits and pieces of broken transposons. It looks like junk to me.

Why do all opponents of junk DNA argue this way without putting their cards on the table? Why don't they give us numbers? How much of the genome consists of transposon sequences that have a biological function? Is it 50%, 20%, 5%?


John Parrington and the C-value paradox

We are discussing John Parrington's book The Deeper Genome: Why there is more to the human genome than meets the eye. This is the second of five posts on: Five Things You Should Know if You Want to Participate in the Junk DNA Debate

1. Genetic load
John Parrington and the genetic load argument
2. C-Value paradox (this post)
John Parrington and the c-value paradox
3. Modern evolutionary theory
John Parrington and modern evolutionary theory
4. Pseudogenes and broken genes are junk
John Parrington discusses pseudogenes and broken genes
5. Most of the genome is not conserved
John Parrington discusses genome sequence conservation


2. C-Value paradox

Parrington addresses this issue on page 63 by describing experiments from the late 1960s showing that there was a great deal of noncoding DNA in our genome and that only a few percent of the genome was devoted to encoding proteins. He also notes that the differences in genome sizes of similar species gave rise to the possibility that most of our genome was junk. Five pages later (page 69) he reports that scientists were surprised to find only 30,000 protein-coding genes when the sequence of the human genome was published—"... the other big surprise was how little of our genomes are devoted to protein-coding sequence."

Contradictory stuff like that makes it every hard to follow his argument. On the one hand, he recognizes that scientists have known for 50 years that only 2% of our genome encodes proteins but, on the other hand, they were "surprised" to find this confirmed when the human genome sequence was published.

He spends a great deal of Chapter 4 explaining the existence of introns and claims that "over 90 per cent of our genes are alternatively spliced" (page 66). This seems to be offered as an explanation for all the excess noncoding DNA but he isn't explicit.

In spite of the fact that genome comparisons are a very important part of this debate, Parrington doesn't return to this point until Chapter 10 ("Code, Non-code, Garbage, and Junk").

We know that the C-Value Paradox isn't really a paradox because most of the excess DNA in various genomes is junk. There isn't any other explanation that makes sense of the data. I don't think Parrington appreciates the significance of this explanation.

The examples quoted in Chapter 10 are the lungfish, with a huge genome, and the pufferfish (Fugu), with a genome much smaller than ours. This requires an explanation if you are going to argue that most of the human genome is functional. Here's Parrington's explanation ...
Yet, despite having a genome only one eighth the size of ours, Fugu possesses a similar number of genes. This disparity raises questions about the wisdom of assigning functionality to the vast majority of the human genome, since, by the same token, this could imply that lungfish are far more complex than us from a genomic perspective, while the smaller amount of non-protein-coding DNA in the Fugu genome suggests the loss of such DNA is perfectly compatible with life in a multicellular organism.

Not everyone is convinced about the value of these examples though, John Mattick, for instance, believes that organisms with a much greater amount of DNA than humans can be dismissed as exceptions because they are 'polyploid', that is, their cells have far more than the normal two copies of each gene, or their genomes contain an unusually high proportion of inactive transposons.
In other words, organisms with larger genomes seem to be perfectly happy carrying around a lot of junk DNA! What kind of an argument is that?
Mattick is also not convinced that Fugu provides a good example of a complex organism with no non-coding DNA. Instead, he points out that 89% of this pufferfish's DNA is still non-protein-coding, so the often-made claim that this is an example of a multicellular organism without such DNA is misleading.
[Mattick has been] a true visionary in his field; he has demonstrated an extraordinary degree of perseverance and ingenuity in gradually proving his hypothesis over the course of 18 years.

Hugo Award Committee
Seriously? That's the best argument he has? He and Mattick misrepresent what scientists say about the pufferfish genome—nobody claims that the entire genome encodes proteins—then they ignore the main point; namely, why do humans need so much more DNA? Is it because we are polyploid?

It's safe to say that John Parrington doesn't understand the C-value argument. We already know that Mattick doesn't understand it and neither does Jonathan Wells, who also wrote a book on junk DNA [John Mattick vs. Jonathan Wells]. I suppose John Parrington prefers to quote Mattick instead of Jonathan Wells—even though they use the same arguments—because Mattick has received an award from the Human Genome Organization (HUGO) for his ideas and Wells hasn't [John Mattick Wins Chen Award for Distinguished Academic Achievement in Human Genetic and Genomic Research].

For further proof that Parrington has not done his homework, I note that the Onion Test [The Case for Junk DNA: The onion test ] isn't mentioned anywhere in his book. When people dismiss or ignore the Onion Test, it usually means they don't understand it. (For a spectacular example of such misunderstanding, see: Why the "Onion Test" Fails as an Argument for "Junk DNA").


Friday, July 03, 2015

The fuzzy thinking of John Parrington: The Central Dogma

My copy of The Deeper Genome: Why there's more to the human genome than meets the eye has arrived and I've finished reading it. It's a huge disappointment. Parrington makes no attempt to describe what's in your genome in more than general hand-waving terms. His main theme is that the genome is really complicated and so are we. Gosh, golly, gee whiz! Re-write the textbooks!

You will look in vain for any hard numbers such as the total number of genes or the amount of the genome devoted to centromeres, regulatory sequences etc. etc. [see What's in your genome?]. Instead, you will find a wishy-washy defense of ENCODE results and tributes to the views of John Mattick.

John Parrington is an Associate Professor of Cellular & Molecular Pharmacology at the University of Oxford (Oxford, UK). He works on the physiology of calcium signalling in mammals. This should make him well-qualified to write a book about biochemistry, molecular biology, and genomes. Unfortunately, his writing leaves a great deal to be desired. He seems to be part of a younger generation of scientists who were poorly trained as graduate students (he got his Ph.D. in 1992). He exhibits the same kind of fuzzy thinking as many of the ENCODE leaders.

Let me give you just one example.

Saturday, March 21, 2015

How the genome lost its junk according to John Parrington

I really hate it when publishers start to hype a book several months before we can read it, especially when the topic is controversial. In this case, it's Oxford University Press and the book is "The Deeper Genome" Why there is more to the human genome than meets the eye." The author is John Parrington.

The title of the promotion blurb is: How the Genome Lost its Junk on the Canadian version of the Oxford University Press website. It looks like this book is going to be an attack on junk DNA.

We won't know for sure until June or July when the book is published. Until then, the author and the publisher will have free reign to sell their ideas without serious opposition or push back.

Here's the prepublication hype. I'm going to buy this book and read it as soon as it becomes available. Stay tuned for a review.

Monday, February 23, 2015

Should universities defend free speech and academic freedom?

This post was prompted by a discussion I'm having with Jerry Coyne on whether he should be trying to censor university professors who teach various forms of creationism.

I very much enjoyed Jerry Coyne's stance on free speech in his latest blog website post: The anti-free speech police ride again. Here's what he said,

Friday, January 16, 2015

Functional RNAs?

One of the most important problems in biochemistry & molecular biology is the role (if any) of pervasive transcription. We've known for decades that most of the genome is transcribed at some time or other. In the case of organisms with large genomes, this means that tens of thousand of RNA molecules are produced from regions of the genome that are not (yet?) recognized as functional genes.

Do these RNAs have a function?

Most knowledgeable biochemists are aware of the fact that transcription factors and RNA polymerase can bind at many sites in the genome that have nothing to do with transcription of a normal gene. This simply has to be the case based on our knowledge of DNA binding proteins [see The "duon" delusion and why transcription factors MUST bind non-functionally to exon sequences and How RNA Polymerase Binds to DNA].

If you have a genome containing large amounts of junk DNA then it follows, as night follows day, that there will be a great deal of spurious transcription. The RNAs produced by these accidental events will not have a biological function.

Friday, January 17, 2014

Casey Luskin's latest take on junk DNA—is he lying or is he stupid?

Some of us have been trying to educate the IDiots for over twenty years. It can be very, very, frustrating.

The issue of junk DNA is a case in point. We've been trying to explain the facts to people like Casey Luskin. I know he's listening because he comments on Sandwalk from time to time. Surely it can't be that hard? All they have to do is acknowledge that "Darwinians" are opposed to junk DNA because they think that natural selection is very powerful and would have selected against junk DNA. All we're asking is that they refer to "evolutionary biologists" when they talk about junk DNA proponents.

We've also pointed out, ad nauseam, that no knowledgeable scientist ever said that all noncoding DNA was junk. We just want the IDiots to admit that there were some smart scientists who knew about functional noncoding DNA—like the genes for ribosomal RNAs, origins of replication, and centromeres.

On the function of lincRNAs

There's plenty of evidence that most of the DNA in mammalian genomes is junk [Five Things You Should Know if You Want to Participate in the Junk DNA Debate]. There's also plenty of evidence that as much as 10% of these genomes are functional in some way or another. This is a lot more DNA than the amount in coding regions but that shouldn't surprise anyone since we've known about functional noncoding DNA for half a century.

Lot's of genes specify functional RNA molecules. The best known ones are the genes for ribosomal RNAs, tRNAs, the spliceosomal RNAs, and a variety of other catalytic RNAs. A host of small regulatory RNAs have been characterized in bacteria over the past five decades (Waters and Storz, 2009) and in the past few decades a variety of different types of small RNAs have been identified in eukaryotes (see Sharp, 2009). These include miRNAs, siRNAs, piRNAs, and others (Malone and Hannon, 2009; Carthew and Sontheimer, 2009).

Tuesday, October 29, 2013

The Khan Academy and AAMC Teach the Central Dogma of Molecular Biology in Preparation for the MCAT

Here's a presentation by Tracy Kovach, a 3rd year medical student at the University of Virginia School of Medicine. Sandwalk readers will be familiar with my view of Basic Concepts: The Central Dogma of Molecular Biology and the widespread misunderstanding of Crick's original idea. It won't be a surprise to learn that a 3rd year medical student is repeating the old DNA to RNA to protein mantra.

I suppose that's excusable, especially since that's what is likely to be tested on the MCAT. I wonder if students who take my course, or similar courses that correctly teach the Central Dogma, will be at a disadvantage on the MCAT?

The video is posted on the Khan Academy website at: Central dogma of molecular biology. What I found so astonishing about the video presentation is that Tracy Kovach spends so much time explaining how to remember "transcription" and "translation" and get them in the right order. Recall that this video is for students who are about to graduate from university and apply to medical school. I expect high school students to have mastered the terms "transcription" and "translation." I'm pretty sure that students in my undergraduate class would be insulted if I showed them this video. They would be able to describe the biochemistry of transcription and translation in considerable detail.


There are people who think that the Central Dogma is misunderstood to an even greater extent than I claim. They say that the Central Dogma is widely interpreted to mean that the only role of DNA information is to make RNA which makes protein. In other words, they fear that belief in that version of the Central Dogma rules out any other role for DNA. This is the view of John Mattick. He says that the Central Dogma has been overthrown by the discovery of genes that make functional RNA but not protein.

I wonder if students actually think that this is what the Central Dogma means? Watch the first few minutes of the video and give me your opinion. Is this what she is saying?


Saturday, August 24, 2013

John Mattick vs. Jonathan Wells

John Mattick and Jonathan Wells both believe that most of the DNA in our genome is functional. They do not believe that most of it is junk.

John Mattick and Jonathan Wells use the same arguments in defense of their position and they quote one another. Both of them misrepresent the history of the junk DNA debate and both of them use an incorrect version of the Central Dogma of Molecular Biology to make a case for the stupidity of scientists. Neither of them understand the basic biochemistry of DNA binding proteins leading them to misinterpret low level transcription as functional. Jonathan Wells and John Mattick ignore much of the scientific evidence in favor of junk DNA. They don't understand the significance of the so-called "C-Value Paradox" and they don't understand genetic load. Both of them claim that junk DNA is based on ignorance.

Thursday, August 01, 2013

The Junk DNA Controversy: John Mattick Defends Design

The failure to recognize the implications of the non-coding DNA will go down I think as the biggest mistakes in the history of molecular biology.

John Mattick
abc Australia
John Mattick has just published a paper dealing with the controversy over the ENCODE results and junk DNA. As you might imagine, Mattick defends the idea that most of our genome is functional. He attempts to explain why most of the critics are wrong.

The title of the paper is "The extent of functionality in the human genome" (Mattick and Dinger, 2013). It's published in the HUGO Journal. Recall that HUGO (Human Genome Organization) gave Mattick a prestigious award for his contributions to genome research. (See The Dark Matter Rises for a discussion of these contributions.)

UPDATE: Mike White also discusses this paper at: Having your cake and eating it: more arguments over human genome function.

Mattick's paper begins by mentioning three of the papers that were critical of ENCODE results: Dan Graur's paper (Graur et al. 2013), Ford Doolittle's paper (Doolittle, 2013), and the paper by Niu and Jiang (2013).

He begins by addressing one of Dan Graur's points about conservation.

Wednesday, July 31, 2013

The Dark Matter Rises

John Mattick is a Professor and research scientist at the Garvan Institute of Medical Research at the University of New South Wales (Australia).

John Mattick publishes lots of papers. Most of them are directed toward proving that almost all of the human genome is functional. I want to remind you of some of the things that John Mattick has said in the past so you'll be prepared to appreciate my next post [The Junk DNA Controversy: John Mattick Defends Design].

Mattick believes that the Central Dogma means DNA makes RNA makes protein. He believes that scientists in the past took this very literally and discounted the importance of RNA. According to Mattick, scientists in the past believed that genes were the only functional part of the genome and that all genes encoded proteins.

If that sounds familiar it's because there are many IDiots who make the same false claim. Like Mattick, they don't understand the Central Dogma of Molecular Biology and they don't understand the history that they are distorting.

Mattick believes that there is a correlation between the amount of noncoding DNA in a genome and the complexity of the organism. He thinks that the noncoding DNA is responsible for making tons of regulatory RNAs and for regulating expression of the genes. This belief led him to publish a famous figure (left) in Scientific American.

Mattick has many followers. So many, in fact, that the Human Genome Organization (HUGO) recently gave him an award for his contributions to the study of the human genome. Here's the citation.
Theme
Genomes
& Junk DNA
The Award Reviewing Committee commented that Professor Mattick’s “work on long non-coding RNA has dramatically changed our concept of 95% of our genome”, and that he has been a “true visionary in his field; he has demonstrated an extraordinary degree of perseverance and ingenuity in gradually proving his hypothesis over the course of 18 years.”
Let's see what this "true visionary" is saying this year. The first paper is "The dark matter rises: the expanding world of regulatory RNAs" (Clark et al., 2013). Here's the abstract ...
The ability to sequence genomes and characterize their products has begun to reveal the central role for regulatory RNAs in biology, especially in complex organisms. It is now evident that the human genome contains not only protein-coding genes, but also tens of thousands of non–protein coding genes that express small and long ncRNAs (non-coding RNAs). Rapid progress in characterizing these ncRNAs has identified a diverse range of subclasses, which vary widely in size, sequence and mechanism-of-action, but share a common functional theme of regulating gene expression. ncRNAs play a crucial role in many cellular pathways, including the differentiation and development of cells and organs and, when mis-regulated, in a number of diseases. Increasing evidence suggests that these RNAs are a major area of evolutionary innovation and play an important role in determining phenotypic diversity in animals.
This is his main theme. Mattick believes that a large percentage of the human genome is devoted to making regulatory RNAs that control development. He believes that the evolution of this complex regulatory network is responsible for the creation of complex organisms like humans, which, incidentally, are the pinnicle of evolution according to the figure shown above.

The second paper I want to highlight focuses on a slightly different theme. It's title is "Understanding the regulatory and transcriptional complexity of the genome through structure." (Mercer and Mattick, 2013). In this paper he emphasizes the role of noncoding DNA in creating a complicated three-dimensional chromatin structure within the nucleus. This structure is important in regulating gene expression in complex organisms. Here's the abstract ...
An expansive functionality and complexity has been ascribed to the majority of the human genome that was unanticipated at the outset of the draft sequence and assembly a decade ago. We are now faced with the challenge of integrating and interpreting this complexity in order to achieve a coherent view of genome biology. We argue that the linear representation of the genome exacerbates this complexity and an understanding of its three-dimensional structure is central to interpreting the regulatory and transcriptional architecture of the genome. Chromatin conformation capture techniques and high-resolution microscopy have afforded an emergent global view of genome structure within the nucleus. Chromosomes fold into complex, territorialized three-dimensional domains in concert with specialized subnuclear bodies that harbor concentrations of transcription and splicing machinery. The signature of these folds is retained within the layered regulatory landscapes annotated by chromatin immunoprecipitation, and we propose that genome contacts are reflected in the organization and expression of interweaved networks of overlapping coding and noncoding transcripts. This pervasive impact of genome structure favors a preeminent role for the nucleoskeleton and RNA in regulating gene expression by organizing these folds and contacts. Accordingly, we propose that the local and global three-dimensional structure of the genome provides a consistent, integrated, and intuitive framework for interpreting and understanding the regulatory and transcriptional complexity of the human genome.
Other posts about John Mattick.

How Not to Do Science
John Mattick on the Importance of Non-coding RNA
John Mattick Wins Chen Award for Distinguished Academic Achievement in Human Genetic and Genomic Research
International team cracks mammalian gene control code
Greg Laden Gets Suckered by John Mattick
How Much Junk in the Human Genome?
Genome Size, Complexity, and the C-Value Paradox


Clark, M.B., Choudhary, A., Smith, M.A., Taft, R.J. and Mattick, J.S. (2013) The dark matter rises: the expanding world of regulatory RNAs. Essays in Biochemistry 54:1-16. [doi:10.1042/bse0540001]

Mercer, T.R. and Mattick, J.S. (2013) Understanding the regulatory and transcriptional complexity of the genome through structure. Genome research 23:1081-1088 [doi: 10.1101/gr.156612.113]

Sunday, July 14, 2013

How Not to Do Science

Theme
Genomes
& Junk DNA
Many reputable scientists are convinced that most of our genome is junk. However, there are still a few holdouts and one of the most prominent is John Mattick. He believes that most of our genome is made up of thousand of genes for regulatory noncoding RNA. These RNAs (about 100 of them for every single protein-coding gene) are mostly involved in subtle controls of the levels of protein in human cells. (I'm not making this up. See: John Mattick on the Importance of Non-coding RNA )

It was a reasonable hypothesis at one point in time.

How do you evaluate a hypothesis in science? Well, one of the things you should always try to do is falsify your hypothesis. Let's see how that works ...
  1. The RNAs should be conserved. FALSE
  2. The RNAs should be abundant (>1 copy per cell). FALSE
  3. There should be dozens of well-studied specific examples. FALSE
  4. The hypothesis should account for variations in genome size. FALSE
  5. The hypothesis should be consistent with other data, such as that on genetic load. FALSE
  6. The hypothesis should be consistent with what we already know about the regulation of gene expression. FALSE
  7. You should be able to refute existing hypotheses, such as transcription errors. FALSE
Normally, you would abandon a hypothesis that had such a bad track record but true believers aren't about to do that. So what's next? Maybe these regulatory RNAs don't show sequence conservation but maybe their secondary structures are conserved. In other words, these RNAs originated as functional RNAs with a secondary structure but over the course of time all traces of sequence conservation have been lost and only the "conserved" secondary structure remains.1 The Mattick lab looked at the "conservation" of secondary structure as an indicator of function using the latest algorithms (Smith et al., 2013). Here's how they describe their attempts to prove their hypothesis in light of conflicting data ...
The majority of the human genome is dynamically transcribed into RNA, most of which does not code for proteins (1–4). The once common presumption that most non–protein-coding sequences are nonfunctional for the organism is being adjusted to the increasing evidence that noncoding RNAs (ncRNAs) represent a previously unappreciated layer of gene expression essential for the epigenetic regulation of differentiation and development (5–8). Yet despite an exponential accumulation of transcriptomic data and the recent dissemination of genome-wide data from the ENCODE consortium (9), limited functional data have fuelled discourse on the amount of functionally pertinent genomic sequence in higher eukaryotes (1, 10–12). What is incontrovertible, however, is that evolutionary conservation of structural components over an adequate evolutionary distance is a direct property of purifying (negative) selection and, consequently, a sufficient indicator of biological function The majority of studies investigating the prevalence of purifying selection in mammalian genomes are predicated on measuring nucleotide substitution rates, which are then rated against a statistical threshold trained from a set of genomic loci arguably qualified as neutrally evolving (13, 14). Conversely, lack of conservation does not impute lack of function, as variation underlies natural selection. Given that the molecular function of ncRNA may at least be partially conveyed through secondary or tertiary structures, mining evolutionary data for evidence of such features promises to increase the resolution of functional genomic annotations.
Here's what they found ..
When applied to consistency-based multiple genome alignments of 35 mammals, our approach confidently identifies >4 million evolutionarily constrained RNA structures using a conservative sensitivity threshold that entails historically low false discovery rates for such analyses (5–22%). These predictions comprise 13.6% of the human genome, 88% of which fall outside any known sequence-constrained element, suggesting that a large proportion of the mammalian genome is functional.
Apparently 13.6% of the human genome is a "large proportion." Taken at face value, however, the Mattick lab has now shown that the vast majority of transcribed sequences don't show any of the characteristics of functional RNA, including conservation of secondary structure. Of course, that's not the conclusion they emphasize in their paper.

Why not?

1. I can't imagine how this would happen, can you? You'd almost have to have selection AGAINST sequence conservation.

Smith, M.A., Gese, T., Stadler, P.F. and Mattick, J.S. (2013) Widespread purifying selection on RNA structure in mammals. Nucleic Acid Research advance access July 11, 2013 [doi: 10.1093/nar/gkt596]

Friday, June 28, 2013

John Mattick on the Importance of Non-coding RNA

John Mattick is a Professor and research scientist at the Garvan Institute of Medical Research at the University of New South Wales (Australia). He received an award from the Human Genome Organization for ....
The Award Reviewing Committee commented that Professor Mattick’s “work on long non-coding RNA has dramatically changed our concept of 95% of our genome”, and that he has been a “true visionary in his field; he has demonstrated an extraordinary degree of perseverance and ingenuity in gradually proving his hypothesis over the course of 18 years.”

Tuesday, May 14, 2013

Scientific Authority and the Role of Small RNAs

A few weeks ago I criticized Philip Ball for an article he published in Nature: DNA: Nature Celebrates Ignorance. Phil has responded to my comments and he has given me permission to quote from his response. I think this is going to stimulate discussion on some very interesting topics.

The role of small RNAs is one of those topics. There are four types of RNA inside cells: tRNA, ribosomal RNA (rRNA), messenger RNA (mRNA), and a broad category that I call “small RNAs.”

The small RNAs include those required for splicing and those involved in catalyzing specific reactions. Many of them play a role in regulating genes expression. These roles have been known for at least three decades so there haven’t been any conceptual advances in the big picture for at least that long.

What’s new is an emphasis on the abundance and importance of small regulatory RNAs. Some workers believe that the human genomes contains thousands of genes for small RNAs that play an important role in regulating gene expression. That’s a main theme for those interpreting the ENCODE results. Several prominent scientists have written extensively about the importance of this “new information” on the abundance of small RNAs and how it assigns function to most of our genome.

Thursday, April 11, 2013

Educating an Intelligent Design Creationist: Rare Transcripts

I'm replying to a post by andyjones (More and more) Function, the evolution-free gospel of ENCODE. This was the fourth post in a series and I'm working my way through five issues that Intelligent Design Creationists need to understand.

Educating an Intelligent Design Creationist: Introduction
Educating an Intelligent Design Creationist: Pervasive Transcription

Andyjones says he didn't know that many of the unusual transcript are very rare. That's a shame because it's one of the very important things you need to know in order to have an intelligent opinion about junk DNA. Here's a question from andyjones ...
The second point is interesting, but I have to ask the question: given the fact that we don’t know everything about the genome, isn’t it precisely those parts that are rarely transcribed that would give most difficulty when it comes to determining their functions?
The simple answer to your question is "yes" but that doesn't mean we don't have clues. The best explanation depends on how rare the transcripts are and on whether there's another, equally reasonable, explanation that accounts for their existence. What we can say right now is that the presence of these rare transcripts is consistent with junk DNA. We can also say that there's no reasonable functional explanation for huge numbers of transcripts that are present at less that one copy per cell. Think about that for a minute. It means that right now there are only two scientifically reasonable explanations: (1) junk DNA/RNA, and (2) we don't know if they have a function. It is scientifically incorrect to say that these transcribed regions are functional and therefore junk DNA is refuted.1

Sunday, September 09, 2012

Brendan Maher Writes About the ENCODE/Junk DNA Publicity Fiasco

Brendan Maher is a Feature Editor for Nature. He wrote a lengthy article for Nature when the ENCODE data was published on Sept. 5, 2012 [ENCODE: The human encyclopaedia]. Here's part of what he said,
After an initial pilot phase, ENCODE scientists started applying their methods to the entire genome in 2007. Now that phase has come to a close, signalled by the publication of 30 papers, in Nature, Genome Research and Genome Biology. The consortium has assigned some sort of function to roughly 80% of the genome, including more than 70,000 ‘promoter’ regions — the sites, just upstream of genes, where proteins bind to control gene expression — and nearly 400,000 ‘enhancer’ regions that regulate expression of distant genes.
I expect encyclopedias to be much more accurate than this.

As most people know by now, there are many of us who challenge the implication that 80% of the genome has a function (i.e it's not junk).1 We think the Consortium was not being very scientific by publicizing such a ridiculous claim.

The main point of Maher's article was that the ENCODE results reveal a huge network of regulatory elements controlling expression of the known genes. This is the same point made by the ENCODE researchers themselves. Here's how Brendan Maher expressed it.

The real fun starts when the various data sets are layered together. Experiments looking at histone modifications, for example, reveal patterns that correspond with the borders of the DNaseI-sensitive sites. Then researchers can add data showing exactly which transcription factors bind where, and when. The vast desert regions have now been populated with hundreds of thousands of features that contribute to gene regulation. And every cell type uses different combinations and permutations of these features to generate its unique biology. This richness helps to explain how relatively few protein-coding genes can provide the biological complexity necessary to grow and run a human being.
I think that much of this hype comes from a problem I've called The Deflated Ego Problem. It arises because many scientists were disappointed to discover that humans have about the same number of genes as many other species yet we are "obviously" much more complex than a mouse or a pine tree. There are many ways of solving this "problem." One of them is to postulate that humans have a much more sophisticated network of control elements in our genome. Of course, this ignores the fact that the genomes of mice and trees are not smaller than ours.

Friday, March 16, 2012

John Mattick Wins Chen Award for Distinguished Academic Achievement in Human Genetic and Genomic Research

Shame on the Human Genome Organization (HUGO). It has awarded a prestigious prize (US $10,000) to John Mattick, director of the Centre for Molecular Biology and Biotechnology at the University of Queensland in Brisbane, Australia. Here's the report from the Sydney Morning Herald.

Making something of junk earns geneticist top award

WHEN Sydney geneticist John Mattick suggested junk DNA was anything but rubbish he was challenging an assumption that had underpinned genetics for 50 years.

''The ideas I put forward 10 years ago were quite radical but I thought I was right,'' Professor Mattick said.

He was. And tomorrow he will become the first Australian honoured with the Chen Award for distinguished academic achievement in human genetic and genomic research, awarded by the Human Genome Organisation.

For decades after James Watson and Francis Crick discovered DNA was a double helix, scientists believed most genes were the written instructions for proteins, the building blocks of all body processes. The assumption was true for bacteria but not complex organisms like humans, said Professor Mattick, the new executive director of the Garvan Institute.

In humans, more than 95 per cent of the genome contains billions of letters that do not make proteins, called non-coding DNA. ''When people bumped into all this DNA that didn't make proteins they thought it must be junk,'' he said. But Professor Mattick felt it was unlikely that useless material would survive hundreds of millions of years of evolution.

He found that the non-protein-coding sections of DNA had a function, to produce RNA.

"The obvious and very exciting possibility was that there is another layer of information being expressed by the genome - that the non-coding RNAs form a massive and previously unrecognised regulatory network that controls human development.''

Many scientists now believe this RNA is the basis of the brain's plasticity and learning, and may hold the secret to understanding many complex diseases.

Wednesday, August 17, 2011

Pervasive Transcription

"Pervasive transcription" refers to the idea that a large percentage of the DNA in mammalian genomes is transcribed. The idea became popular with the publication of the ENCODE results back in 2007 (Birney et al. 2007). Their results indicated that at least 93% of the human genome was transcribed at one time or another or in one tissue or another.

The result suggests that most of the genome consists of functional DNA. This pleases those who are opposed to the concept of junk DNA and it delights those who think that non-coding RNAs are going to radically change our concept of biochemistry and molecular biology. The result also pleased the creationists who were quick to point out that junk DNA is a myth [Junk & Jonathan: Part 6—Chapter 3, Most DNA Is Transcribed into RNA].

THEME:
Transcription

The original ENCODE paper used several different technologies to arrive at their conclusion. Different experimental protocols gave different results and there wasn't always complete overlap when it came to identifying transcribed regions of the genome. Nevertheless, the combination of results from three technologies gave the maximum value for the amount of DNA that was transcribed (93%). That's pervasive transcription.

The implication was that most of our genome is functional because it is transcribed.1 The conclusion was immediately challenged on theoretical grounds. According to our understanding of transcription, it is expected that RNA polymerase will bind accidentally at thousands of sites in the gnome and the probability of initiating the occasional transcript is significant [How RNA Polymerase Binds to DNA]. Genes make up about 30% of our genome and we expect that this fraction will be frequently transcribed. The remainder is transcribed at a very low rate that's easily detectable using modern technology. That could easily be junk RNA [How to Frame a Null Hypothesis] [How to Evaluate Genome Level Transcription Papers].

There were also challenges on technical grounds; notably a widely-discussed paper by van Bakel et al, 2010) from the labs of Ben Blencowe and Tim Hughes here in Toronto. That paper claimed that some of the experiments performed by the ENCODE group were prone to false positives [see Junk RNA or Imaginary RNA?]. They concluded,
We conclude that, while there are bona fide new intergenic transcripts, their number and abundance is generally low in comparison to known genes, and the genome is not as pervasively transcribed as previously reported.
The technical details of this dispute are beyond the level of this blog and, quite frankly, beyond me as well since I don't have any direct experience with these technologies. But let's not forget that aside from the dispute over the validity of the results, there is also a dispute over the interpretation.

As you might imagine, the pro-RNA, anti-junk, proponents fought back hard led by their chief, John Mattick, and Mark Gerstein (Clark et al., 2011). The focus of the counter-attack is on the validity of the results published by the Toronto group. Here's what Clark et al. (2011) conclude after their re-evaluation of the ENCODE results.
A close examination of the issues and conclusions raised by van Bakel et al. reveals the need for several corrections. First, their results are atypical and generate PR curves that are not observed with other reported tiling array data sets. Second, characterization of the transcriptomes of specific cell/tissue types using limited sampling approaches results in a limited and skewed view of the complexity of the transcriptome. Third, any estimate of the pervasiveness of transcription requires inclusion of all data sources, and less than exhaustive analyses can only provide lower bounds for transcriptional complexity. Although van Bakel et al. did not venture an estimate of the proportion of the genome expressed as primary transcripts, we agree with them that “given sufficient sequencing depth the whole genome may appear as transcripts” [2].

There is already a wide and rapidly expanding body of literature demonstrating intricate and dynamic transcript expression patterns, evolutionary conservation of promoters, transcript sequences and splice sites, and functional roles of “dark matter” transcripts [39]. In any case, the fact that their expression can be detected by independent techniques demonstrates their existence and the reality of the pervasive transcription of the genome.
The same issue of PLoS Biology contained a response from the Toronto group (van Bakel et al. 2011). They do not dispute the fact that much of the genome is transcribed since genes (exons + introns) make up a substantial portion and since cryptic (accidental) transcription is well-known. Instead, the Toronto group focuses on the abundance of transcripts from extra-genic regions and its significance.
We acknowledge that the phrase quoted by Clark et al. in our Author Summary should have read “stably transcribed”, or some equivalent, rather than simply “transcribed”. But this does not change the fact that we strongly disagree with the fundamental argument put forward by Clark et al., which is that the genomic area corresponding to transcripts is more important than their relative abundance. This viewpoint makes little sense to us. Given the various sources of extraneous sequence reads, both biological and laboratory-derived (see below), it is expected that with sufficient sequencing depth the entire genome would eventually be encompassed by reads. Our statement that “the genome is not as not as pervasively transcribed as previously reported” stems from the fact that our observations relate to the relative quantity of material detected.

Of course, some rare transcripts (and/or rare transcription) are functional, and low-level transcription may also provide a pool of material for evolutionary tinkering. But given that known mechanisms—in particular, imperfections in termination (see below)—can explain the presence of low-level random (and many non-random) transcripts, we believe the burden of proof is to show that such transcripts are indeed functional, rather than to disprove their putative functionality.
I'm with my colleagues on this one. It's not important that some part of the genome may be transcribed once every day or so. That's pretty much what you might expect from a sloppy mechanism—and let's be very clear about this, gene expression is sloppy.

You can't make grandiose claims about functionality based on such low levels of transcription. (Assuming the data turns out to be correct and there really is pervasive low-level transcription of the entire genome.)

This is a genuine scientific dispute waged on two levels: (1) are the experimental results correct? and (2) is the interpretation correct? I'm delighted to see these challenges to "dark matter" hyperbole and the ridiculous notion that most of our genome is functional. For the better part of a decade, Mattick and his ilk had free rein in the scientific literature [How Much Junk in the Human Genome?] [Greg Laden Gets Suckered by John Mattick].

We need to focus on re-educating the current generation of scientists so they will understand basic principles and concepts of biochemistry. The mere presence of an occasional transcript is not evidence of functionality and the papers that made that claim should never have gotten past reviewers.


1. Not just an "implication" since in many papers that conclusion is explicitly stated.

Clark, M.B., Amaral, P.P., Schlesinger, F.J., Dinger, M.E., Taft, R.J., Rinn, J.L., Ponting, C.P., Stadler, P.F., Morris, K.V., Morillon, A., Rozowsky, J.S., Gerstein, M.B., Wahlestedt, C., Hayashizaki, Y., Carninci, P., Gingeras, T.R., and Mattick, J.S. (2011) The Reality of Pervasive Transcription. PLoS Biol 9(7): e1000625. [doi: 10.1371/journal.pbio.1000625].

Birney, E., Stamatoyannopoulos, J.A. et al. (2007) Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature 447:799-816. [doi:10.1038/nature05874]

van Bakel, H., Nislow, C., Blencowe, B. and Hughes, T. (2010) Most "Dark Matter" Transcripts Are Associated With Known Genes. PLoS Biology 8: e1000371 [doi:10.1371/journal.pbio.1000371]

van Bakel, H., Nislow, C., Blencowe, B.J., and Hughes, T.R.. (2011) Response to "the reality of pervasive transcription". PLoS Biol 9(7): e1001102. [doi:10.1371/journal.pbio.1001102]

Don Johnson


Don Johnson has written a book that I'm probably going to have to buy (and read) if I ever hope to understand Intelligent Design Creationism.

Who is Don Johnson? Here's what it said on Uncommon Descent a few months ago [Why one scientist checked out of Darwinism].
The author worked for ten years as a Senior Research Scientist in the medical and scientific instrument field. The complexity of life came to the forefront during continued research, especially when his research group was involved with recombinant DNA during the late 1970′s. … After several years as an independent consultant in laboratory automation an other computer fields, he began a 20-year career in university teaching, interrupted briefly to earn a second Ph.D. in Computer and information Sciences from the University of Minnesota.Over time, the author began to doubt the natural explanations that had been so ingrained. It was science, and not his religion, that caused his disbelief in the explanatory powers of nature in a number of key areas including the origin and fine-tuning of mass and energy, the origin of life with its complex information content, and the increase in complexity in living organisms. This realization was not achieved easily, as he had to admit that he had been duped into believing concepts that were scientifically unfounded. The fantastic leaps of faith required to accept the natural causes in these areas demand a scientific response to the scientific-sounding concepts that in fact have no known scientific basis.”
Sounds like a typical run-of-the-mill creationist. He has several of the common characteristics of Intelligent Design Creationist proponents: (1) religion, (2) a background in engineering and/or computer science, (3) no obvious expertise in evolutionary biology, (4) multiple Ph.D.s. I'm really intrigued by the fact that so many IDiots have more than one Ph.D. because I hang out with real scientists all the time and none of them have ever felt the need to be a graduate student more than once in their lives.

Why is this book interesting? Well, for one thing, there's this excerpt from Don Johnson's website [Science Integrity (sic)].
"In the absolute sense, one cannot rule out design of anything since a designer could design something to appear as if it weren’t designed. For example, one may not be able to prove an ordinary-looking rock hadn’t been designed to look as if it were the result of natural processes. The 'necessity of design,' however, is falsifiable. To do so, merely prove that known natural processes can be demonstrated (as opposed to merely speculated from unknown science) to produce: the fine-tuning empirically detectable in the Universe, life from non-life (including the information and its processing systems), the vast diversity of morphology suddenly appearing in the Cambrian era, and the increasing complexity moving up the tree of life (with the accompanying information increase and irreducibly complex systems). If those can be demonstrated with known science, the 'necessity of design' will have been falsified in line with using Occam’s Razor principles for determining the most reasonable scenarios. If the 'necessity of design' is falsified, some may continue to BELIEVE in design, but ID would no longer be appropriate as science." (p. 92)
Isn't that cool? It absolves Intelligent Design Creationism from any burden of proof since things are said to be designed unless you can prove the negative. If real scientists can't prove beyond a shadow of doubt that life came from non-life then design can't be falsified and must be true.

It doesn't matter how many times we can demonstrate that some things evolved, that still doesn't demonstrate that evolution is true. We can only do that if we fill in the most famous gaps existing in the early 21st century. That's the only way to falsify Intelligent Design Creationism. One of the ironies is that there's really no explanation to falsify other than "it has to be designed." This is quite clever. By refusing to offer an explanation of how life began, or how animal diversity arose 500 million years ago, the IDiots insulate themselves from the same criticism they level at evolutionary explanations.

I was prompted to write about Don Johnson after reading another except form his book. An excerpt that particularly impressed Denyse O'Leary. She posted this on uncommon Descent: What will be the next time and money-wasting error Darwinism leads scientists into?1].
Researchers are discovering that what had been dismissed as evolution’s relics are actually vital to life. What used to be considered evidence for neo-Darwinism gene-formation mechanism can no longer be use as such evidence. In this case, neo-Darwinism has been a proven science inhibitor as it postponed serious investigation of the non-coding DNA within the genome, which was “one of the biggest mistakes in the history of molecular biology” [John Mattick, BioEssays, 2003 930-939].” This is reminiscent of the classification of 86 (later expanded to 180) human organs as “vestigial” that Robert Wiedersheim (1893) believed “lost their original physiological significance.” in that they were vestiges of evolution. Functions have since been discovered for all 180 organs that were thought to be vestigial, including the wings of flightless birds, the appendix, and the ear muscles of humans.”
This is more than a little confusing since the statement is wrong about the scientific facts. But even more interesting is the implication that the presence of junk DNA and/or vestigial organs is a threat to Intelligent Design Creationism. What kind of threat? Here's how Denyse O'Leary describes it.
The explicit reason for both the junk DNA error and the vestigial organs error was the need to find evidence for Darwinism in the form of stuff in life forms that doesn’t work. Without that need, these errors would not have been made.
Setting aside the lie about these being errors, let's try and see why this is such a big deal for the IDiots.

As we saw from the first quotation, everything is assumed to be designed unless we can prove that the "big four" have a purely natural explanation. So why would the IDiots be concerned about some little fish like junk DNA and vestigial organs? If a large part of our genome turns out to be junk and at least one organ turns out the be truly vestigial does this mean Intelligent Design Creationism is falsified?

Not bloody likely. The real issue here is not whether Intelligent Design Creationism has a better explanation for the organization of the human genome. It doesn't. The real issue is that these topics can be used to discredit science and evolutionary biologists. (Hence, the title of the articles.)

As I point out in class, this is the 21st century and everyone needs to have science on their side. This includes the IDiots and the climate change deniers. They can't just take the position that they are opposed to science—even though they are. That strategy hasn't worked since Darwin.

So, what do you do when the science seems to refute your claims? You resort to the only option available, attack the science and discredit the messengers. That's why we see so many stories about evil "Darwinists" and that's why people like Denyse O'Leary pounce on any opportunity to point out errors and mistakes in the scientific literature. And if you can't find any real mistakes you can always just make them up.

Intelligent Design Creationism is not about proposing alternative explanations. It's about attacking evolution and evolutionary biologists. Don't believe me? Just look at the books and the blogs. Something like 99.9% of what's written by the IDiots is attacking evolution and science. When's the last time you ever saw anything explained by Intelligent Design Creationism?


1. Aren't you glad that Denyse O'Leary is a professional journalist? Can you imagine what her titles migh look like if she didn't have professional training?