More Recent Comments

Monday, July 27, 2015

More confusion about the central dogma of molecular biology

I was doing some reading on lncRNAs (long non-coding RNAs) in order to find out how many of them had been assigned real biological functions. My reading was prompted by the one of the latest updates to the human genome sequence; namely, assembly GRCh38.p3 from June 2015. The Ensembl website lists 14,889 lncRNA genes but I'm sure that most of these are just speculative [Ensembl Whole Genome].

The latest review by my colleagues here in the biochemistry department at the University of Toronto (Toronto, Canada), concludes that only a small fraction of these putative lncRNAs have a function (Palazzo and Lee, 2015). They point out that in the absence of evidence for function, the null hypothesis is that these RNAs are junk and the genes don't exist. That's not the view that annotators at Ensembl take.

I stumbled across a paper by Ling et al. (2015) that tries to make a case for function. I don't think their case is convincing but that's not what I want to discuss. I want to discuss their view of the Central Dogma of Molecular Biology. Here's the abstract ...
The central dogma of molecular biology states that the flow of genetic information moves from DNA to RNA to protein. However, in the last decade this dogma has been challenged by new findings on non-coding RNAs (ncRNAs) such as microRNAs (miRNAs). More recently, long non-coding RNAs (lncRNAs) have attracted much attention due to their large number and biological significance. Many lncRNAs have been identified as mapping to regulatory elements including gene promoters and enhancers, ultraconserved regions and intergenic regions of protein-coding genes. Yet, the biological function and molecular mechanisms of lncRNA in human diseases in general and cancer in particular remain largely unknown. Data from the literature suggest that lncRNA, often via interaction with proteins, functions in specific genomic loci or use their own transcription loci for regulatory activity. In this review, we summarize recent findings supporting the importance of DNA loci in lncRNA function and the underlying molecular mechanisms via cis or trans regulation, and discuss their implications in cancer. In addition, we use the 8q24 genomic locus, a region containing interactive SNPs, DNA regulatory elements and lncRNAs, as an example to illustrate how single nucleotide polymorphism (SNP) located within lncRNAs may be functionally associated with the individual’s susceptibility to cancer.
This is getting to be a familiar refrain. I understand how modern scientists might be confused about the difference between the Watson and the Crick versions of the Central Dogma [see The Central Dogma of Molecular Biology]. Many textbooks perpetuate the myth that Crick's sequence hypothesis is actually the Central Dogma. That's bad enough but lots of researchers seem to think that their false view of the Central Dogma goes even further. They think it means that the ONLY kind of genes in your genome are those that produce mRNA and protein.

I don't understand how such a ridiculous notion could arise but it must be a common misconception, otherwise why would these authors think that non-coding RNAs are a challenge to the Central Dogma? And why would the reviewers and editors think this was okay?

I'm pretty sure that I've interpreted their meaning correctly. Here's the opening sentences of the introduction to their paper ...
The Encyclopedia of DNA Elements (ENCODE) project has revealed that at least 75% of the human genome is transcribed into RNAs, while protein-coding genes comprise only 3% of the human genome. Because of a long-held protein-centered bias, many of the genomic regions that are transcribed into non-coding RNAs (ncRNAs) had been viewed as ‘junk’ in the genome, and the associated transcription had been regarded as transcriptional ‘noise’ lacking biological meaning.
They think that the Central Dogma is a "protein-centered bias." They think the Central Dogma rules out genes that specify noncoding RNAs. (Like tRNA and ribosomal RNA?)

Later on they say ....
The protein-centered dogma had viewed genomic regions not coding for proteins as ‘junk’ DNA. We now understand that many lncRNAs are transcribed from ‘junk’ regions, and even those encompassing transposons, pseudogenes and simple repeats represent important functional regulators with biological relevance.
It's simply not true that scientists in the past viewed all noncoding DNA as junk, at least not knowledgeable scientists [What's in Your Genome?]. Furthermore, no knowledgeable scientists ever interpreted the Central Dogma of Molecular Biology to mean that the only functional genes in a genome were those that encoded proteins.

Apparently Lee, Vincent, Picler, Fodde, Berindan-Neagoe, Slack, and Calin knew scientists who DID believe such nonsense. Maybe they even believed it themselves.

Judging by the frequency with with such statements appear in the scientific literature, I can only assume that this belief is widespread among biochemists and molecular biologists. How in the world did this happen? How many Sandwalk readers were taught that the Central Dogma rules out all genes for noncoding RNAs? Did you have such a protein-centered bias about the role of genes? Who were your teachers?

Didn't anyone teach you who won the Nobel Prize in 1989? Didn't you learn about snRNAs? What did you think RNA polymerases I and III were doing in the cell?


Ling, H., Vincent, K., Pichler, M., Fodde, R., Berindan-Neagoe, I., Slack, F.J., and Calin, G.A. (2015) Junk DNA and the long non-coding RNA twist in cancer genetics. Oncogene (published online January 26, 2015) [PDF]

Palazzo, A.F. and Lee, E.S. (2015) Non-coding RNA: what is functional and what is junk? Frontiers in genetics 6: 2 (published online January 26, 2015 [Abstract]

Friday, July 24, 2015

John Parrington talks about The Deeper Genome

Here's a video from Oxford Press where you can hear John Parrington describe some of the ideas in his book: The Deeper Genome: Why there is more to the human genome than meets the eye.



John Parrington discusses genome sequence conservation

John Parrington has written a book called, The Deeper Genome: Why there is more to the human genome than meets the eye. He claims that most of our genome is functional, not junk. I'm looking at how his arguments compare with Five Things You Should Know if You Want to Participate in the Junk DNA Debate

There's one post for each of the five issues that informed scientists need to address if they are going to write about the amount of junk in you genome. This is the last one.

1. Genetic load
John Parrington and the genetic load argument
2. C-Value paradox
John Parrington and the c-value paradox
3. Modern evolutionary theory
John Parrington and modern evolutionary theory
4. Pseudogenes and broken genes are junk
John Parrington discusses pseudogenes and broken genes
5. Most of the genome is not conserved (this post)
John Parrington discusses genome sequence conservation

5. Most of the genome is not conserved

There are several places in the book where Parrington address the issue of sequence conservation. The most detailed discussion is on pages 92-95 where he discusses the criticisms leveled by Dan Graur against ENCODE workers. Parrington notes that 9% of the human genome is conserved and recognizes that this is a strong argument for function. It implies that >90% of our genome is junk.

Here's how Parrington dismisses this argument ...
John Mattick and Marcel Dinger ... wrote an article for the HUGO Jounral, official journal of the Human Genome Organisation, entitled "The extent of functionality in the human genome." ... In response to the accusation that the apparent lack of sequence conservation of 90 per cent of the genome means that it has no function, Mattick and Dinger argued that regulatory elements and noncoding RNAs are much more relaxed in their link between structure and function, and therefore much harder to detect by standard measures of function. This could mean that 'conservation is relative', depending on the type of genomic structure being analyzed.
In other words, a large part of our genome (~70%?) could be producing functional regulatory RNAs whose sequence is irrelevant to their biological function. Parrington then writes a full page on Mattick's idea that the genome is full of genes for regulatory RNAs.

The idea that 90% of our genome is not conserved deserves far more serious treatment. In the next chapter (Chapter 7), Parrington discusses the role of RNA in forming a "scaffold" to organize DNA in three dimensions. He notes that ...
That such RNAs, by virtue of their sequence but also their 3D shape, can bind DNA, RNA, and proteins, makes them ideal candidates for such a role.
But if the genes for these RNAs make up a significant part of the genome then that means that some of their sequences are important for function. That has genetic load implications and also implications about conservation.

If it's not a "significant" fraction of the genome then Parrington should make that clear to his readers. He knows that 90% of our genome is not conserved, even between individuals (page 142), and he should know that this is consistent with genetic load arguments. However, almost all of his main arguments against junk DNA require that the extra DNA have a sequence-specific function. Those facts are not compatible. Here's how he justifies his position ...
Those proposing a higher figure [for functional DNA] believe that conservation is an imperfect measure of function for a number of reasons. One is that since many non-coding RNAs act as 3D structures, and because regulatory DNA elements are quite flexible in their sequence constraints, their easy detection by sequence conservation methods will be much more difficult than for protein-coding regions. Using such criteria, John Mattick and colleagues have come up with much higher figures for the amount of functionality in the genome. In addition, many epigenetic mechanisms that may be central for genome function will not be detectable through a DNA sequence comparison since they are mediated by chemical modifications of the DNA and its associated proteins that do not involve changes in DNA sequence. Finally, if genomes operate as 3D entities, then this may not be easily detectable in terms of sequence conservation.
This book would have been much better if Parrington had put some numbers behind his speculations. How much of the genome is responsible for making functional non-coding RNAs and how much of that should be conserved in one way of another? How much of the genome is devoted to regulatory sequences and what kind of sequence conservation is required for functionality? How much of the genome is required for "epigenetic mechanisms" and how do they work if the DNA sequence is irrelevant?

You can't argue this way. More than 90% of our genomes is not conserved—not even between individuals. If a good bit of that DNA is, nevertheless, functional, then those functions must not have anything to do with the sequence of the genome at those specific sites. Thus, regions that specify non-coding RNAs, for example, must perform their function even though all the base pairs can be mutated. Same for regulatory sequences—the actual sequence of these regulatory sequences isn't conserved according to John Parrington. This requires a bit more explanation since it flies on the face of what we know about function and regulation.

Finally, if you are going to use bulk DNA arguments to get around the conflict then tell us how much of the genome you are attributing to formation of "3D entities." Is it 90%? 70%? 50%?


John Parrington discusses pseudogenes and broken genes

We are discussing Five Things You Should Know if You Want to Participate in the Junk DNA Debate and how they are described in John Parrington's book The Deeper Genome: Why there is more to the human genome than meets the eye. This is the fourth of five posts.

1. Genetic load
John Parrington and the genetic load argument
2. C-Value paradox
John Parrington and the c-value paradox
3. Modern evolutionary theory
John Parrington and modern evolutionary theory
4. Pseudogenes and broken genes are junk (this post)
John Parrington discusses pseudogenes and broken genes
5. Most of the genome is not conserved
John Parrington discusses genome sequence conservation

4. Pseudogenes and broken genes are junk

Parrington discusses pseudogenes at several places in the book. For example, he mentions on page 72 that both Richard Dawkins and Ken Miller have used the existence of pseudogenes as an argument against intelligent design. But, as usual, he immediately alerts his readers to other possible explanations ...
However, using the uselessness of so much of the genome for such a purpose is also risky, for what if the so-called junk DNA turns out to have an important function, but one that hasn't yet been identified.
This is a really silly argument. We know what genes look like and we know what broken genes look like. There are about 20,000 former protein-coding pseudogenes in the human genome. Some of them arose recently following a gene duplication or insertion of a cDNA copy. Some of them are ancient and similar pseudogenes are found at the same locations in other species. They accumulate mutations at a rate consistent with neutral theory and random genetic drift. (This is a demonstrated fact.)

It's ridiculous to suggest that a significant proportion of those pseudogenes might have an unknown important function. That doesn't rule out a few exceptions but, as a general rule, if it looks like a broken gene and acts like a broken gene, then chances are pretty high that it's a broken gene.

As usual, Parrington doesn't address the big picture. Instead he resorts to the standard ploy of junk DNA proponents by emphasizing the exceptions. He devotes more that two full pages (pages 143-144) to evidence that some pseudogenes have acquired a secondary function.
The potential pitfalls of writing off elements in the genome as useless or parasitical has been demonstrated by a recent consideration of the role of pseudgogenes. ... recent studies are forcing a reappraisal of the functional role of these 'duds."
Do you think his readers understand that even if every single broken gene acquired a new function that would still only account for less than 2% of the genome?

There's a whole chapter dedicated to "The Jumping Genes" (Chapter 8). Parrington notes that 45% of our genome is composed of transposons (page 119). What are they doing in our genome? They could just be parasites (selfish DNA), which he equates with junk. However, Parrrington prefers the idea that they serve as sources of new regulatory elements and they are important in controlling responses to environmental pressures. They are also important in evolution.

As usual, there's no discussion about what fraction of the genome is functional in this way but the reader is left with the impression that most of that 45% may not be junk or parasites.

Most Sandwalk readers know that almost all of the transposon-related sequences are bits and pieces of transposons that haven't bee active for millions of years. They are pseudogenes. They look like broken transposon genes, they act like broken genes, and they evolve like broken transposons. It's safe to assume that's what they are. This is junk DNA and it makes up almost half of our genome.

John Parrington never mentions this nasty little fact. He leaves his readers with the impression that 45% of our genome consists of active transposons jumping around in our genome. I assume that this is what he believes to be true. He has not read the literature.

Chapter 9 is about epigenetics. (You knew it was coming, didn't you?) Apparently, epigentic changes can make the genome more amenable to transposition. This opens up possible functional roles for transposons.
As we've seen, stress may enhance transposition and, intriguingly, this seems to be linked to changes in the chromatin state of the genome, which permits repressed transposons to become active. It would be very interesting if such a mechanism constituted a way for the environment to make a lasting, genetic mark. This would be in line with recent suggestions that an important mechanism of evolution is 'genome resetting'—the periodic reorganization of the genome by newly mobile DNA elements, which establishes new genetic programs in embryo development. New evidence suggests that such a mechanism may be a key route whereby new species arise, and may have played an important role in the evolution of humans from apes. This is very different from the traditional view of evolution being driven by the gradual accumulation of mutations.
It was at this point, on page 139, that I realized I was dealing with a scientist who was in way over his head.

Parrington returns to this argument several times in his book. For example, in Chapter 10 ("Code, Non-code, Garbage, and Junk") he says ....
These sequences [transpsons] are assumed to be useless, and therefore their rate of mutation is taken to taken to represent a 'neutral' reference; however, as John Mattick and his colleague Marcel Dinger of the Garvan Institute have pointed out, a flaw in such reasoning is 'the questionable proposition that transposable elements, which provide the major source of evolutionary plasticity and novelty, are largely non-functional. In fact, as we saw in Chapter 8, there is increasing evidence that while transposons may start off as molecular parasites, they can also play a role in the creation of new regulatory elements, non-coding RNAs, and other such important functional components of the genome. It is this that has led John Stamatoyannopoulos to conclude that 'far from being an evolutionary dustbin, transposable elements appear to be active and lively members of the genomic regulatory community, deserving of the same level of scrutiny applied to other genic or regulatory features. In fact, the emerging role for transposition in creating new regulatory mechanisms in the genome challenges the very idea that we can divide the genome into 'useful' and 'junk' coomponents.
Keep in mind that active transposons represent only a tiny percentage of the human genome. About 50% of the genome consists of transposon flotsam and jetsam—bits and pieces of broken transposons. It looks like junk to me.

Why do all opponents of junk DNA argue this way without putting their cards on the table? Why don't they give us numbers? How much of the genome consists of transposon sequences that have a biological function? Is it 50%, 20%, 5%?


John Parrington and modern evolutionary theory

We are continuing our discussion of John Parrington's book The Deeper Genome: Why there is more to the human genome than meets the eye. This is the third of five posts on: Five Things You Should Know if You Want to Participate in the Junk DNA Debate

1. Genetic load
John Parrington and the genetic load argument
2. C-Value paradox
John Parrington and the c-value paradox
3. Modern evolutionary theory (this post)
John Parrington and modern evolutionary theory
4. Pseudogenes and broken genes are junk
John Parrington discusses pseudogenes and broken genes
5. Most of the genome is not conserved
John Parrington discusses genome sequence conservation

3. Modern evolutionary theory

You can't understand the junk DNA debate unless you've read Michael Lynch's book The Origins of Genome Architecture. That means you have to understand modern population genetics and the role of random genetic drift in the evolution of genomes. There's no evidence in Parrington's book that he has read The Origins of Genome Architecture and no evidence that he understands modern evolutionary theory. The only evolution he talks about is natural selection (Chapter 1).

Here's an example where he demonstrates adaptationist thinking and the fact that he hasn't read Lynch's book ...
At first glance, the existence of junk DNA seems to pose another problem for Crick's central dogma. If information flows in a one-way direction from DNA to RNA to protein, then there would appear to be no function for such noncoding DNA. But if 'junk DNA' really is useless, then isn't it incredibly wasteful to carry it around in our genome? After all, the reproduction of the genome that takes place during each cell division uses valuable cellular energy. And there is also the issue of packaging the approximately 3 billion base pairs of the human genome into the tiny cell nucleus. So surely natural selection would favor a situation where both genomic energy requirements and packaging needs are reduced fiftyfold?1
Nobody who understands modern evolutionary theory would ask such a question. They would have read all the published work on the issue and they would know about the limits of natural selection and why species can't necessarily get rid of junk DNA even if it seems harmful.

People like that would also understand the central dogma of molecular biology.


1. He goes on to propose a solution to this adaptationist paradox. Apparently, most of our genome consists of parasites (transposons), an idea he mistakenly attributes to Richard Dawkins' concept of The Selfish Gene. Parrington seems to have forgotten that most of the sequence of active transposons consists of protein-coding genes so it doesn't work very well as an explanation for excess noncoding DNA.

John Parrington and the C-value paradox

We are discussing John Parrington's book The Deeper Genome: Why there is more to the human genome than meets the eye. This is the second of five posts on: Five Things You Should Know if You Want to Participate in the Junk DNA Debate

1. Genetic load
John Parrington and the genetic load argument
2. C-Value paradox (this post)
John Parrington and the c-value paradox
3. Modern evolutionary theory
John Parrington and modern evolutionary theory
4. Pseudogenes and broken genes are junk
John Parrington discusses pseudogenes and broken genes
5. Most of the genome is not conserved
John Parrington discusses genome sequence conservation


2. C-Value paradox

Parrington addresses this issue on page 63 by describing experiments from the late 1960s showing that there was a great deal of noncoding DNA in our genome and that only a few percent of the genome was devoted to encoding proteins. He also notes that the differences in genome sizes of similar species gave rise to the possibility that most of our genome was junk. Five pages later (page 69) he reports that scientists were surprised to find only 30,000 protein-coding genes when the sequence of the human genome was published—"... the other big surprise was how little of our genomes are devoted to protein-coding sequence."

Contradictory stuff like that makes it every hard to follow his argument. On the one hand, he recognizes that scientists have known for 50 years that only 2% of our genome encodes proteins but, on the other hand, they were "surprised" to find this confirmed when the human genome sequence was published.

He spends a great deal of Chapter 4 explaining the existence of introns and claims that "over 90 per cent of our genes are alternatively spliced" (page 66). This seems to be offered as an explanation for all the excess noncoding DNA but he isn't explicit.

In spite of the fact that genome comparisons are a very important part of this debate, Parrington doesn't return to this point until Chapter 10 ("Code, Non-code, Garbage, and Junk").

We know that the C-Value Paradox isn't really a paradox because most of the excess DNA in various genomes is junk. There isn't any other explanation that makes sense of the data. I don't think Parrington appreciates the significance of this explanation.

The examples quoted in Chapter 10 are the lungfish, with a huge genome, and the pufferfish (Fugu), with a genome much smaller than ours. This requires an explanation if you are going to argue that most of the human genome is functional. Here's Parrington's explanation ...
Yet, despite having a genome only one eighth the size of ours, Fugu possesses a similar number of genes. This disparity raises questions about the wisdom of assigning functionality to the vast majority of the human genome, since, by the same token, this could imply that lungfish are far more complex than us from a genomic perspective, while the smaller amount of non-protein-coding DNA in the Fugu genome suggests the loss of such DNA is perfectly compatible with life in a multicellular organism.

Not everyone is convinced about the value of these examples though, John Mattick, for instance, believes that organisms with a much greater amount of DNA than humans can be dismissed as exceptions because they are 'polyploid', that is, their cells have far more than the normal two copies of each gene, or their genomes contain an unusually high proportion of inactive transposons.
In other words, organisms with larger genomes seem to be perfectly happy carrying around a lot of junk DNA! What kind of an argument is that?
Mattick is also not convinced that Fugu provides a good example of a complex organism with no non-coding DNA. Instead, he points out that 89% of this pufferfish's DNA is still non-protein-coding, so the often-made claim that this is an example of a multicellular organism without such DNA is misleading.
[Mattick has been] a true visionary in his field; he has demonstrated an extraordinary degree of perseverance and ingenuity in gradually proving his hypothesis over the course of 18 years.

Hugo Award Committee
Seriously? That's the best argument he has? He and Mattick misrepresent what scientists say about the pufferfish genome—nobody claims that the entire genome encodes proteins—then they ignore the main point; namely, why do humans need so much more DNA? Is it because we are polyploid?

It's safe to say that John Parrington doesn't understand the C-value argument. We already know that Mattick doesn't understand it and neither does Jonathan Wells, who also wrote a book on junk DNA [John Mattick vs. Jonathan Wells]. I suppose John Parrington prefers to quote Mattick instead of Jonathan Wells—even though they use the same arguments—because Mattick has received an award from the Human Genome Organization (HUGO) for his ideas and Wells hasn't [John Mattick Wins Chen Award for Distinguished Academic Achievement in Human Genetic and Genomic Research].

For further proof that Parrington has not done his homework, I note that the Onion Test [The Case for Junk DNA: The onion test ] isn't mentioned anywhere in his book. When people dismiss or ignore the Onion Test, it usually means they don't understand it. (For a spectacular example of such misunderstanding, see: Why the "Onion Test" Fails as an Argument for "Junk DNA").


Five things John Parrington should discuss if he wants to participate in the junk DNA debate

It's frustrating to see active scientists who think that most of our genome could have a biological function but who seem to be completely unaware of the evidence for junk. Most of the positive evidence for junk is decades old so there's no excuse for such ignorance.

I wrote a post in 2013 to help these scientists understand the issues: Five Things You Should Know if You Want to Participate in the Junk DNA Debate. It was based in a talk I gave at the Evolutionary Biology meeting in Chicago that year.1 Let's look at John Parrington's new book to see if he got the message [Hint: he didn't].

There's one post for each of the five issues that informed scientists need to address if they are going to write about the amount of junk in your genome.

1. Genetic load
John Parrington and the genetic load argument
2. C-Value paradox
John Parrington and the c-value paradox
3. Modern evolutionary theory
John Parrington and modern evolutionary theory
4. Pseudogenes and broken genes are junk
John Parrington discusses pseudogenes and broken genes
5. Most of the genome is not conserved
John Parrington discusses genome sequence conservation


1. It hasn't seemed to help very much.

John Parrington and the genetic load argument

We are discussing John Parrington's book The Deeper Genome: Why there is more to the human genome than meets the eye. This is the first of five posts on: Five Things You Should Know if You Want to Participate in the Junk DNA Debate

1. Genetic load (this post)
John Parrington and the genetic load argument
2. C-Value paradox
John Parrington and the c-value paradox
3. Modern evolutionary theory
John Parrington and modern evolutionary theory
4. Pseudogenes and broken genes are junk
John Parrington discusses pseudogenes and broken genes
5. Most of the genome is not conserved
John Parrington discusses genome sequence conservation


1. Genetic load

The genetic load argument has been around for 50 years. It's why experts did not expect a huge number of genes when the genome sequence was published. It's why the sequence of most of our genome must be irrelevant from an evolutionary perspective.

This argument does not rule out bulk DNA hypotheses but it does rule out all those functions that require specific sequences in order to confer biological function. This includes the speculation that most transcripts have a function and it includes the speculation that there's a vast amount of regulatory sequence in our genome. Chapter 5 of The Deeper Genome is all about the importance of regulatory RNAs.
So, starting from a failed attempt top turn a petunia purple, the discovery of RNA interference has revealed a whole new network of gene regulation mediated by RNAs and operating in parallel to the more established one of protein regulatory factors. ... Studies have revealed that a surprising 60 per cent of miRNAs turn out to be recycled introns, with the remainder being generated from the regions between genes. Yet these were parts of the genome formerly viewed as junk. Does this mean we need a reconsideration of this question? This is an issue we will discuss in Chapter 6, in particular with regard to the ENCODE project ...
The implication here is that a substantial part of the genome is devoted to the production of regulatory RNAs. Presumably, the sequences of those RNAs are important. But this conflicts with the genetic load argument unless we're only talking about an insignificant fraction of the genome.

But that's only one part of Parrington's argument against junk DNA. Here's the summary from the last Chapter ("Conclusion") ...
As we've discussed in this book, a major part of the debate about the ENCODE findings has focused on the question of what proportion of the genome is functional. Given that the two sides of this debate use quite different criteria to assess functionality it is likely that it will be some time before we have a clearer idea about who is the most correct in this debate. Yet, in framing the debate in this quantitative way, there is a danger that we might lose sight of an exciting qualitative shift that has been taking place in biology over the past decade or so. So a previous emphasis on a linear flow of information, from DNA to RNA to protein through a genetic code, is now giving way to a much more complex picture in which multiple codes are superimposed on one another. Such a viewpoint sees the gene as more than just a protein-coding unit; instead it can equally be seen as an accumulation of chemical modifications in the DNA or its associated histones, a site for non-coding RNA synthesis, or a nexus in a 3D network. Moreover, since we now know that multiple sites in the genome outside the protein-coding regions can produce RNAs, and that even many pseudo-genes are turning out to be functional, the very question of what constitutes a gene is now being challenged. Or, as Ed Weiss at the University of Pennsylvania recently put it, 'the concept of a gene is shredding.' Such is the nature of the shift that now we face the challenge of not just recognizing the true scale of this complexity, but explaining how it all comes together to make a living, functioning, human being.
I've already addressed some of the fuzzy thinking in this paragraph [The fuzzy thinking of John Parrington: The Central Dogma and The fuzzy thinking of John Parrington: pervasive transcription]. The point I want to make here is that Parrington's arguments for function in the genome require a great deal of sequence information. They all conflict with the genetic load argument.

Parrington doesn't cover the genetic load argument at all in his book. I don't know why since it seems very relevant. We could not survive as a species if the sequence of most of our genome was important for biological function.


Thursday, July 23, 2015

The essence of modern science education

The July 16th (2015) issue of Nature has a few articles devoted to science education [An Education]. The introduction to these articles in the editorial section is worth quoting. It emphasizes two important points that I've been advocating.
  1. Evidence shows us that active learning (student centered learning) is superior to the old memorize-and-regurgitate system with professors giving powerpoint presentations to passive students.
  2. You must deal with student misconceptions or your efforts won't pay off.
So many people have been preaching this new way of teaching that it's truly astonishing that it's not being adopted. It's time to change. It's time to stop rewarding and praising professors who teach the old way and time to start encouraging professors to move to the new system. Nobody says it's going to be easy.

We have professors whose main job is teaching. They should be leading the way.
One of the subjects that people love to argue about, following closely behind the ‘correct’ way to raise children, is the best way to teach them. For many, personal experience and centuries of tradition make the answer self-evident: teachers and textbooks should lay out the content to be learned, students should study and drill until they have mastered that content, and tests should be given at strategic intervals to discover how well the students have done.

And yet, decades of research into the science of learning has shown that none of these techniques is particularly effective. In university-level science courses, for example, students can indeed get good marks by passively listening to their professor’s lectures and then cramming for the exams. But the resulting knowledge tends to fade very quickly, and may do nothing to displace misconceptions that students brought with them.

Consider the common (and wrong) idea that Earth is cold in the winter because it is further from the Sun. The standard, lecture-based approach amounts to hoping that this idea can be displaced simply by getting students to memorize the correct answer, which is that seasons result from the tilt of Earth’s axis of rotation. Yet hundreds of empirical studies have shown that students will understand and retain such facts much better when they actively grapple with challenges to their ideas — say, by asking them to explain why the northern and southern hemispheres experience opposing seasons at the same time. Even if they initially come up with a wrong answer, to get there they will have had to think through what factors are important. So when they finally do hear the correct explanation, they have already built a mental scaffold that will give the answer meaning.

In this issue, prepared in collaboration with Scientific American, Nature is taking a close look at the many ways in which educators around the world are trying to implement such ‘active learning’ methods (see nature.com/stem). The potential pay-off is large — whether it is measured by the increased number of promising students who finish their degrees in science, technology, engineering and mathematics (STEM) disciplines instead of being driven out by the sheer boredom of rote memorization, or by the non-STEM students who get first-hand experience in enquiry, experimentation and reasoning on the basis of evidence.

Implementing such changes will not be easy — and many academics may question whether they are even necessary. Lecture-based education has been successful for hundreds of years, after all, and — almost by definition — today’s university instructors are the people who thrived on it.

But change is essential. The standard system also threw away far too many students who did not thrive. In an era when more of us now work with our heads, rather than our hands, the world can no longer afford to support poor learning systems that allow too few people to achieve their goals.
The old system is also wasteful because it graduates students who can't think critically and don't understand basic concepts.


Wednesday, July 22, 2015

University of Toronto Professor, teaching stream

After years of negotiation between the administration and the Faculty Association, the university has finally allowed full time lecturers to calls themselves "professors" [U of T introduces new teaching stream professorial ranks]. This brings my university into line with some other progressive universities that recognize the value of teaching.

Unfortunately, the news isn't all good. These new professors will have a qualifier attached to their titles. The new positions are: assistant professor (conditional), teaching stream; assistant professor, teaching stream; associate professor, teaching stream; and professor, teaching stream. Research and scholarly activity is an important component of these positions. The fact that the activity is in the field of pedagogy or the discipline in which they teach should not make a difference.

Meanwhile, current professors will not have qualifiers such as "professor: research," or "professor: administration," or "professor: physician," or "professor: mostly teaching."

The next step is to increase the status of these new professors by making searches more rigorous and more competitive, by keeping the salaries competitive with other professors in the university, and by insisting on high quality research and scholarly activity in the field of pedagogy. The new professors will have to establish an national and international reputation in their field just like other professors. They will have to publish in the pedagogical literature. They are not just lecturers. Almost all of them can do this if they are given the chance.

Some departments have to change the way they treat the new professors. The University of Toronto Faculty Association (UTFA) has published a guideline: Teaching Stream Workload. Here's the part on research and scholarly activity ....
  • In section 7.2, the WLPP offers the following definition of scholarship: “Scholarship refers to any combination of discipline-based scholarship in relation to or relevant to the field in which the faculty member teaches, the scholarship of teaching and learning, and creative/professional activities. Teaching stream faculty are entitled to reasonable time for pedagogical/professional development in determining workload.”
  • It is imperative that teaching stream faculty have enough time in their schedules, that is, enough “space” in their appointments, to allow for the “continued pedagogical/professional development” that the appointments policy (PPAA) calls for. Faculty teaching excessive numbers of courses or with excessive administrative loads will not have the time to engage in scholarly activity. Remember that UTFA fought an Association grievance to win the right for teaching stream faculty to “count” their discipline-based scholarship. That scholarship “counts” in both PTR review and review for promotion to senior lecturer.
And here's a rule that many departments disobey ...
Under 4.1, the WLPP reminds us of a Memorandum of Agreement workload protection: “faculty will not be required to teach in all three terms, nor shall they be pressured to volunteer to do so.” Any faculty member who must teach in all three terms should come to see UTFA.


Tuesday, July 21, 2015

The two mistakes of Kirk Durston

Kirk Durston think he's discovered a couple of mistakes made by people who debate evolution vs creationism [Microevolution versus Macroevolution: Two Mistakes].
I often observe that in discussions of evolution, both evolution skeptics and those who embrace neo-Darwinian evolution are prone to make one of two significant mistakes. Both stem from a failure to distinguish between microevolution and macroevolution.
Let's see how Durston defines these terms.

Debating Darwin's Doubt

Today is the day that John Scopes was found guilty in Dayton, Tennessee (USA) 90 years ago. The Intelligent Design Creationists have marked the day with publication of a new book called Debating Darwin's Doubt [A Scientific Controversy That Can No Longer Be Denied: Here Is Debating Darwin's Doubt].

The book was necessary because there has been so much criticism of the original Stephen Meyer's book Darwin's Doubt. David Klinghoffer has an interesting way of turning this defeat into a victory because he declares,
... the new book is important because it puts to rest a Darwinian myth, an icon of the evolution debate, namely...that there is no debate, about evolution or intelligent design!

The creationism continuum

Intelligent Design Creationists often get upset when I refer to them as creationists. They think that the word "creationist" has only one meaning; namely, a person who believes in the literal truth of Genesis in the Judeo-Christian Bible. The fact that this definition applies to many (most?) intelligent design advocates is irrelevant to them since they like to point out that many ID proponents are not biblical literalists.

There's another definition of "creationist" that's quite different and just as common throughout the world. We've been describing this other definition to ID proponents for over two decades but they refuse to listen. We've been explaining why it's quite legitimate to refer to them as Intelligent Design Creationists but there's hardly any evidence that they are paying attention. This isn't really a surprise.

Sunday, July 19, 2015

God Only Knows

God Only Knows is one of my favorite pop songs.1 It's from the Pet Sounds album by the Beach Boys (1966).

Experts have admired Brian Wilson and the Beach Boys for decades but most people have forgotten (or never knew) about their best songs. (Good Vibrations was released as a single at the same time as Pet Sounds.)

I haven't yet seen the movie about Brian Wilson (Love & Mercy).

The first video is a BBC production from 2014 paying tribute to (and featuring) Brian Wilson. The second video is from 1966.





1. I will delete any snarky comments about God and atheism.

The fuzzy thinking of John Parrington: pervasive transcription

Opponents of junk DNA usually emphasize the point that they were surprised when the draft human genome sequence was published in 2001. They expected about 100,000 genes but the initial results suggested less than 30,000 (the final number is about 25,0001. The reason they were surprised was because they had not kept up with the literature on the subject and they had not been paying attention when the sequence of chromosome 22 was published in 1999 [see Facts and Myths Concerning the Historical Estimates of the Number of Genes in the Human Genome].

The experts were expecting about 30,000 genes and that's what the genome sequence showed. Normally this wouldn't be such a big deal. Those who were expecting a large number of genes would just admit that they were wrong and they hadn't kept up with the literature over the past 30 years. They should have realized that discoveries in other species and advances in developmental biology had reinforced the idea that mammals only needed about the same number of genes as other multicellular organisms. Most of the differences are due to regulation. There was no good reason to expect that humans would need a huge number of extra genes.

That's not what happened. Instead, opponents of junk DNA insist that the complexity of the human genome cannot be explained by such a low number of genes. There must be some other explanation to account for the the missing genes. This sets the stage for at least seven different hypotheses that might resolve The Deflated Ego Problem. One of them is the idea that the human genome contains thousands and thousands of nonconserved genes for various regulatory RNAs. These are the missing genes and they account for a lot of the "dark matter" of the genome—sequences that were thought to be junk.

Here's how John Parrington describes it on page 91 of his book.
The study [ENCODE] also found that 80 per cent of the genome was generating RNA transcripts having importance, many were found only in specific cellular compartments, indicating that they have fixed addresses where they operate. Surely there could hardly be a greater divergence from Crick's central dogma than this demonstration that RNAs were produced in far greater numbers across the genome than could be expected if they were simply intermediates between DNA and protein. Indeed, some ENCODE researchers argued that the basic unit of transcription should now be considered as the transcript. So Stamatoyannopoulos claimed that 'the project has played an important role in changing our concept of the gene.'
This passage illustrates my difficulty in coming to grips with Parrington's logic in The Deeper genome. Just about every page contains statements that are either wrong or misleading and when he strings them together they lead to a fundamentally flawed conclusion. In order to critique the main point, you have to correct each of the so-called "facts" that he gets wrong. This is very tedious.

I've already explained why Parrington is wrong about the Central Dogma of Molecular Biology [John Avise doesn't understand the Central Dogma of Molecular Biology]. His readers don't know that he's wrong so they think that the discovery of noncoding RNAs is a revolution in our understanding of biochemisty—a revolution led by the likes of John A. Stamatoyannopoulos in 2012.

The reference in the book to the statement by Stamatoyannopoulos is from the infamous Elizabeth Pennisi article on ENCODE Project Writes Eulogy for Junk DNA (Pennisi, 2012). Here's what she said in that article ...
As a result of ENCODE, Gingeras and others argue that the fundamental unit of the genome and the basic unit of heredity should be the transcript—the piece of RNA decoded from DNA—and not the gene. “The project has played an important role in changing our concept of the gene,” Stamatoyannopoulos says.
I'm not sure what concept of a gene these people had before 2012. It appears that John Parrington is under the impression that genes are units that encode proteins and maybe that's what Pennisi and Stamatoyannopoulos thought as well.

If so, then perhaps the publicity surrounding ENCODE really did change their concept of a gene but all that proves is that they were remarkably uniformed before 2012. Intelligent biochemists have known for decades that the best definition of a gene is "a DNA sequence that is transcribed to produce a functional product."2 In other words, we have been defining a gene in terms of transcripts for 45 years [What Is a Gene?].

This is just another example of wrong and misleading statements that will confuse readers. If I were writing a book I would say, "The human genome sequence confirmed the predictions of the experts that there would be no more than 30,000 genes. There's nothing in the genome sequence or the ENCODE results that has any bearing on the correct understanding of the Central Dogma and there's nothing that changes the correct definition of a gene."

You can see where John Parrington's thinking is headed. Apparently, Parrington is one of those scientists who were completely unaware of the fact that genes could specify functional RNAs and completely unaware of the fact that Crick knew this back in 1970 when he tried to correct people like Parrington. Thus, Parrington and his colleagues were shocked to learn that the human genome only had only 25,000 genes and many of them didn't encode proteins. Instead of realizing that his view was wrong, he thinks that the ENCODE results overthrew those old definitions and changed the way we think about genes. He tries to convince his readers that there was a revolution in 2012.

Parrington seems to be vaguely aware of the idea that most pervasive transcription is due to noise or junk RNA. However, he gives his readers no explanation of the reasoning behind such a claim. Spurious transcription is predicted because we understand the basic concept of transcription initiation. We know that promoter sequences and transcription binding sites are short sequences and we know that they HAVE to occur a high frequency in large genomes just by chance. This is not just speculation. [see The "duon" delusion and why transcription factors MUST bind non-functionally to exon sequences and How RNA Polymerase Binds to DNA]

If our understanding of transcription initiation is correct then all you need is a activator transcription factor binding site near something that's compatible with a promoter sequence. Any given cell type will contain a number of such factors and they must bind to a large number of nonfunctional sites in a large genome. Many of these will cause occasional transcription giving rise to low abundance junk RNA. (Most of the ENCODE transcripts are present at less than one copy per cell.)

Different tissues will have different transcription factors. Thus, the low abundance junk RNAs must exhibit tissue specificity if our prediction is correct. Parrington and the ENCODE workers seem to think that the cell specificity of these low abundance transcripts is evidence of function. It isn't—it's exactly what you expect of spurious transcription. Parrington and the ENCODE leaders don't understand the scientific literature on transription initiation and transcription factors binding sites.

It takes me an entire blog post to explain the flaws in just one paragraph of Parrington's book. The whole book is like this. The only thing it has going for it is that it's better than Nessa Carey's book [Nessa Carey doesn't understand junk DNA].


1. There are about 20,000 protein-encoding genes and an unknown number of genes specifying functional RNAs. I'm estimating that there are about 5,000 but some people think there are many more.

2. No definition is perfect. My point is that defining a gene as a DNA sequence that encodes a protein is something that should have been purged from textbooks decades ago. Any biochemist who ever thought seriously enough about the definition to bring it up in a scientific paper should be embarrassed to admit that they ever believed such a ridiculous definition.

Pennisi, E. (2012) "ENCODE Project Writes Eulogy for Junk DNA." Science 337: 1159-1161. [doi:10.1126/science.337.6099.1159"]