More Recent Comments

Friday, July 31, 2015

For the King - Teaser Trailer

This is the game my son, Gordon Moran, and his friends at IronOak Games are developing. Please send him lots of money when Kickstarter is activated in September.

I'm buying a university and a professor character for the game. The professor will battle the forces of evil and superstition. Ms. Sandwalk is contributing enough for a medieval faire with lots of games where you can win prizes.

Find out more at

Thursday, July 30, 2015

The next step in genomics

The draft sequence of the human genome was published in 2001. The "finished" version was published a few years later but annotation continues.

A massive amount of data on complex genomes has been published, especially on the human genome. The next step is to decide what this data means. Here are the most important questions from my perspective.

An accomodationist defends the science of the Pope in the journal Nature

I don't think scientific journals or scientific organizations should take a position on the conflict between science and religion but that doesn't mean they should stay away from the subject altogether. The journal Nature has just (July 28, 2015) published a defense of accomodationism written by David M. Lodge [Faith and science can find common ground]. Lodge describes himself as a "Protestant ecologist embedded for 30 years in a Roman Catholic university." The Catholic University is Notre Dame [see David M. Lodge].

His main argument is that the current Pope understands the science of the environment and has spoken out in favor of protecting the environment. David Lodge thinks this represents an accomomodation between science and religion.

Wednesday, July 29, 2015

Michael Lynch on modern evolutionary theory

Of the Five Things You Should Know if You Want to Participate in the Junk DNA Debate, the most difficult to explain is "Modern Evolutionary Theory." Most scientists think they understand evolution well enough to engage in the debate about junk DNA. However, sooner or later they will mention that junk DNA should have been deleted by selection if it ever existed. You can see that their worldview leads them to believe that everything in biology has an adaptive function.

It's been a few years since I posted Michael Lynch's scathing comments on panadaptationism and how it applies to understanding genomes [Michael Lynch on Adaptationism and A New View of Evolution]. You're in for a treat today.

Here's what you need to know about evolution in order to discuss junk DNA. The first quotation is from the preface to The Origins of Genome Architecture (pages xiii-xiv). The second quotations are from the last chapter (page 366 and pages 368-369.
Contrary to popular belief, evolution is not driven by natural selection alone. Many aspects of evolutionary change are indeed facilitated by natural selection, but all populations are influenced by nonadaptive forces of mutation, recombination, and random genetic drift. These additional forces are not simple embellishments around a primary axis of selection, but are quite the opposite—they dictate what natural selection can and cannot do. Although this basic principle has been known for a long time, it is quite remarkable that most biologists continue to interpret nearly aspect of biodiversity as an outcome of adaptive processes. This blind acceptance of natural selection as the only force relevant to evolution has led to a lot of sloppy thinking, and is probably the primary reason why evolution is viewed as a soft science by much of society.

A central point to be explained in this book is that most aspects of evolution at the genome level cannot be fully explained in adaptive terms, and moreover, that many features could not have emerged without a near-complete disengagement of the power of natural selection. This contention is supported by a wide array of comparative data, as well as by well-established principles of population genetics. However, even if such support did not exist, there is an important reason for pursuing nonadaptive (neutral) models of evolution. If one wants to confidently invoke a specific adaptive scenario to explain an observed pattern of comparative data, then an ability to reject a hypothesis based entirely on the nonadaptive forces of evolution is critical.

The blind worship of natural selection is not evolutionary biology. It is arguably not even science.

Michael Lynch
Despite the tremendous theoretical and physical resources now available, the field of evolutionary biology continues to be widely perceived as a soft science. Here I am referring not to the problems associated with those pushing the view that life was created by an intelligent designer, but to a more significant internal issue: a subset of academics who consider themselves strong advocates of evolution but who see no compelling reason to probe the substantial knowledge base of the field. Although this is a heavy charge, it is easy to document. For example, in his 2001 presidential address to the Society for the Study of Evolution, Nick Barton presented a survey that demonstrated that about half of the recent literature devoted to evolutionary issues is far removed from mainstream evolutionary biology.

With the possible exception of behavior, evolutionary biology is treated unlike any other science. Philosophers, sociologists, and ethicists expound on the central role of evolutionary theory in understanding our place in the world. Physicists excited about biocomplexity and computer scientists enamored with genetic algorithms promise a bold new understanding of evolution, and similar claims are made in the emerging field of evolutionary psychology (and its derivatives in political science, economics, and even the humanities). Numerous popularizers of evolution, some with careers focused on defending the teaching of evolution in public schools, are entirely satisfied that a blind adherence to the Darwinian concept of natural selection is a license for such activities. A commonality among all these groups is the near-absence of an appreciation of the most fundamental principles of evolution. Unfortunately, this list extends deep within the life sciences.


... the uncritical acceptance of natural selection as an explanatory force for all aspects of biodiversity (without any direct evidence) is not much different than invoking an intelligent designer (without any direct evidence). True, we have actually seen natural selection in action in a number of well-documented cases of phenotypic evolution (Endler 1986; Kingsolver et al. 2001), but it is a leap to assume that selection accounts for all evolutionary change, particularly at the molecular and cellular levels. The blind worship of natural selection is not evolutionary biology. It is arguably not even science. Natural selection is just one of several evolutionary mechanisms, and the failure to realize this is probably the most significant impediment to a fruitful integration of evolutionary theory with molecular, cellular, and developmental biology.

Natural selection is just one of several evolutionary mechanisms, and the failure to realize this is probably the most significant impediment to a fruitful integration of evolutionary theory with molecular, cellular, and developmental biology.It should be emphasized here that the sins of panselectionism are by no means restricted to developmental biology, but simply follow the tradition embraced by many areas of evolutionary biology itself, including paleontology and evolutionary ecology (as cogently articulated by Gould and Lewontin in 1979). The vast majority of evolutionary biologists studying morphological, physiological, and or behavioral traits almost always interpret the results in terms of adaptive mechanisms, and they are so convinced of the validity of this approach that virtually no attention is given to the null hypothesis of neutral evolution, despite the availability of methods to do so (Lande 1976; Lynch and Hill 1986; Lynch 1994). For example, in a substantial series of books addressed to the general public, Dawkins (e,g., 1976, 1986, 1996, 2004) has deftly explained a bewildering array of observations in terms of hypothetical selection scenarios. Dawkins's effort to spread the gospel of the awesome power of natural selection has been quite successful, but it has come at the expense of reference to any other mechanisms, and because more people have probably read Dawkins than Darwin, his words have in some ways been profoundly misleading. To his credit, Gould, who is also widely read by the general public, frequently railed against adaptive storytelling, but it can be difficult to understand what alternative mechanisms of evolution Gould had in mind.

Tuesday, July 28, 2015

I never expected this!

David Klinghoffer writes at Evolution News & Views (sic): In The New Yorker, Tom Wolfe Compares Persecution of Intelligent Design Advocates to the "Spanish Inquisition".
Interviewed by The New Yorker earlier this year, the great novelist and journalist Tom Wolfe acknowledged that he's writing a book about evolution -- actually, "a history of the theory of evolution from the nineteenth century to the present." No indication of what his overall thesis might be, but he "invokes the Spanish Inquisition when discussing how academics have cast out proponents of intelligent design for 'not believing in evolution the right way.'"

On the total length of all DNA molecules on the planet

If you were to line up all the DNA molecules from all the individuals in all the species on Earth, how long would it be? This is a kind of "Fermi question" or "Fermi problem." You should be able to estimate an answer based on what you know and reasonable assumptions.

Michael Lynch has a crude estimate in his book The Origins of Genome Architecture. Without reading the book, can you come up with an estimate of your own? Is it larger than the circumference of the Earth? Larger than the distance to Pluto? Longer than the distance to the nearest star (other than the sun) or the the center of the galaxy? Would the string of DNA molecules stretch to the nearest large galaxy (Andromeda)? Or, would it be even longer than that?

In case you've forgotten everything you once knew about the structure of DNA, here's a brief refresher: The Three-Dimensional Structure of DNA.

You may assume that all of the DNA molecules are in the standard B-form with the dimensions shown in the figure.

I will not accept any answers in archaic measurement units like leagues, miles, yojana, or cubits.

Readings from Trends in Biochemical Sciences on the Central Dogma

I'm re-reading The Inside Story edited by Jan Witkowski, the former editor-in-chief of Trends in Biochemical Sciences (TIBS). The book is a collection of essays that appeared in the journal. The collection centers around "the theme of the Central Dogma of molecular biology." Here's how Jan Witkowski describes the collection in the preface (page xii)...
When I came to look more closely, it was clear that the area the articles covered most comprehensively, where the most interesting selection could be made, was the Central Dogma, that is DNA, RNA, and protein synthesis. And the number of relevant articles was just right for the size of book we had in mind.
This explains the subtitle of the book, "DNA to RNA to Protein."

This is not going to be another complaint about misinterpretations of the Central Dogma. Quite the contrary, as we shall see.

The Forward was written by Tim Hunt who was the editor-in-chief from 1992-2000. He refers to "The General Idea."
"Jim, you might say, had it first. DNA makes RNA makes protein. That became the general idea." Thus did Francis Crick explain to Horace Judson years later, long after he had written with such clarity and force on the subject of protein synthesis in the 1958 Symposium on "The Biological Replication of Macromolecules" [see Crick, 1959). This article is celebrated for its prediction of the existence of tRNA (although by the time the article appeared in print, tRNA had been discovered), but it is chiefly worth reading and rereading, even today, for its enunciation of the two principles that together constitute the "General Idea." The first principle is the Sequence Hypothesis; the idea that the sequence of amino acids in proteins is specified by the sequence of bases in DNA and RNA. The second principle is the famous "Central Dogma"; not DNA makes RNA makes Protein, but the assertion that "Once information has passed into protein it cannot get out again." It isn't completely clear why one is a hypothesis and the other a dogma and the two together an idea. The Dogma stuck in some throats, mainly because it was called a dogma, with heavy religious overtones.
I quote Tim Hunt to show that there are some knowledgeable scientists who understand the Central Dogma [see The Central Dogma of Molecular Biology].

Hunt continues ...
Crick explains that calling it a dogma was a misunderstanding on his part: he thought the word stood for "an idea for which there was no reasonable evidence," blaming his "curious religious upbringing" for the error. But it probably wasn't that much of a mistake after all, for the Oxford Dictionary allows dogma to mean simply a principle, although the alternative "Arrogant declaration of opinion" is probably how most people who were not molecular biologists took it, considering its never modest author. That is probably how they were meant to take it, too. It was the most important article of faith among the circle of biologists centered on Watson and Crick and remained so for quite a long time until the mechanism of protein synthesis became clear. Crick said that if you did not subscribe to the sequence hypothesis and the central dogma "you generally ended up in the wilderness," although he did not offer alternative scenarios for public consumption, even though they probably played an important part in convincing him of the dogmatic status of the General Idea's second component.
This is the concept that I "grew up" with as a graduate student in the late 1960s. We saw the "General Idea" as an important concept and a way of understanding the data that was coming out of many labs working on DNA replication, transcription, and protein synthesis. We knew, especially after 1970 (Crick, 1970), that RNA could be used as a template to make DNA and that there were many types of RNA other than messenger RNA. We also knew that Francis Crick was a very smart man and it was unwise to disagree with him because he was usually right about big ideas.

Fig. 1. Information flow and the sequence hypothesis. These diagrams of potential information flow were used by Crick (1958) to illustrate all possible transfers of information (left) and those that are permitted (right). The sequence hypothesis refers to the idea that information encoded in the sequence of nucleotides specifies the sequence of amino acids in the protein.
At some point in the last 40 year the "General Idea" has been subverted in two ways.
  1. The Sequence Hypothesis has come to be interpreted as the Central Dogma. This is mostly due to Jim Watson who propagated this misinterpretation in his Molecular Biology of the Gene textbook.
  2. The Central Dogma is taken to mean that the ONLY important information in the genome is that which encodes proteins. It's assumed, incorrectly, that Crick meant to say that the role of all genes is to encode proteins.
One of the essays in The Inside Story is "Forty Years under the Central Dogma," published in 1998. The authors are Denis Thieffry and Sahotra Sarkar (Thieffry and Sarkar, 1998).

Here's how they explain some of the confusion about the Central Dogma ...
The most obvious interpretation of Crick’s original (1958) formulation of the Central Dogma is in negative terms. The Central Dogma only forbids a few types of information transfer, namely, from proteins to proteins and from proteins to nucleic acids. However, after its rapid adoption by most of the biologists interested in protein synthesis, it was most often interpreted or reformulated in a more restrictive way, constricting the flow of information from DNA to RNA and from RNA to protein (Fig. 1).

Figure 1 The Central Dogma as envisioned by Watson in 1965. ‘We should first look at the evidence that DNA itself is not the direct template that orders amino acid sequences. Instead, the genetic information of DNA is transferred to another class of molecules, which then serve as the protein templates. These intermediate templates are molecules of ribonucleic acid (RNA)...Their relation to DNA and protein is usually summarised by the formula (often called the central dogma).'

According to Watson’s autobiography, he had already derived this ‘formula’ (Fig. 1) in 1952. In fact, such schemes were commonly entertained during the early 1950s, at least among the biologists interested in protein synthesis. ... Much more restrictive than Crick’s original statement, Watson’s formula was immediately confronted with a series of possible exceptions, some of which are mentioned below. Crick, meanwhile, remained rather cautious in his interpretation of the Central Dogma. On several occasions, he felt it necessary to come back to his original idea and explicate what he thought to be its correct interpretation. For example, in 1970, Crick devoted a paper specifically to the Central Dogma, including a diagram reportedly conceived (but not published) in 1958.[see the figure at the top of this page]
The authors recognize several challenges to the Central Dogma, at least to the version preferred by Watson. There were two discoveries in the 1960s that seemed to threaten the Central Dogma. The first was the discovery that the genetic material of some viruses (e.g. TMV) was RNA, not DNA. The second was the discovery that RNA could be copied into DNA by reverse transcriptase. This was not a problem for Crick ....
These findings prompted Crick to write his 1970 piece for Nature, in which he explicitly showed how the new facts fitted into his scheme.
It's difficult to evaluate the importance of the Central Dogma in the 21st century because so many scientists don't understand it. The incorrect version seems to mostly serve as a whipping boy to promote "new" ideas that overthrow the strawman version of the Central Dogma.

Back in 1998, the authors of this article asked Crick what he thought of the Central Dogma ...
In a recent answer to a question addressing the relevance of these challenges, Crick stated that he still believes in the value of the Central Dogma today (F.H.C. Crick, pers. commun.). However, he also acknowledges the existence of various exceptions, most of which he regards as minor. For him, the most significant exception is RNA editing. Still, according to Crick, simplifications of the Central Dogma in terms such as ‘DNA makes RNA and RNA makes protein’ were clearly inadequate from the beginning.

Crick, F.H.C. (1958) On protein synthesis. Symp. Soc. Exp. Biol. XII:138-163. [PDF]

Crick, F. (1970) Central Dogma of Molecular Biology. Nature 227, 561-563. [PDF file]

Thieffry, D. and Sarkar, S. (1998) "Forty years under the central dogma." Trends in Biochemical Sciences 23:312–316. [doi: 10.1016/S0968-0004(98)01244-4}

Monday, July 27, 2015

More confusion about the central dogma of molecular biology

I was doing some reading on lncRNAs (long non-coding RNAs) in order to find out how many of them had been assigned real biological functions. My reading was prompted by the one of the latest updates to the human genome sequence; namely, assembly GRCh38.p3 from June 2015. The Ensembl website lists 14,889 lncRNA genes but I'm sure that most of these are just speculative [Ensembl Whole Genome].

The latest review by my colleagues here in the biochemistry department at the University of Toronto (Toronto, Canada), concludes that only a small fraction of these putative lncRNAs have a function (Palazzo and Lee, 2015). They point out that in the absence of evidence for function, the null hypothesis is that these RNAs are junk and the genes don't exist. That's not the view that annotators at Ensembl take.

I stumbled across a paper by Ling et al. (2015) that tries to make a case for function. I don't think their case is convincing but that's not what I want to discuss. I want to discuss their view of the Central Dogma of Molecular Biology. Here's the abstract ...
The central dogma of molecular biology states that the flow of genetic information moves from DNA to RNA to protein. However, in the last decade this dogma has been challenged by new findings on non-coding RNAs (ncRNAs) such as microRNAs (miRNAs). More recently, long non-coding RNAs (lncRNAs) have attracted much attention due to their large number and biological significance. Many lncRNAs have been identified as mapping to regulatory elements including gene promoters and enhancers, ultraconserved regions and intergenic regions of protein-coding genes. Yet, the biological function and molecular mechanisms of lncRNA in human diseases in general and cancer in particular remain largely unknown. Data from the literature suggest that lncRNA, often via interaction with proteins, functions in specific genomic loci or use their own transcription loci for regulatory activity. In this review, we summarize recent findings supporting the importance of DNA loci in lncRNA function and the underlying molecular mechanisms via cis or trans regulation, and discuss their implications in cancer. In addition, we use the 8q24 genomic locus, a region containing interactive SNPs, DNA regulatory elements and lncRNAs, as an example to illustrate how single nucleotide polymorphism (SNP) located within lncRNAs may be functionally associated with the individual’s susceptibility to cancer.
This is getting to be a familiar refrain. I understand how modern scientists might be confused about the difference between the Watson and the Crick versions of the Central Dogma [see The Central Dogma of Molecular Biology]. Many textbooks perpetuate the myth that Crick's sequence hypothesis is actually the Central Dogma. That's bad enough but lots of researchers seem to think that their false view of the Central Dogma goes even further. They think it means that the ONLY kind of genes in your genome are those that produce mRNA and protein.

I don't understand how such a ridiculous notion could arise but it must be a common misconception, otherwise why would these authors think that non-coding RNAs are a challenge to the Central Dogma? And why would the reviewers and editors think this was okay?

I'm pretty sure that I've interpreted their meaning correctly. Here's the opening sentences of the introduction to their paper ...
The Encyclopedia of DNA Elements (ENCODE) project has revealed that at least 75% of the human genome is transcribed into RNAs, while protein-coding genes comprise only 3% of the human genome. Because of a long-held protein-centered bias, many of the genomic regions that are transcribed into non-coding RNAs (ncRNAs) had been viewed as ‘junk’ in the genome, and the associated transcription had been regarded as transcriptional ‘noise’ lacking biological meaning.
They think that the Central Dogma is a "protein-centered bias." They think the Central Dogma rules out genes that specify noncoding RNAs. (Like tRNA and ribosomal RNA?)

Later on they say ....
The protein-centered dogma had viewed genomic regions not coding for proteins as ‘junk’ DNA. We now understand that many lncRNAs are transcribed from ‘junk’ regions, and even those encompassing transposons, pseudogenes and simple repeats represent important functional regulators with biological relevance.
It's simply not true that scientists in the past viewed all noncoding DNA as junk, at least not knowledgeable scientists [What's in Your Genome?]. Furthermore, no knowledgeable scientists ever interpreted the Central Dogma of Molecular Biology to mean that the only functional genes in a genome were those that encoded proteins.

Apparently Lee, Vincent, Picler, Fodde, Berindan-Neagoe, Slack, and Calin knew scientists who DID believe such nonsense. Maybe they even believed it themselves.

Judging by the frequency with with such statements appear in the scientific literature, I can only assume that this belief is widespread among biochemists and molecular biologists. How in the world did this happen? How many Sandwalk readers were taught that the Central Dogma rules out all genes for noncoding RNAs? Did you have such a protein-centered bias about the role of genes? Who were your teachers?

Didn't anyone teach you who won the Nobel Prize in 1989? Didn't you learn about snRNAs? What did you think RNA polymerases I and III were doing in the cell?

Ling, H., Vincent, K., Pichler, M., Fodde, R., Berindan-Neagoe, I., Slack, F.J., and Calin, G.A. (2015) Junk DNA and the long non-coding RNA twist in cancer genetics. Oncogene (published online January 26, 2015) [PDF]

Palazzo, A.F. and Lee, E.S. (2015) Non-coding RNA: what is functional and what is junk? Frontiers in genetics 6: 2 (published online January 26, 2015 [Abstract]

Friday, July 24, 2015

John Parrington talks about The Deeper Genome

Here's a video from Oxford Press where you can hear John Parrington describe some of the ideas in his book: The Deeper Genome: Why there is more to the human genome than meets the eye.

John Parrington discusses genome sequence conservation

John Parrington has written a book called, The Deeper Genome: Why there is more to the human genome than meets the eye. He claims that most of our genome is functional, not junk. I'm looking at how his arguments compare with Five Things You Should Know if You Want to Participate in the Junk DNA Debate

There's one post for each of the five issues that informed scientists need to address if they are going to write about the amount of junk in you genome. This is the last one.

1. Genetic load
John Parrington and the genetic load argument
2. C-Value paradox
John Parrington and the c-value paradox
3. Modern evolutionary theory
John Parrington and modern evolutionary theory
4. Pseudogenes and broken genes are junk
John Parrington discusses pseudogenes and broken genes
5. Most of the genome is not conserved (this post)
John Parrington discusses genome sequence conservation

5. Most of the genome is not conserved

There are several places in the book where Parrington address the issue of sequence conservation. The most detailed discussion is on pages 92-95 where he discusses the criticisms leveled by Dan Graur against ENCODE workers. Parrington notes that 9% of the human genome is conserved and recognizes that this is a strong argument for function. It implies that >90% of our genome is junk.

Here's how Parrington dismisses this argument ...
John Mattick and Marcel Dinger ... wrote an article for the HUGO Jounral, official journal of the Human Genome Organisation, entitled "The extent of functionality in the human genome." ... In response to the accusation that the apparent lack of sequence conservation of 90 per cent of the genome means that it has no function, Mattick and Dinger argued that regulatory elements and noncoding RNAs are much more relaxed in their link between structure and function, and therefore much harder to detect by standard measures of function. This could mean that 'conservation is relative', depending on the type of genomic structure being analyzed.
In other words, a large part of our genome (~70%?) could be producing functional regulatory RNAs whose sequence is irrelevant to their biological function. Parrington then writes a full page on Mattick's idea that the genome is full of genes for regulatory RNAs.

The idea that 90% of our genome is not conserved deserves far more serious treatment. In the next chapter (Chapter 7), Parrington discusses the role of RNA in forming a "scaffold" to organize DNA in three dimensions. He notes that ...
That such RNAs, by virtue of their sequence but also their 3D shape, can bind DNA, RNA, and proteins, makes them ideal candidates for such a role.
But if the genes for these RNAs make up a significant part of the genome then that means that some of their sequences are important for function. That has genetic load implications and also implications about conservation.

If it's not a "significant" fraction of the genome then Parrington should make that clear to his readers. He knows that 90% of our genome is not conserved, even between individuals (page 142), and he should know that this is consistent with genetic load arguments. However, almost all of his main arguments against junk DNA require that the extra DNA have a sequence-specific function. Those facts are not compatible. Here's how he justifies his position ...
Those proposing a higher figure [for functional DNA] believe that conservation is an imperfect measure of function for a number of reasons. One is that since many non-coding RNAs act as 3D structures, and because regulatory DNA elements are quite flexible in their sequence constraints, their easy detection by sequence conservation methods will be much more difficult than for protein-coding regions. Using such criteria, John Mattick and colleagues have come up with much higher figures for the amount of functionality in the genome. In addition, many epigenetic mechanisms that may be central for genome function will not be detectable through a DNA sequence comparison since they are mediated by chemical modifications of the DNA and its associated proteins that do not involve changes in DNA sequence. Finally, if genomes operate as 3D entities, then this may not be easily detectable in terms of sequence conservation.
This book would have been much better if Parrington had put some numbers behind his speculations. How much of the genome is responsible for making functional non-coding RNAs and how much of that should be conserved in one way of another? How much of the genome is devoted to regulatory sequences and what kind of sequence conservation is required for functionality? How much of the genome is required for "epigenetic mechanisms" and how do they work if the DNA sequence is irrelevant?

You can't argue this way. More than 90% of our genomes is not conserved—not even between individuals. If a good bit of that DNA is, nevertheless, functional, then those functions must not have anything to do with the sequence of the genome at those specific sites. Thus, regions that specify non-coding RNAs, for example, must perform their function even though all the base pairs can be mutated. Same for regulatory sequences—the actual sequence of these regulatory sequences isn't conserved according to John Parrington. This requires a bit more explanation since it flies on the face of what we know about function and regulation.

Finally, if you are going to use bulk DNA arguments to get around the conflict then tell us how much of the genome you are attributing to formation of "3D entities." Is it 90%? 70%? 50%?

John Parrington discusses pseudogenes and broken genes

We are discussing Five Things You Should Know if You Want to Participate in the Junk DNA Debate and how they are described in John Parrington's book The Deeper Genome: Why there is more to the human genome than meets the eye. This is the fourth of five posts.

1. Genetic load
John Parrington and the genetic load argument
2. C-Value paradox
John Parrington and the c-value paradox
3. Modern evolutionary theory
John Parrington and modern evolutionary theory
4. Pseudogenes and broken genes are junk (this post)
John Parrington discusses pseudogenes and broken genes
5. Most of the genome is not conserved
John Parrington discusses genome sequence conservation

4. Pseudogenes and broken genes are junk

Parrington discusses pseudogenes at several places in the book. For example, he mentions on page 72 that both Richard Dawkins and Ken Miller have used the existence of pseudogenes as an argument against intelligent design. But, as usual, he immediately alerts his readers to other possible explanations ...
However, using the uselessness of so much of the genome for such a purpose is also risky, for what if the so-called junk DNA turns out to have an important function, but one that hasn't yet been identified.
This is a really silly argument. We know what genes look like and we know what broken genes look like. There are about 20,000 former protein-coding pseudogenes in the human genome. Some of them arose recently following a gene duplication or insertion of a cDNA copy. Some of them are ancient and similar pseudogenes are found at the same locations in other species. They accumulate mutations at a rate consistent with neutral theory and random genetic drift. (This is a demonstrated fact.)

It's ridiculous to suggest that a significant proportion of those pseudogenes might have an unknown important function. That doesn't rule out a few exceptions but, as a general rule, if it looks like a broken gene and acts like a broken gene, then chances are pretty high that it's a broken gene.

As usual, Parrington doesn't address the big picture. Instead he resorts to the standard ploy of junk DNA proponents by emphasizing the exceptions. He devotes more that two full pages (pages 143-144) to evidence that some pseudogenes have acquired a secondary function.
The potential pitfalls of writing off elements in the genome as useless or parasitical has been demonstrated by a recent consideration of the role of pseudgogenes. ... recent studies are forcing a reappraisal of the functional role of these 'duds."
Do you think his readers understand that even if every single broken gene acquired a new function that would still only account for less than 2% of the genome?

There's a whole chapter dedicated to "The Jumping Genes" (Chapter 8). Parrington notes that 45% of our genome is composed of transposons (page 119). What are they doing in our genome? They could just be parasites (selfish DNA), which he equates with junk. However, Parrrington prefers the idea that they serve as sources of new regulatory elements and they are important in controlling responses to environmental pressures. They are also important in evolution.

As usual, there's no discussion about what fraction of the genome is functional in this way but the reader is left with the impression that most of that 45% may not be junk or parasites.

Most Sandwalk readers know that almost all of the transposon-related sequences are bits and pieces of transposons that haven't bee active for millions of years. They are pseudogenes. They look like broken transposon genes, they act like broken genes, and they evolve like broken transposons. It's safe to assume that's what they are. This is junk DNA and it makes up almost half of our genome.

John Parrington never mentions this nasty little fact. He leaves his readers with the impression that 45% of our genome consists of active transposons jumping around in our genome. I assume that this is what he believes to be true. He has not read the literature.

Chapter 9 is about epigenetics. (You knew it was coming, didn't you?) Apparently, epigentic changes can make the genome more amenable to transposition. This opens up possible functional roles for transposons.
As we've seen, stress may enhance transposition and, intriguingly, this seems to be linked to changes in the chromatin state of the genome, which permits repressed transposons to become active. It would be very interesting if such a mechanism constituted a way for the environment to make a lasting, genetic mark. This would be in line with recent suggestions that an important mechanism of evolution is 'genome resetting'—the periodic reorganization of the genome by newly mobile DNA elements, which establishes new genetic programs in embryo development. New evidence suggests that such a mechanism may be a key route whereby new species arise, and may have played an important role in the evolution of humans from apes. This is very different from the traditional view of evolution being driven by the gradual accumulation of mutations.
It was at this point, on page 139, that I realized I was dealing with a scientist who was in way over his head.

Parrington returns to this argument several times in his book. For example, in Chapter 10 ("Code, Non-code, Garbage, and Junk") he says ....
These sequences [transpsons] are assumed to be useless, and therefore their rate of mutation is taken to taken to represent a 'neutral' reference; however, as John Mattick and his colleague Marcel Dinger of the Garvan Institute have pointed out, a flaw in such reasoning is 'the questionable proposition that transposable elements, which provide the major source of evolutionary plasticity and novelty, are largely non-functional. In fact, as we saw in Chapter 8, there is increasing evidence that while transposons may start off as molecular parasites, they can also play a role in the creation of new regulatory elements, non-coding RNAs, and other such important functional components of the genome. It is this that has led John Stamatoyannopoulos to conclude that 'far from being an evolutionary dustbin, transposable elements appear to be active and lively members of the genomic regulatory community, deserving of the same level of scrutiny applied to other genic or regulatory features. In fact, the emerging role for transposition in creating new regulatory mechanisms in the genome challenges the very idea that we can divide the genome into 'useful' and 'junk' coomponents.
Keep in mind that active transposons represent only a tiny percentage of the human genome. About 50% of the genome consists of transposon flotsam and jetsam—bits and pieces of broken transposons. It looks like junk to me.

Why do all opponents of junk DNA argue this way without putting their cards on the table? Why don't they give us numbers? How much of the genome consists of transposon sequences that have a biological function? Is it 50%, 20%, 5%?

John Parrington and modern evolutionary theory

We are continuing our discussion of John Parrington's book The Deeper Genome: Why there is more to the human genome than meets the eye. This is the third of five posts on: Five Things You Should Know if You Want to Participate in the Junk DNA Debate

1. Genetic load
John Parrington and the genetic load argument
2. C-Value paradox
John Parrington and the c-value paradox
3. Modern evolutionary theory (this post)
John Parrington and modern evolutionary theory
4. Pseudogenes and broken genes are junk
John Parrington discusses pseudogenes and broken genes
5. Most of the genome is not conserved
John Parrington discusses genome sequence conservation

3. Modern evolutionary theory

You can't understand the junk DNA debate unless you've read Michael Lynch's book The Origins of Genome Architecture. That means you have to understand modern population genetics and the role of random genetic drift in the evolution of genomes. There's no evidence in Parrington's book that he has read The Origins of Genome Architecture and no evidence that he understands modern evolutionary theory. The only evolution he talks about is natural selection (Chapter 1).

Here's an example where he demonstrates adaptationist thinking and the fact that he hasn't read Lynch's book ...
At first glance, the existence of junk DNA seems to pose another problem for Crick's central dogma. If information flows in a one-way direction from DNA to RNA to protein, then there would appear to be no function for such noncoding DNA. But if 'junk DNA' really is useless, then isn't it incredibly wasteful to carry it around in our genome? After all, the reproduction of the genome that takes place during each cell division uses valuable cellular energy. And there is also the issue of packaging the approximately 3 billion base pairs of the human genome into the tiny cell nucleus. So surely natural selection would favor a situation where both genomic energy requirements and packaging needs are reduced fiftyfold?1
Nobody who understands modern evolutionary theory would ask such a question. They would have read all the published work on the issue and they would know about the limits of natural selection and why species can't necessarily get rid of junk DNA even if it seems harmful.

People like that would also understand the central dogma of molecular biology.

1. He goes on to propose a solution to this adaptationist paradox. Apparently, most of our genome consists of parasites (transposons), an idea he mistakenly attributes to Richard Dawkins' concept of The Selfish Gene. Parrington seems to have forgotten that most of the sequence of active transposons consists of protein-coding genes so it doesn't work very well as an explanation for excess noncoding DNA.

John Parrington and the C-value paradox

We are discussing John Parrington's book The Deeper Genome: Why there is more to the human genome than meets the eye. This is the second of five posts on: Five Things You Should Know if You Want to Participate in the Junk DNA Debate

1. Genetic load
John Parrington and the genetic load argument
2. C-Value paradox (this post)
John Parrington and the c-value paradox
3. Modern evolutionary theory
John Parrington and modern evolutionary theory
4. Pseudogenes and broken genes are junk
John Parrington discusses pseudogenes and broken genes
5. Most of the genome is not conserved
John Parrington discusses genome sequence conservation

2. C-Value paradox

Parrington addresses this issue on page 63 by describing experiments from the late 1960s showing that there was a great deal of noncoding DNA in our genome and that only a few percent of the genome was devoted to encoding proteins. He also notes that the differences in genome sizes of similar species gave rise to the possibility that most of our genome was junk. Five pages later (page 69) he reports that scientists were surprised to find only 30,000 protein-coding genes when the sequence of the human genome was published—"... the other big surprise was how little of our genomes are devoted to protein-coding sequence."

Contradictory stuff like that makes it every hard to follow his argument. On the one hand, he recognizes that scientists have known for 50 years that only 2% of our genome encodes proteins but, on the other hand, they were "surprised" to find this confirmed when the human genome sequence was published.

He spends a great deal of Chapter 4 explaining the existence of introns and claims that "over 90 per cent of our genes are alternatively spliced" (page 66). This seems to be offered as an explanation for all the excess noncoding DNA but he isn't explicit.

In spite of the fact that genome comparisons are a very important part of this debate, Parrington doesn't return to this point until Chapter 10 ("Code, Non-code, Garbage, and Junk").

We know that the C-Value Paradox isn't really a paradox because most of the excess DNA in various genomes is junk. There isn't any other explanation that makes sense of the data. I don't think Parrington appreciates the significance of this explanation.

The examples quoted in Chapter 10 are the lungfish, with a huge genome, and the pufferfish (Fugu), with a genome much smaller than ours. This requires an explanation if you are going to argue that most of the human genome is functional. Here's Parrington's explanation ...
Yet, despite having a genome only one eighth the size of ours, Fugu possesses a similar number of genes. This disparity raises questions about the wisdom of assigning functionality to the vast majority of the human genome, since, by the same token, this could imply that lungfish are far more complex than us from a genomic perspective, while the smaller amount of non-protein-coding DNA in the Fugu genome suggests the loss of such DNA is perfectly compatible with life in a multicellular organism.

Not everyone is convinced about the value of these examples though, John Mattick, for instance, believes that organisms with a much greater amount of DNA than humans can be dismissed as exceptions because they are 'polyploid', that is, their cells have far more than the normal two copies of each gene, or their genomes contain an unusually high proportion of inactive transposons.
In other words, organisms with larger genomes seem to be perfectly happy carrying around a lot of junk DNA! What kind of an argument is that?
Mattick is also not convinced that Fugu provides a good example of a complex organism with no non-coding DNA. Instead, he points out that 89% of this pufferfish's DNA is still non-protein-coding, so the often-made claim that this is an example of a multicellular organism without such DNA is misleading.
[Mattick has been] a true visionary in his field; he has demonstrated an extraordinary degree of perseverance and ingenuity in gradually proving his hypothesis over the course of 18 years.

Hugo Award Committee
Seriously? That's the best argument he has? He and Mattick misrepresent what scientists say about the pufferfish genome—nobody claims that the entire genome encodes proteins—then they ignore the main point; namely, why do humans need so much more DNA? Is it because we are polyploid?

It's safe to say that John Parrington doesn't understand the C-value argument. We already know that Mattick doesn't understand it and neither does Jonathan Wells, who also wrote a book on junk DNA [John Mattick vs. Jonathan Wells]. I suppose John Parrington prefers to quote Mattick instead of Jonathan Wells—even though they use the same arguments—because Mattick has received an award from the Human Genome Organization (HUGO) for his ideas and Wells hasn't [John Mattick Wins Chen Award for Distinguished Academic Achievement in Human Genetic and Genomic Research].

For further proof that Parrington has not done his homework, I note that the Onion Test [The Case for Junk DNA: The onion test ] isn't mentioned anywhere in his book. When people dismiss or ignore the Onion Test, it usually means they don't understand it. (For a spectacular example of such misunderstanding, see: Why the "Onion Test" Fails as an Argument for "Junk DNA").

Five things John Parrington should discuss if he wants to participate in the junk DNA debate

It's frustrating to see active scientists who think that most of our genome could have a biological function but who seem to be completely unaware of the evidence for junk. Most of the positive evidence for junk is decades old so there's no excuse for such ignorance.

I wrote a post in 2013 to help these scientists understand the issues: Five Things You Should Know if You Want to Participate in the Junk DNA Debate. It was based in a talk I gave at the Evolutionary Biology meeting in Chicago that year.1 Let's look at John Parrington's new book to see if he got the message [Hint: he didn't].

There's one post for each of the five issues that informed scientists need to address if they are going to write about the amount of junk in your genome.

1. Genetic load
John Parrington and the genetic load argument
2. C-Value paradox
John Parrington and the c-value paradox
3. Modern evolutionary theory
John Parrington and modern evolutionary theory
4. Pseudogenes and broken genes are junk
John Parrington discusses pseudogenes and broken genes
5. Most of the genome is not conserved
John Parrington discusses genome sequence conservation

1. It hasn't seemed to help very much.

John Parrington and the genetic load argument

We are discussing John Parrington's book The Deeper Genome: Why there is more to the human genome than meets the eye. This is the first of five posts on: Five Things You Should Know if You Want to Participate in the Junk DNA Debate

1. Genetic load (this post)
John Parrington and the genetic load argument
2. C-Value paradox
John Parrington and the c-value paradox
3. Modern evolutionary theory
John Parrington and modern evolutionary theory
4. Pseudogenes and broken genes are junk
John Parrington discusses pseudogenes and broken genes
5. Most of the genome is not conserved
John Parrington discusses genome sequence conservation

1. Genetic load

The genetic load argument has been around for 50 years. It's why experts did not expect a huge number of genes when the genome sequence was published. It's why the sequence of most of our genome must be irrelevant from an evolutionary perspective.

This argument does not rule out bulk DNA hypotheses but it does rule out all those functions that require specific sequences in order to confer biological function. This includes the speculation that most transcripts have a function and it includes the speculation that there's a vast amount of regulatory sequence in our genome. Chapter 5 of The Deeper Genome is all about the importance of regulatory RNAs.
So, starting from a failed attempt top turn a petunia purple, the discovery of RNA interference has revealed a whole new network of gene regulation mediated by RNAs and operating in parallel to the more established one of protein regulatory factors. ... Studies have revealed that a surprising 60 per cent of miRNAs turn out to be recycled introns, with the remainder being generated from the regions between genes. Yet these were parts of the genome formerly viewed as junk. Does this mean we need a reconsideration of this question? This is an issue we will discuss in Chapter 6, in particular with regard to the ENCODE project ...
The implication here is that a substantial part of the genome is devoted to the production of regulatory RNAs. Presumably, the sequences of those RNAs are important. But this conflicts with the genetic load argument unless we're only talking about an insignificant fraction of the genome.

But that's only one part of Parrington's argument against junk DNA. Here's the summary from the last Chapter ("Conclusion") ...
As we've discussed in this book, a major part of the debate about the ENCODE findings has focused on the question of what proportion of the genome is functional. Given that the two sides of this debate use quite different criteria to assess functionality it is likely that it will be some time before we have a clearer idea about who is the most correct in this debate. Yet, in framing the debate in this quantitative way, there is a danger that we might lose sight of an exciting qualitative shift that has been taking place in biology over the past decade or so. So a previous emphasis on a linear flow of information, from DNA to RNA to protein through a genetic code, is now giving way to a much more complex picture in which multiple codes are superimposed on one another. Such a viewpoint sees the gene as more than just a protein-coding unit; instead it can equally be seen as an accumulation of chemical modifications in the DNA or its associated histones, a site for non-coding RNA synthesis, or a nexus in a 3D network. Moreover, since we now know that multiple sites in the genome outside the protein-coding regions can produce RNAs, and that even many pseudo-genes are turning out to be functional, the very question of what constitutes a gene is now being challenged. Or, as Ed Weiss at the University of Pennsylvania recently put it, 'the concept of a gene is shredding.' Such is the nature of the shift that now we face the challenge of not just recognizing the true scale of this complexity, but explaining how it all comes together to make a living, functioning, human being.
I've already addressed some of the fuzzy thinking in this paragraph [The fuzzy thinking of John Parrington: The Central Dogma and The fuzzy thinking of John Parrington: pervasive transcription]. The point I want to make here is that Parrington's arguments for function in the genome require a great deal of sequence information. They all conflict with the genetic load argument.

Parrington doesn't cover the genetic load argument at all in his book. I don't know why since it seems very relevant. We could not survive as a species if the sequence of most of our genome was important for biological function.

Thursday, July 23, 2015

The essence of modern science education

The July 16th (2015) issue of Nature has a few articles devoted to science education [An Education]. The introduction to these articles in the editorial section is worth quoting. It emphasizes two important points that I've been advocating.
  1. Evidence shows us that active learning (student centered learning) is superior to the old memorize-and-regurgitate system with professors giving powerpoint presentations to passive students.
  2. You must deal with student misconceptions or your efforts won't pay off.
So many people have been preaching this new way of teaching that it's truly astonishing that it's not being adopted. It's time to change. It's time to stop rewarding and praising professors who teach the old way and time to start encouraging professors to move to the new system. Nobody says it's going to be easy.

We have professors whose main job is teaching. They should be leading the way.
One of the subjects that people love to argue about, following closely behind the ‘correct’ way to raise children, is the best way to teach them. For many, personal experience and centuries of tradition make the answer self-evident: teachers and textbooks should lay out the content to be learned, students should study and drill until they have mastered that content, and tests should be given at strategic intervals to discover how well the students have done.

And yet, decades of research into the science of learning has shown that none of these techniques is particularly effective. In university-level science courses, for example, students can indeed get good marks by passively listening to their professor’s lectures and then cramming for the exams. But the resulting knowledge tends to fade very quickly, and may do nothing to displace misconceptions that students brought with them.

Consider the common (and wrong) idea that Earth is cold in the winter because it is further from the Sun. The standard, lecture-based approach amounts to hoping that this idea can be displaced simply by getting students to memorize the correct answer, which is that seasons result from the tilt of Earth’s axis of rotation. Yet hundreds of empirical studies have shown that students will understand and retain such facts much better when they actively grapple with challenges to their ideas — say, by asking them to explain why the northern and southern hemispheres experience opposing seasons at the same time. Even if they initially come up with a wrong answer, to get there they will have had to think through what factors are important. So when they finally do hear the correct explanation, they have already built a mental scaffold that will give the answer meaning.

In this issue, prepared in collaboration with Scientific American, Nature is taking a close look at the many ways in which educators around the world are trying to implement such ‘active learning’ methods (see The potential pay-off is large — whether it is measured by the increased number of promising students who finish their degrees in science, technology, engineering and mathematics (STEM) disciplines instead of being driven out by the sheer boredom of rote memorization, or by the non-STEM students who get first-hand experience in enquiry, experimentation and reasoning on the basis of evidence.

Implementing such changes will not be easy — and many academics may question whether they are even necessary. Lecture-based education has been successful for hundreds of years, after all, and — almost by definition — today’s university instructors are the people who thrived on it.

But change is essential. The standard system also threw away far too many students who did not thrive. In an era when more of us now work with our heads, rather than our hands, the world can no longer afford to support poor learning systems that allow too few people to achieve their goals.
The old system is also wasteful because it graduates students who can't think critically and don't understand basic concepts.

Wednesday, July 22, 2015

University of Toronto Professor, teaching stream

After years of negotiation between the administration and the Faculty Association, the university has finally allowed full time lecturers to calls themselves "professors" [U of T introduces new teaching stream professorial ranks]. This brings my university into line with some other progressive universities that recognize the value of teaching.

Unfortunately, the news isn't all good. These new professors will have a qualifier attached to their titles. The new positions are: assistant professor (conditional), teaching stream; assistant professor, teaching stream; associate professor, teaching stream; and professor, teaching stream. Research and scholarly activity is an important component of these positions. The fact that the activity is in the field of pedagogy or the discipline in which they teach should not make a difference.

Meanwhile, current professors will not have qualifiers such as "professor: research," or "professor: administration," or "professor: physician," or "professor: mostly teaching."

The next step is to increase the status of these new professors by making searches more rigorous and more competitive, by keeping the salaries competitive with other professors in the university, and by insisting on high quality research and scholarly activity in the field of pedagogy. The new professors will have to establish an national and international reputation in their field just like other professors. They will have to publish in the pedagogical literature. They are not just lecturers. Almost all of them can do this if they are given the chance.

Some departments have to change the way they treat the new professors. The University of Toronto Faculty Association (UTFA) has published a guideline: Teaching Stream Workload. Here's the part on research and scholarly activity ....
  • In section 7.2, the WLPP offers the following definition of scholarship: “Scholarship refers to any combination of discipline-based scholarship in relation to or relevant to the field in which the faculty member teaches, the scholarship of teaching and learning, and creative/professional activities. Teaching stream faculty are entitled to reasonable time for pedagogical/professional development in determining workload.”
  • It is imperative that teaching stream faculty have enough time in their schedules, that is, enough “space” in their appointments, to allow for the “continued pedagogical/professional development” that the appointments policy (PPAA) calls for. Faculty teaching excessive numbers of courses or with excessive administrative loads will not have the time to engage in scholarly activity. Remember that UTFA fought an Association grievance to win the right for teaching stream faculty to “count” their discipline-based scholarship. That scholarship “counts” in both PTR review and review for promotion to senior lecturer.
And here's a rule that many departments disobey ...
Under 4.1, the WLPP reminds us of a Memorandum of Agreement workload protection: “faculty will not be required to teach in all three terms, nor shall they be pressured to volunteer to do so.” Any faculty member who must teach in all three terms should come to see UTFA.

Tuesday, July 21, 2015

The two mistakes of Kirk Durston

Kirk Durston think he's discovered a couple of mistakes made by people who debate evolution vs creationism [Microevolution versus Macroevolution: Two Mistakes].
I often observe that in discussions of evolution, both evolution skeptics and those who embrace neo-Darwinian evolution are prone to make one of two significant mistakes. Both stem from a failure to distinguish between microevolution and macroevolution.
Let's see how Durston defines these terms.

Debating Darwin's Doubt

Today is the day that John Scopes was found guilty in Dayton, Tennessee (USA) 90 years ago. The Intelligent Design Creationists have marked the day with publication of a new book called Debating Darwin's Doubt [A Scientific Controversy That Can No Longer Be Denied: Here Is Debating Darwin's Doubt].

The book was necessary because there has been so much criticism of the original Stephen Meyer's book Darwin's Doubt. David Klinghoffer has an interesting way of turning this defeat into a victory because he declares,
... the new book is important because it puts to rest a Darwinian myth, an icon of the evolution debate, namely...that there is no debate, about evolution or intelligent design!

The creationism continuum

Intelligent Design Creationists often get upset when I refer to them as creationists. They think that the word "creationist" has only one meaning; namely, a person who believes in the literal truth of Genesis in the Judeo-Christian Bible. The fact that this definition applies to many (most?) intelligent design advocates is irrelevant to them since they like to point out that many ID proponents are not biblical literalists.

There's another definition of "creationist" that's quite different and just as common throughout the world. We've been describing this other definition to ID proponents for over two decades but they refuse to listen. We've been explaining why it's quite legitimate to refer to them as Intelligent Design Creationists but there's hardly any evidence that they are paying attention. This isn't really a surprise.

Sunday, July 19, 2015

God Only Knows

God Only Knows is one of my favorite pop songs.1 It's from the Pet Sounds album by the Beach Boys (1966).

Experts have admired Brian Wilson and the Beach Boys for decades but most people have forgotten (or never knew) about their best songs. (Good Vibrations was released as a single at the same time as Pet Sounds.)

I haven't yet seen the movie about Brian Wilson (Love & Mercy).

The first video is a BBC production from 2014 paying tribute to (and featuring) Brian Wilson. The second video is from 1966.

1. I will delete any snarky comments about God and atheism.

The fuzzy thinking of John Parrington: pervasive transcription

Opponents of junk DNA usually emphasize the point that they were surprised when the draft human genome sequence was published in 2001. They expected about 100,000 genes but the initial results suggested less than 30,000 (the final number is about 25,0001. The reason they were surprised was because they had not kept up with the literature on the subject and they had not been paying attention when the sequence of chromosome 22 was published in 1999 [see Facts and Myths Concerning the Historical Estimates of the Number of Genes in the Human Genome].

The experts were expecting about 30,000 genes and that's what the genome sequence showed. Normally this wouldn't be such a big deal. Those who were expecting a large number of genes would just admit that they were wrong and they hadn't kept up with the literature over the past 30 years. They should have realized that discoveries in other species and advances in developmental biology had reinforced the idea that mammals only needed about the same number of genes as other multicellular organisms. Most of the differences are due to regulation. There was no good reason to expect that humans would need a huge number of extra genes.

That's not what happened. Instead, opponents of junk DNA insist that the complexity of the human genome cannot be explained by such a low number of genes. There must be some other explanation to account for the the missing genes. This sets the stage for at least seven different hypotheses that might resolve The Deflated Ego Problem. One of them is the idea that the human genome contains thousands and thousands of nonconserved genes for various regulatory RNAs. These are the missing genes and they account for a lot of the "dark matter" of the genome—sequences that were thought to be junk.

Here's how John Parrington describes it on page 91 of his book.
The study [ENCODE] also found that 80 per cent of the genome was generating RNA transcripts having importance, many were found only in specific cellular compartments, indicating that they have fixed addresses where they operate. Surely there could hardly be a greater divergence from Crick's central dogma than this demonstration that RNAs were produced in far greater numbers across the genome than could be expected if they were simply intermediates between DNA and protein. Indeed, some ENCODE researchers argued that the basic unit of transcription should now be considered as the transcript. So Stamatoyannopoulos claimed that 'the project has played an important role in changing our concept of the gene.'
This passage illustrates my difficulty in coming to grips with Parrington's logic in The Deeper genome. Just about every page contains statements that are either wrong or misleading and when he strings them together they lead to a fundamentally flawed conclusion. In order to critique the main point, you have to correct each of the so-called "facts" that he gets wrong. This is very tedious.

I've already explained why Parrington is wrong about the Central Dogma of Molecular Biology [John Avise doesn't understand the Central Dogma of Molecular Biology]. His readers don't know that he's wrong so they think that the discovery of noncoding RNAs is a revolution in our understanding of biochemisty—a revolution led by the likes of John A. Stamatoyannopoulos in 2012.

The reference in the book to the statement by Stamatoyannopoulos is from the infamous Elizabeth Pennisi article on ENCODE Project Writes Eulogy for Junk DNA (Pennisi, 2012). Here's what she said in that article ...
As a result of ENCODE, Gingeras and others argue that the fundamental unit of the genome and the basic unit of heredity should be the transcript—the piece of RNA decoded from DNA—and not the gene. “The project has played an important role in changing our concept of the gene,” Stamatoyannopoulos says.
I'm not sure what concept of a gene these people had before 2012. It appears that John Parrington is under the impression that genes are units that encode proteins and maybe that's what Pennisi and Stamatoyannopoulos thought as well.

If so, then perhaps the publicity surrounding ENCODE really did change their concept of a gene but all that proves is that they were remarkably uniformed before 2012. Intelligent biochemists have known for decades that the best definition of a gene is "a DNA sequence that is transcribed to produce a functional product."2 In other words, we have been defining a gene in terms of transcripts for 45 years [What Is a Gene?].

This is just another example of wrong and misleading statements that will confuse readers. If I were writing a book I would say, "The human genome sequence confirmed the predictions of the experts that there would be no more than 30,000 genes. There's nothing in the genome sequence or the ENCODE results that has any bearing on the correct understanding of the Central Dogma and there's nothing that changes the correct definition of a gene."

You can see where John Parrington's thinking is headed. Apparently, Parrington is one of those scientists who were completely unaware of the fact that genes could specify functional RNAs and completely unaware of the fact that Crick knew this back in 1970 when he tried to correct people like Parrington. Thus, Parrington and his colleagues were shocked to learn that the human genome only had only 25,000 genes and many of them didn't encode proteins. Instead of realizing that his view was wrong, he thinks that the ENCODE results overthrew those old definitions and changed the way we think about genes. He tries to convince his readers that there was a revolution in 2012.

Parrington seems to be vaguely aware of the idea that most pervasive transcription is due to noise or junk RNA. However, he gives his readers no explanation of the reasoning behind such a claim. Spurious transcription is predicted because we understand the basic concept of transcription initiation. We know that promoter sequences and transcription binding sites are short sequences and we know that they HAVE to occur a high frequency in large genomes just by chance. This is not just speculation. [see The "duon" delusion and why transcription factors MUST bind non-functionally to exon sequences and How RNA Polymerase Binds to DNA]

If our understanding of transcription initiation is correct then all you need is a activator transcription factor binding site near something that's compatible with a promoter sequence. Any given cell type will contain a number of such factors and they must bind to a large number of nonfunctional sites in a large genome. Many of these will cause occasional transcription giving rise to low abundance junk RNA. (Most of the ENCODE transcripts are present at less than one copy per cell.)

Different tissues will have different transcription factors. Thus, the low abundance junk RNAs must exhibit tissue specificity if our prediction is correct. Parrington and the ENCODE workers seem to think that the cell specificity of these low abundance transcripts is evidence of function. It isn't—it's exactly what you expect of spurious transcription. Parrington and the ENCODE leaders don't understand the scientific literature on transription initiation and transcription factors binding sites.

It takes me an entire blog post to explain the flaws in just one paragraph of Parrington's book. The whole book is like this. The only thing it has going for it is that it's better than Nessa Carey's book [Nessa Carey doesn't understand junk DNA].

1. There are about 20,000 protein-encoding genes and an unknown number of genes specifying functional RNAs. I'm estimating that there are about 5,000 but some people think there are many more.

2. No definition is perfect. My point is that defining a gene as a DNA sequence that encodes a protein is something that should have been purged from textbooks decades ago. Any biochemist who ever thought seriously enough about the definition to bring it up in a scientific paper should be embarrassed to admit that they ever believed such a ridiculous definition.

Pennisi, E. (2012) "ENCODE Project Writes Eulogy for Junk DNA." Science 337: 1159-1161. [doi:10.1126/science.337.6099.1159"]