Sandwalk: Search results for mattick

Showing posts sorted by relevance for query mattick. Sort by date Show all posts

Thursday, February 07, 2008

Theme: Genomes & Junk DNA

Junk in Your Genome

Transposable Elements: (44% junk)

      DNA transposons:
         active (functional): <0.1%
         defective (nonfunctional): 3%
      retrotransposons:
         active (functional): <0.1%
         defective transposons
            (full-length, nonfunctional): 8%
            L1 LINES (fragments, nonfunctional): 16%
            other LINES: 4%
            SINES (small pseudogene fragments): 13%
            co-opted transposons/fragments: <0.1%^a

^aCo-opted transposons and transposon fragments are those that have secondarily acquired a new function.

Viruses (9% junk)

      DNA viruses
         active (functional): <0.1%
         defective DNA viruses: ~1%
      RNA viruses
         active (functional): <0.1%
         defective (nonfunctional): 8%
         co-opted RNA viruses: <0.1%^b

^bCo-opted RNA viruses are defective integrated virus genomes that have secondarily acquired a new function.

Pseudogenes (1.2% junk)
(from protein-encoding genes): 1.2% junk
co-opted pseudogenes: <0.1%^c

^cCo-opted pseudogenes are formerly defective pseudogenes those that have secondarily acquired a new function.

Ribosomal RNA genes:
      essential 0.22%
      junk 0.19%

Other RNA encoding genes
      tRNA genes: <0.1% (essential)
      known small RNA genes: <0.1% (essential)
      putative regulatory RNAs: ~2% (essential) Protein-encoding genes: (9.6% junk)
      transcribed region:
            essential 1.8%
            intron junk (not included above) 9.6%^d

^dIntrons sequences account for about 30% of the genome. Most of these sequences qualify as junk but they are littered with defective transposable elements that are already included in the calculation of junk DNA.

Regulatory sequences:
      essential 0.6%

Origins of DNA replication
      <0.1% (essential) Scaffold attachment regions (SARS)
      <0.1% (essential) Highly Repetitive DNA (1% junk)
      α-satellite DNA (centromeres)
            essential 2.0%
            non-essential 1.0%%
      telomeres
            essential (less than 1000 kb, insignificant)

Intergenic DNA (not included above)
      conserved 2% (essential)
      non-conserved 26.3% (unknown but probably junk)

Total Essential/Functional (so far) = 8.7%
Total Junk (so far) = 65%
Unknown (probably mostly junk) = 26.3%

For references and further information click on the "Genomes & Junk DNA" link in the box

LAST UPDATE: May 10, 2011 (fixed totals, and ribosomal RNA calculations)

November 11, 2006
Sea Urchin Genome Sequenced

The sea urchin genome is 814,000 kb or about 1/4 the size of a typical mammalian genome. Like mammalian genomes, the sea urchin genome contains a lot of junk DNA, especially repetitive DNA. The preliminary count of the number of genes is 23,300. This is about the same number that we have in our genomes. Only about 10,000 of these genes have been annotated by the sea urchin sequencing team.

Nils Walter disputes junk DNA: (5) What does the number of transcripts per cell tell us about function?

I'm discussing a recent paper published by Nils Walter (Walter, 2024). He is arguing against junk DNA by claiming that the human genome contains large numbers of non-coding genes.

This is the fifth post in the series. The first one outlines the issues that led to the current paper and the second one describes Walter's view of a paradigm shift. The third post describes the differing views on how to define key terms such as 'gene' and 'function.' The fourth post makes the case that differing views on junk DNA are mainly due to philosophical disagreements.

-Nils Walter disputes junk DNA: (1) The surprise

-Nils Walter disputes junk DNA: (2) The paradigm shaft

-Nils Walter disputes junk DNA: (3) Defining 'gene' and 'function'

-Nils Walter disputes junk DNA: (4) Different views of non-functional transcripts

Transcripts vs junk DNA

The most important issue, according to Nils Walter, is whether the human genome contains huge numbers of genes for lncRNAs and other types of regulatory RNAs. He doesn't give us any indication of how many of these potential genes he thinks exist or what percentage of the genome they cover. This is important since he's arguing against junk DNA but we don't know how much junk he's willing to accept.

There are several hundred thousand transcripts in the RNA databases. Most of them are identified as lncRNAs because they are bigger than 200 bp. Let's assume, for the sake of argument, that 200,000 of these transcripts have a biologically relevant function and therefore there are 200,000 non-coding genes. A typical size might be 1000 bp so these genes would take up about 6.5% of the genome. That's about 10 times the number of protein-coding genes and more than 6 times the amount of coding DNA.

That's not going to make much of a difference in the junk DNA debate since proponents of junk DNA argue that 90% of the genome is junk and 10% is functional. All of those non-coding genes can be accommodated within the 10%.

The ENCODE researchers made a big deal out of pervasive transcription back in 2007 and again in 2012. We can quibble about the exact numbers but let's say that 80% of the human is transcribed. We know that protein-coding genes occupy at least 40% percent of the genome so much of this pervasive transcription is introns. If all of the presumptive regulatory genes are located in the remaining 40% (i.e. none in introns), and the average size is 1000 bp, then this could be about 1.24 million non-coding genes. Is this reasonable? Is this what Nils Walter is proposing?

I think there's some confusion about the difference between large numbers of functional transcripts and the bigger picture of how much total junk DNA there is in the human genome. I wish the opponents of junk DNA would commit to how much of the genome they think is functional and what evidence they have to support that position.

But they don't. So instead we're stuck with debates about how to decide whether some transcripts are functional or junk.

What does transcript concentration tell us about function?

If most detectable transcripts are due to spurious transcription of junk DNA then you would expect these transcripts to be present at very low levels. This turns out to be true as Nils Walter admits. He notes that "fewer than 1000 lncRNAs are present at greater than one copy per cell."

This is a problem for those who advocate that many of these low abundance transcripts must be functional. We are familiar with several of the ad hoc hypotheses that have been advanced to get around this problem. John Mattick has been promoting them for years [John Mattick's new paradigm shaft].

Walter advances two of these excuses. First, he says that a critical RNA may be present at an average of one molecule per cell but it might be abundant in just one specialized cell in the tissue. Furthermore, their expression might be transient so they can only be detected at certain times during development and we might not have assayed cells at the right time. I assume he's advocating that there might be a short burst of a large number of these extremely specialized regulatory RNAs in these special cells.

As far as I know, there aren't many examples of such specialized gene expression. You would need at least 100,000 examples in order to make a viable case for function.

His second argument is that many regulatory RNAs are restricted to the nucleus where they only need to bind to one regulatory sequence to carry out their function. This ignores the mass action laws that govern such interactions. If you apply the same reasoning to proteins then you would only need one lac repressor protein to shut down the lac operon in E. coli but we've known for 50 years that this doesn't work in spite of the fact that the lac repressor association constant shows that it is one of the tightest binding proteins known [DNA Binding Proteins]. This is covered in my biochemistry textbook on pages 650-651.¹

If you apply the same reasoning to mammalian regulatory proteins then it turns out that you need 10,000 transcription factor molecules per nucleus in order to ensure that a few specific sites are occupied. That's not only because of the chemistry of binary interactions but also because the human genome is full of spurious sites that resemble the target regulatory sequence [The Specificity of DNA Binding Proteins]. I cover this in my book in Chapter 8: "Noncoding Genes and Junk RNA" in the section titled "On the important properties of DNA-binding proteins" (pp. 200-204). I use the estrogen receptor as an example based on calculations that were done in the mid-1970s. The same principles apply to regulatory RNAs.

This is a disagreement based entirely on biochemistry and molecular biology. There aren't enough examples (evidence) to make the first argument convincing and the second argument makes no sense in light of what we know about the interactions between molecules inside of the cell (or nucleus).

Note: I can almost excuse the fact that Nils Walter ignores my book on junk DNA, my biochemistry textbook, and my blog posts, but I can't excuse the fact that his main arguments have been challenged repeatedly in the scientific literature. A good scientist should go out of their way to seek out objections to their views and address them directly.

1. In addition to the thermodynamic (equilibrium) problem, there's a kinetic problem. DNA binding proteins can find their binding sites relatively quickly by one dimensional diffusion—an option that's not readily available to regulatory RNAs [Slip Slidin' Along - How DNA Binding Proteins Find Their Target].

Walter, N.G. (2024) Are non‐protein coding RNAs junk or treasure? An attempt to explain and reconcile opposing viewpoints of whether the human genome is mostly transcribed into non‐functional or functional RNAs. BioEssays:2300201. [doi: 10.1002/bies.202300201]

Thursday, July 28, 2016

False history and the number of genes: 2016

There's an article about junk DNA in the latest issue of New Scientist. The title is: You are junk: Why it’s not your genes that make you human. The author is Colin Barras, a science writer from Michigan with a Ph.D. in paleontology.

He begins with .....

IT WAS a discovery that threatened to overturn everything we thought about what makes us human. At the dawn of the new millennium, two rival teams were vying to be the first to sequence the human genome. Their findings, published in February 2001, made headlines around the world. Back-of-the-envelope calculations had suggested that to account for the sheer complexity of human biology, our genome should contain roughly 100,000 genes. The estimate was wildly off. Both groups put the actual figure at around 30,000. We now think it is even fewer – just 20,000 or so.

"It was a massive shock," says geneticist John Mattick. "That number is tiny. It’s effectively the same as a microscopic worm that has just 1000 cells."

Required reading for the junk DNA debate

This is a list of scientific papers on junk DNA that you need to read (and understand) in order to participate in the junk DNA debate. It's not a comprehensive list because it's mostly papers that defend junk DNA and refute arguments for massive amounts of function. The only exception is the paper by Mattick and Dinger (2013).¹ It's the only anti-junk paper that attempts to deal with the main evidence for junk DNA. If you know of any other papers that make a good case against junk DNA then I'd be happy to include them in the list.

If you come across a publication that argues against junk DNA, then you should immediately check the reference list. If you do not see some of these references in the list, then don't bother reading the paper because you know the author is not knowledgeable about the subject.

Brenner, S. (1998) Refuge of spandrels. Current Biology, 8:R669-R669. [PDF]

Brunet, T.D., and Doolittle, W.F. (2014) Getting “function” right. Proceedings of the National Academy of Sciences, 111:E3365-E3365. [doi: 10.1073/pnas.1409762111]

Casane, D., Fumey, J., et Laurenti, P. (2015) L’apophénie d’ENCODE ou Pangloss examine le génome humain. Med. Sci. (Paris) 31: 680-686. [doi: 10.1051/medsci/20153106023] [The apophenia of ENCODE or Pangloss looks at the human genome]

Cavalier-Smith, T. (1978) Nuclear volume control by nucleoskeletal DNA, selection for cell volume and cell growth rate, and the solution of the DNA C-value paradox. Journal of Cell Science, 34(1), 247-278. [doi: PDF]

Doolittle, W.F. (2013) Is junk DNA bunk? A critique of ENCODE. Proc. Natl. Acad. Sci. (USA) published online March 11, 2013. [PubMed] [doi: 10.1073/pnas.1221376110]

Doolittle, W.F., Brunet, T.D., Linquist, S., and Gregory, T.R. (2014) Distinguishing between “function” and “effect” in genome biology. Genome biology and evolution 6, 1234-1237. [doi: 10.1093/gbe/evu098]

Doolittle, W.F., and Brunet, T.D. (2017) On causal roles and selected effects: our genome is mostly junk. BMC biology, 15:116. [doi: 10.1186/s12915-017-0460-9]

Eddy, S.R. (2012) The C-value paradox, junk DNA and ENCODE. Current Biology, 22:R898. [doi: 10.1016/j.cub.2012.10.002]

Eddy, S.R. (2013) The ENCODE project: missteps overshadowing a success. Current Biology, 23:R259-R261. [10.1016/j.cub.2013.03.023]

Graur, D. (2017) Rubbish DNA: The functionless fraction of the human genome Evolution of the Human Genome I (pp. 19-60): Springer. [doi: 10.1007/978-4-431-56603-8_2 (book)] [PDF]

Graur, D. (2017) An upper limit on the functional fraction of the human genome. Genome Biology and Evolution, 9:1880-1885. [doi: 10.1093/gbe/evx121]

Graur, D., Zheng, Y., Price, N., Azevedo, R. B., Zufall, R. A., and Elhaik, E. (2013) On the immortality of television sets: "function" in the human genome according to the evolution-free gospel of ENCODE. Genome Biology and Evolution published online: February 20, 2013 [doi: 10.1093/gbe/evt028

Graur, D., Zheng, Y., and Azevedo, R.B. (2015) An evolutionary classification of genomic function. Genome Biology and Evolution, 7:642-645. [doi: 10.1093/gbe/evv021]

Gregory, T. R. (2005) Synergy between sequence and size in large-scale genomics. Nature Reviews Genetics, 6:699-708. [doi: 10.1038/nrg1674]

Haerty, W., and Ponting, C.P. (2014) No Gene in the Genome Makes Sense Except in the Light of Evolution. Annual review of genomics and human genetics, 15:71-92. [doi:10.1146/annurev-genom-090413-025621]

Hurst, L.D. (2013) Open questions: A logic (or lack thereof) of genome organization. BMC biology, 11:58. [doi:10.1186/1741-7007-11-58]

Kellis, M., Wold, B., Snyder, M.P., Bernstein, B.E., Kundaje, A., Marinov, G.K., Ward, L.D., Birney, E., Crawford, G. E., and Dekker, J. (2014) Defining functional DNA elements in the human genome. Proc. Natl. Acad. Sci. (USA) 111:6131-6138. [doi: 10.1073/pnas.1318948111]

Mattick, J. S., and Dinger, M. E. (2013) The extent of functionality in the human genome. The HUGO Journal, 7:2. [doi: 10.1186/1877-6566-7-2]

Five Things You Should Know if You Want to Participate in the Junk DNA DebateMorange, M. (2014) Genome as a Multipurpose Structure Built by Evolution. Perspectives in biology and medicine, 57:162-171. [doi: 10.1353/pbm.2014.000]

Niu, D. K., and Jiang, L. (2012) Can ENCODE tell us how much junk DNA we carry in our genome?. Biochemical and biophysical research communications 430:1340-1343. [doi: 10.1016/j.bbrc.2012.12.074]

Ohno, S. (1972) An argument for the genetic simplicity of man and other mammals. Journal of Human Evolution, 1:651-662. [doi: 10.1016/0047-2484(72)90011-5]

Ohno, S. (1972) So much "junk" in our genome. In H. H. Smith (Ed.), Evolution of genetic systems (Vol. 23, pp. 366-370): Brookhaven symposia in biology.

Palazzo, A.F., and Gregory, T.R. (2014) The Case for Junk DNA. PLoS Genetics, 10:e1004351. [doi: 10.1371/journal.pgen.1004351]

Rands, C. M., Meader, S., Ponting, C. P., and Lunter, G. (2014) 8.2% of the Human Genome Is Constrained: Variation in Rates of Turnover across Functional Element Classes in the Human Lineage. PLOS Genetics, 10:e1004525. [doi: 10.1371/journal.pgen.1004525]

Thomas Jr, C.A. (1971) The genetic organization of chromosomes. Annual review of genetics, 5:237-256. [doi: annurev.ge.05.120171.001321]

1. The paper by Kellis et al. (2014) is ambiguous. It's clear that most of the ENCODE authors are still opposed to junk DNA even though the paper is mostly a retraction of their original claim that 80% of the genome is functional.

Saturday, October 13, 2018

The great junk DNA debate

I've been talking to philosophers lately about the true state of the junk DNA controversy. I imagine what it would be like to stage a great debate on the topic. It's easy to come up with names for the pro-junk side; Dan Graur, Ford Doolittle, Sean Eddy, Ryan Gregory etc. It's hard to think of any experts who could defend the idea that most of our genome is functional. The only scientist I can think of who would accept such a challenge is John Mattick but let's imagine that he could find three others to join him in the great debate.

I claim that the debate would be a rout for the pro-junk side. The data and the theories are all on the side of those who would argue that 90% of our genome is junk. I don't think the functionalists could possibly defend the idea that most of our genome is functional. What do you think?

Assuming that I'm right, why is it that the average scientist doesn't know this? Why do they still believe there's a good case for function when none of the arguments stand up to close scrutiny? And why are philosophers not conveying the true state of the controversy to their readers? I'm told that anti-junk philosophers like Evelyn Fox Keller are held in high regard even though her arguments are easy to refute [When philosophers talk about genomes]. I'm told that John Mattick is highly respected in philosophy circles even though knowledgeable scientists have little use for his writings.

Can readers help me identify papers by philosophers of science that come down on the side of junk DNA and conclude that experts like Graur, Doolittle, etc are almost certainly correct?

Image Credit: The cartoon is by Tom Gauld and it was published online at the The New York Times Magazine website. I hope they will consider it fair use on an educational blog. See: Junk DNA comments in the New York Times Magazine.

Wednesday, July 08, 2009

Junk DNA and the Scientific Literature

A discussion about junk DNA has broken out in the comments to Monday's Molecule #128: Winners.

Charlie Wagner, an old talk.origins fan, wonders why junk DNA advocates are still around given that there have been several recent papers questioning the idea that most of our genome is junk.

Charlie asks ...

So why are Larry and many others still clinging to the myth of "junk DNA"? Do they not read the literature?

Of course we read the literature, Charlie, but unlike you we read all of the literature. You can't just pick out the papers that support your position and assume that the question has been settled.

The skill in reading the scientific literature is to put things into perspective and maintain a certain degree of skepticism. It's just not true that everything published in scientific journals is correct. An important part of science is challenging the consensus and many scientists try to make their reputation by coming up with interpretations that break new ground. The success of science depends on the few that are correct but let's not forget that most of them turn out to be wrong.

THEME

Genomes & Junk DNA
The trick is to recognize the new ideas that may be on to something and ignore those that aren't. This isn't easy but experienced scientists have a pretty good track record. Inexperienced scientists may not be able to distinguish between legitimate challenges to dogma and ones that are frivolous. The problem is even more severe for non-scientists and journalists. They are much more likely to be sucked in by the claims in the latest paper—especially if it's published in a high profile journal.

Lots of scientists don't like the idea of junk DNA because it doesn't fit into their view of how evolution works. They gleefully announce the demise of junk DNA whenever another little bit of noncoding DNA is discovered to have a function. They also attach undue significance to recent studies showing that a large part of mammalian genomes are transcribed at one time or another in spite of the fact that this phenomenon has been known for decades and is perfectly consistent with what we know about spurious transcription.

I've addressed many of the specific papers in previous postings. You can review my previous postings by clicking on the Theme Box URL. The bottom line is "don't trust everything you read in the recent scientific literature."

Another good rule of thumb is never trust any paper that doesn't give you a fair and accurate summary of the "dogma" they are opposing. When you challenge the concept of junk DNA, for example, it's not good enough to just present a piece of new evidence that may not fit the current "dogma." You also have to deal with all the evidence that was used to create the consensus view in the first place and show how it can be better explained by your new model. A good place to start is The Onion Test.

The figure is from Mattick (2007), an excellent example of what I'm talking about. This is a paper attacking the current consensus on junk DNA but in doing so it uses a figure that reveals an astonishing lack of understanding of genomes. This makes everything else in paper suspect. The figure was chosen by Ryan Gregory to be the classic example of a Dog's Ass Plot.

Mattick, J.S. (2004) The hidden genetic program of complex organisms. Sci Am. 291:60-67.

Monday, September 11, 2017

What's in Your Genome?: Chapter 4: Pervasive Transcription (revised)

I'm working (slowly) on a book called What's in Your Genome?: 90% of your genome is junk! The first chapter is an introduction to genomes and DNA [What's in Your Genome? Chapter 1: Introducing Genomes ]. Chapter 2 is an overview of the human genome. It's a summary of known functional sequences and known junk DNA [What's in Your Genome? Chapter 2: The Big Picture]. Chapter 3 defines "genes" and describes protein-coding genes and alternative splicing [What's in Your Genome? Chapter 3: What Is a Gene?].

Chapter 4 is all about pervasive transcription and genes for functional noncoding RNAs. I've finally got a respectable draft of this chapter. This is an updated summary—the first version is at: What's in Your Genome? Chapter 4: Pervasive Transcription.

Philip Ball strikes back

Philip Ball believes that we are in the middle of a revolution in our way of thinking about how life works. His ideas are complex but part of his case involves molecular biology and how things work at the molecular level. Ball believes that the old view of molecular biology placed far too much emphasis on coding DNA and ignored all the other functional regions of genomes. He also says that most of our genes specify non-coding RNA instead of mRNA and implies to his readers that a very large fraction of our genome is functional (i.e. not junk).¹

In order to build the case for revolution, he tries to demonstrate a paradigm shift in our view of molecular biology by showing a huge gap between the understanding of previous generations of molecular biologists and the post-genomic view. I believe he is wrong about this for two reasons: first, he misrepresents the views of older molecular biologists and, second he misrepresents the discoveries of the past twenty years. I tried to explain why he was wrong about these two claims in a previous post where I discussed an article he published in Scientific American in May 2024: Philip Ball says RNA may rule our genome.

Philip Ball responded to my criticism in a comment under that article.

Older molecular biologists were really stupid

I said ...

Ball begins with the same old myth that writers like him have been repeating for many years. He claims that before ENCODE most molecular biologists were really stupid. According to Philip Ball, most of us thought that coding DNA was the only functional part of the genome and most of the rest was junk DNA.

In the comment section of my earlier post, Philip Ball says,

I’m sorry to say that Larry’s commentary here is dismayingly inaccurate.

Let’s get this one out of the way first:

“He claims that before ENCODE most molecular biologists were really stupid.”

I have never made this claim and never would – it is a pure fabrication on Larry’s part. I guess this is what John Horgan meant in his comment to Larry: credible writers don’t just make up stuff.

I admit that Philip Ball never said those exact words. I'll leave it to the readers to decide whether my characterization of his position is accurate.

I stand by the statements I made although I admit to a bit of hyperbole. Ball has said repeatedly that the molecular biologists of my generation were wedded to the idea that coding regions were the only important part of the genome and he often connects that to the Central Dogma of Molecular Biology. He also claims that the experts in molecular biology dismissed all non-coding DNA as junk. Here's how he puts it in another article that he published recently in Aeon: We are not machines.

Only around 1-2 per cent of the entire human genome actually consists of protein-coding genes. The remainder was long thought to be mostly junk: meaningless sequences accumulated over the course of evolution. But at least some of that non-coding genome is now known to be involved in regulating genes: altering, activating or suppressing their transcription in RNA and translation into proteins.

I interpret that to mean that older molecular biologists, like me, didn't know about functional non-coding DNAs such as centromeres, telomeres, origins of replication, non-coding genes, SARs, and regulatory sequences in spite of the fact that thousands of papers on these sequences were published in the 30 years that preceded the publication of the first draft of the human genome sequence. This is not true, we did know about those things. I don't think it's too much of an exaggeration to say that Philip Ball thinks we were really stupid.

Here's what he says in his book, "How Life Works" (p. 85) when he's talking about the beginning of the human genome project.

Even at its outset, it faced the somewhat troubling issue that just 2 percent or so of our genome actually accounts for protein-coding genes. The conventional narrative was that our biology was all about proteins, for each of which the genome held the template. ... But we had all this other DNA too! What was it for? The common view was that it was mostly just junk, like the stuff in our attics: meaningless material accumulated during evolution, which our cells had no motivation to clear out.

Again, his claim is that in 1990 at the beginning of the human genome project the experts in molecular biology thought that non-coding DNA was mostly junk (98% of the genome). I have repeatedly refuted this myth and challenged anyone to come up with a single scientific paper arguing that all non-coding DNA is junk. I challenge Philip Ball to find a single molecular biology textbook written before 1990 that fails to discuss regulation, non-coding genes, and other non-coding functional elements in the human genome.

The truth is that the molecular biology experts concluded in the 1970s that we had about 30,000 genes and that 90% of our genome is junk and 10% is functional. That 10% consisted of about 2% coding DNA (now thought to be only 1%) and 8% functional non-coding DNA. So the "conventional narrative" was that there was a lot more functional non-coding DNA than coding DNA.

The human genome is full of genes for regulatory RNAs.

"Ball is one of the most meticulous, precise science writers out there. He is the antithesis of hypey, "dumb-it-down" reporting. He is MUCH more credible than you are, Laurence."

John Horgan July, 2024The title of the article I was discussing is "Revolutionary Genetics Research Shows RNA May Rule Our Genome." In that article Ball says that ENCODE was basically right and there are many more non-coding genes than protein-coding genes. I pointed out that Ball mentions some criticism of this idea but only to dismiss it. I said that "[Ball] wants you to believe that almost of all of those transcripts are functional—that's the revolution that he's promoting." Philip Ball objects to this statement ...

This too is sheer fabrication. I don’t say this in my article, nor in my book. Instead, I say pretty much what Larry seems to want me to say, but for some reason he will not admit it – which is that there is controversy about how many of the transcripts are functional."

Ball states that "ENCODE was basically right" when they claimed that 75% of our genome was transcribed and he goes on to say that ...

Dozens of other research groups, scoping out activity along the human genome, also have found that much of our DNA is churning out 'noncoding' RNA.

He says that ENCODE has identified 37,000 noncoding genes but there may be as many as 96,000. After making these definitive statements, he mentions that there are "still doubters" but then discuss why these discoveries are revolutionary. Later on he quotes John Mattick suspecting that there may be more that 500,000 non-coding genes.

Toward the end of the article, after discussing all kinds of functional RNAs, he brings up the Ponting and Haerty review where they say that most lncRNAs are just noise. He also mentions that the low copy number of non-coding RNAs raises questions about whether they are functional but immediately counters with the standard excuses from his allies.

Ball closes the article with ...

Gingeras says he is perplexed by ongoing claims that ncRNAs are merely noise or junk, as evidence is mounting that they do many things. "It is puzzling why there is such an effort to persuade colleagues to move from a sense of interest and curiosity in the ncRNA field to a more dubious and critical one," he says.

Perhaps the arguments are so intense because they undercut the way we think our biology works. Ever since the epochal discovery about DNA's double helix and how it encodes information, the bedrock idea of molecular biology has been that there are precisely encoded instructions that program specific molecules for particular tasks. But ncRNAs seem to point to a fuzzier, more collective, logic to life. It is a logic that is harder to discern and harder to understand. ut if scientists can learn to live with the fuzziness, this view of life may turn out to be more complete.

What's remarkable about the quote from a leading ENCODE worker (Gingeras) is that he is "puzzled" by scientists who are dubious and critical about claims in the ncRNA field. Isn't that what good scientists are supposed to do? Isn't that exactly what we did when we successfully challenged the dubious claims about junk DNA made in 2012?

There is no doubt in my mind that Philip Ball has fallen hook-line-and-sinker for the ENCODE claims that our genome is buzzing with non-coding genes. He only brings up the counter-arguments to dismiss them and pretend that he is being fair. Nobody who was truly skeptical about the function of transcripts would write an article with the title, "Revolutionary Genetics Research Shows RNA May Rule Our Genome."

However, as Ball points out in other comments, he does have a sentence in his book where he mentions that perhaps only 30% of the genome is functional. He says in the comment that what he believes is that the amount of functional DNA lies somewhere between 10% and 30%. That's not something that he mentions in the Scientific American article but, if he's being honest, it does mean that I was unfair when I said he believes that "almost of all of those transcripts are functional" but I only know that from what he now says, not from the published article.

If I were to take Philip Ball at his word—as expressed in the comment—then he must believe that most of the ENCODE transcripts are junk RNA. That's not a belief that you get from reading his published work.² Furthermore, if I were to take him at his word, then he must believe that there are some reasonable criteria that must be applied to a transcript in order to decide whether it has a biologically relevant function. So, when he says that ENCODE identified 37,600 non-coding genes he must have these criteria in mind but he doesn't express any serious skepticism about that number. We all know that there's no solid evidence that such a large number of transcripts are functional but that doesn't bother Philip Ball. He thinks we are in the middle of an RNA revolution.

1. In commenting to my previous post, Ball says he believes that somewhere between 70% and 90% of our genome is junk but he doesn't say this in the Scientific American article. Instead, he says that scientists were surprised to learn that 75% of the human genome is transcribed implying that there's a lot of function. He goes on the say that "ENCODE was basically right." But what the ENCODE publicity campaign actually said was that junk DNA is dead and there's practically no junk DNA. If Ball really believes that up to 90% of the genome is junk then to me this means that ENCODE was spectacularly wrong not "basically right."

2. Ball says that 75% of the genome is transcribed. If Ball believes that as little as 10% may be functional then he must believe that less than 10% is transcribed to produce functional RNAs since he has to allow for regulatory sequences and other functional DNA elements. Let's say that 8% is a reasonable number. Ball seems to be willing to admit that 67% of the genome might be transcribed to produce junk RNA.

Wednesday, August 17, 2011

Don Johnson

Don Johnson has written a book that I'm probably going to have to buy (and read) if I ever hope to understand Intelligent Design Creationism.

Who is Don Johnson? Here's what it said on Uncommon Descent a few months ago [Why one scientist checked out of Darwinism].

The author worked for ten years as a Senior Research Scientist in the medical and scientific instrument field. The complexity of life came to the forefront during continued research, especially when his research group was involved with recombinant DNA during the late 1970′s. … After several years as an independent consultant in laboratory automation an other computer fields, he began a 20-year career in university teaching, interrupted briefly to earn a second Ph.D. in Computer and information Sciences from the University of Minnesota.Over time, the author began to doubt the natural explanations that had been so ingrained. It was science, and not his religion, that caused his disbelief in the explanatory powers of nature in a number of key areas including the origin and fine-tuning of mass and energy, the origin of life with its complex information content, and the increase in complexity in living organisms. This realization was not achieved easily, as he had to admit that he had been duped into believing concepts that were scientifically unfounded. The fantastic leaps of faith required to accept the natural causes in these areas demand a scientific response to the scientific-sounding concepts that in fact have no known scientific basis.”

Sounds like a typical run-of-the-mill creationist. He has several of the common characteristics of Intelligent Design Creationist proponents: (1) religion, (2) a background in engineering and/or computer science, (3) no obvious expertise in evolutionary biology, (4) multiple Ph.D.s. I'm really intrigued by the fact that so many IDiots have more than one Ph.D. because I hang out with real scientists all the time and none of them have ever felt the need to be a graduate student more than once in their lives.

Why is this book interesting? Well, for one thing, there's this excerpt from Don Johnson's website [Science Integrity (sic)].

"In the absolute sense, one cannot rule out design of anything since a designer could design something to appear as if it weren’t designed. For example, one may not be able to prove an ordinary-looking rock hadn’t been designed to look as if it were the result of natural processes. The 'necessity of design,' however, is falsifiable. To do so, merely prove that known natural processes can be demonstrated (as opposed to merely speculated from unknown science) to produce: the fine-tuning empirically detectable in the Universe, life from non-life (including the information and its processing systems), the vast diversity of morphology suddenly appearing in the Cambrian era, and the increasing complexity moving up the tree of life (with the accompanying information increase and irreducibly complex systems). If those can be demonstrated with known science, the 'necessity of design' will have been falsified in line with using Occam’s Razor principles for determining the most reasonable scenarios. If the 'necessity of design' is falsified, some may continue to BELIEVE in design, but ID would no longer be appropriate as science." (p. 92)

Isn't that cool? It absolves Intelligent Design Creationism from any burden of proof since things are said to be designed unless you can prove the negative. If real scientists can't prove beyond a shadow of doubt that life came from non-life then design can't be falsified and must be true.

It doesn't matter how many times we can demonstrate that some things evolved, that still doesn't demonstrate that evolution is true. We can only do that if we fill in the most famous gaps existing in the early 21st century. That's the only way to falsify Intelligent Design Creationism. One of the ironies is that there's really no explanation to falsify other than "it has to be designed." This is quite clever. By refusing to offer an explanation of how life began, or how animal diversity arose 500 million years ago, the IDiots insulate themselves from the same criticism they level at evolutionary explanations.

I was prompted to write about Don Johnson after reading another except form his book. An excerpt that particularly impressed Denyse O'Leary. She posted this on uncommon Descent: What will be the next time and money-wasting error Darwinism leads scientists into?¹].

Researchers are discovering that what had been dismissed as evolution’s relics are actually vital to life. What used to be considered evidence for neo-Darwinism gene-formation mechanism can no longer be use as such evidence. In this case, neo-Darwinism has been a proven science inhibitor as it postponed serious investigation of the non-coding DNA within the genome, which was “one of the biggest mistakes in the history of molecular biology” [John Mattick, BioEssays, 2003 930-939].” This is reminiscent of the classification of 86 (later expanded to 180) human organs as “vestigial” that Robert Wiedersheim (1893) believed “lost their original physiological significance.” in that they were vestiges of evolution. Functions have since been discovered for all 180 organs that were thought to be vestigial, including the wings of flightless birds, the appendix, and the ear muscles of humans.”

This is more than a little confusing since the statement is wrong about the scientific facts. But even more interesting is the implication that the presence of junk DNA and/or vestigial organs is a threat to Intelligent Design Creationism. What kind of threat? Here's how Denyse O'Leary describes it.

The explicit reason for both the junk DNA error and the vestigial organs error was the need to find evidence for Darwinism in the form of stuff in life forms that doesn’t work. Without that need, these errors would not have been made.

Setting aside the lie about these being errors, let's try and see why this is such a big deal for the IDiots.

As we saw from the first quotation, everything is assumed to be designed unless we can prove that the "big four" have a purely natural explanation. So why would the IDiots be concerned about some little fish like junk DNA and vestigial organs? If a large part of our genome turns out to be junk and at least one organ turns out the be truly vestigial does this mean Intelligent Design Creationism is falsified?

Not bloody likely. The real issue here is not whether Intelligent Design Creationism has a better explanation for the organization of the human genome. It doesn't. The real issue is that these topics can be used to discredit science and evolutionary biologists. (Hence, the title of the articles.)

As I point out in class, this is the 21st century and everyone needs to have science on their side. This includes the IDiots and the climate change deniers. They can't just take the position that they are opposed to science—even though they are. That strategy hasn't worked since Darwin.

So, what do you do when the science seems to refute your claims? You resort to the only option available, attack the science and discredit the messengers. That's why we see so many stories about evil "Darwinists" and that's why people like Denyse O'Leary pounce on any opportunity to point out errors and mistakes in the scientific literature. And if you can't find any real mistakes you can always just make them up.

Intelligent Design Creationism is not about proposing alternative explanations. It's about attacking evolution and evolutionary biologists. Don't believe me? Just look at the books and the blogs. Something like 99.9% of what's written by the IDiots is attacking evolution and science. When's the last time you ever saw anything explained by Intelligent Design Creationism?

1. Aren't you glad that Denyse O'Leary is a professional journalist? Can you imagine what her titles migh look like if she didn't have professional training?

Tuesday, October 29, 2013

The Khan Academy and AAMC Teach the Central Dogma of Molecular Biology in Preparation for the MCAT

Here's a presentation by Tracy Kovach, a 3rd year medical student at the University of Virginia School of Medicine. Sandwalk readers will be familiar with my view of Basic Concepts: The Central Dogma of Molecular Biology and the widespread misunderstanding of Crick's original idea. It won't be a surprise to learn that a 3rd year medical student is repeating the old DNA to RNA to protein mantra.

I suppose that's excusable, especially since that's what is likely to be tested on the MCAT. I wonder if students who take my course, or similar courses that correctly teach the Central Dogma, will be at a disadvantage on the MCAT?

The video is posted on the Khan Academy website at: Central dogma of molecular biology. What I found so astonishing about the video presentation is that Tracy Kovach spends so much time explaining how to remember "transcription" and "translation" and get them in the right order. Recall that this video is for students who are about to graduate from university and apply to medical school. I expect high school students to have mastered the terms "transcription" and "translation." I'm pretty sure that students in my undergraduate class would be insulted if I showed them this video. They would be able to describe the biochemistry of transcription and translation in considerable detail.

There are people who think that the Central Dogma is misunderstood to an even greater extent than I claim. They say that the Central Dogma is widely interpreted to mean that the only role of DNA information is to make RNA which makes protein. In other words, they fear that belief in that version of the Central Dogma rules out any other role for DNA. This is the view of John Mattick. He says that the Central Dogma has been overthrown by the discovery of genes that make functional RNA but not protein.

I wonder if students actually think that this is what the Central Dogma means? Watch the first few minutes of the video and give me your opinion. Is this what she is saying?

Monday, February 23, 2015

Should universities defend free speech and academic freedom?

This post was prompted by a discussion I'm having with Jerry Coyne on whether he should be trying to censor university professors who teach various forms of creationism.

I very much enjoyed Jerry Coyne's stance on free speech in his latest ~~blog~~ website post: The anti-free speech police ride again. Here's what he said,

Jonathan Wells (1942 - 2024)

Johnathan Wells died recently. He was a well-known Intelligent Design Creationist and that's why Evolution News (sic) is eulogizing him by posting multiple tributes and excerpts from his books and essays.

I think it's only fair to post links to my efforts to demonstrate the serious flaws in his arguments. I'm particularly proud of the series of articles I wrote when he published his book The Myth of Junk DNA. I went through every chapter and analyzed his arguments against junk DNA. It won't surprise anyone to learn that I found those arguments lacking in substance and in some cases I discovered that Wells had misrepresented the science.

Here are my posts.

Jonathan Wells never responded directly to my criticism but he did respond to a comment that Paul McBride made on one of his blog posts. Paul asked him why he didn't respond to my post and here's what Wells said,

Oh, one last thing: “paulmc” referred to an online review of my book by University of Toronto professor Larry Moran—a review that “paulmc” called both extensive and thorough. Well, saturation bombing is extensive and thorough, too. Although “paulmc” admitted to not having read more than the Preface to The Myth of Junk DNA, I have read Mr. Moran’s review, which is so driven by confused thinking and malicious misrepresentations of my work—not to mention personal insults—that addressing it would be like trying to reason with a lynch mob.

This is typical of the attitude of most Intelligent Design Creationists. They are happy to publish lengthy books denegrating science and scientists but couldn't be bothered responding to criticism.

Here's are some other post of mine where I demonstrate the flawed thinking of Jonathan Wells.

Jonathan Wells talks about junk DNA

Jonathan Wells illustrates zombie science by revisiting junk DNA

Brace yourselves, a new "Icons" is coming

Jonathan Wells proves that life must have been created by gods

Answering ten questions from the IDiots

John Mattick vs. Jonathan Wells

Some Questions for IDiots

American Loons: #409 Jonathan Wells and #411 John West

A Dishonest Intelligent Design Proponent?

We Called Out IDiot Jonathan Wells, and He Folded

Jonathan Wells Sends His Regrets

Watch Jonathan Wells Screw Up

Jonathan Wells Talks About Sequence Conservation

Ohmygod! These photographs are faked!

Fossil Horses and Directed Evolution

Peppered Moths and the Confused IDiots

Jonathan Wells reviews the Christiane Nüsslein-Volhard and Wieschaus Experiment

Thursday, July 28, 2016

You are junk

There's an article about junk DNA in the latest issue of New Scientist (July 27, 2016) [You are junk: Why it’s not your genes that make you human]. I've already discussed the false meme at the beginning of the article [False history and the number of genes: 2016]. Now it's time to look at the main argument.

The subtitle is ...

Genes make proteins make us – that was the received wisdom. But from big brains to opposable thumbs, some of our signature traits could come from elsewhere.

You can see where this is going. You start with a false paradigm, "Genes make proteins make us," then proceed to refute it. This is called "paradigm shafting."¹

Nils Walter disputes junk DNA: (7) Conservation of transcribed DNA

I'm discussing a recent paper published by Nils Walter (Walter, 2024). He is arguing against junk DNA by claiming that the human genome contains large numbers of non-coding genes.

This is the seventh post in the series. The first one outlines the issues that led to the current paper and the second one describes Walter's view of a paradigm shift/shaft. The third post describes the differing views on how to define key terms such as 'gene' and 'function.' In the fourth post I discuss his claim that differing opinions on junk DNA are mainly due to philosophical disagreements. The fifth and sixth posts address specific arguments in the junk DNA debate.

Sequence conservation

If you don't know what a transcript is doing then how are you going to know whether it's a spurious transcript or one with an unknown function? One of the best ways is to check and see whether the DNA sequence is conserved. There's a powerful correlation between sequence conservation and function: as a general rule, functional sequences are conserved and non-conserved sequences can be deleted without consequence.

There might be an exception to the the conservation criterion in the case of de novo genes. They arise relatively recently so there's no history of conservation. That's why purifying selection is a better criterion. Now that we have the sequences of thousands of human genomes, we can check to see whether a given stretch of DNA is constrained by selection or whether it accumulates mutations at the rate we expect if its sequence were irrelevant junk DNA (neutral rate). The results show that less than 10% of our genome is being preserved by purifying selection. This is consistent with all the other arguments that 90% of our genome is junk and inconsistent with arguments that most of our genome is functional.

This sounds like a problem for the anti-junk crowd. Let's see how it's addressed in Nils Walter's article in BioEssays.

There are several hand-waving objections to using conservation as an indication of function and Walter uses them all plus one unique argument that we'll get to shortly. Let's deal with some of the "facts" that he discusses in his defense of function. He seems to agree that much of the genome is not conserved even though it's transcribed. In spite of this, he says,

"... the estimates of the fraction of the human genome that carries function is still being upward corrected, with the best estimate of confirmed ncRNAs now having surpassed protein-coding genes,[12] although so far only 10%–40% of these ncRNAs have been shown to have a function in, for example, cell morphology and proliferation, under at least one set of defined conditions."

This is typical of the rhetoric in his discussion of sequence conservation. He seems to be saying that there are more than 20,000 "confirmed" non-coding genes but only 10%-40% of them have been shown to have a function! That doesn't make any sense since the whole point of this debate is how to identify function.

Here's another bunch of arguments that Walter advances to demonstrate that a given sequence could be functional but not conserved. I'm going to quote the entire thing to give you a good sense of Walter's opinion.

A second limitation of a sequence-based conservation analysis of function is illustrated by recent insights from the functional probing of riboswitches. RNA structure, and hence dynamics and function, is generally established co-transcriptionally, as evident from, for example, bacterial ncRNAs including riboswitches and ribosomal RNAs, as well as the co-transcriptional alternative splicing of eukaryotic pre-mRNAs, responsible for the important, vast diversification of the human proteome across ∼200 cell types by excision of varying ncRNA introns. In the latter case, it is becoming increasingly clear that splicing regulation involves multiple layers synergistically controlled by the splicing machinery, transcription process, and chromatin structure. In the case of riboswitches, the interactions of the ncRNA with its multiple protein effectors functionally engage essentially all of its nucleotides, sequence-conserved or not, including those responsible for affecting specific distances between other functional elements. Consequently, the expression platform—equally important for the gene regulatory function as the conserved aptamer domain—tends to be far less conserved, because it interacts with the idiosyncratic gene expression machinery of the bacterium. Consequently, taking a riboswitch out of this native environment into a different cell type for synthetic biology purposes has been notoriously challenging. These examples of a holistic functioning of ncRNAs in their species-specific cellular context lay bare the limited power of pure sequence conservation in predicting all functionally relevant nucleotides.

I don't know much about riboswitches so I can't comment on that. As for alternative splicing, I assume he's suggesting that much of the DNA sequence for large introns is required for alternative splicing. That's just not correct. You can have effective alternative splicing with small introns. The only essential parts of introns sequences are the splice sites and a minimum amount of spacer.

Part of what he's getting at is the fact that you can have a functional transcript where the actual nucleotide sequence doesn't matter so it won't look conserved. That's correct. There are such sequences. For example, there seem to be some examples of enhancer RNAs, which are transcripts in the regulatory region of a gene where it's the act of transcription that's important (to maintain an open chromatin conformation, for example) and not the transcript itself. Similarly, not all intron sequences are junk because some spacer sequence in required to maintain a minimum distance between splice sites. All this is covered in Chapter 8 of my book ("Noncoding Genes and Junk RNA").

Are these examples enough to toss out the idea of sequence conservation as a proxy for function and assume that there are tens of thousands of such non-conserved genes in the human genome? I think not. The null hypothesis still holds. If you don't have any evidence of function then the transcript doesn't have a function—you may find a function at some time in the future but right now it doesn't have one. Some of the evidence for function could be sequence conservation but the absence of conservation is not an argument for function. If conservation doesn't work then you have to come up with some other evidence.

It's worth mentioning that, in the broadest sense, purifying selection isn't confined to nucleotide sequence. It can also take into account deletions and insertions. If a given region of the genome is deficient in random insertions and deletions then that's an indication of function in spite of the fact that the nucleotide sequence isn't maintained by purifying selection. The maintenance definition of function isn't restricted to sequence—it also covers bulk DNA and spacer DNA.

(This is a good time to bring up a related point. The absence of conservation (size or sequence) is not evidence of junk. Just because a given stretch of DNA isn't maintained by purifying selection does not prove that it is junk DNA. The evidence for a genome full of junk DNA comes from different sources and that evidence doesn't apply to every little bit of DNA taken individually. On the other hand, the maintenance function argument is about demonstrating whether a particular region has a function or not and it's about the proper null hypothesis when there's no evidence of function. The burden of proof is on those who claim that a transcript is functional.)

This brings us to the main point of Walter's objection to sequence conservation as an indication of function. You can see hints of it in the previous quotation where he talks about "holistic functioning of ncRNAs in their species-specific cellular context," but there's more ...

Some evolutionary biologists and philosophers have suggested that sequence conservation among genomes should be the primary, or perhaps only, criterion to identify functional genetic elements. This line of thinking is based on 50 years of success defining housekeeping and other genes (mostly coding for proteins) based on their sequence conservation. It does not, however, fully acknowledge that evolution does not actually select for sequence conservation. Instead, nature selects for the structure, dynamics and function of a gene, and its transcription and (if protein coding) translation products; as well as for the inertia of the same in pathways in which they are not involved. All that, while residing in the crowded environment of a cell far from equilibrium that is driven primarily by the relative kinetics of all possible interactions. Given the complexity and time dependence of the cellular environment and its environmental exposures, it is currently impossible to fully understand the emergent properties of life based on simple cause-and-effect reasoning.

The way I see it, his most important argument is that life is very complicated and we don't currently understand all of it's emergent properties. This means that he is looking for ways to explain the complexity that he expects to be there. The possibility that there might be several hundred thousand regulatory RNAs seems to fulfil this need so they must exist. According to Nils Walter, the fact that we haven't (yet) proven that they exist is just a temporary lull on the way to rigorous proof.

This seems to be a common theme among those scientists who share this viewpoint. We can see it in John Mattick's writings as well. It's as though the logic of having a genome full of regulatory RNA genes is so powerful that it doesn't require strong supporting evidence and can't be challenged by contradictory evidence. The argument seems somewhat mystical to me. Its proponents are making the a priori assumption that humans just have to be a lot more complicated than what "reductionist" science is indicating and all they have to do is discover what that extra layer of complexity is all about. According to this view, the idea that our genome is full of junk must be wrong because it seems to preclude the possibility that our genome could explain what it's like to be human.

Walter, N.G. (2024) Are non‐protein coding RNAs junk or treasure? An attempt to explain and reconcile opposing viewpoints of whether the human genome is mostly transcribed into non‐functional or functional RNAs. BioEssays:2300201. [doi: 10.1002/bies.202300201]

Thursday, May 20, 2010

Junk RNA or Imaginary RNA?

RNA is very popular these days. It seems as though new varieties of RNA are being discovered just about every month. There have been breathless reports claiming that almost all of our genome is transcribed and most of the this RNA has to be functional even though we don't yet know what the function is. The fervor with which some people advocate a paradigm shift in thinking about RNA approaches that of a cult follower [see Greg Laden Gets Suckered by John Mattick].

We've known for decades that there are many types of RNA besides messenger RNA (mRNA encodes proteins). Besides the standard ribosomal RNAs and transfer RNAs (tRNAs), there are a variety of small RNAs required for splicing and many other functions. There's no doubt that some of the new discoveries are important as well. This is especially true of small regulatory RNAs.

However, the idea that a huge proportion of our genome could be devoted to synthesizing functional RNAs does not fit with the data showing that most of our genome is junk [see Shoddy But Not "Junk"?]. That hasn't stopped RNA cultists from promoting experiments leading to the conclusion that almost all of our genome is transcribed.

Late to the Party

Several people have already written about this paper including Carl Zimmer and PZ Myers. There are also summaries in Nature News and PLoS Biology.That may change. A paper just published in PLoS Biology shows that the earlier work was prone to artifacts. Some of those RNAs may not even be there and others are present in tiny amounts.

The work was done by Harm van Bakel in Tim Hughes' lab, right here in Toronto. It's only a few floors, and a bridge, from where I'm sitting right now. The title of their paper tries to put a positive spin on the results: "Most 'Dark Matter' Transcripts Are Associated With Known Genes" [van Bakel et. al. (2010)]. Nobody's buying that spin. They all recognize that the important result is not that non-coding RNAs are mostly associated with genes but the fact that they are not found in the rest of the genome. In other words, most of our genome is not transcribed in spite of what was said in earlier papers.

Van Bekal compared two different types of analysis. The first, called "tiling arrays," is a technique where bulk RNA (cDNA, actually) is hybridized to a series of probes on a microchip. The probes are short pieces of DNA corresponding to genomic sequences spaced every few thousand base pairs along each chromosome. When some RNA fragment hybridizes to one of these probes you score that as a "hit." The earlier experiments used this technique and the results indicated that almost every probe could hybridize an RNA fragment. Thus, as you scanned the chip you saw that almost every spot recorded a "hit." The conclusion is that almost all of the genome is transcribed even though only 2% corresponds to known genes.

The second type of analysis is called RNA-Seq and it relies on direct sequencing of RNA fragments. Basically, you copy the RNA into DNA, selecting for small 200 bp fragments. Using new sequencing technology, you then determine the sequence of one (single end) or both ends (paired end) of this cDNA. You may only get 30 bp of good sequence information but that's sufficient to place the transcript on the known genome sequence. By collecting millions of sequence reads, you can determine what parts of the genome are transcribed and you can also determine the frequency of transcription. The technique is much more quantitative than tiling experiments.

Van Bekel et al. show that using RNA-Seq they detect very little transcription from the regions between genes. On the other hand, using tiling arrays they detect much more transcription from these regions. They conclude that the tiling arrays are producing spurious results—possibly due to cross-hybridization or possibly due to detection of very low abundance transcripts. In other words, the conclusion that most of our genome is transcribed may be an artifact of the method.

The parts of the genome that are presumed to be transcribed but for which there is no function is called "dark matter." Here's the important finding in the author's own words.

To investigate the extent and nature of transcriptional dark matter, we have analyzed a diverse set of human and mouse tissues and cell lines using tiling microarrays and RNA-Seq. A meta-analysis of single- and paired-end read RNA-Seq data reveals that the proportion of transcripts originating from intergenic and intronic regions is much lower than identified by whole-genome tiling arrays, which appear to suffer from high false-positive rates for transcripts expressed at low levels.

Many of us dismissed the earlier results as transcriptional noise or "junk RNA." We thought that much of the genome could be transcribed at a very low level but this was mostly due to accidental transcription from spurious promoters. This low level of "accidental" transcription is perfectly consistent with what we know about RNA polymerase and DNA binding proteins [What is a gene, post-ENCODE?, How RNA Polymerase Binds to DNA]. Although we might have suspected that some of the "transcription" was a true artifact, it was difficult to see how the papers could have failed to consider such a possibility. They had been through peer review and the reviewers seemed to be satisfied with the data and the interpretation.

That's gonna change. I suspect that from now on everybody is going to ignore the tiling array experiments and pretend they don't exist. Not only that, but in light of recent results, I suspect more and more scientists will announce that they never believed the earlier results in the first place. Too bad they never said that in print.

van Bakel, H., Nislow, C., Blencowe, B. and Hughes, T. (2010) Most "Dark Matter" Transcripts Are Associated With Known Genes. PLoS Biology 8: e1000371 [doi:10.1371/journal.pbio.1000371]

Friday, December 13, 2019

The "standard" view of junk DNA is completely wrong

I was browsing the table of contents of the latest issue of Cell and I came across this ....

For decades, the miniscule protein-coding portion of the genome was the primary focus of medical research. The sequencing of the human genome showed that only ∼2% of our genes ultimately code for proteins, and many in the scientific community believed that the remaining 98% was simply non-functional “junk” (Mattick and Makunin, 2006; Slack, 2006). However, the ENCODE project revealed that the non-protein coding portion of the genome is copied into thousands of RNA molecules (Djebali et al., 2012; Gerstein et al., 2012) that not only regulate fundamental biological processes such as growth, development, and organ function, but also appear to play a critical role in the whole spectrum of human disease, notably cancer (for recent reviews, see Adams et al., 2017; Deveson et al., 2017; Rupaimoole and Slack, 2017).

Slack, F.J. and Chinnaiyan, A.M. (2019) The Role of Non-coding RNAs in Oncology. Cell 179:1033-1055 [doi: 10.1016/j.cell.2019.10.017]

Cell is a high-impact, refereed journal so we can safely assume that this paper was reviewed by reputable scientists. This means that the view expressed in the paragraph above did not raise any alarm bells when the paper was reviewed. The authors clearly believe that what they are saying is true and so do many other reputable scientists. This seems to be the "standard" view of junk DNA among scientists who do not understand the facts or the debate surrounding junk DNA and pervasive transcription.

Here are some of the obvious errors in the statement.

The sequencing of the human genome did NOT show that only ~2% of our genome consisted of coding region. That fact was known almost 50 years ago and the human genome sequence merely confirmed it.
No knowledgeable scientist ever thought that the remaining 98% of the genome was junk—not in 1970 and not in any of the past fifty years.
The ENCODE project revealed that much of our genome is transcribed at some time or another but it is almost certainly true that the vast majority of these low-abundance, non-conserved, transcripts are junk RNA produced by accidental transcription.
The existence of noncoding RNAs such as ribosomal RNA and tRNA was known in the 1960s, long before ENCODE. The existence of snoRNAs, snRNAs, regulatory RNAs, and various catalytic RNAS were known in the 1980s, long before ENCODE. Other RNAs such as miRNAs, piRNAS, and siRNAs were well known in the 1990s, long before ENCODE.

How did this false view of our genome become so widespread? It's partially because of the now highly discredited ENCODE publicity campaign orchestrated by Nature and Science but that doesn't explain everything. The truth is out there in peer-reviewed scientific publications but scientists aren't reading those papers. They don't even realize that their standard view has been seriously challenged. Why?

Friday, July 03, 2015

The fuzzy thinking of John Parrington: The Central Dogma

My copy of The Deeper Genome: Why there's more to the human genome than meets the eye has arrived and I've finished reading it. It's a huge disappointment. Parrington makes no attempt to describe what's in your genome in more than general hand-waving terms. His main theme is that the genome is really complicated and so are we. Gosh, golly, gee whiz! Re-write the textbooks!

You will look in vain for any hard numbers such as the total number of genes or the amount of the genome devoted to centromeres, regulatory sequences etc. etc. [see What's in your genome?]. Instead, you will find a wishy-washy defense of ENCODE results and tributes to the views of John Mattick.

John Parrington is an Associate Professor of Cellular & Molecular Pharmacology at the University of Oxford (Oxford, UK). He works on the physiology of calcium signalling in mammals. This should make him well-qualified to write a book about biochemistry, molecular biology, and genomes. Unfortunately, his writing leaves a great deal to be desired. He seems to be part of a younger generation of scientists who were poorly trained as graduate students (he got his Ph.D. in 1992). He exhibits the same kind of fuzzy thinking as many of the ENCODE leaders.

Let me give you just one example.

Nils Walter disputes junk DNA: (1) The surprise

Nils Walter attempts to present the case for a functional genome by reconciling opposing viewpoints. I address his criticisms of the junk DNA position and discuss his arguments in favor of large numbers of functional non-coding RNAs.

Nils Walter is Francis S. Collins Collegiate Professor of Chemistry, Biophysics, and Biological Chemistry at the University of Michigan in Ann Arbor (Michigan, USA). He works on human RNAs and claims that, "Over 75% of our genome encodes non-protein coding RNA molecules, compared with only <2% that encodes proteins." He recently published an article explaining why he opposes junk DNA.

Walter, N.G. (2024) Are non‐protein coding RNAs junk or treasure? An attempt to explain and reconcile opposing viewpoints of whether the human genome is mostly transcribed into non‐functional or functional RNAs. BioEssays:2300201. [doi: 10.1002/bies.202300201]

The human genome project's lasting legacies are the emerging insights into human physiology and disease, and the ascendance of biology as the dominant science of the 21st century. Sequencing revealed that >90% of the human genome is not coding for proteins, as originally thought, but rather is overwhelmingly transcribed into non-protein coding, or non-coding, RNAs (ncRNAs). This discovery initially led to the hypothesis that most genomic DNA is “junk”, a term still championed by some geneticists and evolutionary biologists. In contrast, molecular biologists and biochemists studying the vast number of transcripts produced from most of this genome “junk” often surmise that these ncRNAs have biological significance. What gives? This essay contrasts the two opposing, extant viewpoints, aiming to explain their basis, which arise from distinct reference frames of the underlying scientific disciplines. Finally, it aims to reconcile these divergent mindsets in hopes of stimulating synergy between scientific fields.

Subscribe to: Posts ( Atom )

Quotations

The old argument of design in nature, as given by Paley, which formerly seemed to me to be so conclusive, fails, now that the law of natural selection has been discovered. We can no longer argue that, for instance, the beautiful hinge of a bivalve shell must have been made by an intelligent being, like the hinge of a door by man. There seems to be no more design in the variability of organic beings and in the action of natural selection, than in the course which the wind blows.Charles Darwin (c1880)

Although I am fully convinced of the truth of the views given in this volume, I by no means expect to convince experienced naturalists whose minds are stocked with a multitude of facts all viewed, during a long course of years, from a point of view directly opposite to mine. It is so easy to hide our ignorance under such expressions as "plan of creation," "unity of design," etc., and to think that we give an explanation when we only restate a fact. Any one whose disposition leads him to attach more weight to unexplained difficulties than to the explanation of a certain number of facts will certainly reject the theory.

Charles Darwin (1859)

Science reveals where religion conceals. Where religion purports to explain, it actually resorts to tautology. To assert that "God did it" is no more than an admission of ignorance dressed deceitfully as an explanation...

Peter Atkins

Quotations

The world is not inhabited exclusively by fools, and when a subject arouses intense interest, as this one has, something other than semantics is usually at stake. Stephen Jay Gould (1982)
I have championed contingency, and will continue to do so, because its large realm and legitimate claims have been so poorly attended by evolutionary scientists who cannot discern the beat of this different drummer while their brains and ears remain tuned to only the sounds of general theory. Stephen Jay Gould (2002) p.1339
The essence of Darwinism lies in its claim that natural selection creates the fit. Variation is ubiquitous and random in direction. It supplies raw material only. Natural selection directs the course of evolutionary change. Stephen Jay Gould (1977)
Rudyard Kipling asked how the leopard got its spots, the rhino its wrinkled skin. He called his answers "just-so stories." When evolutionists try to explain form and behavior, they also tell just-so stories—and the agent is natural selection. Virtuosity in invention replaces testability as the criterion for acceptance. Stephen Jay Gould (1980)
Since 'change of gene frequencies in populations' is the 'official' definition of evolution, randomness has transgressed Darwin's border and asserted itself as an agent of evolutionary change. Stephen Jay Gould (1983) p.335
The first commandment for all versions of NOMA might be summarized by stating: "Thou shalt not mix the magisteria by claiming that God directly ordains important events in the history of nature by special interference knowable only through revelation and not accessible to science." In common parlance, we refer to such special interference as "miracle"—operationally defined as a unique and temporary suspension of natural law to reorder the facts of nature by divine fiat. Stephen Jay Gould (1999) p.84

Quotations

My own view is that conclusions about the evolution of human behavior should be based on research at least as rigorous as that used in studying nonhuman animals. And if you read the animal behavior journals, you'll see that this requirement sets the bar pretty high, so that many assertions about evolutionary psychology sink without a trace.

Jerry Coyne
Why Evolution Is True

I once made the remark that two things disappeared in 1990: one was communism, the other was biochemistry and that only one of them should be allowed to come back.

Sydney Brenner
TIBS Dec. 2000

It is naïve to think that if a species' environment changes the species must adapt or else become extinct.... Just as a changed environment need not set in motion selection for new adaptations, new adaptations may evolve in an unchanging environment if new mutations arise that are superior to any pre-existing variations

Douglas Futuyma

One of the most frightening things in the Western world, and in this country in particular, is the number of people who believe in things that are scientifically false. If someone tells me that the earth is less than 10,000 years old, in my opinion he should see a psychiatrist.

Francis Crick

There will be no difficulty in computers being adapted to biology. There will be luddites. But they will be buried.

Sydney Brenner

An atheist before Darwin could have said, following Hume: 'I have no explanation for complex biological design. All I know is that God isn't a good explanation, so we must wait and hope that somebody comes up with a better one.' I can't help feeling that such a position, though logically sound, would have left one feeling pretty unsatisfied, and that although atheism might have been logically tenable before Darwin, Darwin made it possible to be an intellectually fulfilled atheist

Richard Dawkins

Another curious aspect of the theory of evolution is that everybody thinks he understand it. I mean philosophers, social scientists, and so on. While in fact very few people understand it, actually as it stands, even as it stood when Darwin expressed it, and even less as we now may be able to understand it in biology.

Jacques Monod

The false view of evolution as a process of global optimizing has been applied literally by engineers who, taken in by a mistaken metaphor, have attempted to find globally optimal solutions to design problems by writing programs that model evolution by natural selection.

Richard Lewontin

More Recent Comments

Thursday, February 07, 2008

Sunday, March 03, 2024

Transcripts vs junk DNA

What does transcript concentration tell us about function?

Thursday, July 28, 2016

Saturday, April 07, 2018

Saturday, October 13, 2018

Wednesday, July 08, 2009

Monday, September 11, 2017

Monday, October 21, 2024

Older molecular biologists were really stupid

The human genome is full of genes for regulatory RNAs.

Wednesday, August 17, 2011

Tuesday, October 29, 2013

Monday, February 23, 2015

Tuesday, October 01, 2024

Thursday, July 28, 2016

Wednesday, March 13, 2024

Sequence conservation

Thursday, May 20, 2010

Friday, December 13, 2019

Friday, July 03, 2015

Tuesday, February 27, 2024