More Recent Comments

Showing posts sorted by relevance for query "junk dna". Sort by date Show all posts
Showing posts sorted by relevance for query "junk dna". Sort by date Show all posts

Wednesday, February 14, 2024

Copilot answers the question, "What is junk DNA?"

The Microsoft browser (Edge) has a built in function called Copilot. It's an AI assistant based on ChatGPT-4.

I decided to test it byt asking "What is junk DNA?" and here's the answer it gave me.

Sunday, March 03, 2024

Nils Walter disputes junk DNA: (5) What does the number of transcripts per cell tell us about function?

I'm discussing a recent paper published by Nils Walter (Walter, 2024). He is arguing against junk DNA by claiming that the human genome contains large numbers of non-coding genes.

This is the fifth post in the series. The first one outlines the issues that led to the current paper and the second one describes Walter's view of a paradigm shift. The third post describes the differing views on how to define key terms such as 'gene' and 'function.' The fourth post makes the case that differing views on junk DNA are mainly due to philosophical disagreements.

-Nils Walter disputes junk DNA: (1) The surprise

-Nils Walter disputes junk DNA: (2) The paradigm shaft

-Nils Walter disputes junk DNA: (3) Defining 'gene' and 'function'

-Nils Walter disputes junk DNA: (4) Different views of non-functional transcripts

Transcripts vs junk DNA

The most important issue, according to Nils Walter, is whether the human genome contains huge numbers of genes for lncRNAs and other types of regulatory RNAs. He doesn't give us any indication of how many of these potential genes he thinks exist or what percentage of the genome they cover. This is important since he's arguing against junk DNA but we don't know how much junk he's willing to accept.

There are several hundred thousand transcripts in the RNA databases. Most of them are identified as lncRNAs because they are bigger than 200 bp. Let's assume, for the sake of argument, that 200,000 of these transcripts have a biologically relevant function and therefore there are 200,000 non-coding genes. A typical size might be 1000 bp so these genes would take up about 6.5% of the genome. That's about 10 times the number of protein-coding genes and more than 6 times the amount of coding DNA.

That's not going to make much of a difference in the junk DNA debate since proponents of junk DNA argue that 90% of the genome is junk and 10% is functional. All of those non-coding genes can be accommodated within the 10%.

The ENCODE researchers made a big deal out of pervasive transcription back in 2007 and again in 2012. We can quibble about the exact numbers but let's say that 80% of the human is transcribed. We know that protein-coding genes occupy at least 40% percent of the genome so much of this pervasive transcription is introns. If all of the presumptive regulatory genes are located in the remaining 40% (i.e. none in introns), and the average size is 1000 bp, then this could be about 1.24 million non-coding genes. Is this reasonable? Is this what Nils Walter is proposing?

I think there's some confusion about the difference between large numbers of functional transcripts and the bigger picture of how much total junk DNA there is in the human genome. I wish the opponents of junk DNA would commit to how much of the genome they think is functional and what evidence they have to support that position.

But they don't. So instead we're stuck with debates about how to decide whether some transcripts are functional or junk.

What does transcript concentration tell us about function?

If most detectable transcripts are due to spurious transcription of junk DNA then you would expect these transcripts to be present at very low levels. This turns out to be true as Nils Walter admits. He notes that "fewer than 1000 lncRNAs are present at greater than one copy per cell."

This is a problem for those who advocate that many of these low abundance transcripts must be functional. We are familiar with several of the ad hoc hypotheses that have been advanced to get around this problem. John Mattick has been promoting them for years [John Mattick's new paradigm shaft].

Walter advances two of these excuses. First, he says that a critical RNA may be present at an average of one molecule per cell but it might be abundant in just one specialized cell in the tissue. Furthermore, their expression might be transient so they can only be detected at certain times during development and we might not have assayed cells at the right time. I assume he's advocating that there might be a short burst of a large number of these extremely specialized regulatory RNAs in these special cells.

As far as I know, there aren't many examples of such specialized gene expression. You would need at least 100,000 examples in order to make a viable case for function.

His second argument is that many regulatory RNAs are restricted to the nucleus where they only need to bind to one regulatory sequence to carry out their function. This ignores the mass action laws that govern such interactions. If you apply the same reasoning to proteins then you would only need one lac repressor protein to shut down the lac operon in E. coli but we've known for 50 years that this doesn't work in spite of the fact that the lac repressor association constant shows that it is one of the tightest binding proteins known [DNA Binding Proteins]. This is covered in my biochemistry textbook on pages 650-651.1

If you apply the same reasoning to mammalian regulatory proteins then it turns out that you need 10,000 transcription factor molecules per nucleus in order to ensure that a few specific sites are occupied. That's not only because of the chemistry of binary interactions but also because the human genome is full of spurious sites that resemble the target regulatory sequence [The Specificity of DNA Binding Proteins]. I cover this in my book in Chapter 8: "Noncoding Genes and Junk RNA" in the section titled "On the important properties of DNA-binding proteins" (pp. 200-204). I use the estrogen receptor as an example based on calculations that were done in the mid-1970s. The same principles apply to regulatory RNAs.

This is a disagreement based entirely on biochemistry and molecular biology. There aren't enough examples (evidence) to make the first argument convincing and the second argument makes no sense in light of what we know about the interactions between molecules inside of the cell (or nucleus).

Note: I can almost excuse the fact that Nils Walter ignores my book on junk DNA, my biochemistry textbook, and my blog posts, but I can't excuse the fact that his main arguments have been challenged repeatedly in the scientific literature. A good scientist should go out of their way to seek out objections to their views and address them directly.


1. In addition to the thermodynamic (equilibrium) problem, there's a kinetic problem. DNA binding proteins can find their binding sites relatively quickly by one dimensional diffusion—an option that's not readily available to regulatory RNAs [Slip Slidin' Along - How DNA Binding Proteins Find Their Target].

Walter, N.G. (2024) Are non‐protein coding RNAs junk or treasure? An attempt to explain and reconcile opposing viewpoints of whether the human genome is mostly transcribed into non‐functional or functional RNAs. BioEssays:2300201. [doi: 10.1002/bies.202300201]

Wednesday, June 25, 2014

The Function Wars: Part I

This is Part I of the "Function Wars: posts. The second one is on The ENCODE legacy.1

Quibbling about the meaning of the word "function"

The world is not inhabited exclusively by fools and when a subject arouses intense interest and debate, as this one has, something other than semantics is usually at stake.
Stephan Jay Gould (1982)
The ENCODE Consortium tried to redefine the word “function” to include any biological activity that they could detect using their genome-wide assays. This was not helpful since it included a huge number of sites and sequences that result from spurious (nonfunctional) binding of transcription factors or accidental transcription of random DNA sequences to make junk RNA [see What did the ENCODE Consortium say in 2012?]..

I believe that this strange way of redefining biological function was a deliberate attempt to discredit junk DNA. It was quite successful since much of the popular press interpreted the ENCODE results as refuting or disproving junk DNA. I believe that the leaders of the ENCODE Consortium knew what they were doing when they decided to hype their results by announcing that 80% of the human genome is functional [see The Story of You: Encode and the human genome – video, Science Writes Eulogy for Junk DNA]..

The ENCODE Project, today, announces that most of what was previously considered as 'junk DNA' in the human genome is actually functional. The ENCODE Project has found that 80 per cent of the human genome sequence is linked to biological function.

[Google Earth of Biomedical Research]

Monday, March 04, 2024

Nils Walter disputes junk DNA: (6) The C-value paradox

I'm discussing a recent paper published by Nils Walter (Walter, 2024). He is arguing against junk DNA by claiming that the human genome contains large numbers of non-coding genes.

This is the fifth post in the series. The first one outlines the issues that led to the current paper and the second one describes Walter's view of a paradigm shift/shaft. The third post describes the differing views on how to define key terms such as 'gene' and 'function.' In the fourth post I discuss his claim that differing opinions on junk DNA are mainly due to philosophical disagreements.

Sunday, January 01, 2023

The function wars are over

In order to have a productive discussion about junk DNA we needed to agree on how to define "function" and "junk." Disagreements over the definitions spawned the Function Wars that became intense over the past decade. That war is over and now it's time to move beyond nitpicking about terminology.

The idea that most of the human genome is composed of junk DNA arose gradually in the late 1960s and early 1970s. The concept was based on a lot of evidence dating back to the 1940s and it gained support with the discovery of massive amounts of repetitive DNA.

Various classes of functional DNA were known back then including: regulatory sequences, protein-coding genes, noncoding genes, centromeres, and origins of replication. Other categories have been added since then but the total amount of functional DNA was not thought to be more than 10% of the genome. This was confirmed with the publication of the human genome sequence.

From the very beginning, the distinction between functional DNA and junk DNA was based on evolutionary principles. Functional DNA was the product of natural selection and junk DNA was not constrained by selection. The genetic load argument was a key feature of Susumu Ohno's conclusion that 90% of our genome is junk (Ohno, 1972a; Ohno, 1972b).

Friday, April 01, 2011

Junk & Jonathan: Part 2— What Did Biologists Really Say About Junk DNA?

This is the second in a series of postings about a new book by Jonathan Wells: The Myth of Junk DNA. The book is published by Discovery Institute Press and it should go on sale on May 31 2011. I'm responding to an interview with Jonathan Wells on Uncommon Descent [Jonathan Wells on his book, The Myth of Junk DNA – yes, it is a Darwinist myth and he nails it as such].

Denyse O'Leary asks, "Interestingly, in the “nail dump is Ming vase” story, no one insists that nobody ever thought it was just another piece of junk. They almost always say, “Yes, we thought so but had no idea …” So what’s behind the failure to admit an error in this case?" It's hard to figure out what she means but I think she's wondering why biologists don't just admit they were wrong about junk DNA. Jonathan Wells interprets the question differently.
Some people revise history by claiming that no mainstream biologists ever regarded non-protein-coding DNA as “junk.”

This claim is easily disproved: Francis Crick and Leslie Orgel published an article in Nature in 1980 (284: 604-607) arguing that such DNA “is little better than junk,” and “it would be folly in such cases to hunt obsessively” for functions in it. Since then, Brown University biologist Kenneth R. Miller, Oxford University biologist Richard Dawkins, University of Chicago biologist Jerry A. Coyne, and University of California–Irvine biologist John C. Avise have all argued that most of our DNA is junk, and that this provides evidence for Darwinian evolution and against intelligent design. National Institutes of Health director Francis Collins argued similarly in his widely read 2006 book The Language of God.

It is true that some biologists (such as Thomas Cavalier-Smith and Gabriel Dover) have long been skeptical of “junk DNA” claims, but probably a majority of biologists since 1980 have gone along with the myth. The revisionists are misinformed (or misinforming).
It's in the best interests of the IDiots to promote the idea that all "Darwinists" believed in the "myth" of junk DNA and that it wasn't until the predictions of the IDiots were confirmed (not) that the biologists changed their minds.

The truth is somewhat different. Wells says, "Some people revise history by claiming that no mainstream biologists ever regarded non-protein-coding DNA as “junk.”" The truth is that the mainstream biologist community never, ever claimed that all non-coding DNA was junk. Most of them didn't even believe that a majority of our genome was junk.

The issue has come up many times over the past few years on blogs and newsgroups. The last time I took a poll was a few years ago and here are the results.


As you can see, there's a wide range of opinion among people who read Sandwalk. I think this is a pretty good reflection of the opinions of most biologists.

In responding to the question, Wells makes one serious error when he claims that biologists promoted junk DNA because it "provides evidence for Darwinian evolution." It does nothing of the sort. In fact, it goes against any prediction of Darwinian evolution by natural selection. The reason why the concept of (huge amounts of) junk DNA was resisted by so many biologists was because of this conflict.

Wells also says that junk DNA was promoted by some biologists because it "provides evidence ... against intelligent design." This is partly true, especially when the arguments center on conserved pseudogenes. That part of junk DNA (pseudogenes) is accepted by almost all biologists but it's only a tiny part of our genome. There is no evidence to suggest that pseudogenes are anything but junk and all the evidence indicates that we have thousands of them in our genome. (If they have a function then they aren't pseudogenes.)

Many mainstream biologists have supported the idea that a majority of our genome is junk. There's no denying that. I agree with them. None of them are changing their minds in spite of what Jonathan Wells is telling you. What Wells is doing is picking sides in a genuine scientific dispute. He could have done this 30 years ago and the result would have been the same. The genuine scientific controversy is not about to be resolved and there's no new evidence that seals the case one way or the other.

In my opinion, our genome is almost 90% junk DNA and that's the view that's going to win in the end.


Tuesday, February 10, 2015

Nessa Carey and New Scientist don't understand the junk DNA debate

There's a new book on junk DNA due to be published at the end of March. It's called Junk DNA: A Journey through the Dark Matter of the Genome. The author is someone named Nessa Carey. Here's her bio ....
Nessa Carey has a virology PhD from the University of Edinburgh and is a former Senior Lecturer in Molecular Biology at Imperial College, London. She worked in the biotech and pharmaceutical industry for thirteen years and is now International Director for the UK's leading organisation for technology transfer professionals. She lives in Norfolk and is a Visiting Professor at Imperial College.
Pretty impressive.

Here's how she describes her view of the human genome.

Tuesday, August 07, 2012

Note to David Klinghoffer, When You find Yourself in a Hole, Stop Digging

Some of you might recall the recent Chromosome 2 kerfuffle. It started when Carl Zimmer asked David Klinghoffer a simple question. Zimmer asked him to describe the evidence to support his claim that the fusion site didn't look like it should if two primitive ape chromosomes had fused to produce human chromosome 2.

Rather than simply answer the question, the IDiots circled the wagons then went into attack mode. Eventually, after a lot of pressure, they got around to answering the question; apparently there is no evidence to support their claim [And Finally the Hounding Duck Can Rest].

Of course by then they were so deep in their hole that the sun don't shine.

Sunday, November 13, 2011

Jonathan Wells Talks About Genetic Load

Most people don't understand the positive evidence for junk DNA—this includes most scientists. Paulmc tried to convince the readers on Uncommon Descent that they had been misinformed about junk DNA. The fact that our genome has huge amounts of junk DNA is not just an argument from ignorance—an argument that most IDiots are familiar with—because there are several good reasons for concluding that most DNA has to be junk.

Wells addressed those arguments in: Jonathan Wells on Darwinism, Science, and Junk DNA.

Monday, May 02, 2016

The Encyclopedia of Evolutionary Biology revisits junk DNA

The Enclyopedia of Evolutionary Biology is a four volume set of articles by leading evolutionary biologists. An online version is available at ScienceDirect. Many universities will have free access.

I was interested in what they had to say about junk DNA and the evolution of large complex genomes. The only article that directly addressed the topic was "Noncoding DNA Evolution: Junk DNA Revisited" by Michael Z. Ludwig of the Department of Ecology and Evolution at the University of Chicago. Ludwig is a Research Associate (Assistant Professor) who works with Martin Kreitman on "Developmental regulation of gene expression and the genetic basis for evolution of regulatory DNA."

As you could guess from the title of the article, Michael Ludwig divides the genome into two fractions; protein-coding genes and noncoding DNA. The fact that organismal complexity doesn't correlate with the number of genes (protein-coding) is a problem that requires an explanation, according to Ludwig. He assumes that the term "junk DNA" was used in the past to account for our lack of knowledge about noncoding DNA.
Eukaryotic genomes mostly consist of DNA that is not translated into protein sequence. However, noncoding DNA (ncDNA) has been little studied relative to proteins. The lack of knowledge about its functional significance has led to hypotheses that much nongenic DNA is useless "junk" (Ohno, 1972) or that it exists only to replicate itself (Doolittle and Sapienza, 1980; Orgel and Crick, 1980).
Ludwig says that we now know some of the functions of non-coding DNA and one of them is regulation of gene expression.
These regulatory sequences are distributed among selfish transposons and middle or short repetitive DNAs. The genome is an extremely complex machine; functionally as well as structurally it is generally not possible to disentangle the regulatory function from the junk selfish activity. The idea of junk DNA needs to be revisited.
Of course we all know about regulatory sequences. We've known about this function of non-coding DNA for half a century. The question that interests us is not whether non-coding DNA has a function but whether a large proportion of noncoding DNA is junk.

Ludwig seems to be arguing that a significant fraction of the mammalian genome is devoted to regulation. He doesn't ever specify what this fraction is but apparently it's large enough to "revisit" junk DNA.

The biggest obstacle to his thesis is the fact that only 8% of the human genome is conserved (Rands et al., 2014). Ludwig says that 1% of the genome is coding DNA and 7% "has a functional regulatory gene expression role" according to the Rands et al. study. This is somewhat misleading since Rands et al. specifically mention that not all of this conserved DNA will be regulatory.

All of this is consistent with a definition of function specifying that it must be under negative selection (i.e. conserved). It leads to the conclusion that about 90% of the human genome is junk. That doesn't require a re-evaluation of junk.

In order to "revisit" junk DNA, the proponents of the "complex machine" view of evolution must come up with plausible reasons why lack of sequence conservation does not rule out function. Ludwig offers up the standard rationales ...
  1. Some ultra-conserved sequences don't seem to have a function and this "shows that the extent of sequence conservation is not a good predictor of the functional importance of a sequence."
  2. The amount of conserved sequence depends on the alignment and alignment is difficult.
  3. About 40%-70% of the noncoding DNA in Drosophila melanogaster is under functional constraint within the species but not between D. melanogaster and D. simulans. Therefore, some large fraction of functional regulatory sequences might only be conserved in the human lineage and it won't show up in comparisons between species. (Does this explain onions?)
The idea here is that there is rapid turnover of functional DNA binding sites required for regulation but the overall fraction of DNA devoted to regulation remains large. This explains why there doesn't seem to be a correlation between the amount of conserved DNA and the amount that can possibly be devoted to regulating gene expression. The argument implies that much more than 7% of the genome is required for regulation. The amount has to be >50% or so in order to justify overthrowing the concept of junk DNA.

That's a ridiculous number, but so is 7%. Imagine that "only" 7% of the genome is functionally involved in regulating expression of the protein-coding genes. That's 224 million base pairs of DNA or approximately 10 thousand base pairs of cis-regulatory elements (CREs) for every protein-coding gene.

There is no evidence whatsoever that even this amount (7%) of DNA is required for regulation but Ludwig would like to think that the actual amount is much greater. The lack of conservation is dismissed by assuming rapid turnover while conserving function and/or stabilizing selection on polymorphic sequences.

The problem here is that Ludwig is constructing a just-so evolutionary story to explain something that doesn't require an explanation. If there's no evidence that a large fraction of the genome is required for regulation then there's no problem that needs explaining. Ludwig does not tell us why he believes that most of our genome is required for regulation. Maybe it's because of ENCODE?

Since this is published in the Encyclopedia of Evolutionary Biolgoy, I assume that this sort of evolutionary argument resonates with many evolutionary biologists. That's sad.


Rands, C. M., Meader, S., Ponting, C. P., and Lunter, G. (2014) 8.2% of the Human Genome Is Constrained: Variation in Rates of Turnover across Functional Element Classes in the Human Lineage. PLoS Genetics, 10(7), e1004525. [doi: 10.1371/journal.pgen.1004525]

Monday, September 21, 2009

More Junk DNA Fallacies

 
BiOpinionated is a blog written by a molecular biologist named Nils Reinton. He tries to see every side of an argument but there are times when this attempt goes astray.

The "debate" over junk DNA is an example. Here's how Nils responed to claims by Ryan Gregory and me that most of our genome is junk [How to have your cake, eat it, and then complain].
First: State that most of our genome is junk.

Second: When more and more promoters, enhancers, repressors and other regulatory elements are discovered, claim that this of course was not included in the definition of “most of the genome”. The perfect excuse because it means you’ll never be wrong.

Last: Complain when the press does not understand that “most of our DNA” actually meant “much of our DNA , but with a lot of exceptions” and that science reporters don’t intuitively know which exceptions these are.

Post written using the zpen in dire agony over extremely poor science communication from the same persons who most eagerly criticize science communication from others.
[see the original article for links - LAM]
Oh dear. There's so much wrong with the logic of this posting that I hardly know where to begin.

Nils is mostly upset about a recent posting on Genomicron: The Junk DNA myth strikes again (next up: media hype). This isn't very complicated so let me give you the short version.

Most of our genome is junk. That does not mean that all of our genome is junk and it certainly never meant (among intelligent scientists) that all of our non-coding DNA is junk. Here's a short list of non-coding DNA that is absolutely essential in our genome: all genes that produce functional RNAs instead of proteins; all regulatory sequences including enhancers; sequences that control splicing and other RNA processing events such as capping and polyadenylation; some 5′-leaders and 3′-tails of mRNA; chromatin domain markers (regulatory); scaffold attachment sites (SARs); some recombination hotspots; origins of replication; centromeres; telomeres.

Ryan was complaining about a paper that's about to be published in Molecular Biology and Evolution. The authors say this in their abstract.
Protein-coding sequences make up only about 1% of the mammalian genome. Much of the remaining 99% has been long assumed to be junk DNA, with little or no functional significance.
I agree with Ryan Gregory that this is extremely misleading. It implies that there are legitimate scientists who think that all non-coding DNA is junk. It would be far better to say something like this ...
Genes that encode proteins, and other genes, make up only a few percent of our genome. If you add in all of the other DNA sequences that are known to be essential you still can only account for no more than 5% of our genome. Most of the rest is thought to be junk DNA with no biological function. There are no respectable scientists who think that none of it will ever be shown to have a function but the general consensus among the defenders of junk DNA is that the vast majority of these DNA sequences, consisting mostly of defective transposons and pseudogenes, will turn out to have no function.
The authors of the paper go on to present evidence that about 5.4% of non-coding DNA has a function.

Big deal. That's not much more than what the textbooks have been saying for several decades.

Nils, there's an interesting debate going on about the amount of junk DNA in our genome. You're welcome to participate but please make sure you understand the issue and, please, don't spread false information. When we say that most of our genome is junk that does not mean that some of what we now consider to be junk DNA won't turn out to have a function. We're not that stupid—please don't imply that we are.

What we're saying is that the vast majority of DNA sequence in our genome is junk. I think the amount of junk is going to be >90%. That still leaves room for discovering a function for about twice as much DNA as we already know to be functional.

Get back to me when someone publishes solid evidence that more than 10% of our genome is essential.


Thursday, April 05, 2018

Subhash Lakhotia: The concept of 'junk DNA' becomes junk

Continuing my survey of recent papers on junk DNA, I stumbled upon a review by Subash Lakhotia that has recently been accepted in The Proceedings of the Indian National Science Academy (Lakhotia, 2018). It illustrates the extent of the publicity campaign mounted by ENCODE and opponents of junk DNA. In the title of this post, I paraphrased a sentence from the abstract that summarizes the point of the paper; namely, that the 'recent' discovery of noncoding RNAs refutes the concept of junk DNA.

Lakhotia claims to have written a review of the history of junk DNA but, in fact, his review perpetuates a false history. He repeats a version of history made popular by John Mattick. It goes like this. Old-fashioned scientists were seduced by Crick's central dogma into thinking that the only important part of the genome was the part encoding proteins. They ignored genes for noncoding RNAs because they didn't fit into their 'dogma.' They assumed that most of the noncoding part of the genome was junk. However, recent new discoveries of huge numbers of noncoding RNAs reveal that those scientists were very stupid. We now know that the genome is chock full of noncoding RNA genes and the concept of junk DNA has been refuted.

Wednesday, October 12, 2011

A Twofer

A few weeks ago David Klinghoffer criticized science bloggers for only going after "extremely marginal and daffy creationists." He challenged us to take on the "real scientists" like Jonathan M. [A Reason to Doubt the IDiots]

Today you're in for a treat, dear readers, 'cause I'm going to respond to a daffy creationist who happens to be Jonathan M. It's a twofer!

Tuesday, February 27, 2024

Nils Walter disputes junk DNA: (1) The surprise

Nils Walter attempts to present the case for a functional genome by reconciling opposing viewpoints. I address his criticisms of the junk DNA position and discuss his arguments in favor of large numbers of functional non-coding RNAs.

Nils Walter is Francis S. Collins Collegiate Professor of Chemistry, Biophysics, and Biological Chemistry at the University of Michigan in Ann Arbor (Michigan, USA). He works on human RNAs and claims that, "Over 75% of our genome encodes non-protein coding RNA molecules, compared with only <2% that encodes proteins." He recently published an article explaining why he opposes junk DNA.

Walter, N.G. (2024) Are non‐protein coding RNAs junk or treasure? An attempt to explain and reconcile opposing viewpoints of whether the human genome is mostly transcribed into non‐functional or functional RNAs. BioEssays:2300201. [doi: 10.1002/bies.202300201]

The human genome project's lasting legacies are the emerging insights into human physiology and disease, and the ascendance of biology as the dominant science of the 21st century. Sequencing revealed that >90% of the human genome is not coding for proteins, as originally thought, but rather is overwhelmingly transcribed into non-protein coding, or non-coding, RNAs (ncRNAs). This discovery initially led to the hypothesis that most genomic DNA is “junk”, a term still championed by some geneticists and evolutionary biologists. In contrast, molecular biologists and biochemists studying the vast number of transcripts produced from most of this genome “junk” often surmise that these ncRNAs have biological significance. What gives? This essay contrasts the two opposing, extant viewpoints, aiming to explain their basis, which arise from distinct reference frames of the underlying scientific disciplines. Finally, it aims to reconcile these divergent mindsets in hopes of stimulating synergy between scientific fields.

Saturday, April 07, 2018

Required reading for the junk DNA debate

This is a list of scientific papers on junk DNA that you need to read (and understand) in order to participate in the junk DNA debate. It's not a comprehensive list because it's mostly papers that defend junk DNA and refute arguments for massive amounts of function. The only exception is the paper by Mattick and Dinger (2013).1 It's the only anti-junk paper that attempts to deal with the main evidence for junk DNA. If you know of any other papers that make a good case against junk DNA then I'd be happy to include them in the list.

If you come across a publication that argues against junk DNA, then you should immediately check the reference list. If you do not see some of these references in the list, then don't bother reading the paper because you know the author is not knowledgeable about the subject.

Brenner, S. (1998) Refuge of spandrels. Current Biology, 8:R669-R669. [PDF]

Brunet, T.D., and Doolittle, W.F. (2014) Getting “function” right. Proceedings of the National Academy of Sciences, 111:E3365-E3365. [doi: 10.1073/pnas.1409762111]

Casane, D., Fumey, J., et Laurenti, P. (2015) L’apophénie d’ENCODE ou Pangloss examine le génome humain. Med. Sci. (Paris) 31: 680-686. [doi: 10.1051/medsci/20153106023] [The apophenia of ENCODE or Pangloss looks at the human genome]

Cavalier-Smith, T. (1978) Nuclear volume control by nucleoskeletal DNA, selection for cell volume and cell growth rate, and the solution of the DNA C-value paradox. Journal of Cell Science, 34(1), 247-278. [doi: PDF]

Doolittle, W.F. (2013) Is junk DNA bunk? A critique of ENCODE. Proc. Natl. Acad. Sci. (USA) published online March 11, 2013. [PubMed] [doi: 10.1073/pnas.1221376110]

Doolittle, W.F., Brunet, T.D., Linquist, S., and Gregory, T.R. (2014) Distinguishing between “function” and “effect” in genome biology. Genome biology and evolution 6, 1234-1237. [doi: 10.1093/gbe/evu098]

Doolittle, W.F., and Brunet, T.D. (2017) On causal roles and selected effects: our genome is mostly junk. BMC biology, 15:116. [doi: 10.1186/s12915-017-0460-9]

Eddy, S.R. (2012) The C-value paradox, junk DNA and ENCODE. Current Biology, 22:R898. [doi: 10.1016/j.cub.2012.10.002]

Eddy, S.R. (2013) The ENCODE project: missteps overshadowing a success. Current Biology, 23:R259-R261. [10.1016/j.cub.2013.03.023]

Graur, D. (2017) Rubbish DNA: The functionless fraction of the human genome Evolution of the Human Genome I (pp. 19-60): Springer. [doi: 10.1007/978-4-431-56603-8_2 (book)] [PDF]

Graur, D. (2017) An upper limit on the functional fraction of the human genome. Genome Biology and Evolution, 9:1880-1885. [doi: 10.1093/gbe/evx121]

Graur, D., Zheng, Y., Price, N., Azevedo, R. B., Zufall, R. A., and Elhaik, E. (2013) On the immortality of television sets: "function" in the human genome according to the evolution-free gospel of ENCODE. Genome Biology and Evolution published online: February 20, 2013 [doi: 10.1093/gbe/evt028

Graur, D., Zheng, Y., and Azevedo, R.B. (2015) An evolutionary classification of genomic function. Genome Biology and Evolution, 7:642-645. [doi: 10.1093/gbe/evv021]

Gregory, T. R. (2005) Synergy between sequence and size in large-scale genomics. Nature Reviews Genetics, 6:699-708. [doi: 10.1038/nrg1674]

Haerty, W., and Ponting, C.P. (2014) No Gene in the Genome Makes Sense Except in the Light of Evolution. Annual review of genomics and human genetics, 15:71-92. [doi:10.1146/annurev-genom-090413-025621]

Hurst, L.D. (2013) Open questions: A logic (or lack thereof) of genome organization. BMC biology, 11:58. [doi:10.1186/1741-7007-11-58]

Kellis, M., Wold, B., Snyder, M.P., Bernstein, B.E., Kundaje, A., Marinov, G.K., Ward, L.D., Birney, E., Crawford, G. E., and Dekker, J. (2014) Defining functional DNA elements in the human genome. Proc. Natl. Acad. Sci. (USA) 111:6131-6138. [doi: 10.1073/pnas.1318948111]

Mattick, J. S., and Dinger, M. E. (2013) The extent of functionality in the human genome. The HUGO Journal, 7:2. [doi: 10.1186/1877-6566-7-2]

Five Things You Should Know if You Want to Participate in the Junk DNA DebateMorange, M. (2014) Genome as a Multipurpose Structure Built by Evolution. Perspectives in biology and medicine, 57:162-171. [doi: 10.1353/pbm.2014.000]

Niu, D. K., and Jiang, L. (2012) Can ENCODE tell us how much junk DNA we carry in our genome?. Biochemical and biophysical research communications 430:1340-1343. [doi: 10.1016/j.bbrc.2012.12.074]

Ohno, S. (1972) An argument for the genetic simplicity of man and other mammals. Journal of Human Evolution, 1:651-662. [doi: 10.1016/0047-2484(72)90011-5]

Ohno, S. (1972) So much "junk" in our genome. In H. H. Smith (Ed.), Evolution of genetic systems (Vol. 23, pp. 366-370): Brookhaven symposia in biology.

Palazzo, A.F., and Gregory, T.R. (2014) The Case for Junk DNA. PLoS Genetics, 10:e1004351. [doi: 10.1371/journal.pgen.1004351]

Rands, C. M., Meader, S., Ponting, C. P., and Lunter, G. (2014) 8.2% of the Human Genome Is Constrained: Variation in Rates of Turnover across Functional Element Classes in the Human Lineage. PLOS Genetics, 10:e1004525. [doi: 10.1371/journal.pgen.1004525]

Thomas Jr, C.A. (1971) The genetic organization of chromosomes. Annual review of genetics, 5:237-256. [doi: annurev.ge.05.120171.001321]


1. The paper by Kellis et al. (2014) is ambiguous. It's clear that most of the ENCODE authors are still opposed to junk DNA even though the paper is mostly a retraction of their original claim that 80% of the genome is functional.

Wednesday, November 25, 2015

Selfish genes and transposons

Back in 1980, the idea that large fractions of animal and plant genomes could be junk was quite controversial. Although the idea was consistent with the latest developments in population genetics, most scientists were unaware of these developments. They were looking for adaptive ways of explaining all the excess DNA in these genomes.

Some scientists were experts in modern evolutionary theory but still wanted to explain "junk DNA." Doolittle & Sapienza, and Orgel & Crick, published back-to-back papers in the April 17, 1980 issue of Nature. They explained junk DNA by claiming that most of it was due to the presence of "selfish" transposons that were being selected and preserved because they benefited their own replication and transmission to future generations. They have no effect on the fitness of the organism they inhabit. This is natural selection at a different level.

This prompted some responses in later editions of the journal and then responses to the responses.

Here's the complete series ...

Saturday, May 14, 2022

Editing the Wikipedia article on non-coding DNA

I decided to edit the Wikipedia article on non-coding DNA by adding new sections on "Noncoding genes," "Promoters and regulatory sequences," "Centromeres," and "Origins of replication." That didn't go over very well with the Wikipedia police so they deleted the sections on "Noncoding genes" and "Origins of replication." (I'm trying to restore them so you may see them come back when you check the link.)

I also decided to re-write the introduction to make it more accurate but my version has been deleted three times in favor of the original version you see now on the website. I have been threatened with being reported to Wikipedia for disruptive edits.

The introduction has been restored to the version that talks about the ENCODE project and references Nessa Carey's book. I tried to move that paragraph to the section on the ENCODE project and I deleted the reference to Carey's book on the grounds that it is not scientifically accurate [see Nessa Carey doesn't understand junk DNA]. The Wikipedia police have restored the original version three times without explaining why they think we should mention the ENCODE results in the introduction to an article on non-coding DNA and without explaining why Nessa Carey's book needs to be referenced.

The group that's objecting includes Ramos1990, Qzd, and Trappist the monk. (I am Genome42.) They seem to be part of a group that is opposed to junk DNA and resists the creation of a separate article for junk DNA. They want junk DNA to be part of the article on non-coding DNA for reasons that they don't/won't explain.

The main problem is the confusion between "noncoding DNA" and "junk DNA." Some parts of the article are reasonably balanced but other parts imply that any function found in noncoding DNA is a blow against junk DNA. The best way to solve this problem is to have two separate articles; one on noncoding DNA and it's functions and another on junk DNA. There has been a lot of resistance to this among the current editors and I can only assume that this is because they don't see the distinction. I tried to explain it in the discussion thread on splitting by pointing out that we don't talk about non-regulatory DNA, non-centromeric DNA, non-telomeric DNA, or non-origin DNA and there's no confusion about the distinction between these parts of the genome and junk DNA. So why do we single out noncoding DNA and get confused?

It looks like it's going to be a challenge to fix the current Wikipedia page(s) and even more of a challenge to get a separate entry for junk DNA.

Here is the warning that I have received from Ramos1990.

Your recent editing history shows that you are currently engaged in an edit war; that means that you are repeatedly changing content back to how you think it should be, when you have seen that other editors disagree. To resolve the content dispute, please do not revert or change the edits of others when you are reverted. Instead of reverting, please use the talk page to work toward making a version that represents consensus among editors. The best practice at this stage is to discuss, not edit-war. See the bold, revert, discuss cycle for how this is done. If discussions reach an impasse, you can then post a request for help at a relevant noticeboard or seek dispute resolution. In some cases, you may wish to request temporary page protection.

Being involved in an edit war can result in you being blocked from editing—especially if you violate the three-revert rule, which states that an editor must not perform more than three reverts on a single page within a 24-hour period. Undoing another editor's work—whether in whole or in part, whether involving the same or different material each time—counts as a revert. Also keep in mind that while violating the three-revert rule often leads to a block, you can still be blocked for edit warring—even if you do not violate the three-revert rule—should your behavior indicate that you intend to continue reverting repeatedly.

I guess that's very clear. You can't correct content to the way you think it should be as long as other editors disagree. I explained the reason for all my changes in the "history" but none of the other editors have bothered to explain why they reverted to the old version. Strange.


Wednesday, March 27, 2013

ENCODE, Junk DNA, and Intelligent Design Creationism

andyjones has replied to my earlier posting on ENCODE and junk DNA. You can read his response at: (More) Function, the evolution-free gospel of ENCODE. Here's part of what he says ...
Larry Moran has sort-of replied to my previous blogpost but disappoints with only one substantive point. And even that one point is wrong: ID is not committed to the idea that individual genomes be well-designed; that is just an expectation some of us derive based on belief in a designer which is established on other evidence. ID would still be true if only globular proteins were designed (lookup Axe), or even if only the flagellum was designed (lookup Behe), or even if only the first life form was designed (lookup Meyer – and please read their actual work, not cheap reviews, because reviewers often dont pick up on the salient points – more below). I just say this lest readers get the impression that this is ID’s strongest point, or in any sense a weak point. It is neither.
It's true that there are some IDiots who are distancing themselves from a commitment to junk DNA. There are probably some who claim that they could live with the fact that 90% of our DNA is junk.

But let's not forget that Jonathan Wells is a prominent IDiot and he wrote a book on The Myth of Junk DNA. It sounded very much like Intelligent Design Creationism is staking its reputation on finding function for most of our genome.

Monday, October 21, 2013

Jukes to Crick on Junk DNA

Dan Graur discovered that the term "junk DNA" was commonly used in the 1960's—long before Susumu Ohno used "junk" in the title of his 1972 paper. This makes a lot of sense. Apparently the term was quite commonly used in Cambridge by people like Francis Crick and Sydney Brenner. (Perhaps you've heard of them?)

Graur found a 1963 paper that refers to "junk" DNA. This is the earliest known refencee to junk in the scientific literature. Read about his sleuthing at: The Origin of Junk DNA: A Historical Whodunnit.

Meanwhile, a person named "ShadiZl" commented on one my posts and pointed me to a letter from Thomas Jukes to Francis Crick in 1979. Jukes, you might recall, was no Darwinian. He was a proponent of Neutral Theory and random genetic drift. The letter is archived on the National Library of Medicine (USE) site under a section devoted to The Francis Crick Papers: Letter from Thomas H. Jukes to Francis Crick.

The letter is interesting because it reveals how casually the "insiders" talked about junk DNA and about the adaptationist misconception even as far back as 1979. This was when Gould and Lewontin published the "spandrels" paper. It also reveals how misguided the creationists are when it comes to the history of junk DNA. They still think that it was "Darwinists" who "predicted" junk DNA based on their view of natural selection. (Do not read this letter if you are irony-deficient. It will only confuse you.)
December 20, 1979

Dear Francis:

I am sure that you realize how frightfully angry a lot of people will be if you say that much of the DNA is junk. The geneticists will be angry because they think that DNA is sacred. The Darwinian evolutionists will be outraged because they believe every change in DNA that is accepted in evolution is necessarily an adaptive change. To suggest anything else is an insult to the sacred memory of Darwin.

This additive is so pervasive that if no reason can be found for an evolutionary change, it is necessary to invent one. Kimura points out that one author attributed the pink color of flamingos to protective coloration against the setting sun. This type of thinking carries over into people who sequence mRNA. They claim that differences between rabbit and human globin mRNAs are because each species has its own requirements for secondary structure.

Various people have tried to think up possible functions for the regions of DNA that do not code for anything as far as is known. Roy Britten says that such DNA has a regulatory function.

Actually, the scheme proposed by Britten about ten years ago was that occasionally events of saltatory duplication, took place, so that a great many copies of a short piece of DNA were made. As time went by, the composition of a family of identical copies became changed by drift, until the copies no longer closely resemble each other. Figure 55 of the article by Britten shows a diagram of a sort of "junk DNA generating system". I note that he says on page 105 "the rate of increase in DNA content per cell resulting from saltatory replication alone may prove to be embarrassingly large and a mechanism for the loss of DNA may have to be invoked". I gather that you agree with this.

I quoted you on drift in DNA in a talk that I gave at the symposium for Emil Smith (see enclosure). Your concept of "junk DNA" presumably includes this idea. I shall look forward to hearing more about it, and I have been asked by Die Naturwissenschaften to write an article on silent changes, so I hope I can include mention of your new manuscript when I start to write mine.

With best regards,


Thomas H. Jukes



Saturday, May 04, 2024

Casey Luskin posts misleading quotes about junk DNA

On Thursday May 2, 2024, Casey Luskin and Dan Stern Cardinale debated junk DNA on the YouTube channel "The NonSequitor Show." David Klinghoffer thinks that this debate went very well for the ID side [Debate: Casey Luskin Versus Rutgers Biologist Dan Cardinale, Thursday, May 2]. I agree with Klinghoffer; Luskin did an excellent job of promoting his case because many of his statements and claims were not challenged effectively.

I'll be putting up a separate post on the debate but for now I'd like to address an article by Casey Luskin that he posted before the debate as preparation for what he was going to say. The article consists of a bunch of quotes from prominent scientists about junk DNA [“Junk DNA” from Three Perspectives: Some Key Quotes]. Here are the three perspectives, according to Luskin.

Category 1: Quotes from evolutionists claiming (or repeating the widespread belief) that non-coding DNA is “junk” and has no function.

Some of the quotes represent the actual position of junk DNA proponents but Luskin has also picked out stupid quotes from scientists who think, incorrectly, that all non-coding DNA is junk. This is deliberate as we will see below.

Category 2: Early quotes from intelligent design theorists predicting function for non-coding “junk” DNA.

Luskin builds the case for function in non-coding DNA by quoting religious scientists who "predict" that there will be functional DNA in non-coding regions of the genome. This is disingenuous at best because Luskin knows full well that from the very beginning of the scientific debate we knew about functional non-coding DNA. It was never the case that all non-coding DNA was assumed to be junk.

Category 3: Quotes from mainstream scientific sources saying that we’ve experienced a shift in our thinking that junk DNA actually has function.

Many of these quotes are from scientists announcing that some non-coding DNA has a function. They support Luskin's false claim that all non-coding DNA was thought to be junk and the discovery of functional regions of non-coding DNA has resulted in a "paradigm shift" in our view of the human genome.

Casey Luskin should not have been allowed to get away with equating junk DNA and non-coding DNA in the debate. He should have been challenged to retract that false claim at the very beginning of the debate and called out whenever he used the term "non-coding DNA" during the debate.