A recent paper on characterizing endogenous retrovirus sequences has attracted some attention because of a press release from Kyoto University that focused on refuting junk DNA. But it turns out that there's no mention of junk DNA in the published paper.
Let's start with a little background. Retroviruses are RNA viruses that go though a stage where their RNA genomes are copied into DNA by reverse transcriptase. The virus may integrate into the host genome and be carried along for many generations producing low levels of virus particles [Retrotransposons/Endogenous Retroviruses]. The integrated copies are called endogenous retroviruses (ERVs).
Our genome contains about 31 different families of ERVS that have integrated over millions of years. Most of the original virus genomes have acquired mutations, including insertions and deletions, and they are no longer active. These sequences account for about 8% of our genome.
This is a podcast in French on the topic of junk DNA. The moderator is Thomas C. Durand of La Tronche en Biais, a YouTube channel that focuses on critical thinking. Durand interviews two scientists from l’Université Paris Cité (City University of Paris), Didier Casane and Patrick Laurenti.
It's a two hour video that discusses all the relevant topics on the human genome and junk DNA. The most exciting part for me comes at 56 mins when the moderator asks Casane and Laurenti to recommend a book on the subject (see screenshot on right). Patrick Laurenti suggests that my book should be translated into French but I don't think that's going to happen.
The John Templeton Foundation supports "interdisciplinary research and catalyze conversations that enable people to pursue lives of meaning and purpose." Many of these projects have religious themes or religious implications. The foundation is well-known for its support of projects that promote the compatibility of science and religion. You can see a list of recent grants here.
Templeton recently awarded a grant of $607,686 (US) to study the role of transposons in the human genome. The project leader is Stefan Linquist, a philosopher from the University of Guelph (Guelph, Ontario, Canada). Stefan has published a number of papers on junk DNA and he promotes the definition of functional DNA as DNA that is subject to purifying selection [The function wars are over]. Other members of the team include Ryan Gregory and Ford Doolittle who are prominent supporters of junk DNA.
A few months ago (June, 2024) I commented on an article by Tom Cech in The New York Times. [Tom Cech writes about the "dark matter" of the genome] In that article he expressed the view that 75% of the human genome consists of "dark matter" that is copied into RNAs of unknown function. He believes that many of these mysterious RNAs will turn out to have exciting functions.
I suspected that Cech is opposed to junk DNA and that suspicion is confirmed in his new book The Catalyst.
Every now and then I check Google to see if there's any news about junk DNA. I use "junk DNA" as my search query.
The first thing I see at the top of the results page is a summary of the topic created by Google's Generative AI, which it claims is experimental. The AI summary is different every time you start a new search but all of the responses are similar in that they criticize the idea of junk DNA. Here's an example from today,
The idea that most of the human genome is junk originated more that 50 years ago. Since then, evidence in support of this concept has steadily accumulated but it has been stongly resisted by most biochemists and molecular biologists. Opposition is even stronger among scientists in other fields and in the general public thanks to a steady stream of anti-junk articles in the popular press.
Much of this opposition to junk DNA stems from a massive publiciy campaign launched by ENCODE researchers and the leading science journals back in 2012.
It's likely that most of the controversy over junk DNA is related to differing views on evolution and the power of natural selection. Most people think that natural selection is very powerful so that modern species must be extremely well-adapted to their present environment. They tend to believe that complexity is simply a reflection of sophisticated fine-tuning and this must apply to the human genome. According to this view, the presence of huge amounts of DNA with an unknown function is just a temporary situation and in the next few years most of this 'dark matter' will turn out to have a function. It has to have a function otherwise natural selection would have eliminated it.
The Center for Science and Culture (sic) and the Discovery Institute (sic) have published another propaganda video on junk DNA. The emphasis is on their claim that ID predicted a functional genome and that prediction turned out to be correct! The difference between this video an previous attempts to rationalize their failures is that I now get a personal mention and a caricature in this latest video.
I think I understand the problem. The ID creationists are getting worried about junk DNA as they realize that more and more scientists are beginning to understand the real problems with the ENCODE data and previous claims of function. This is why they are attempting to rebut the science behind junk DNA. But the real problem is that they simply don't understand the science as you can see in the video.
Once again, we are faced with a question about whether Intelligent Design Creationists are stupid or lying (or both).
Current Opinion in Plant Biology has a special edition devoted to Genome studies and molecular genetics 2024. The only paper (so far) that discusses plant genomes is one devoted to RNAs. Here's the abstract ...
Anyatama, A., Datta, T., Dwivedi, S. and Trivedi, P.K. (2024) Transcriptional junk: Waste or a key regulator in diverse biological processes? Current Opinion in Plant Biology 82:102639. [doi: 10.1016/j.pbi.2024.102639]
Plant genomes, through their evolutionary journey, have developed a complex composition that includes not only protein-coding sequences but also a significant amount of non-coding DNA, repetitive sequences, and transposable elements, traditionally labeled as “junk DNA”. RNA molecules from these regions, labeled as “transcriptional junk,” include non-coding RNAs, alternatively spliced transcripts, untranslated regions (UTRs), and short open reading frames (sORFs). However, recent research shows that this genetic material plays crucial roles in gene regulation, affecting plant growth, development, hormonal balance, and responses to stresses. Additionally, some of these regulatory regions encode small proteins, such as miRNA-encoded peptides (miPEPs) and microProteins (miPs), which interact with DNA or nuclear proteins, leading to chromatin remodeling and modulation of gene expression. This review aims to consolidate our understanding of the diverse roles that these so-called “transcriptional junk” regions play in regulating various physiological processes in plants.
Lungfish are our closest living fish cousins. All living terrestrial vertebrates (e.g. amphibians, mammals, reptiles) descent from a common ancestor with lungfish. The split occurred about 400 million years ago (4Ma) (Devonian) when there were 70-100 different lungfish species.
This relationship (lungfish-tetrapods) was firmly established recently by comparing the genome of the Australian lungfish (Neoceratodus forsteri) with that of tetrapods (Meyer et al., 2021). The other possibility had been ceolacanth-tetrapods. Coelacanths and lungfish are related—they form the class Sarcopterygii (lobe-finned fish).
I just learned that John Mattick gave a seminar this morning at the Department of Cell & Systems Biology at the University of Toronto. Unfortunately, I was unable to attend.
Most Sandwalk readers will recognize Mattick as one of the few remaining vocal opponents of junk DNA. He is probably best known for his dog-ass plot but this is only one of the ways he misrepresents science.
Scite Assistant is billed as "your AI research partner" and as "ChatGPT for researchers." It's supposed to draw on peer-reviewed published scientific papers for its information and it will give you an answer with genuine citations.
That sounds like a good idea until you realize that the scientific literature is full of misinformation and conflicting information. What we need is an AI assistant that can help us sort throught the misinformation and give us a genuine well-informed answer on controversial issues.
Let's pick the question of junk DNA as a completley random (!) example of such an issue. The scientific literature is full of false information about the origin of the term "junk DNA" and what it was originally intended to describe. It's also full of false information about recent results and how they pertain to junk DNA.
Zach Hancock is a postdoc in ecology & evoluvionary biology at the University of Michigan. He has a YouTube channel with several thousand subscribers. You might recall that he interviewed me last year when my book came out [Zach Hancock interviews me on his YouTube channel].
He has just posted a new video on junk DNA that's well worth watching. He tries to correct all the falsehoods and misinformation on junk DNA, especially those promoted by creationists. It's well worth watching.
Tom Cech won a Nobel Prize for discovering one example of a catalytic RNA. He recently published an article in the New York Times extolling the virtues of RNA and non-coding genes [The Long-Overlooked Molecule That Will Define a Generation of Science]. There's a fair amount of hype in the article but the main point is quite valid—over the past fifty years we have learned about dozens of important non-coding RNAs that we didn't know about at the beginning of molecular biology [see: Non-coding RNA, Non-coding DNA].
The main issue in this field concerns the number of non-coding genes in the human genome. I cover the available data in my book and conclude that there are fewer than 1000 (p.214). Those scientists who promote the importance of RNA (e.g. Tom Cech) would like you to believe that there are many more non-coding genes; indeed, most of those scientists believe that there are more non-coding genes than coding genes (i.e. > 20,000). They rarely present evidence for such a claim beyond noting that much of our genome is transcribed.
Tom Cech is wise enough to avoid publishing an estimate of the number of non-coding genes but his bias is evident in the following paragraph from near the end of his article.
Although most scientists now agree on RNA's bright promise, we are still only beginning to unlock its potential. Consider, for instance, that some 75 percent of the human genome consists of dark matter that is copied into RNAs of unknown function. While some researchers have dismissed this dark matter as junk or noise, I expect it will be the source of even more exciting breakthroughs.
Let's dissect this to see where the bias lies. The first thing you note is the use of the term "dark matter" to make it sound like there's a lot of mysterious DNA in our genome. This is not true. We know a heck of a lot about our genome, including the fact that it's full of junk DNA. Only 10% of the genome is under purifying selection and assumed to be functional. The rest is full of introns, pseudogenes, and various classes of repetitive sequences made up mostly of degraded transposons and viruses. The entire genome has been sequenced—there's not much mystery there. I don't know why anyone refers to this as "dark matter" unless they have a hidden agenda.
The second thing you notice is the statement that 75% of the genome is transcribed at some time or another and, according to Tom Cech, these transcripts have an unknown function. That's strange since protein-coding genes take up roughly 40% of our genome and we know a great deal about coding DNA, UTRs, and introns. If you add in the known examples of non-coding genes, this accounts for an additional 2-3% of the genome.1
Almost all the rest of the transcripts come from non-conserved DNA and those transcripts are present at less than one copy per cell. As the ENCODE researchers noted in 2014, they are likely to be junk RNA resulting from spurious transcription. I'd say we know a great deal about the fraction of the genome that's transcribed and there's not much indication that it's hiding a plethora of undiscovered functional RNAs.
1. In my book I make a generous estimate of 5,000 non-coding genes in order to avoid quibbling over a smaller number and in order to demonstrate that even with such a obvious over-estimate the genome is still 90% junk.
Here's a link to the junk DNA debate between Dan Stern Cardinale and Casey Luskin. The debate took place on May 2, 2024.
I mentioned in a previous post that Luskin should have been called out on his repeated attempts to equate junk DNA with non-coding DNA. This allowed him to portray all non-coding functions as evidence against junk DNA. [Casey Luskin posts misleading quotes about junk DNA].
There are several other things that I would have done differently. I would have made it clear that 10% of the genome is functional and we don't know the function of some of that fraction. Thus, all newly discovered functional regions could still fit into the 10% and 90% of the genome is still junk. Every time Casey mentions a new function he should have been challenged to specify exactly what percentage of the genome he is referring to. (Dan tried to do this but he was too nice, and let Casey off the hook.)
The idea here is to make it clear to viewers that recent discoveries of functional regions do not affect the idea that most of our genome is junk.
I would also attempt to get Casey to admit that there's a scientific controversy over junk DNA so there are many papers defending junk DNA and criticizing the arguments of junk DNA opponents. For every quotation from a scientist who opposes junk, there's an equally significant quotation from one who supports junk. Why does Casey only quote scientists who agree with him? Is this cherry-picking? Is selectively rattling off quotations and references from people who agree with you a reasonable way to have a serious scientific debate?
I think the arguments over transcripts should begin with presenting all the scientific evidence that spurious transcripts exist - for example, random DNA sequences inserted into a cell nucleus are transcribed and spurious transcription is easily documented in well-studied organisms such as bacteria and yeast. The characteristics of spurious transcription are that the transcripts are present in very small amounts, that they are rapidly degraded, that they come from regions of the genome that are not under purifying selection, and they are cell/tissue specific. So what is the most reasonable explanation when you look at such transcripts?
Casey Luskin's attempt to avoid the best explanation (spurius transcription) is a classic example ad hoc rescue and it might have been useful to point this out to viewers.
Regulation is not new. There was serious discussion and debate over the amount of the genome devoted to regulation back in the late 1960s when the concept of junk DNA was first proposed. Casey should have been challenged to state what percentage of the genome is devoted to regulation and if he comes up with an unreasonable number he should have to give examples of many well-studied genes that have been shown to have that level of regulation. (Hint: There aren't any.) All of the detailed work on the regulation of dozens of specific human genes has shown that you don't need more than a few transcription factor binding sites to control expression. Is there any reason to suppose that the other genes require ten or a hundred times more regulatory sequences to control expression?
What is the trend line? Ever since the ENCODE publicity disaster of 2012 there has been a flood of papers defending junk DNA and the data supporting junk DNA is now stronger that it has ever been because we now know from hundreds of thousands of human genome sequences that only about 10% is under purifying selection. There have also been a lot of papers fleshing out the 10% of the genome that's functional. There have only been a handful of papers published in the past ten years that seriously attempt to present evidence that most of our genome is functional. I would have challenged Casey to come up with a single scientific publication in the past ten years claiming, with supporting data, that most of the genome is functional.
On Thursday May 2, 2024, Casey Luskin and Dan Stern Cardinale debated junk DNA on the YouTube channel "The NonSequitor Show." David Klinghoffer thinks that this debate went very well for the ID side [Debate: Casey Luskin Versus Rutgers Biologist Dan Cardinale, Thursday, May 2]. I agree with Klinghoffer; Luskin did an excellent job of promoting his case because many of his statements and claims were not challenged effectively.
I'll be putting up a separate post on the debate but for now I'd like to address an article by Casey Luskin that he posted before the debate as preparation for what he was going to say. The article consists of a bunch of quotes from prominent scientists about junk DNA [“Junk DNA” from Three Perspectives: Some Key Quotes]. Here are the three perspectives, according to Luskin.
Category 1: Quotes from evolutionists claiming (or repeating the widespread belief) that non-coding DNA is “junk” and has no function.
Some of the quotes represent the actual position of junk DNA proponents but Luskin has also picked out stupid quotes from scientists who think, incorrectly, that all non-coding DNA is junk. This is deliberate as we will see below.
Category 2: Early quotes from intelligent design theorists predicting function for non-coding “junk” DNA.
Luskin builds the case for function in non-coding DNA by quoting religious scientists who "predict" that there will be functional DNA in non-coding regions of the genome. This is disingenuous at best because Luskin knows full well that from the very beginning of the scientific debate we knew about functional non-coding DNA. It was never the case that all non-coding DNA was assumed to be junk.
Category 3: Quotes from mainstream scientific sources saying that we’ve experienced a shift in our thinking that junk DNA actually has function.
Many of these quotes are from scientists announcing that some non-coding DNA has a function. They support Luskin's false claim that all non-coding DNA was thought to be junk and the discovery of functional regions of non-coding DNA has resulted in a "paradigm shift" in our view of the human genome.
Casey Luskin should not have been allowed to get away with equating junk DNA and non-coding DNA in the debate. He should have been challenged to retract that false claim at the very beginning of the debate and called out whenever he used the term "non-coding DNA" during the debate.
Intelligent Design Creationists are heavily invested in refuting junk DNA because it casts doubt on their model of an intelligently designed human. Over the years they have advanced all kinds of arguments against junk DNA and some ID supporters actually address the real scientific issues (e.g. Jonathan Wells). However, most Intelligent Design Creationists are as ignorant about the scientific dispute over junk DNA as they are about evolution and lots of other science issues that conflict with their underlying religious beliefs.
A few days ago (March 26, 2024), the Discovery Institute's Center for Science and Culture published a short video on "The MYTH of Junk DNA" where they ignored most of the science and appealed to the majority of creationists who don't care about the truth. We have enough data to conclude that the Discovery Institute isn't just ignorant of the real science but is actually lying in this video. We know this because there are prominent Senior Fellows of the Center for Science and Culture who know that the material in this video is wrong and/or mispleading.
Should universities remove online courses that contain incorrect or misleading information?
There are lots of scientific controversies where different scientists have conflicting views. Eventually these controversies will be solved by normal scientific means involving evidence and logic but for the time being there isn't enough data to settle a genuine scientific controversy. Many of us are interested in these controversies and some of us have chosen to invest time and effort into defending one side or the other.
But there's a dark side of science that infects these debates—false or misleading information used to support one side of a legitimate controversy. To give just one example, I'm frustrated at the constant reference to junk DNA being defined as non-coding DNA. Many scientists believe that this was the way junk DNA was defined by its earliest proponents and then they go on to say that the recent discovery of functional non-coding DNA refutes junk.
I don't know where this idea came from because there's nothing in the scientific literature from 50 years ago to support such a ridiculous claim. It must be coming from somewhere since the idea is so widespread.
Where does misinformation come from and how is it spread?
Paul Nelson is a Senior Fellow of the Discovery Institute—the most important source of intelligent design propaganda. Paul and I have been disagreeing about science for many years. He is prone to interpret anything he finds in the scientific literature as support for the idea that scientists have misunderstood their subject matter and failed to recognize that science supports intelligent design. My goal has always been to try and explain the actual science and why his interpretations are misguided. I have not been very successful.
The photo was taken in London (UK) in 2016 at a meeting on evolution. It looks like I'm holding my breath because I'm beside a creationist but I assure you that's not what was happening. We actually get along quite well in spite of the fact that he's wrong about everything. :-)