More Recent Comments

Tuesday, February 27, 2024

Nils Walter disputes junk DNA: (2) The paradigm shaft

I'm discussing a recent paper published by Nils Walter (Walter, 2024). He is trying to explain the conflict between proponents of junk DNA and their opponents. His main focus is building a case for large numbers of non-coding genes.

This is the second post in the series. The first one outlines the issues that led to the current paper.

Nils Walter disputes junk DNA: (1) The surprise

Walter begins his defense of function by outlining a "paradigm shift" that's illustrated in Figure 1.

FIGURE 1: Assessment of the information content of the human genome ∼20 years before (left)[110] and after (right)[111] the Human Genome Project was preliminarily completed, drawn roughly to scale.[9] This significant progress can be described per Thomas Kuhn as a “paradigm shift” flanked by extended periods of “normal science”, during which investigations are designed and results interpreted within the dominant conceptual frameworks of the sub-disciplines.[9] Others have characterized this leap in assigning newly discovered ncRNAs at least a rudimentary (elemental) biochemical activity and thus function as excessively optimistic, or Panglossian, since it partially extrapolates from the known to the unknown.[75] Adapted from Ref. [9].

Reference #9 is a paper by John Mattick promoting a "Kuhnian revolution" in molecular biology. I've already discussed that paper as an example of a paradigm shaft, which is defined as a strawman "paradigm" set up to make your work look like revolutionary [John Mattick's new paradigm shaft]. Here's the figure from the Mattick paper.

The Walter figure is another example of a paradigm shaft—not to be confused with a real paradigm shift.1 Both pie charts misrepresent the amount of functional DNA since they don't show regulatory sequences, centromeres, telomeres, origins of replication, and SARS. Together, these account for more functional DNA than the functional regions of protein-coding genes and non-coding genes. We didn't know the exact amounts in 1980 but we sure knew they existed. I cover this in Chapter 5 of my book: "The Big Picture."

The 1980 view also implies, incorrectly, that we knew nothing about the non-functional component of the genome when, in fact, we knew by then that half of our genome was composed of transposon and viral sequences that were likely to be inactive, degenerate fragments of once active elements. (John Mattick's figure is better.)

The 2020 view implies that most intron sequences are functional since introns make up more than 40% of our genome but only about 3% of the pie chart. As far as I know, there's no evidence to support that claim. About 80% of the pie chart is devoted to transcripts identified as either small ncRNAs or lncRNAs. The implication is that the discovery of these RNAs represents a paradigm shift in our understanding of the genome.

The alternative explanation is that we've known since the late 1960s that most of the human genome is transcribed and that these transcripts—most of which turned out to be introns—are junk RNA that is confined to the nucleus and rapidly degraded. Advances in technology have enabled us to detect many examples of spurious transcripts that are present transiently at low levels in certain cells. I cover this in Chaper 8 of my book: "Noncoding Genes and Junk RNA.

The whole point of Nils Walter's paper is to defend the idea that most of these transcripts are functional and the alternative explanation is wrong. He's trying to present a balanced view of the controversy so he's well aware of the fact that some of us interpret the red part of the pie chart as spurious transcripts (junk RNA). If he's wrong, and I am right, then there's no paradigm shift.

You don't get to shift the paradigm all on our own, even if John Mattick is on your side. A true paradigm shift requires that the entire community of scientists changes their perspective and that hasn't happened.

In the next few posts we'll see whether Nils Walter can make a strong case that all those lncRNAs are functional. They cover about two-thirds of the genome in the pie chart. If we assume that the average length of these long transcripts is 2000 bp then this represents one million transcripts and potentially one million non-coding genes.

1. The term "paradigm shaft" was coined by reader Diogenes in a comment on this blog from many years ago.

Walter, N.G. (2024) Are non‐protein coding RNAs junk or treasure? An attempt to explain and reconcile opposing viewpoints of whether the human genome is mostly transcribed into non‐functional or functional RNAs. BioEssays:2300201. [doi: 10.1002/bies.202300201]


Larry Moran said...

@Mehrshad: The more junk DNA, the greater the chances of making spurious transcripts (junk RNA). The presence of junk DNA also increases the chance of harmful mutations. This is why adding junk DNA is slightly deleterious and why it is selected against in species with large population where natural selection is powerful.

SPARC said...

Shouldn't it rather be "even if John Mattick is on your side" than "even if John Mattick is on our side"

John Harshman said...

One good thing Mattick did in that figure: he explicitly tried to represent the overlap between introns and other classes of non-coding sequences. What percentage of intron sequences consist of repetitive elements, transposons, etc.? If you can divide all those classes into intronic and non-intronic sets it could be informative.

Larry Moran said...

@John Harshman: I do this in most of my blog posts and it's in my book. There's no hard data that I could find so I just assumed that about 45% of all transposon-related sequences are found in introns. You don't want to count sequences twice when you are adding up the junk DNA.

This gives 43% introns (mostly junk) and 30% transposon-related sequences outside of the introns.

John Harshman said...

The simple point I'm trying to make is that you can, like Mattick, have a pie piece in one color representing intron sequences that aren't transposons, one for transposons that aren't in introns, and an intermediate color for transposons in introns, the intersection. Solves the problem.

John Harshman said...

Larry: Wow. I just read what he has to say about the C-value paradox. Words fail.

SPARC said...

It's frustrating and in my opinion embarrassing for Nils Walter that it needed students fumbling at a Wikipedia page to make someone who likely envisions himself as an educated knowledgeable scientist aware of years long a controversy about his central views and more so that he still doesn’t get it.

John Harshman said...

Walter's excuse for not establishing function through tests of...function... is that it wouldn't be a high-throughput operation and so could not test everything. So apparently he's not familiar with the concept of random sampling.

Larry Moran said...

@John Harshman: Here's how I tried to show the overlap in a pie chart back in 2018. The percentages are not correct but you get the idea.

What's In Your Genome? - The Pie Chart