More Recent Comments

Tuesday, February 27, 2024

Nils Walter disputes junk DNA: (2) The paradigm shaft

I'm discussing a recent paper published by Nils Walter (Walter, 2024). He is trying to explain the conflict between proponents of junk DNA and their opponents. His main focus is building a case for large numbers of non-coding genes.

This is the second post in the series. The first one outlines the issues that led to the current paper.

Nils Walter disputes junk DNA: (1) The surprise

Walter begins his defense of function by outlining a "paradigm shift" that's illustrated in Figure 1.

FIGURE 1: Assessment of the information content of the human genome ∼20 years before (left)[110] and after (right)[111] the Human Genome Project was preliminarily completed, drawn roughly to scale.[9] This significant progress can be described per Thomas Kuhn as a “paradigm shift” flanked by extended periods of “normal science”, during which investigations are designed and results interpreted within the dominant conceptual frameworks of the sub-disciplines.[9] Others have characterized this leap in assigning newly discovered ncRNAs at least a rudimentary (elemental) biochemical activity and thus function as excessively optimistic, or Panglossian, since it partially extrapolates from the known to the unknown.[75] Adapted from Ref. [9].

Reference #9 is a paper by John Mattick promoting a "Kuhnian revolution" in molecular biology. I've already discussed that paper as an example of a paradigm shaft, which is defined as a strawman "paradigm" set up to make your work look like revolutionary [John Mattick's new paradigm shaft]. Here's the figure from the Mattick paper.

The Walter figure is another example of a paradigm shaft—not to be confused with a real paradigm shift.1 Both pie charts misrepresent the amount of functional DNA since they don't show regulatory sequences, centromeres, telomeres, origins of replication, and SARS. Together, these account for more functional DNA than the functional regions of protein-coding genes and non-coding genes. We didn't know the exact amounts in 1980 but we sure knew they existed. I cover this in Chapter 5 of my book: "The Big Picture."

The 1980 view also implies, incorrectly, that we knew nothing about the non-functional component of the genome when, in fact, we knew by then that half of our genome was composed of transposon and viral sequences that were likely to be inactive, degenerate fragments of once active elements. (John Mattick's figure is better.)

The 2020 view implies that most intron sequences are functional since introns make up more than 40% of our genome but only about 3% of the pie chart. As far as I know, there's no evidence to support that claim. About 80% of the pie chart is devoted to transcripts identified as either small ncRNAs or lncRNAs. The implication is that the discovery of these RNAs represents a paradigm shift in our understanding of the genome.

The alternative explanation is that we've known since the late 1960s that most of the human genome is transcribed and that these transcripts—most of which turned out to be introns—are junk RNA that is confined to the nucleus and rapidly degraded. Advances in technology have enabled us to detect many examples of spurious transcripts that are present transiently at low levels in certain cells. I cover this in Chaper 8 of my book: "Noncoding Genes and Junk RNA.

The whole point of Nils Walter's paper is to defend the idea that most of these transcripts are functional and the alternative explanation is wrong. He's trying to present a balanced view of the controversy so he's well aware of the fact that some of us interpret the red part of the pie chart as spurious transcripts (junk RNA). If he's wrong, and I am right, then there's no paradigm shift.

You don't get to shift the paradigm all on our own, even if John Mattick is on our side. A true paradigm shift requires that the entire community of scientists changes their perspective and that hasn't happened.

In the next few posts we'll see whether Nils Walter can make a strong case that all those lncRNAs are functional. They cover about two-thirds of the genome in the pie chart. If we assume that the average length of these long transcripts is 2000 bp then this represents one million transcripts and potentially one million non-coding genes.


1. The term "paradigm shaft" was coined by reader Diogenes in a comment on this blog from many years ago.

Walter, N.G. (2024) Are non‐protein coding RNAs junk or treasure? An attempt to explain and reconcile opposing viewpoints of whether the human genome is mostly transcribed into non‐functional or functional RNAs. BioEssays:2300201. [doi: 10.1002/bies.202300201]

Nils Walter disputes junk DNA: (1) The surprise

Nils Walter attempts to present the case for a functional genome by reconciling opposing viewpoints. I address his criticisms of the junk DNA position and discuss his arguments in favor of large numbers of functional non-coding RNAs.

Nils Walter is Francis S. Collins Collegiate Professor of Chemistry, Biophysics, and Biological Chemistry at the University of Michigan in Ann Arbor (Michigan, USA). He works on human RNAs and claims that, "Over 75% of our genome encodes non-protein coding RNA molecules, compared with only <2% that encodes proteins." He recently published an article explaining why he opposes junk DNA.

Walter, N.G. (2024) Are non‐protein coding RNAs junk or treasure? An attempt to explain and reconcile opposing viewpoints of whether the human genome is mostly transcribed into non‐functional or functional RNAs. BioEssays:2300201. [doi: 10.1002/bies.202300201]

The human genome project's lasting legacies are the emerging insights into human physiology and disease, and the ascendance of biology as the dominant science of the 21st century. Sequencing revealed that >90% of the human genome is not coding for proteins, as originally thought, but rather is overwhelmingly transcribed into non-protein coding, or non-coding, RNAs (ncRNAs). This discovery initially led to the hypothesis that most genomic DNA is “junk”, a term still championed by some geneticists and evolutionary biologists. In contrast, molecular biologists and biochemists studying the vast number of transcripts produced from most of this genome “junk” often surmise that these ncRNAs have biological significance. What gives? This essay contrasts the two opposing, extant viewpoints, aiming to explain their basis, which arise from distinct reference frames of the underlying scientific disciplines. Finally, it aims to reconcile these divergent mindsets in hopes of stimulating synergy between scientific fields.

Saturday, February 17, 2024

How to end the war in Ukraine according to a Canadian Conservative "diplomat"

In my opinion, the war in Ukraine is much more complicated than most people realize. We are constantly bombarded with propaganda from all sides and it inhibits rational thinking. One of the few reliable facts is that Vladimir Putin is a very smart bad person.

Lots of people think they have the answer to ending the war in Ukraine. One of the latest pundits is Chris Alexander who has published his thoughts in the Feb. 16, 2024 edition of Canada's Globe and Mail: Ukraine is paying the price for our nonchalance toward Russia’s leadership. Alexander spent years in Canada's Foreign Service, including many years in Moscow and a stint as Canada's ambassador to Afghanistan. In 2011 he was elected to Parliament as a Conservative MP and served as Minister of Citizenship and Immigration in Stephen Harper's cabinet. His reputation as a politician was very different than his previous, mostly admirable, reputation as a diplomat. Here's a excerpt from his Wikipedia article.

Wednesday, February 14, 2024

Copilot answers the question, "What is junk DNA?"

The Microsoft browser (Edge) has a built in function called Copilot. It's an AI assistant based on ChatGPT-4.

I decided to test it byt asking "What is junk DNA?" and here's the answer it gave me.

Sunday, February 11, 2024

Older but wiser?

With age comes wisdom, but sometimes age comes alone.

Oscar Wilde

Like many baby boomers, I sometimes forget people's names and other important bits of information. Sometimes I can't find a word that's been in my vocabulary for decades. These lapses are often temporary but very annoying. It's a sign of age. (I am 77 years old.)

We often make fun of these incidents and consol ourselves with the knowledge that we may be old but we are much wiser than we were in our younger days. We have years and years of experience behind us and over the years we've learned a thing or two that we never understood when we were listening to the Beatles on the radio. We've lived through the Cuban Missile crisis, the war in Viet Nam, the assassination of two Kennedys and Martin Luther King, and a host of cultural changes. We've lived in several different countries and we've raised children. All of these experiences have made us wiser, or so we think.

Friday, February 09, 2024

Open and closed chromatin domains (and epigenetics)

Gene expression in eukaryotes is influenced by the state of chromatin. Tightly packed nucleosomes inhibit the binding of transcription factors and RNA polymerase so that genes in these regions are "repressed." From time to time these regions loosen up a bit allowing access to transcription complexes and subsequent transcription.

The tightly packed regions are known as closed domains and the accessible regions are open domains. Some authors add an intermediate domain called a permissive domain. This model of eukaryotic gene expression has been around for 50 years and the important mechanisms controlling the switch were worked out in the 1980s. I found a recent review that covers this issue in the context of epigenetics and the image below comes from that paper (Klemm et al., 2019).

Wednesday, February 07, 2024

Philip Ball's new book: "How Life Works"

Philip Ball has just published a new book "How Life Works." The subtitle is "A User’s Guide to the New Biology" and that should tell you all you need to know. This is going to be a book about how human genomics has changed everything.

Monday, January 29, 2024

"People also ask" about junk DNA

I'm interested in the spread of science misinformation on the internet. The misinformation about the human genome is a good example that illustrates the problem. There are many other examples but I happen to know a lot about this particular one.

Anyone trying to find out about junk DNA will find it impossible to get a correct answer by searching the internet. The correct answer is that the amount of junk DNA in the human genome is controversial: some scientists think that most of our genome is functional while others think that as much as 90% is junk. The scientific evidence strongly favors the junk side of the controvesy and that's very well explained in the Wikipedia articles on Junk DNA and Non-coding DNA.

Wednesday, January 10, 2024

Benjamin Lewin's new book and his view of the human genome

I was a big fan of Benjamin Lewin. Back in the 1970's he published the first volumes of what was to become Genes, the authoritative textbook of molecular biology. I admired his ability to understand the latest experiments and put the results in the appropriate context.

Later on, when he founded the journal Cell, his editorials and other writings were always insightful. His editorial judgement was impeccable—he always published the very best papers in molecular biology.1

Saturday, January 06, 2024

Why do Intelligent Design Creationists lie about junk DNA?

A recent post on Evolution News (sic) promotes a a new podcast: Casey Luskin on Junk DNA’s “Kuhnian Paradigm Shift”. You can listen to the podcast here but most Sandwalk readers won't bother because they've heard it all before. [see Paradigm shifting.]

Luskin repeats the now familiar refrain of claiming that scientists used to think that all non-coding DNA was junk. Then he goes on to list recent discoveries showing that some of this non-coding DNA is functional. The truth is that no knowledgeable scientist ever claimed that all non-coding DNA was junk. The original idea of junk DNA was based on evidence that only 10% of the genome is functional and these scientists knew that coding regions occupied only a few percent. Thus, right from the beginning, the experts on genome evolution knew about all sorts of functional non-coding DNA such as regulatory sequences, non-coding genes, and other things.

Saturday, December 16, 2023

Kat Arney interviews me on her podcast

I had a long chat with Kat Arney a few weeks ago and she has now taken the best parts of that conversation and put them in her latest Genetics Society podcast: Genes, junk and the 'dark genome'. My comments are in the last twelve minutes. At the end, Kat asks me "Is there like one thing you would really want a student or researcher, working in genetics today to really understand about the human genome?"

Kat was kind enough to write a blurb for my book last year where she said,

What's in Your Genome? is a thought-provoking and pugnatious book that will make you wonder afresh at the molecular intracies of life. When it comes to our genomes, we humans are nothing special—Moran makes a convincing argument that the vast majority of our sloppy human genome is not mysterious genetic treasures but boring junk.

In this podscast, she combines my thoughts on the human genome with those of two people who don't agee with the idea that the human genome is full of junk. Here's a brief summary of their positions.

Naomi Allen is Chief Scientist at UK Biobank, a consortium that's sequencing the genomes of UK citizens. So far, they've published data on 500,000 genome sequences. I wrote about one of their more significant findings last year (August, 2022) where they reported on the fraction of the human genome that was under purifying selection. This is an excellent proxy for functional DNA and the results are in line with (my) expectations: less that 10% of the genome is conserved and most of it is in the non-coding fraction [Identifying functional DNA (and junk) by purifying selection.

It's too bad that Kat's interview with Naomi Allen doesn't mention that important result, especially since the podcast is about junk DNA. Here's how Naomi Allen begins her part of the interview.

Whole genome sequencing enables researchers to look at all of the genetic variation across the entire genome. So not just in the 2% of the genome that encodes for proteins, but all of the genetic variation, much of which was previously considered "junk DNA" precisely because we didn't know what it did.

This is disappointing for two important reasons. First, surely in 2023 we've gone beyond the tired myth that all of the information in the human genome was concentrated in coding DNA? Second, no knowledgeable scientist ever said that all non-coding DNA was junk DNA and the idea of junk DNA was not based on ignorance so surely it's time to stop repeating that myth as well.

The rest of that interview focuses on how mapping genetic variation could contribute to our understanding of health and disease. I would have loved to ask how Biobanks proposes to do this if most of the variation is in junk DNA and also ask whether mutations in junk DNA can contribute to genetic disease. (They can.)

Danuta Jeziorska is the CEO of Nucleome Therapeutics, a company that's described as "spun out of Oxford University with a new set of technologies for exploring the dark genome." Kat asks her about the dark genome and here's her response.

So if you think about it, we have 22,000 genes in our genome, and we can compare that to having 22,000 ingredients in the fridge. We use the same set of ingredients to create different meals, just like how we have the same DNA within each cell, but then we have hundreds of different cell types. So this dark genome determines the combination of ingredients of the genes that you take and at which level you use them, to produce the different cell types that build our body. And you can just imagine that if you make a mistake in that - so let's say that you add the wrong ingredients in the wrong meal, you can mess up the meal. And in this same way you can mess up the cell type. So if you, for example, if you don't produce enough of haemoglobin to transport oxygen around the body, you will end up with a genetic form of anaemia or if you turn on a gene that's not supposed to be turned on, like an oncogene, you may end up having cancer.

So the dark genome is now very well understood as the mechanism that is causing diseases.

This is a slightly different definition of the dark genome than those I discussed in a recent post [What is the "dark matter of the genome"?]. In that post I suggested that most scientists were referring to all of the functions in non-coding DNA but Danuta Jeziorska seems to be restricting her use of "dark genome" to just regulatory sequences. In the rest of the interview she goes on to describe various types of regulatory sequences, with an emphasis on 3D structure, and to explain that many common genetic diseases are caused by mutations in regulatory sequences. Her company is using machine learning to find the functional elements in the dark genome and which variants are associated with disease. They are also investing in drug discovery.


What is the "dark matter of the genome"?

The phrase "dark matter of the genome" is used by scientists who are skeptical of junk DNA so they want to convey the impression that most of the genome consists of important DNA whose function is just waiting to be discovered. Not surprisingly, the term is often used by researchers who are looking for funding and investors to support their efforts to use the latest technology to discover this mysterious function that has eluded other scientists for over 50 years.

The term "dark matter" is often applied to the human genome but what does it mean? We get a clue from a BBC article published by David Cox last April: The mystery of the human genome's dark matter. He begins the article by saying,

Twenty years ago, an enormous scientific effort revealed that the human genome contains 20,000 protein-coding genes, but they account for just 2% of our DNA. The rest of was written off as junk – but we are now realising it has a crucial role to play.

Friday, December 08, 2023

What really happened between Rosalind Franklin, James Watson, and Francis Crick?

That's part of the title of podcast by Kat Arney who interviews Matthew Cobb [Double helix double crossing? What really happened between Rosalind Franklin, James Watson and Francis Crick?].

Matthew Cobb is one of the world's leading experts on the history of molecular biology.

The way it’s usually told, Franklin was effectively ripped off and belittled by the Cambridge team, especially Watson, and has only recently been restored to her rightful place as one of the key discoverers of the double helix. It’s a dramatic narrative, with heroes, villains and a grand prize. But, as I found out when I sat down for a chat with Matthew Cobb, science author and Professor of Zoology at the University of Manchester, the real story is a lot more nuanced.

Photo 51 did not belong to Rosalind Franklin and it had (almost) nothing to do with solving the structure of DNA. Franklin and Wilkins would never have gotten the structure on their own. Crick and Watson did not "steal" any data. Whether they behaved ethically is debatable.


Sunday, November 26, 2023

ChatGPT gets two-thirds of science textbook questions wrong: time to bring it into the classroom!

The November 16th issue of Nature has an article about ChatGPT: ChatGPT has entered the classroom: how LLMs could transform education. It reports that the latest version (GPT4) can only answer one third of questions correctly in physical chemistry, physics, and calculus. Nevertheless, the article promotes the idea that ChatGPT should be brought into the classroom!

An editorial in the same issue explains Why teachers should explore ChatGPT’s potential — despite the risks.

Many students now use AI chatbots to help with their assignments. Educators need to study how to include these tools in teaching and learning — and minimize pitfalls.

I don't get it. It seems to me that the problems with ChatGPT far outweigh the advantages and the best approach for now is to warn students that using AI tools may be terribly misleading and could lead to them failing a course if they trust the output. That doesn't mean that there's no potential for improvement in the future but this can only happen if the sources of information used by these tools were to become much more reliable. No improvements in the algorithms are going to help with that.