More Recent Comments

Saturday, February 15, 2025

Junk DNA is gradually making its way into mainstream textbooks

The idea that most of the human genome is junk originated more that 50 years ago. Since then, evidence in support of this concept has steadily accumulated but it has been stongly resisted by most biochemists and molecular biologists. Opposition is even stronger among scientists in other fields and in the general public thanks to a steady stream of anti-junk articles in the popular press.

Much of this opposition to junk DNA stems from a massive publiciy campaign launched by ENCODE researchers and the leading science journals back in 2012.

It's likely that most of the controversy over junk DNA is related to differing views on evolution and the power of natural selection. Most people think that natural selection is very powerful so that modern species must be extremely well-adapted to their present environment. They tend to believe that complexity is simply a reflection of sophisticated fine-tuning and this must apply to the human genome. According to this view, the presence of huge amounts of DNA with an unknown function is just a temporary situation and in the next few years most of this 'dark matter' will turn out to have a function. It has to have a function otherwise natural selection would have eliminated it.

Most experts in molecular evolution reject this view. They are quite comfortable with the concept of a messy genome full of junk DNA because that fits with their view of evolution—a view that was shaped by the development of neutral theory and nearly-neutral theory in the late 1960s and early 1970s. This view—the messy genome view—has not been covered in most biochemistry and molecular biology textbooks and this partially explains why recent generations of scientists are so committed to the sophisticated fine-tuning view of genomes.

I'm pleased to report that this may be changing. Many teachers and textbook writiers are beginning to realize that the evidence for junk DNA and a messy genome is too overwhelming to ignore so they have to address it in their books. Hopefully, this will soon make its way into the classroom.

The latest edition of Molecular Biology of the Cell is a good example of this change. The authors of the 7th (2022) edition are: Bruce Alberts, Rebecca Heald, Alexander Johnson, David Morgan, Martin Raff, Keith Roberts, and Peter Walter. The various authors have always expressed support for the idea that only 10% of the human genome is functional but in earlier editions they avoided the words "junk DNA." They now realize that they need to be more forceful in supporting good genome science so this is what they say in the latest edition.

The power of this [sequence alignment] can be increased by including in such comparisons the genomes of large numbers of species whose genomes have been sequenced such as rat, chicken, fish, dog, and chimpanzee, as well as mouse and human. By revealing in this way the results of a very long natural "experiment," lasting for hundreds of millions of years, such comparative DNA sequencing studies have highlighted some of the most interesting regions in our genome. The comparisons reveal that about 4.5% of the human genome consists of multi-species conserved sequence. To our great surprise only about one fourth of these sequences code for proteins. Most of the remaining conserved sequences consist of DNA that is thought to contain clusters of protein-binding sites involved in gene regulation, while others produce RNA molecules that are not translated into protein but are important for other reasons.

When the DNA sequences of hundreds of thousands of individual humans are compared, an additional 5% of our genome shows a reduced variation in the human population, which implies that sequences in this 5% of the genome are also important. Taken together, these analyses suggest that only about 10% of the human genome contains nucleotide sequences that truly matter.

The important question of how much of the DNA sequence of the human genome is functionally relevant was briefly confused by a set of high-profile publications that appeared in 2012 from a large, federally funded US genome project named ENCODE. These publications, which reported the results of a massive survey using sensitive assays that can detect the presence of RNA molecules in cells at extremely low levels, reported that 76% of the total DNA sequence in human cells is transcribed to produce RNA molecules. Even though many of these transcripts were found at levels of less than a single RNA molecules per cell, the ENCODE scientists used such data to assert that most of human DNA is functional, with very little "junk." This claim received widespread publicity, along with their belief that our genome contains tens of thousands of previously undetected genes that produce RNA molecules that do not code for protein.

As previously stated, there is strong scientific consensus that most of the human genome consists of DNA whose nucleotide sequence is not relevant to biological function—being the so-called junk. This conclusion rests on the finding that natural selection fails to preserve the sequences in the face of the inevitable random changes to genomes that occur over time, as can be seen both when different species are compared and from detailed analyses of human variation. The fact that these DNA sequences nevertheless produce an occasional RNA molecule can be explained by the occurrence of background "noise" in gene expression. Although gene expression is very accurate, it is not perfect, and biochemical errors occasionally occur. Such errors are to be expected, and so long as they are kept at a low level, they are thought to have little or no consequence for the cell.

It's not perfect but it's a good start.


Anonymous said...

In an age of bad news this is a very welcome glimmer of hope.

Mark Sturtevant said...

Larry, I think that this entry into this essential book on cell biology had something to do with your persistent drum beating. Now that this has been written in this particular book, it probably will spread to others. [thumbs up]

Joe Felsenstein said...

A good first step. Larry can take a lot of the credit, and not just for writing his book.

Larry Moran said...

@Mark Sturtevant: Email me for more information.

Anonymous said...

I was a reviewer for that addition and among other things I recommended they consider the possibility that a lot of the annotated alternate transcripts are nonfunctional

Anonymous said...

Whoops. I meant to say the latest of Lodish. Lodish isn’t as good on nonfunctional DNA etc as Alberts

Larry Moran said...

@Anonymous: I don't have a copy of the latest edition of "Molecular Cell Biology" by Lodish et al. I think it's the 9th edition (2021), right?

Could you please post what they have to say about junk DNA?

Lamarck said...

Hi Larry,

here you go, here it is:

“Well before the entire human genome was sequenced, it was apparent that only about 10 percent of human DNA consists of protein-coding genes, and for many years the remaining 90 percent was considered “junk DNA”! In recent years, we’ve learned that much of the so-called junk DNA is actually copied into thousands of RNA molecules that, though they do not encode proteins, serve equally important purposes in the cell (see Chapter 9). At present, however, we know the function of only a very few of these abundant noncoding RNAs.” [Lodish et al. (2021): Molecular Cell Biology (9th Ed.)]

“It was a surprise to many researchers when genomic sequencing revealed that large portions of the genomes of metazoans and plants do not encode mRNAs or any other RNAs required by the organism. Remarkably, about 98.5 percent of human chromosomal DNA is noncoding DNA! The noncoding DNA includes transcription regulatory sequences recognized and bound by proteins that regulate transcription of genes within tens to hundreds of kb away in the linear DNA sequence. However, the vast majority of noncoding DNA in multicellular organisms includes many regions that do not seem to be involved directly in gene control or DNA replication. A large fraction of noncoding DNA in the genomes of individual organisms (∼50 percent for humans) includes many regions that are similar but not identical in sequence to one another. There is enough variation within this repetitive DNA among individuals that every person can be distinguished by a unique DNA fingerprint based on these sequence variations. Moreover, the location in the genome of some repetitive DNA sequences varies among different individuals of the same species. At one time, all noncoding DNA was collectively termed junk DNA and was considered to serve no purpose. We now understand the evolutionary basis of this noncoding DNA and its variation in location among individuals. Cellular genomes harbor transposable (mobile) DNA elements that can copy themselves and move to new locations throughout the genome. Although most transposable DNA elements seem to have little function in the life cycle of an individual organism, over evolutionary time they have helped to shape our genomes and contributed to the rapid evolution of multicellular organisms.” [Lodish et al. (2021): Molecular Cell Biology (9th Ed.), p. 278]

Interesting prose style, maybe someone would like to make a song out of it.



Larry Moran said...

@Lamark Thanks. That's pretty disgusting IMHO. I think Lodish et al. need better reviewers! :-)

Mikkel Rumraket Rasmussen said...

Isn't "as good" is an amazing understatement. Given the quote supplied below by Lamarck it's clear Lodish et al. is an abysmally bad molecular biology textbook.

Mikkel Rumraket Rasmussen said...

Holy houndeye that is bad. It begins with the mistake of thinking the 10% functional DNA of the human genome is all protein coding (it's more like 2%), and then it's just downhill from an already low point.

Anonymous said...

If I recall correctly, I mentioned several compelling bits of evidence for non-fucntional DNA and I included 3 references, 2 of which I got from Larry on this site. I would bet the next edition will be substantially better

Anonymous said...

Lynch and Marinov (2015) "The bioenergetic costs of a gene" was pretty convincing in my eyes. It shows us why the accumulation of junk in the genomes of many eukaryotes is unavoidable. Through the years I had often wondered what it would take to calculate the burden of junk DNA, and it was a pleasant surprise to read a paper where that question was answered so convincingly.

Donald Forsdyke said...

If only it were that easy.

Consider that bioenergetically unburdensome segment of DNA passing through the generations accompanied by many similar "junk" companions, each of which was, from time-to-time, unproductively transcribed into RNA. Then, one generation, a novel intracellular pathogen appears! It so happens that the RNA corresponding to one of the "junk" fellow-travellers happens to complement a significant part of the pathogen's RNA transcript.

As my ex-colleagues showed in Cambridge in the early 1970s (Tim Hunt, Mike Mathews, etc.), the resulting double-stranded segment would trigger alarms (e,g., blocking protein synthesis). Thus, that particular junk item had acquired a survival value. It is retained longer than its junk companions.

Do the sums on this. Over the generations a ceiling would be reached regarding an acceptable total amount of "junk." Thus, eventually all the junk becomes meaningful regarding natural selection. "Junk" no more!

Neil Taylor said...

Prof Moran, Dan Graur and many others have that the it-may-become-useful-in-the-future argument is very weak. The segment is far more likely to mutate into something that reduces fitness than increases it.

Larry Moran said...

@Neil Taylor: Graur and those many other experts in molecular evolution are correct. It's rather silly to assume that a species might deliberately retain a substantial amount of junk DNA in their genomes on the off chance that a million years from now it might prove (very slightly) useful.

Donald Forsdyke said...

I do not know about you, Larry, but in one year, the resources of my genome are called upon multiple times to deal with infections from foreign pathogens. Your junk DNAs may just be sitting in the wings, but I suspect mine are happy to lend a hand on stage.