Today I did a routine search for "junk DNA" "2025" to see if misinformation is still dominating the web. It is, but that's not the most surprising thing I discovered. Here's what Google AI told me at the top of the search page.
In 2025, "junk DNA" is no longer considered junk, as new studies show it plays vital roles in gene regulation and development. Research from 2025 indicates that these sequences, many of which come from ancient viruses, can act as "genetic switches" that influence how genes are turned on or off and how cells respond to their environment. This has led to potential breakthroughs in regenerative medicine and cancer treatment by providing new therapeutic targets.
This video explains how what was once considered junk DNA has been found to contain thousands of new genes:
The video is by Robert Carter who has a Ph.D. in molecular biology. His site is called Biblical Genetics. He also posts on creation.com
Carter sounds like he knows what he's talking about but he's just parroting all the misinformation that permeates the scientific literature. The main message of this video is that scientists were shocked to discover that the human genome only had 20,000 protein coding genes but we now know (no, we don't) that each gene makes many different proteins and that accounts for the "missing" complexity that all the experts had expected.1
We also "know" (no, we don't) that scientists have discovered tens of thousands of new protein coding genes that make small proteins. He references a Science article by Elizabeth Penissi who has been spreading misinformation about the human genome for more than 25 years.
It's not surprising that Robert Carter wants to discredit the idea of junk DNA. What's surprising is that Google AI is directing readers to a creationist video.
1. The knowledgeable experts predicted that the human genome would have fewer than 30,000 genes and that's exactly what was found when the human genome sequence was published.
41 comments :
Not that it would be any better to cite one of Nature's or ENCODE's absurd propaganda videos in this respect. They contain the same stupid misinformation.
It's not surprising at all that AI references a creationist video. Unfortunate, yes, but not surprising.
The answers of the Google search AI are based on the first page or so, of ordinary search results. The AI (Gemini) is more skeptical when you ask it something directly.
(shudders)
Are they not confusing the "no junk" fallacy with the "alternative splicing" fallacy? If we hypothetically found lots of functional isoforms of known proteins that wouldn't reduce the amount of junk DNA by one bit.
Sorry, I seem to default to "anonymous" now; that's new.
Pervasive functional translation of non canonical human open reading frames
Jin Chen https://orcid.org/0000-0002-6634-4397, Andreas-David Brunner
It's time to pay more attention on non_canocial ORF which have discovered to have important roles in immunity and development . They're beyond noisy transcript
@Anonymous: Please describe the evidence that a substantial number of these ORFs encode a biologically functional protein/peptide. Some of that evidence might include showing that they are conserved or that they are subject to purifying selection.
Please estimate the total amount of functional DNA that these ORFs would add to the genome if every single one of them were functional. I'm guessing that it's about 0.1% or less.
The paper I cited, just investigated a hundred number of non-concial ORF . By the way, I assume your guess about the proportion of junk ones, is purely based on sequence conservation. I'm afraid but that method of detecting functionally important sequence through sequence conservation has not a good reputation. Since many non conserved regions of the genes found to be quite essential and vise versa
Latest, most direct test of the neutral theory soundly disproves it again.
José Aguilar-Rodríguez
@jaguilarrod
One of the most exciting projects of my career, years in the making. Using high-throughput precision genome editing, we mapped the fitness effects of thousands of natural variants—challenging the idea that common variants are inconsequential.
https://www.biorxiv.org/content/10.1101/2024.10.30.621178v2 Massively parallel interrogation of the fitness of natural variants in ancient signaling pathways reveals pervasive local adaptation
Note that these are amino acid changes in proteins. Nothing to do with junk DNA. Also note that they found most variants to be neutral. Direct quote from the abstract: "We address this question by mapping the fitness effects of over 9,000 natural variants in the Ras/PKA and TOR/Sch9 pathways—key regulators of cell proliferation in eukaryotes—across four conditions in Saccharomyces cerevisiae. While most variants are neutral in our assay, ∼3,500 exhibited significant fitness effects."
The Kimura neutral theory of 1968 was inspired by protein-sequence-alignment-based genetic distances among species (DNA sequence came much later), such as the genetic equidistance phenomenon (GEP). The GEP was surprising and inspired the molecular clock hypothesis, which directly inspired the neutral theory. NT considers most observed variants or amino acid substitutions as neutral. When applied in population genetics, NT treats most coding and non-coding DNA variants as neutral. So, NT considers both coding and non-coding variants as neutral, regardless of whether between species or within species.
The study identified a substantial proportion of the examined genetic variants as non-neutral, contrary to predictions from the neutral theory (NT). Why not all or most variants? This is likely attributable to the limited scope of their functional assays, which evaluated only four specific conditions. In nature, organisms face far more diverse and stringent challenges—environmental stresses, pathogens, metabolic demands, and ecological interactions—that extend well beyond these four criteria. Consequently, more comprehensive and exhaustive testing would be expected to reveal functionality in the majority of variants.
@Anonymous says, "The paper I cited, just investigated a hundred number of non-concial ORF."
The paper identified some transcripts that are bound to ribosomes. They looked at the ones producing a detectable peptide. They provided some evidence that these may be functional. The total number is 230 uORFs and 91 from lncRNAs.
" By the way, I assume your guess about the proportion of junk ones, is purely based on sequence conservation."
"The average size of these micropeptides is less than 100aa so even if we assume they are all derived from real genes (not junk) this only amounts to a tiny fraction of functional DNA (<0.1%). I'm anxiously awaiting confirmation of the data implying function.
"I'm afraid but that method of detecting functionally important sequence through sequence conservation has not a good reputation. Since many non conserved regions of the genes found to be quite essential and vise versa."
I'm aware of spacer DNA sequences that show conservation of size but not nucleotide sequence. What percentage of the genome do you think is functional but shows no evidence of conservation or purifying selection? The most recent data on the amount under purifying selection is about 10% and that's similar to the amount that is conserved.
If you are going to challenge the concept of 90% junk then you better have a good case for arguing that a substantial amount of intron sequences and transposon debris actually has a function in spite of the fact that it is evolving neutrally.
I look forward to hearing you make that case.
I confidently assume that our genome is entirely functional because of ever increasing data in reasrch articles discovering new function for transposable elements introns , psydogenes and even repetitive satellites. (No matter how many article I cite, you always keep saying that these are only the minority ones ).
The study identified a substantial proportion of the examined genetic variants as non-neutral, contrary to predictions from the neutral theory (NT).
Note that you have confused the prediction of the neutral theory (most variation neutral) with a strawman (all variation neutral), and have gone from the previous universe of study (coding and non-coding) to a highly restricted one (coding, non-silent). Note that even in the study you cite, 2/3 of the variants are evolving neutrally.
The study concludes “What is surprising and inconsistent with nearly neutral theory is that many very large-effect polymorphisms are present at high frequencies in the population. These results provide strong support in favor of the balance theory of natural variation.” https://www.biorxiv.org/content/10.1101/2024.10.30.621178v2
The so-called balance theory of natural variation is in fact encompassed by the Maximum Genetic Diversity (MGD) theory, which posits that genetic diversity (or distance) is maintained at an equilibrium corresponding to a maximum or optimal level. In principle, genetic diversity must either have such a maximum equilibrium or not. Empirical evidence that refutes the latter view — as assumed by the neutral theory — necessarily supports the former, thereby vindicating the MGD framework.
Approximately 44% (1,612/3,629) of non-synonymous variants exhibit significant non-neutral effects, compared to ~33% (1,134/3,473) of synonymous variants and ~37% (866/2,349) of noncoding variants. These findings reveal only modest quantitative differences in non-neutrality across variant types, rather than the qualitative distinctions predicted by the neutral theory, which assumes most variants are neutral and unaffected by selection, especially for the syn and noncoding variants.
In short, most of the so called evidence for the neutral theory are based on unproven or unrealistic assumptions and none includes empirical test data. All empirical tests have invalidated it and vindicated its only possible rival.
Gnomon why do you spam Larry's blog with articles you either haven't read or didn't properly understand, and why do you just ignore rebuttals by John Harshman and others? Start engaging honestly with your critics instead of ignoring them and spamming more.
I understand the issues our host has with Wikipedia, I am curious about his opinion of the Grokipedia article on junk DNA:
https://grokipedia.com/page/Junk_DNA
Well, the first falsehood is in the first sentence. Not off to a good start. "Junk DNA (non-functional DNA) is a DNA sequence that has no known biological function."
And the references used to support that sentence are good publications, but they don't support the claim made in that sentence. Perhaps having AI write your encyclopedia isn't a great idea.
Perhaps Grokipedia will do better.
The Grokipedia article on junk DNA is a copy of the Wikipedia article. I think the Wikipedia article is pretty good but that's because I wrote most of it.
I cannot imagine how Grokipedia could be any better than Wikipedia on science articles. A quick scan of "edits" on the "Junk DNA," "Alternative Splicing," " Non-coding DNA," and "Central Dogma" articles show the same old complaints that we've been dealing with on Wikipedia for the past 20 years.
We continue to improve Wikipedia articles. I don't know how Grokipedia is going to deal with that. If it just copies the Wikipedia article then it's of no use for scientific information.
Wait, that first line, with the bad definition of junk DNA, comes straight from the Wikipedia page? You should definitely edit that one.
What is the best AI for understanding junk DNA? Or should we be sceptical of them all?
The best "AI" for understanding junk DNA is the one that displays Larry's junk DNA Wikipedia page.
Scientists Said 98% of Your DNA Was Useless (They Were Wrong)
https://www.youtube.com/watch?v=YjEG6gQnmK4
@gert korthof: That video is full of lies and misconceptions. Unfortunately, it's very difficult to counter that much garbage without getting into details about function and actutal evidence.
The most egregious lie is the implication that we knew nothing about functional regulatory sequences back in 1972 when Ohno predicted that 9% of non-coding DNA would be functional and included regulatory sequences as part of those functional elements.
@John Harshman: Please explain why you disagree with the definition of junk DNA on the Wikipedia page.
Could we consider the junk DNA to be like a workshop where new genes and changes to genes are developed?
Hi Doug!
AI:
Yes—there are solid references showing that gene copies can move into non‑coding regions, mutate there, and sometimes evolve into new or replacement genes. This is well‑documented in the literature on gene duplication, retrotransposition, and de novo gene birth.
📚 Key References
Gene duplication and divergence Biology LibreTexts – “Gene Duplications and Divergence” Explains how duplicated genes can diverge in function, with one copy free to accumulate mutations while the other maintains the original role. 👉 Read here
Processed pseudogenes via retrotransposition Genome Biology & Evolution – “Processed pseudogenes: from junk DNA to regulatory elements” Shows how mRNA copies of genes can be reverse‑transcribed and inserted into non‑coding regions, where they sometimes acquire new functions. 👉 Article link
Segmental duplications in complex regions Oxford Academic – “GenomeDecoder: inferring segmental duplications in highly repetitive genomic regions” Discusses how duplications often land in non‑coding or repetitive DNA, creating raw material for new genes. 👉 Read here
De novo gene birth from non‑coding DNA Quanta Magazine – “Where Do New Genes Come From?” Reviews evidence that entirely new genes can emerge from previously non‑coding regions, showing the genome’s workshop‑like creativity. 👉 Read here
Globin gene family as a case study Nature Reviews Genetics – “Evolution of the globin gene family” Documents how duplications, pseudogenes, and replacements shaped the hemoglobin family, with old copies degrading while new ones took over. 👉 Summary link
🧬 Why These References Matter
Mechanism → They show how gene copies can move into “safe” non‑coding regions.
Sandbox effect → Once there, mutations accumulate without harming the organism.
Replacement → If the modified copy gains a useful function, it can replace the original, which may then degrade into a pseudogene.
The definition is "a DNA sequence that has no known biological function". But there are plenty of conserved sequences whose function is unknown yet are not junk. The proper definition would refer not to "no known biological function" but to "no biological function". Of course whether a sequence is actually junk would always be provisional, as with everything in science. The default assumption would be that all non-conserved sequences are junk.
It seems logical that some of the DNA is the workshop (workspace) where changes can be made safely before they are moved into the coding section.
It would be logical, if evolution worked by teleology. The real evolutionary process, not so much.
@Light
No, you should just consider it junk DNA. An evolutionary remnant that can still undergo mutations and in which new genes can evolve (and be retained by selection if they are adaptive) and also disappear again if they are not adaptive and further mutations accumulate.
@Light
That doesn't make sense. Mutations ("changes") are not moved from junk DNA into coding regions. Mutation occur regardless of whether the DNA is junk or not, in all regions of the genome. It's just that in nonfunctional (junk) regions they are less often selected against.
Besides, there's no difference between a "coding region" and any other part of the genome except that a protein-coding gene is there. If a new gene evolves in a non-coding region, presto, it's now a coding region.
Thanks Rasmussen for the clarification.
So material that was considered to be junk is now seen to be functional. Is that right?
Not to my knowledge. Some small fraction of material that belongs to a few classes of sequences expected to be junk — e.g. ERVs, LNCRNAs, etc. — has been found to be functional. That would come as no surprise, since any random sequence can gain a function, though it's rare in any particular case.
Light, a section of DNA which previously wasn't functional is mutated in some way and this results in a functional result.
This isn't about static sequences but either neutrally mutating ones or ones under positive selection.
Post a Comment