The main issue in this field concerns the number of non-coding genes in the human genome. I cover the available data in my book and conclude that there are fewer than 1000 (p.214). Those scientists who promote the importance of RNA (e.g. Tom Cech) would like you to believe that there are many more non-coding genes; indeed, most of those scientists believe that there are more non-coding genes than coding genes (i.e. > 20,000). They rarely present evidence for such a claim beyond noting that much of our genome is transcribed.
Tom Cech is wise enough to avoid publishing an estimate of the number of non-coding genes but his bias is evident in the following paragraph from near the end of his article.Although most scientists now agree on RNA's bright promise, we are still only beginning to unlock its potential. Consider, for instance, that some 75 percent of the human genome consists of dark matter that is copied into RNAs of unknown function. While some researchers have dismissed this dark matter as junk or noise, I expect it will be the source of even more exciting breakthroughs.
Let's dissect this to see where the bias lies. The first thing you note is the use of the term "dark matter" to make it sound like there's a lot of mysterious DNA in our genome. This is not true. We know a heck of a lot about our genome, including the fact that it's full of junk DNA. Only 10% of the genome is under purifying selection and assumed to be functional. The rest is full of introns, pseudogenes, and various classes of repetitive sequences made up mostly of degraded transposons and viruses. The entire genome has been sequenced—there's not much mystery there. I don't know why anyone refers to this as "dark matter" unless they have a hidden agenda.
The second thing you notice is the statement that 75% of the genome is transcribed at some time or another and, according to Tom Cech, these transcripts have an unknown function. That's strange since protein-coding genes take up roughly 40% of our genome and we know a great deal about coding DNA, UTRs, and introns. If you add in the known examples of non-coding genes, this accounts for an additional 2-3% of the genome.1
Almost all the rest of the transcripts come from non-conserved DNA and those transcripts are present at less than one copy per cell. As the ENCODE researchers noted in 2014, they are likely to be junk RNA resulting from spurious transcription. I'd say we know a great deal about the fraction of the genome that's transcribed and there's not much indication that it's hiding a plethora of undiscovered functional RNAs.
Photo credit: University of Colorado, Boulder.
1. In my book I make a generous estimate of 5,000 non-coding genes in order to avoid quibbling over a smaller number and in order to demonstrate that even with such a obvious over-estimate the genome is still 90% junk.
9 comments :
I'm currently reading "Why We Die" by Venki Ramakrishnan, and he also shows this bias: "Only about 2 percent of our DNA actually codes for the
proteins that carry out much of life’s functions. The rest consists of what
biologists once dismissed as “junk DNA”; they now increasingly think it
is important, but don’t fully understand how or why."
This is quite disappointing in an otherwise quite well-written book.
It is a bizarre situation. A great majority of molecular biologists and genomicists think that junk DNA is just a mistake. So it's inevitable that popular science will often reflect their views. How to get across that people working on molecular evolution and population genetics are almost 100% opposed to this view? But they are the ones who actually understand the strength of processes putting junk into the genome and the weakness of the natural selection that removes it.
@Joe Felsenstein
Yeah and I don't know what to really do about it. Most pop-sci writers are just hopelessly out of their depth here. Science-writing in the popular press has too many bad incentives driving it.
First of all the writers lack actual understanding of the subject matter, so to them at best it just looks like a debate between experts that are equally qualified to talk about it. They don't know who are actual authorities on the specific questions. And they rarely have the actual time to dig into the material and try to understand it themselves.
But probably worse is all the dramatization, storytelling, and sensationalization that drives it all because they need to write grandiose headlines to grab attention.
So we get endless stories about grand paradigm shifts, textbooks having to be rewritten, ancient mysteries being solved, genomic "dark matter" being illuminated, the oppressive upholders of orthodoxy from the past lacked vision and imagination, finally the new generation is overturning decades-old and outdated ideas bla bla bla. The junk-DNA controversy has so many ways a writer can choose a narrative or perspective on the subject.
@Joe Felsenstein
I think one way to get to the heart of this problem is to get these researchers to read outside of the confines of the mammalian (particularly human) genome and to consider the rest of life. Why are the largest known genomes found in single celled dinoflagellates? Why does a paramecium have a 400n genome? etc. If you only consider one type of organism, your biases reflect that.
@Lorax
they will not read anything on the topic because they have been and are ignoring any relevant article that criticized ENCODE or even criticism of their very own published work. E.g. John Mattick.
I am afraid the only thing left are PR measures rather than scientific publications. An open letter with as many signatures as possible may help but I guess it would take several hundred signatures and I am not sure whom to address. Maybe rather institutions that finance science and don’t want to be humiliated for spending money on delusion than journals which earned tons of money by publishing junk DNA denying articles.
Professor Cech relates in the NYT that he began his work "In the early 1980s when ... most of the promise of RNA was still unimagined." His research group discovered "that the RNA could cut and join biochemical bonds all by itself — the sort of activity that had been thought to be the sole purview of protein enzymes."
When introducing this, he had proclaimed that "the first half of the 20th century was dominated by breakthroughs in physics." This prompted one commentator (Arthur Lundquist, New York) to write: "I don't know about dominated, but as far as the effect on human lives goes, I suspect that the creators of antibiotics might have something to say about that."
To this I added: "Yes, penicillin was discovered at St. Mary's Hospital Medical School, in the UK." Indeed, "in the early 1960s Asher Korner's Cambridge laboratory that was studying protein synthesis recruited a Mary's graduate [myself] who added cellular immunology" to the laboratory portfolio. By the early 1970s I had departed to Canada, but in the interim the laboratory had moved to immunological aspects of protein synthesis.
This led to the discovery of the role of double-stranded RNA (dsRNA) in detecting foreign viruses and hence activating a protein kinase (PKR). This would prevent viral replication by inhibiting host protein synthesis and activating various immune defences (interferon). Any RNA, be it encoding or "junk," could be preserved by natural selection if it happened to have a ssRNA sequence that complemented the ssRNA of a pathogen, so that dsRNA would be generated.
Thus, a decade before the Cech laboratory began its great work, the "promise of RNA" was far from being "unimagined."
The software somehow declared me as "anonymous". I am Donald Forsdyke, Queen's University, Canada/
Grrr, I may have to en up reading that bly book ;-)
Post a Comment