More Recent Comments

Showing posts with label Junk RNA. Show all posts
Showing posts with label Junk RNA. Show all posts

Tuesday, December 31, 2019

Are introns mostly junk?

There are many reasons for thinking that introns are mostly junk DNA.
  1. The size and sequence of introns in related species are not conserved and almost all of the sequences are evolving at the rate expected for neutral substitutions and fixation by drift.
  2. Many species have lost introns or reduced their lengths drastically suggesting that the presence of large introns can be detrimental in some cases (probably large populations).
  3. After decades of searching, there are very few cases where introns and/or parts of introns have been shown to be essential.
  4. Researchers routinely construct intronless versions of eukaryotic genes and they function normally when re-inserted into the genome.
  5. Intron sequences are often littered with transposon and viral sequences that have inserted into the intron and this is not consistent with the idea that intron sequences are important.
  6. About 98% of the introns in modern yeast (Saccharomyces cerevisiae) have been eliminated during evolution form a common ancestor that probably had about 18,000 introns [Yeast loses its introns]. This suggests that there was no selective pressure to retain those introns over the past 100 million years.
  7. About 245/295 of the remaining introns in yeast have been artificially removed by researchers who are constructing an artificial yeast genome suggesting that over 80% of the introns that survived evolutionary loss are also junk [Yeast loses its introns].

Friday, December 13, 2019

The "standard" view of junk DNA is completely wrong

I was browsing the table of contents of the latest issue of Cell and I came across this ....
For decades, the miniscule protein-coding portion of the genome was the primary focus of medical research. The sequencing of the human genome showed that only ∼2% of our genes ultimately code for proteins, and many in the scientific community believed that the remaining 98% was simply non-functional “junk” (Mattick and Makunin, 2006; Slack, 2006). However, the ENCODE project revealed that the non-protein coding portion of the genome is copied into thousands of RNA molecules (Djebali et al., 2012; Gerstein et al., 2012) that not only regulate fundamental biological processes such as growth, development, and organ function, but also appear to play a critical role in the whole spectrum of human disease, notably cancer (for recent reviews, see Adams et al., 2017; Deveson et al., 2017; Rupaimoole and Slack, 2017).

Slack, F.J. and Chinnaiyan, A.M. (2019) The Role of Non-coding RNAs in Oncology. Cell 179:1033-1055 [doi: 10.1016/j.cell.2019.10.017]
Cell is a high-impact, refereed journal so we can safely assume that this paper was reviewed by reputable scientists. This means that the view expressed in the paragraph above did not raise any alarm bells when the paper was reviewed. The authors clearly believe that what they are saying is true and so do many other reputable scientists. This seems to be the "standard" view of junk DNA among scientists who do not understand the facts or the debate surrounding junk DNA and pervasive transcription.

Here are some of the obvious errors in the statement.
  1. The sequencing of the human genome did NOT show that only ~2% of our genome consisted of coding region. That fact was known almost 50 years ago and the human genome sequence merely confirmed it.
  2. No knowledgeable scientist ever thought that the remaining 98% of the genome was junk—not in 1970 and not in any of the past fifty years.
  3. The ENCODE project revealed that much of our genome is transcribed at some time or another but it is almost certainly true that the vast majority of these low-abundance, non-conserved, transcripts are junk RNA produced by accidental transcription.
  4. The existence of noncoding RNAs such as ribosomal RNA and tRNA was known in the 1960s, long before ENCODE. The existence of snoRNAs, snRNAs, regulatory RNAs, and various catalytic RNAS were known in the 1980s, long before ENCODE. Other RNAs such as miRNAs, piRNAS, and siRNAs were well known in the 1990s, long before ENCODE.
How did this false view of our genome become so widespread? It's partially because of the now highly discredited ENCODE publicity campaign orchestrated by Nature and Science but that doesn't explain everything. The truth is out there in peer-reviewed scientific publications but scientists aren't reading those papers. They don't even realize that their standard view has been seriously challenged. Why?


Monday, April 01, 2019

The frequency of splicing errors reflects the balance between selection and drift

Splice variants are very common in eukaryotes. We know that it's possible to detect dozens of different splice variants for each gene with multiple introns. In the past, these variants were thought to be examples of differential regulation by alternative spicing but we now know that most of them are due to splicing errors. Most of the variants have been removed from the sequence databases but many remain and they are annotated as examples of alternative splicing, which implies that they have a biological function.

I have blogged about splice variants many times, noting that alternative splicing is a very real phenomenon but it's probably restricted to just a small percentage of genes. Most of splice variants that remain in the databases are probably due to splicing errors. They are junk RNA [The persistent myth of alternative splicing].

The ongoing controversy over the origin of splice variants is beginning to attract attention in the scientific literature although it's fair to say that most scientists are still unaware of the controversy. They continue to believe that abundant alternative splicing is a real phenomenon and they don't realize that the data is more compatible with abundant splicing errors.

Some molecular evolution labs have become interested in the controversy and have devised tests of the two possibilities. I draw your attention to a paper that was published 18 months ago.

Friday, March 29, 2019

Are multiple transcription start sites functional or mistakes?

If you look in the various databases you'll see that most human genes have multiple transcription start sites. The evidence for the existence of these variants is solid—they exist—but it's not clear whether the minor start sites are truly functional or whether they are just due to mistakes in transcription initiation. They are included in the databases because annotators are unable to distinguish between these possibilities.

Let's look at the entry for the human triosephosphate isomerase gene (TPI1; Gene ID 7167).


The correct mRNA is NM_0003655, third from the top. (Trust me on this!). The three other variants have different transcription start sites: two of them are upstream and one is downstream of the major site. Are these variants functional or are they simply transcription initiation errors? This is the same problem that we dealt with when we looked at splice variants. In that case I concluded that most splice variants are due to splicing errors and true alternative splicing is rare.

Monday, February 04, 2019

What is the dominant view of junk DNA?

I think that about 90% of our genome is junk and I know lots of other scientists who feel the same way. I'm pretty sure that this view is not shared by the majority of scientists but I don't know whether they are convinced that most of our genome is functional or whether they just think the question is unanswerable at the present time. I suspect that the latter view is more common but I'd like to hear your opinion.

Sunday, January 27, 2019

Yeast loses its introns

Baker’s yeast (Saccharomyces cerevisiae) is one of the best studied eukaryotes. Its genome is just slightly larger than the largest bacterial genome and it was the first eukaryotic genome to be sequenced (Mewes at al., 1997). It has about 7000 genes in total and 6,604 of these genes are protein-coding genes but only 280 of these genes contain introns.1 The rest have lost their introns over the course of several hundred million years of evolution (Hooks et al., 2014).

We know that introns have been lost in yeast because the genes of related species have lots of introns. The common ancestor of all fungi undoubtedly had genes with multiple introns because the available evidence indicates that introns invaded eukarotic genes very early in the evolution of eukaryotes. The fact that most introns have been purged from the yeast genome suggests that introns are not essential for gene function. In other words, introns are mostly junk.2

Tuesday, October 16, 2018

John Mattick's latest attack on junk DNA

John Mattick is the most prominent defender of the idea that the human genome is full of functional sequences. In fact, he is just about the only scientist of any prominence who's on that side of the debate. His main "evidence" is the fact that genomes are pervasively transcribed and that most of the transcripts are functional. Let's look at his latest review paper to see how well this argument stands up to close scrutiny (Mattick, 2018).1

As you read this post, keep in mind that in 2012 John Mattick was awarded a prize by the Human Genome Organization for proving his hypothesis [John Mattick Wins Chen Award for Distinguished Academic Achievement in Human Genetic and Genomic Research].
The Award Reviewing Committee commented that Professor Mattick’s “work on long non-coding RNA has dramatically changed our concept of 95% of our genome”, and that he has been a “true visionary in his field; he has demonstrated an extraordinary degree of perseverance and ingenuity in gradually proving his hypothesis over the course of 18 years.”
Mattick follows his usual format by giving us his version of history. He has argued for the past 15 years that the scientific community has been reluctant to accept the evidence of massive amounts of regulatory RNA genes because it conflicts with the standard paradigm of the supremacy of proteins. In the past he has claimed that this paradigm is based on the Central Dogma which states, according to him, that the only real function of DNA is to make proteins [How Much Junk in the Human Genome?]. As we shall see, he hasn't abandoned that argument but at least he no longer refers to the Central Dogma for support

Saturday, October 13, 2018

The great junk DNA debate


I've been talking to philosophers lately about the true state of the junk DNA controversy. I imagine what it would be like to stage a great debate on the topic. It's easy to come up with names for the pro-junk side; Dan Graur, Ford Doolittle, Sean Eddy, Ryan Gregory etc. It's hard to think of any experts who could defend the idea that most of our genome is functional. The only scientist I can think of who would accept such a challenge is John Mattick but let's imagine that he could find three others to join him in the great debate.

I claim that the debate would be a rout for the pro-junk side. The data and the theories are all on the side of those who would argue that 90% of our genome is junk. I don't think the functionalists could possibly defend the idea that most of our genome is functional. What do you think?

Assuming that I'm right, why is it that the average scientist doesn't know this? Why do they still believe there's a good case for function when none of the arguments stand up to close scrutiny? And why are philosophers not conveying the true state of the controversy to their readers? I'm told that anti-junk philosophers like Evelyn Fox Keller are held in high regard even though her arguments are easy to refute [When philosophers talk about genomes]. I'm told that John Mattick is highly respected in philosophy circles even though knowledgeable scientists have little use for his writings.

Can readers help me identify papers by philosophers of science that come down on the side of junk DNA and conclude that experts like Graur, Doolittle, etc are almost certainly correct?


Image Credit: The cartoon is by Tom Gauld and it was published online at the The New York Times Magazine website. I hope they will consider it fair use on an educational blog. See: Junk DNA comments in the New York Times Magazine.

Tuesday, October 09, 2018

Alternative splicing and the gene concept

I just learned about a workshop scheduled for the end of this month. The topic is: Evolutionary Roles of Transposable Elements and Non-coding DNA: The Science and the Philosophy.

I'd love to attend but it's a just small workshop designed to encourage dialogue between scientists and philosophers who are interested in the topic. Here's a list of the speakers ...
  • Ryan Gregory: Junk DNA, genome size, and the onion test.
  • Stefan Linquist: Four decades debating junk DNA and the Phenotype Paradigm is (somehow) alive and well.
  • Chris Ponting: 92.9% of the human genome evolved neutrally.
  • Paul Griffiths: Both adaptation and adaptivity are relevant to diagnosing function.
  • Ford Doolittle: Selfish genes and selfish DNA: is there a difference?
  • Justin Garson: Biological functions, the liberality problem, and transposable elements.
  • Joyce Havstad: Evolutionary Thinking about Critique of Function Talk.
  • Guillame Bourque: Impact of transposable elements on human gene regulatory networks.
  • Ulrich Stegman: On parity, genetic causation and coding.
  • Steven Downes: Understanding non-coding variants as disease risk alleles.
  • Alexander Palazzo: How nuclear retention and cytoplasmic export of RNAs reduces the deleteriousness of junk DNA.
  • David Haig: Pax somatica
  • Cedric Feschotte: Transposable elements as catalysts of genome evolution.

Wednesday, February 28, 2018

Junk DNA and selfish DNA

Selfish DNA is a term that became popular with the publication of a series of papers in Nature in 1980. The authors were referring to viruses and transposons that insert themselves into a genome where they exist solely for the purposes of propagating themselves. These selfish DNA sequences are often thought, incorrectly, to be the same as the Selfish Genes of Richard Dawkins1 [Selfish genes and transposons]. In fact, "selfish genes" refers to the idea that some DNAs enhance fitness and the frequency of these genes will increase in a population through their effect on the vehicle that carries them. It's an adaptationist view of evolution. The selfish DNA of transposons and viruses is quite different. These sequences only propagate themselves—the fitness of the organism is largely irrelevant. These elements do not contribute directly to the adaptive evolution of the species.

Transposons and integrated viruses are subjected to mutation just like the rest of the genome. Deleterious mutations cannot be purged by natural selection because inactivating a transposon has no effect on the fitness of the organism.2 As a result, large genomes are littered with defective transposons and bits and pieces of dead transposons. This is not selfish DNA by any definition. It is junk DNA [What's in Your Genome?].

Tuesday, February 06, 2018

How many lncRNAs are functional?

There's solid evidence that 90% of your genome is junk. Most of it is transcribed at some time but the transcripts are transient and usually confined to the nucleus. They are junk RNA [Functional RNAs?]. This is the view held by many experts but you wouldn't know that from reading the scientific literature and the popular press. The opposition to junk DNA gets much more attention in both venues.

There are prominent voices expressing the view that most of the genome is devoted to producing functional RNAs required for regulating gene expression [John Mattick still claims that most lncRNAs are functional]. Most of these RNAs are long noncoding RNAs known as lncRNAs. Although most of them fail all reasonable criteria for function there are still those who maintain that tens of thousands of them are functional [How many lncRNAs are functional: can sequence comparisons tell us the answer?].

Wednesday, August 30, 2017

Experts meet to discuss non-coding RNAs - fail to answer the important question

The human genome is pervasively transcribed. More than 80% of the genome is complementary to transcripts that have been detected in some tissue or cell type. The important question is whether most of these transcripts have a biological function. How many genes are there that produce functional non-coding RNA?

There's a reason why this question is important. It's because we have every reason to believe that spurious transcription is common in large genomes like ours. Spurious, or accidental, transcription occurs when the transcription initiation complex binds nonspecifically to sites in the genome that are not real promoters. Spurious transcription also occurs when the initiation complex (RNA plymerase plus factors) fires in the wrong direction from real promoters. Binding and inappropriate transcription are aided by the binding of transcription factors to nonpromoter regions of the genome—a well-known feature of all DNA binding proteins [see Are most transcription factor binding sites functional?].

Tuesday, June 27, 2017

Debating alternative splicing (Part IV)

In Debating alternative splicing (Part III) I discussed a review published in the February 2017 issue of Trends in Biochemical Sciences. The review examined the data on detecting predicted protein isoforms and concluded that there was little evidence they existed.

My colleague at the University of Toronto, Ben Blencowe, is a forceful proponent of massive alternative splicing. He responded in a letter published in the June 2017 issue of Trends in Biochemical Sciences (Blencowe, 2017). It's worth looking at his letter in order to understand the position of alternative splicing proponents. He begins by saying,
It is estimated that approximately 95% of multiexonic human genes give rise to transcripts containing more than 100 000 distinct AS events [3,4]. The majority of these AS events display tissue-dependent variation and 10–30% are subject to pronounced cell, tissue, or condition-specific regulation [4].

Monday, June 26, 2017

Debating alternative splicing (Part III)

Proponents of massive alternative splicing argue that most human genes produce many different protein isoforms. According to these scientists, this means that humans can make about 100,000 different proteins from only ~20,000 protein-coding genes. They tend to believe humans are considerably more complex than other animals even though we have about the same number of genes. They think alternative splicing accounts for this complexity [see The Deflated Ego Problem].

Opponents (I am one) argue that most splice variants are due to splicing errors and most of those predicted protein isoforms don't exist. (We also argue that the differences between humans and other animals can be adequately explained by differential regulation of 20,000 protein-coding genes.) The controversy can only be resolved when proponents of massive alternative splicing provide evidence to support their claim that there are 100,000 functional proteins.

Thursday, June 22, 2017

Jonathan Wells talks about junk DNA

Watch this video. It dates from this year. Almost everything Wells says is either false or misleading. Why? Is he incapable of learning about genomes, junk DNA, and evolutionary theory?



Wednesday, June 21, 2017

John Mattick still claims that most lncRNAs are functional

Most of the human genome is transcribed at some time or another in some tissue or another. The phenomenon is now known as pervasive transcription. Scientists have known about it for almost half a century.

At first the phenomenon seemed really puzzling since it was known that coding regions accounted for less than 1% of the genome and genetic load arguments suggested that only a small percentage of the genome could be functional. It was also known that more than half the genome consists of repetitive sequences that we now know are bits and pieces of defective transposons. It seemed unlikely back then that transcripts of defective transposons could be functional.

Part of the problem was solved with the discovery of RNA processing, especially splicing. It soon became apparent (by the early 1980s) that a typical protein coding gene was stretched out over 37,000 bp of which only 1300 bp were coding region. The rest was introns and intron sequences appeared to be mostly junk.

Monday, February 13, 2017

Dan Graur explains junk DNA

If you want to be a serious participant in the debate over junk DNA then you should watch this video. Dan Graur presents the standard arguments for junk DNA—most of which have been around for decades. He also destroys the main arguments against junk DNA. You are entitled to choose sides in this debate but you are not entitled to pose as an authority unless you know the best arguments from BOTH sides. It is not sufficient to just quote evidence for function as support for your bias. You must also refute the evidence for junk. You have to show why it is wrong or misleading.





Hat Tip: PZ Myers

Sunday, February 12, 2017

ENCODE workshop discusses function in 2015

A reader directed me to a 2015 ENCODE workshop with online videos of all the presentations [From Genome Function to Biomedical Insight: ENCODE and Beyond]. The workshop was sponsored by the National Human Genome Research Institute in Bethesda, Md (USA). The purpose of the workshop was ...

  1. Discuss the scientific questions and opportunities for better understanding genome function and applying that knowledge to basic biological questions and disease studies through large-scale genomics studies.
  2. Consider options for future NHGRI projects that would address these questions and opportunities.
The main controversy concerning the human genome is how much of it is junk DNA with no function. Since the purpose of ENCODE is to understand genome function, I expected a lively discussion about how to distinguish between functional elements and spurious nonfunctional elements.

Thursday, January 19, 2017

The pervasive transcription controversy: 2002

I'm working on a chapter about pervasive transcription and how it relates to the junk DNA debate. I found a short review in Nature from 2002 so I decided to see how much progress we've made in the past 15 years.

Most of our genome is transcribed at some time or another in some tissue. That's a fact we've known about since the late 1960s (King and Jukes, 1969). We didn't know it back then, but it turns out that a lot of that transcription is introns. In fact, the observation of abundant transcription led to the discovery of introns. We have about 20,000 protein-coding genes and the average gene is 37.2 kb in length. Thus, the total amount of the genome devoted to these genes is about 23%. That's the amount that's transcribed to produce primary transcripts and mRNA. There are about 5000 noncoding genes that contribute another 2% so genes occupy about 25% of our genome.

Friday, December 09, 2016

Using conservation to determine whether splice variants are functional

We've been having a discussion about function and how to recognize it. This is important when it comes to determining how much junk is in our genome [see Restarting the function wars (The Function Wars Part V)]. There doesn't seem to be any consensus on how to define "function" although there's general agreement on using sequence conservation as a first step. If some sequence under investigation is conserved in other species then that's a good sign that it's under negative selection and has a biological function. What if it's not conserved? Does that rule out function? The correct answer is "no" because one can always come up with explanations/excuses for such an observation. We discussed the example of de novo genes, which, by definition, are not conserved.

Let's look at another example: splice variants. Splice variants are different forms of RNA produced from the same gene. If they are biologically relevant then they will produce different forms of the protein (for protein-coding genes). This is an example of alternative splicing if, and only if, relevance has been proven.