This is part 7 of my review of
The Myth of Junk DNA. For a list of other postings on this topic see the link to
Genomes & Junk DNA in the "theme box" below or in the sidebar under "Themes."
The title of Chapter 4 is
Introns and the Splicing Code. It opens with a brief description of eukaryotic genes and alternative splicing. Here's a better description of splicing for those who want a quick refresher:
RNA Splicing: Introns and Exons. Alternative splicing is when a transcript can be spliced in at least two different ways to produce 2 distinct mRNAs. Each of them will make a different, but related, protein. The process has been known for thirty years and the mechanism is well-understood. It's described very well in a Wikipedia article:
Alternative Splicing.
Here's some important background information from
Junk in Your Genome: Protein-Encoding Genes.
The minimum size of a eukaryotic intron is less than 50 bp. For a typical mammalian intron, the essential sequences in the introns are: the 5′ splice site (~10 bp); the 3′ splice site (~30 bp): the branch site (~10 bp); and enough additional RNA to form a loop (~30 bp). This gives a total of 80 bp of essential sequence per intron or 20,500 × 7.2 × 80 = 11.8 Mb. Thus, 0.37% of the genome is essential because it contains sequences for processing RNA.
In other words, assuming that introns aren't all junk we can estimate how much of the intron sequence is essential for it's function by taking into account the known regulatory sequences and the amount needed to form a loop.
The rest of an intron sequence may be junk. If it is, then we would expect to see two things.
- Considerable variation is intron size from species to species.
- Frequent examples of transposons, endogenous retroviruses, and even other genes inserting into introns.
This is exactly what we see [
Junk in Your Genome: Intron Size and Distribution]. There's no indication that intron sequences are conserved or essential.
Jonathan Wells explains that alternative splicing is important in some genes. He is correct. He then explains that there are sequences in introns that regulate alternative splicing. He's correct about that as well. We've been writing this up in the textbooks and teaching it in introductory biochemistry courses since early in the 1980s. The classic example is the determination of sex in
Drosophila—it's largely controlled by alternative splicing and we know a great deal about which proteins bind to which sequences in the introns to promote or repress a given splice site [
Sex in the fruit fly Drosophila melanogaster].
Nothing new here. We know about binding sites and we know that most of them are 10 bp or less. Their presence makes no significant difference in our calculations of junk DNA. I get the distinct impression that Wells and the other IDiots don't really understand splicing and alternative splicing.
Here's a series of blog posts I did last year when Richard Sternberg tried to pretend that he knew something about molecular biology and alternative splicing. Later on, Jonathan Wells weighs in to try and help his friend but ends up showing that he too, is in way over his head.
Creationists, Introns, and Fairly TalesIDiots Do Arithmetic a Second Time - Same ResultJonathan Wells Weighs in on Alternative SplicingHaving "proven" that something like 0.03% of our genome may not be junk, Wells then goes on to describe other sequences that are found in introns. Some of these are regulatory sequences or enhancers. These aren't common, but they do exist. They're usually located in the 5′ intron and they are often associated with alternative transcription start sites. The total amount of non-junk DNA due to regulatory sequences has already been taken into account in my calculations (
Junk in Your Genome: Protein-Encoding Genes) and it doesn't matter whether these regulatory sequences are intergenic or included within an intron.
Theme
Genomes
& Junk DNAWells also notes that many genes for small RNAs are located within introns. These include some of the genes for the splicing machinery, tRNA genes, snoRNA genes etc. He doesn't mention that introns are also loaded with
Alu sequences and other transposable elements (mostly defective). The presence of the these insertions show us that cells don't discriminate between intron sequences that make up 25% of the genome and the remaining 65% that's mostly junk. They are all targets for inserting small genes and transposons. No surprises here.
Finally, on the last page of Chapter 4, Wells devotes two paragraphs to a genuine scientific argument. The idea is that long introns might be necessary to delay transcription. This idea has been around for a long time. It was originally proposed over 25 years ago as an explanation for the long introns found in
Drosophila HOX genes, especially
Ubx.
If a gene has several long introns it can stretch out over 100 kb (100,000 bp). The typical RNA polymerase II elongation complex transcribes at a rate of 50 bp per second so it will take more than 30 minutes to transcribe these long genes. The idea is that the presence of long introns delays appearance of the regulatory proteins during development. This seems unlikely because there are many other, more efficient, ways of regulating gene expression. As a matter of fact, the argument can be easily turned upside down.
Genes that need to be transcribed quickly have very short introns or none at all. The heat shock inducible genes, for example, don't even have introns. These genes need to be expressed rapidly when a cell encounters stressful conditions. Their non-inducible homologues all have respectable introns so it looks like there has been selection for losing introns in these genes.
Similarly, there are often testes specific genes than lack introns. The supposition is that these variant family members have lost introns so they can be quickly transcribed during spermatogenesis. The globin genes have relatively small introns and they are also expressed at a high rate in erythroblasts.
Genes that are infrequently transcribed tend to accumulate large introns. This includes most developmentally regulated transcription factors that only need to produce a small number of proteins at a specific time in the life of the organism. These observations are consistent with the idea that excess junk in intron sequences is removed when necessary. It's actually evidence that those sequences are junk.
So far we covered the evidence of probable function in Chapter 3 and seen that Wells does not critically examine the data on pervasive transcription but simply assumes it is correct. He then makes the unsubstantiated claim that evidence of transcription is evidence of function. He's wrong about the claim that most of our genome is transcirbed and he's wrong to assume that all transcripts are functional. Nothing in that chapter supported his claim that junk DNA is a myth.
In this chapter we see the first evidence for specific functions of noncoding DNA. The presence of regulatory sequences in introns has been well known for decades and it has no impact on the estimates of junk DNA. The idea that big introns might be adaptive regardless of sequence is possible but not reasonable. In fact, the evidence suggests strongly that big introns full of junk DNA can be detrimental in some cases. Nothing in Chapter 4 provides convincing evidence that junk DNA is a myth.
What about pseudogenes? Are they a myth? That's covered in Chapter 5.
A note about referencesThe IDiots are promoting this book by bragging about multiple references that challenge the concept of junk DNA [
Jonathan Wells offers over 600 references to recent peer-reviewed literature]. Chapters 1 and 2 were introductions to the problem. They had a total of 51 references. Chapter 3 had 62 references but, as we have seen, they don't add up to a convincing case. There were plenty of references that should have been included if a scientific case was going to be made. Chapter 4 has 63 references but only three of them address a substantive argument against junk DNA in introns. All three make the same point; namely that long introns delay transcription.
That's a total of 176 references so far with nothing much to show for them. There are 432 references in the rest of the book. There are 26 references to known IDiots including 8 references to the work of Jonathan Wells.