Theme
Genomes
& Junk DNAMany reputable scientists are convinced that most of our genome is junk. However, there are still a few holdouts and one of the most prominent is John Mattick. He believes that most of our genome is made up of thousand of genes for regulatory noncoding RNA. These RNAs (about 100 of them for every single protein-coding gene) are mostly involved in subtle controls of the levels of protein in human cells. (I'm not making this up. See: John Mattick on the Importance of Non-coding RNA )
It was a reasonable hypothesis at one point in time.
How do you evaluate a hypothesis in science? Well, one of the things you should always try to do is falsify your hypothesis. Let's see how that works ...
- The RNAs should be conserved. FALSE
- The RNAs should be abundant (>1 copy per cell). FALSE
- There should be dozens of well-studied specific examples. FALSE
- The hypothesis should account for variations in genome size. FALSE
- The hypothesis should be consistent with other data, such as that on genetic load. FALSE
- The hypothesis should be consistent with what we already know about the regulation of gene expression. FALSE
- You should be able to refute existing hypotheses, such as transcription errors. FALSE
Normally, you would abandon a hypothesis that had such a bad track record but true believers aren't about to do that. So what's next? Maybe these regulatory RNAs don't show sequence conservation but maybe their secondary structures are conserved. In other words, these RNAs originated as functional RNAs with a secondary structure but over the course of time all traces of sequence conservation have been lost and only the "conserved" secondary structure remains.1 The Mattick lab looked at the "conservation" of secondary structure as an indicator of function using the latest algorithms (Smith et al., 2013). Here's how they describe their attempts to prove their hypothesis in light of conflicting data ... The majority of the human genome is dynamically transcribed into RNA, most of which does not code for proteins (1–4). The once common presumption that most non–protein-coding sequences are nonfunctional for the organism is being adjusted to the increasing evidence that noncoding RNAs (ncRNAs) represent a previously unappreciated layer of gene expression essential for the epigenetic regulation of differentiation and development (5–8). Yet despite an exponential accumulation of transcriptomic data and the recent dissemination of genome-wide data from the ENCODE consortium (9), limited functional data have fuelled discourse on the amount of functionally pertinent genomic sequence in higher eukaryotes (1, 10–12). What is incontrovertible, however, is that evolutionary conservation of structural components over an adequate evolutionary distance is a direct property of purifying (negative) selection and, consequently, a sufficient indicator of biological function The majority of studies investigating the prevalence of purifying selection in mammalian genomes are predicated on measuring nucleotide substitution rates, which are then rated against a statistical threshold trained from a set of genomic loci arguably qualified as neutrally evolving (13, 14). Conversely, lack of conservation does not impute lack of function, as variation underlies natural selection. Given that the molecular function of ncRNA may at least be partially conveyed through secondary or tertiary structures, mining evolutionary data for evidence of such features promises to increase the resolution of functional genomic annotations.
Here's what they found .. When applied to consistency-based multiple genome alignments of 35 mammals, our approach confidently identifies >4 million evolutionarily constrained RNA structures using a conservative sensitivity threshold that entails historically low false discovery rates for such analyses (5–22%). These predictions comprise 13.6% of the human genome, 88% of which fall outside any known sequence-constrained element, suggesting that a large proportion of the mammalian genome is functional.
Apparently 13.6% of the human genome is a "large proportion." Taken at face value, however, the Mattick lab has now shown that the vast majority of transcribed sequences don't show any of the characteristics of functional RNA, including conservation of secondary structure. Of course, that's not the conclusion they emphasize in their paper. Why not?
1. I can't imagine how this would happen, can you? You'd almost have to have selection AGAINST sequence conservation. Smith, M.A., Gese, T., Stadler, P.F. and Mattick, J.S. (2013) Widespread purifying selection on RNA structure in mammals. Nucleic Acid Research advance access July 11, 2013 [doi: 10.1093/nar/gkt596]
How many of you recognize this place? Note that the line-up is not (quite) out the door. How cool is that!?
Last week's molecule was the fatty acid synthase complex [Monday's Molecule #207]. The winner was Matt McFarlane. He should contact me by email to collect his winnings.
Today's molecule was pictured on a US stamp issued in April 2008. Can you identify the molecule? ... Be precise, there's only one correct answer and it may not be the one you think.
Email your answers to me at: Monday's Molecule #207. I'll hold off posting your answers for 24 hours. The first one with the correct answer wins. I will only post mostly correct answers to avoid embarrassment. The winner will be treated to a free lunch.
There could be two winners. If the first correct answer isn't from an undergraduate student then I'll select a second winner from those undergraduates who post the correct answer. You will need to identify yourself as an undergraduate in order to win. (Put "undergraduate" at the bottom of your email message.)
Atheists do not believe in any gods. An atheist does not claim to have proof that gods don't exist, although they do claim that most of the evidence for god(s) is wrong.
I really like the way Hemant Mehta explains this on his blog Friendly Atheist.
The House of Commons in Canada has passed a bill declaring that Canada will celebrate Pope John Paul II Day on April 2nd every year [An Act to establish Pope John Paul II Day]. It was a private member's bill introduced by Conservative Wladyslaw Lizon (Mississauga East—Cooksville). Lizon is Polish, which partly explains his admiration for the former Pope. Read what Veronica Abbass has to say about this: Tommy Douglas versus Karol Wojtyła.
Meanwhile, in Ontario, a similar bill was passed by the provincial legislature [see Blindsided on Canadian Atheist].
It's tempting to dismiss both these bills as trivial. After all, nobody really expects either government to make a big fuss about it next April 2nd. Its also tempting make excuses by recognizing that few MPs or MPPs could risk speaking out against them.
I don't think we should settle for that. The facts are revolting. Canada has set aside a special day for a foreign despot whose religious and moral views are despised by a large number of Canadians, and rejected by most Catholics. How in the world could that happen in the 21st century?
Here are five things you should know if you want to engage in a legitimate scientific discussion about the amount of junk DNA in a genome.
- Genetic Load
Every newborn human baby has about 100 mutations not found in either parent. If most of our genome contained functional sequence information, then this would be an intolerable genetic load. Only a small percentage of our genome can contain important sequence information suggesting strongly that most of our genome is junk.
- C-Value Paradox
A comparison of genomes from closely related species shows that genome size can vary by a factor of ten or more. The only reasonable explanation is that most of the DNA in the larger genomes is junk.
- Modern Evolutionary Theory
Nothing in biology makes sense except in the light of population genetics. The modern understanding of evolution is perfectly consistent with the presence of large amounts of junk DNA in a genome.
- Pseudogenes and broken genes are junk
More than half of our genomes consists of pseudogenes, including broken transposons and bits and pieces of transposons. A few may have secondarily acquired a function but, to a first approximation, broken genes are junk.
- Most of the genome is not conserved
Most of the DNA sequences in large genomes is not conserved. These sequences diverge at a rate consistent with fixation of neutral alleles by random genetic drift. This strongly suggests that it does not have a function although one can't rule out some unknown function that doesn't depend on sequence.
If you want to argue against junk DNA then you need to refute or rationalize all five of these observations.
The debate over the amount of junk in our genome is a genuine scientific debate. There are legitimate scientific points of view on both sides although the weight of evidence and logic is tilting heavily in favor of junk DNA. It looks more and more like most (~90%) of our genome is junk.
The problem with the debate is that the scientific literature is full of papers attacking junk DNA while there are very few papers promoting it. This is partly because there haven't been any new discoveries in favor of junk DNA. On the other hand, there have been quite a few discoveries showing that some small part of the genome that was thought to be junk might have a function. Even though these discoveries make an insignificant contribution to the big picture, they are often blown up out of all proportion and promoted as an end to junk DNA.
A recent paper in PLoS Genetics illustrates the problem.
Last week's molecule was N-acetylmuramic acid (MurNAc) one of the components of the polysaccharide in bacterial cell walls [Monday's Molecule #206]. The winner was Michael Florea. He should contact me by email to collect his winnings.
Today's (Tuesday's) molecule is a new addition to biochemistry textbooks because its structure was only solved a few years ago. There are plenty of hints in the figure. You have to identify the molecule AND each of the seven activities that are labelled. Bonus points for the PDB identification number and the species.
Email your answers to me at: Monday's Molecule #207. I'll hold off posting your answers for 24 hours. The first one with the correct answer wins. I will only post mostly correct answers to avoid embarrassment. The winner will be treated to a free lunch.
There could be two winners. If the first correct answer isn't from an undergraduate student then I'll select a second winner from those undergraduates who post the correct answer. You will need to identify yourself as an undergraduate in order to win. (Put "undergraduate" at the bottom of your email message.)
It's Ray Comfort and he's going to destroy Richard Dawkins. Bet you can hardly wait to see this movie. PZ Myers is in it. Read what he has to say at: Lie harder, little man.
I am one of those scientists who think that the discipline of "philosophy of science" is catering to some pretty stupid philosophers. Dan Graur found one of them, his name is Max Andrews and he's a graduate student in philosophy at the University of Edinburgh, Scotland ["I’ve Got a Little List" & “Let the Punishment Fit the Crime"].
You can read Max Andrews' blog posting at: Junk DNA Isn’t Junk. Be careful, you might find it very difficult to see the connection between this philosophy student's view of biology and anything you might recognize as real science.
It goes without saying that Max Andrews gets the Central Dogma wrong—many scientists make the same mistake. But here's a taste of what else he gets wrong.
The argument from junk DNA suggests that a designer would be maximally efficient in his use of information. There appears to be some information that does not execute or have any meaningful coding. Darwinism takes this issue and uses it as the result of the prediction that there would be left over information not being used due to natural selection and random mutation. However, it doesn’t appear that all junk DNA is actually junk.
Genome organization is patterned to be maximally informative. The overlapping codes observed are known to be evolutionarily costly, because random mutations will likely have a deteriorating effect, not an instructing role So the complex specified information entailed by any genomic region is orders of magnitude higher than previously suspected by, say, Dembski. Any seemingly random aspect of chromosome sequence arrangement is not. A case in point involves endogenous retroviruses (ERV’s). This implies that the taxonomically-specific formatting, indexing, punctuation, etc., of genomes were precisely written. Morphogenetic information is not reducible to the genotype—though it is strongly dependent upon it. Therefore, changes in DNA do not equal changes in the information that structures the body plan.
I wonder who his supervisor is? Maybe Dan or I could be external reviewer on his Ph.D. oral?
Dan Graur is fed up with journalists who don't know the difference between the "genetic code" and the sequence of a genome. He's not alone. But, unlike the rest of us, Dan has a solution. It may be a little difficult to enforce ...
See: An Artistic Inspiration for Putting an End to the Misuse of the Term “Genetic Code”.
Quite a few people think that there's going to be a serious debate about junk DNA at the SMBE meeting in Chicago next week. One of the sessions has a provocative title, "WHERE DID 'JUNK' GO?", but if you look at actual session titles it doesn't look like there's going to be much of a debate.
It's true than the session organizer, Wojciech Makalowski, advertised the session as a dicussion about junk DNA ....
Nick Matzke is going to the SMBE (Society for Molecular Biology and Evolution) meeting in Chicago next week. He's created a T-shirt for supporters of junk DNA [KEEP CALM and ASK ABOUT ONIONS].
John Mattick is a Professor and research scientist at the Garvan Institute of Medical Research at the University of New South Wales (Australia). He received an award from the Human Genome Organization for ....
The Award Reviewing Committee commented that Professor Mattick’s “work on long non-coding RNA has dramatically changed our concept of 95% of our genome”, and that he has been a “true visionary in his field; he has demonstrated an extraordinary degree of perseverance and ingenuity in gradually proving his hypothesis over the course of 18 years.”