There are several ways to report a mutation rate. You can state it as the number of mutations per base pair per year in which case a typical mutation rate for humans is about 5 × 10-10. Or you can express it as the number of mutations per base pair per generation (~1.5 × 10-8).
You can use the number of mutations per generation or per year if you are only discussing one species. In humans, for example, you can describe the mutation rate as 100 mutations per generation and just assume that everyone knows the number of base pairs (6.4 × 109).Friday, November 17, 2017
Wednesday, November 08, 2017
How much mitochondrial DNA in your genome?
Most mitochondrial genes have been transferred from the ancestral mitochondrial genome to the nuclear genome over the course of 1-2 billion years of evollution. They are no longer present in mitochondria but they are easily recognized because they resemble α-proteobacterial sequences more than the other nuclear genes [see Endosymbiotic Theory].
This process of incorporating mitochondrial DNA into the nuclear genome continues to this day. The latest human reference genome has about 600 examples of nuclear sequences of mitochondrial origin (= numts). Some of them are quite recent while others date back almost 70 million years—the limit of resolution for junk DNA [see Mitochondria are invading your genome!].Tuesday, November 07, 2017
Lateral gene transfer in eukaryotes - where's the evidence?
Lateral gene transfer (LGT), or horizontal gene transfer (HGT), is widespread in bacteria. It leads to the creation of pangenomes for many bacterial species where different subpopulations contain different subsets of genes that have been incorporated from other species. It also leads to confusing phylogenetic trees such that the history of bacterial evolution looks more like a web of life than a tree [The Web of Life].
Bacterial-like genes are also found in eukaryotes. Many of them are related to genes found in the ancestors of modern mitochondria and chloroplasts and their presence is easily explained by transfer from the organelle to the nucleus. Eukaryotic genomes also contain examples of transposons that have been acquired from bacteria. That's also easy to understand because we know how transposons jump between species.Contaminated genome sequences
The authors of the original draft of the human genome sequence claimed that hundreds of genes had been acquired from bacteria by lateral gene transfer (LGT) (Lander et al., 2001). This claim was abandoned when the "finished" sequence was published a few years later (International Human Genome Consortium, 2004) because others had shown that the data was easily explained by differential gene loss in other lineages or by bacterial contamination in the draft sequence (see Salzberg, 2017).
Thursday, November 02, 2017
Parental age and the human mutation rate
Mutation
-definition
-mutation types
-mutation rates
-phylogeny
-controversies
Mutations are mostly due to errors in DNA replication. We have a pretty good idea of the accuracy of DNA replication—the overall error rate is about 10-10 per bp. There are about 30 cell divisions in females between zygote and formation of all egg cells. In males, there are about 400 mitotic cell divisions between zygote and formation of sperm cells (Ohno, 2019) . Using these average values, we can calculate the number of mutations per generation. It works out to about 130 mutations per generation [Estimating the Human Mutation Rate: Biochemical Method].
This value is similar to the estimate from comparing the sequences of different species (e.g. human and chimpanzee) based on the number of differences and the estimated time of divergence. This assumes that most of the genome is evolving at the rate expected for fixation of neutral alleles. This phylogenetic method give a value of about 112 mutations per generation [Estimating the Human Mutation Rate: Phylogenetic Method].The third way of measuring the mutation rate is to directly compare the genome sequence of a child and both parents (trios). After making corrections for false positives and false negatives, this method yields values of 60-100 mutations per generation depending on how the data is manipulated [Estimating the Human Mutation Rate: Direct Method]. The lower values from the direct method call into question the dates of the split between the various great ape lineages. This controversy has not been resolved [Human mutation rates] [Human mutation rates - what's the right number?].
It's clear that males contribute more to evolution than females. There's about a ten-fold difference in the number of cell divisions in the male line compared to the female line; therefore, we expect there to be about ten times more mutations inherited from fathers. This difference should depend on the age of the father since the older the father the more cell divisions required to produce sperm.
This effect has been demonstrated in many publications. A maternal age effect has also been postulated but that's been more difficult to prove. The latest study of Icelandic trios helps to nail down the exact effect (Jónsson et al., 2017).
The authors examined 1,548 trios consisting of parents and at least one offspring. They analyzed 2.682 Mb of genome sequence (84% of the total genome) and discovered an average of 70 mutations events per child.1 This gives an overall mutation rate of 83 mutations per generation with an average generation time of 30 years. This is consistent with previous results.
Jónsson et al. looked at 225 cases of three generation data in order to make sure that the mutations were germline mutations and not somatic cell mutations. They plotted the numbers of mutations against the age of the father and mother to produce the following graph from Figure 1 of their paper.
Look at parents who are 30 years old. At this age, females contribute about 10 mutations and males contribute about 50. This is only a five-fold difference—much lees than we expect from the number of cell divisions. This suggests that the initial estimates of 400 cell divisions in males might be too high.
An age effect on mutations from the father is quite apparent and expected. A maternal age effect has previously been hypothesized but this is the first solid data that shows such an effect. The authors speculate that oocyotes accumulate mutations with age, particularly mutations due to strand breakage.
Of these, 93% were single nucleotide changes and 7% were small deletions or insertions.
Jónsson, H., Sulem, P., Kehr, B., Kristmundsdottir, S., Zink, F., Hjartarson, E., Hardarson, M.T., Hjorleifsson, K.E., Eggertsson, H.P., and Gudjonsson, S.A. (2017) Parental influence on human germline de novo mutations in 1,548 trios from Iceland. Nature, 549:519-522. [doi: 10.1038/nature24018]
Ohno, M. (2019) Spontaneous de novo germline mutations in humans and mice: rates, spectra, causes and consequences. Genes & genetic systems 94:13-22. [doi: 10.1266/ggs.18-00015]
Tuesday, October 31, 2017
The history of DNA sequencing
Pyrosequencing was developed in the mid 1990's and by the year 2000 massive parallel sequencing using this technique was becoming quite common. This "NextGen" sequencing technique was behind the massive explosion in sequences in the early part of the 21st century.2
Even newer techniques are available today and there's a debate about whether they should be called Third Generation Sequencing (Heather and Chain, 2015).
Nature has published a nice review of the history of DNA sequencing (Shendure et al., 2017). I recommend it to anyone who's interested in the subject. The figure above is taken from that article.
1. Many labs were using the technology in 1976 before the papers were published.
2. New software and enhanced computer power played an important, and underappreciated, role.
Heather, J.M., and Chain, B. (2015) The sequence of sequencers: The history of sequencing DNA. Genomics, 107:1-8. [doi: 10.1016/j.ygeno.2015.11.003]
Maxam, A.M., and Gilbert, W. (1980) Sequencing end-labeled DNA with base-specific chemical cleavages. Methods in enzymology, 65:499-560. [doi: 10.1016/S0076-6879(80)65059-9]
Sanger, F., Nicklen, S., and Coulson, A.R. (1977) DNA sequencing with chain-terminating inhibitors. Proceedings of the National Academy of Sciences, 74:5463-5467. [PDF]
Shendure, J., Balasubramanian, S., Church, G.M., Gilbert, W., Rogers, J., Schloss, J.A., and Waterston, R.H. (2017) DNA sequencing at 40: past, present and future. Nature, 550:345-353. [doi: 10.1038/nature24286]
Escape from X chromosome inactivation
Mammals have two sex chromosomes: X and Y. Males have one X chromosome and one Y chromosome and females have two X chromosomes. Since females have two copies of each X chromosome gene, you might expect them to make twice as much gene product as males of the same species. In fact, males and females often make about the same amount of gene product because one of the female X chromosomes is inactivated by a mechanism that causes extensive chromatin condensation.
The mechanism is known as X chromosome inactivation. The phenomenon was originally discovered by Mary Lyon (1925-2014) [see Calico Cats].Saturday, October 28, 2017
Creationists questioning pseudogenes: the GULO pseudogene
This is the second post discussing creationist1 papers on pseudogenes. The first post addressed a paper by Jeffrey Tomkins on the β-globin pseudogene [Creationists questioning pseudogenes: the beta-globin pseudogene]. This post covers another paper by Tomkins claiming that the GULO pseudogenes in various primate species are not derived from a common ancestor but instead have been deactivated independently in each lineage.
The Tomkins' article was published in 2014 in Answers Research Journal, a publication that describes itself like this:ARJ is a professional, peer-reviewed technical journal for the publication of interdisciplinary scientific and other relevant research from the perspective of the recent Creation and the global Flood within a biblical framework.
Saturday, October 14, 2017
Creationists questioning pseudogenes: the beta-globin pseudogene
Jonathan Kane recently (Oct. 6, 2017) posted an article on The Panda's Thumb where he claimed that Young Earth Creationists often don't get enough credit for raising serious issues about evolution [Five principles for arguing against creationism].
He mentioned some articles about pseudogenes as prime examples. I asked him for references and he responded with two articles by Jeffrey Tomkins that were published on the Answers in Genesis website. The first was on the β-globin pseudogene and the second was on the GULO pseudogene. Both articles claim that these DNA sequences aren't really pseudogenes because they have functions.
I'll deal with the β-globin pseudogene in this post and the GULO pseudogene in a subsequent post.Wednesday, October 11, 2017
Historical evolution is determined by chance events
Wednesday, September 13, 2017
Sequencing human diploid genomes
Monday, September 11, 2017
What's in Your Genome?: Chapter 4: Pervasive Transcription (revised)
I'm working (slowly) on a book called What's in Your Genome?: 90% of your genome is junk! The first chapter is an introduction to genomes and DNA [What's in Your Genome? Chapter 1: Introducing Genomes ]. Chapter 2 is an overview of the human genome. It's a summary of known functional sequences and known junk DNA [What's in Your Genome? Chapter 2: The Big Picture]. Chapter 3 defines "genes" and describes protein-coding genes and alternative splicing [What's in Your Genome? Chapter 3: What Is a Gene?].
Chapter 4 is all about pervasive transcription and genes for functional noncoding RNAs. I've finally got a respectable draft of this chapter. This is an updated summary—the first version is at: What's in Your Genome? Chapter 4: Pervasive Transcription.Saturday, September 09, 2017
Cold Spring Harbor tells us about the "dark matter" of the genome (Part I)
This is a podcast from Cold Spring Harbor [Dark Matter of the Genome, Pt. 1 (Base Pairs Episode 8)]. The authors try to convince us that most of the genome is mysterious "dark matter," not junk. The main theme is that the genome contains transposons that could play an important role in evolution and disease.
Saturday, September 02, 2017
Wednesday, August 30, 2017
Experts meet to discuss non-coding RNAs - fail to answer the important question
There's a reason why this question is important. It's because we have every reason to believe that spurious transcription is common in large genomes like ours. Spurious, or accidental, transcription occurs when the transcription initiation complex binds nonspecifically to sites in the genome that are not real promoters. Spurious transcription also occurs when the initiation complex (RNA plymerase plus factors) fires in the wrong direction from real promoters. Binding and inappropriate transcription are aided by the binding of transcription factors to nonpromoter regions of the genome—a well-known feature of all DNA binding proteins [see Are most transcription factor binding sites functional?].
Sunday, August 27, 2017
The Extended Evolutionary Synthesis - papers from the Royal Society meeting
The meeting was a huge disappointment [Kevin Laland's new view of evolution]. It was dominated by talks that were so abstract and obtuse that it was difficult to mount any serious discussion. The one thing that was crystal clear is that almost all of the speakers had an old-fashioned view of the current status of evolutionary theory. Thus, they were for the most part arguing against a strawman version of evolutionary theory.
The Royal Society has now published the papers that were presented at the meeting [Theme issue ‘New trends in evolutionary biology: biological, philosophical and social science perspectives’ organized by Denis Noble, Nancy Cartwright, Patrick Bateson, John Dupré and Kevin Laland]. I'll list the Table of Contents below.
Most of these papers are locked behind a paywall and that's a good thing because you won't be tempted to read them. The overall quality is atrocious—the Royal Society should be embarrassed to publish them.1 The only good thing about the meeting was that I got to meet a few friends and acquaintances who were supporters of evolution. There was also a sizable contingent of Intelligent Design Creationists at the meeting and I enjoyed talking to them as well2 [see Intelligent Design Creationists reveal their top story of 2016].
Friday, August 25, 2017
Niles Eldredge explains punctuated equilibria
Punctuated equilibria are when these speciation events take place relatively quickly and are followed by much longer periods of stasis (no change). Niles Eldredge explains how the theory is derived from his studies of thousands of trilobite fossils.
Niles Eldredge explains hierarchy theory
How much of the human genome is devoted to regulation?
One of the common rationalizations is to speculate that while humans may have "only" 25,000 genes they are regulated and controlled in a much more sophisticated manner than the genes in other species. It's this extra level of control that makes humans special. Such speculations have been around for almost fifty years but they have gained in popularity since publication of the human genome sequence.
In some cases, the extra level of regulation is thought to be due to abundant regulatory RNAs. This means there must be tens of thousand of extra genes expressing these regulatory RNAs. John Mattick is the most vocal proponent of this idea and he won an award from the Human Genome Organization for "proving" that his speculation is correct! [John Mattick Wins Chen Award for Distinguished Academic Achievement in Human Genetic and Genomic Research]. Knowledgeable scientists know that Mattick is probably wrong. They believe that most of those transcripts are junk RNAs produced by accidental transcription at very low levels from non-conserved sequences.
Monday, August 07, 2017
A philosopher defends agnosticism
Paul Draper is a philosopher at Purdue University (West Lafayette, Indiana, USA). He has just (Aug. 2, 2017) posted an article on Atheism and Agnosticism on the Stanford Encyclopedia of Philosophy website.
Many philosphers use a different definition of atheism than many atheists. Philosophers tend to define atheism as the proposition that god(s) do not exist. Many atheists (I am one) define atheism as the lack of belief in god(s). The distinction is important but for now I want to discuss Draper's defense of agnosticism.Keep in mind that Draper defines atheism as "god(s) don't exist." He argues, convincingly, that this proposition cannot be proven. He also argues that theism—the proposition that god(s) exist—can also not be proven. Therefore, the only defensible position for a philosopher like him is agnosticism.
Friday, August 04, 2017
To toss or not to toss?
Some stuff is easy to toss out and some stuff is easy to keep. It's the other stuff that causes a problem. Here's an example ....
These are the manuals that came with my very first PC back in 1981. I know I'll never use them but I'm kinda attached to them. Are they antiques yet?
Thursday, July 27, 2017
talk.origins evolves
So talk.origins evolves and the server is moving elsewhere. Goodby Darwin.
Friday, July 14, 2017
Bastille Day 2017
Ms. Sandwalk and I visited the site of the Bastille (Place de la Bastille) when we were in Paris in 2008. There's nothing left of the former castle but the site still resonates with meaning and history.
One of my wife's ancestors is William Playfair, the inventor of pie charts and bar graphs [Bar Graphs, Pie Charts, and Darwin]. His work attracted the attention of the French King so he moved to Paris in 1787 to set up an engineering business. He is said to have participated in the storming of the Bastille but he has a history of exaggeration and untruths so it's more likely that he just witnessed the event. He definitely lived nearby and was in Paris on the day in question. (His son, my wife's ancestor, was born in Paris in 1790.)
In honor of the French national day I invite you to sing the French national anthem, La Marseillaise. An English translation is provided so you can see that La Marseillaise is truly a revolutionary call to arms. (A much better translation can be found here.)1
1. I wonder if President Trump sang La Marseillaise while he was at the ceremonies today?
Check out Uncertain Principles for another version of La Marseillaise—this is the famous scene in Casablanca.
Reposted and modified from 2016.
Revisiting the genetic load argument with Dan Graur
The genetic load argument is one of the oldest arguments for junk DNA and it's one of the most powerful arguments that most of our genome must be junk. The concept dates back to J.B.S. Haldane in the late 1930s but the modern argument traditionally begins with Hermann Muller's classic paper from 1950. It has been extended and refined by him and many others since then (Muller, 1950; Muller, 1966).
Thursday, July 06, 2017
Scientists say "sloppy science" more serious than fraud
An article on Nature: INDEX reports on a recent survey of scientists: Cutting corners a bigger problem than research fraud. The subtitle says it all: Scientists are more concerned about the impact of sloppy science than outright scientific fraud.
The survey was published on BioMed Central.Tuesday, July 04, 2017
Another contribution of philosophy: Bernard Lonergan
The challenge is to provide recent (past two decades) examples from philosophy that have lead to increased knowledge and understanding of the natural world. Here's what Jonathan Bernier offered.
But to use just one example of advances in philosophical understanding, UofT (specifically Regis College) houses the Lonergan Research Institute, which houses Bernard Lonergan's archives and publishes his collected works. Probably his most significant work is a seven-hundred-page tome called Insight, the first edition of which was published in 1957. It is IMHO the single best account of how humans come to know anything that has ever been written. The tremendous fruits that it has wrought cannot be summarized in a FB commend. Instead, I'd suggest that you walk over and see the friendly people at the LRI. No doubt they could help answer some of your questions.Here's a Wikipedia link to Bernard Lonergan. He was a Canadian Jesuit priest who died in 1984. Regis College is the Jesuit College associated with the University of Toronto.
Is Jonathan Bernier correct? Is it true that Lonergan's works will eventually change the way we understand learning?
Note: In my response to Bernier on Facebook I said, "I guess I'll just have to take our word for it. I'm not about to walk over to Regis College and consult a bunch of Jesuit priests about the nature of reality." Was I being too harsh? Is this really an examples of a significant contribution of philosophy? Is it possible that a philosopher could be very wrong about the existence of supernatural beings but still make a contribution to the nature of knowledge and understanding?
1. Jonathan Bernier tells me on Facebook that he is not a philosopher and never claimed to be a philosopher.
Monday, July 03, 2017
Contributions of philosophy
Philosophers, historians, and sociologists of science such as Thomas Kuhn, Paul Feyerabend, Bruno Latour, Bas van Fraassen, and Ian Hacking have changed the way that we see the purpose of science in everyday life, as well as proper scientific conduct. Kuhn's concept of a paradigm shift is now so commonplace as to be cliche. Meanwhile, areas like philosophy of physics and especially philosophy of biology are sites of active engagement between philosophers and scientists about the interpretation of scientific results.
Sunday, July 02, 2017
Confusion about the number of genes
[According to Ensembl86] the human genome encodes 58,037 genes, of which approximately one-third are protein-coding (19,950), and yields 198,093 transcripts. By comparison, the mouse genome encodes 48,709 genes, of which half are protein-coding (22,018 genes), and yields 118,925 transcripts overall.The very latest Ensembl estimates (April 2017) for Homo sapiens and Mus musculus are similar. The difference in gene numbers between mouse and human is not significant according to the authors ...
The discrepancy in total number of annotated genes between the two species is unlikely to reflect differences in underlying biology, and can be attributed to the less advanced state of the mouse annotation.This is correct but it doesn't explain the other numbers. There's general agreement on the number of protein-coding genes in mammals. They all have about 20,000 genes. There is no agreement on the number of genes for functional noncoding RNAs. In its latest build, Ensemble says there are 14,727 lncRNA genes, 5,362 genes for small noncoding RNAs, and 2,222 other genes for nocoding RNAs. The total number of non-protein-coding genes is 22,311.
There is no solid evidence to support this claim. It's true there are many transcripts resembling functional noncoding RNAs but claiming these identify true genes requires evidence that they have a biological function. It would be okay to call them "potential" genes or "possible" genes but the annotators are going beyond the data when they decide that these are actually genes.
Breschi et al. mention the number of transcripts. I don't know what method Ensembl uses to identify a functional transcript. Are these splice variants of protein-coding genes?
The rest of the review discusses the similarities between human and mouse genes. They point out, correctly, that about 16,000 protein-coding genes are orthologous. With respect to lncRNAs they discuss all the problems in comparing human and mouse lncRNA and conclude that "... the current catalogues of orthologous lncRNAs are still highly incomplete and inaccurate." There are several studies suggesting that only 1,000-2,000 lncRNAs are orthologous. Unfortunately, there's very little overlap between the two most comprehensive studies (189 lncRNAs in common).
There are two obvious possibilities. First, it's possible that these RNAs are just due to transcriptional noise and that's why the ones in the mouse and human genomes are different. Second, all these RNAs are functional but the genes have arisen separately in the two lineages. This means that about 10,000 genes for biologically functional lncRNAs have arisen in each of the genomes over the past 100 million years.
Breschi et al. don't discuss the first possibility.
Breschi, A., Gingeras, T.R., and Guigó, R. (2017) Comparative transcriptomics in human and mouse. Nature Reviews Genetics [doi: 10.1038/nrg.2017.19]
Genome size confusion
Breschi, A., Gingeras, T. R., and Guigó, R. (2017). Comparative transcriptomics in human and mouse. Nature Reviews Genetics [doi: 10.1038/nrg.2017.19]I was confused by the comments made by the authors when they started comparing the human and mouse genomes. They said,
Cross-species comparisons of genomes, transcriptomes and gene regulation are now feasible at unprecedented resolution and throughput, enabling the comparison of human and mouse biology at the molecular level. Insights have been gained into the degree of conservation between human and mouse at the level of not only gene expression but also epigenetics and inter-individual variation. However, a number of limitations exist, including incomplete transcriptome characterization and difficulties in identifying orthologous phenotypes and cell types, which are beginning to be addressed by emerging technologies. Ultimately, these comparisons will help to identify the conditions under which the mouse is a suitable model of human physiology and disease, and optimize the use of animal models.
The most recent genome assemblies (GRC38) include 3.1 Gb and 2.7 Gb for human and mouse respectively, with the mouse genome being 12% smaller than the human one.I think this statement is misleading. The size of the human genome isn't known with precision but the best estimate is 3.2 Gb [How Big Is the Human Genome?]. The current "golden path length" according to Ensembl is 3,096,649,726 bp. [Human assembly and gene annotation]. It's not at all clear what this means and I've found it almost impossible to find out; however, I think it approximates the total amount of sequenced DNA in the latest assembly plus an estimate of the size of some of the gaps.
The golden path length for the mouse genome is 2,730,871,774 bp. [Mouse assembly and gene annotation]. As is the case with the human genome, this is NOT the genome size. Not as much mouse DNA sequence has been assembled into a contiguous and accurate assembly as is the case with humans. The total mouse sequence is at about the same stage the human genome assembly was a few years ago.
If you look at the mouse genome assembly data you see that 2,807,715,301 bp have been sequenced and there's 79,356,856 bp in gaps. That's 2.88 Gb which doesn't match the golden path length and doesn't match the past estimates of the mouse genome size.
We don't know the exact size of the mouse genome. It's likely to be similar to that of the human genome but it could be a bit larger or a bit smaller. The point is that it's confusing to say that the mouse genome is 12% smaller than the human one. What the authors could have said is that less of the mouse genome has been sequenced and assembled into accurate contigs.
If you go to the NCBI site for Homo sapiens you'll see that the size of the genome is 3.24 Gb. The comparable size for Mus musculus is 2.81 Gb. That 15% smaller than the human genome size. How accurate is that?
There's a problem here. With all this sequence information, and all kinds of other data, it's impossible to get an accurate scientific estimate of the total genome sizes.
[Image Credit: Wikipedia: Creative Commons Attribution 2.0 Generic license]
Tuesday, June 27, 2017
Debating alternative splicing (Part IV)
In Debating alternative splicing (Part III) I discussed a review published in the February 2017 issue of Trends in Biochemical Sciences. The review examined the data on detecting predicted protein isoforms and concluded that there was little evidence they existed.
My colleague at the University of Toronto, Ben Blencowe, is a forceful proponent of massive alternative splicing. He responded in a letter published in the June 2017 issue of Trends in Biochemical Sciences (Blencowe, 2017). It's worth looking at his letter in order to understand the position of alternative splicing proponents. He begins by saying,It is estimated that approximately 95% of multiexonic human genes give rise to transcripts containing more than 100 000 distinct AS events [3,4]. The majority of these AS events display tissue-dependent variation and 10–30% are subject to pronounced cell, tissue, or condition-specific regulation [4].
Monday, June 26, 2017
Debating alternative splicing (Part III)
Opponents (I am one) argue that most splice variants are due to splicing errors and most of those predicted protein isoforms don't exist. (We also argue that the differences between humans and other animals can be adequately explained by differential regulation of 20,000 protein-coding genes.) The controversy can only be resolved when proponents of massive alternative splicing provide evidence to support their claim that there are 100,000 functional proteins.
Saturday, June 24, 2017
Debating alternative splicing (part II)
If you add up all the known genes, they cover about 30% of the genome sequence. Most of this (>90%) is intron sequence and introns are mostly junk. The standard mammalian gene is transcribed to produce a precursor RNA that is subsequently processed by splicing out introns to produce a mature RNA. If it's a messenger RNA (mRNA) then it will be translated to produce a protein (technically, a polypeptide). So far, the vast majority of protein-coding genes produce a single protein but there are some classic cases of alternative splicing where a given gene produces several different protein isoforms, each of which has a specific function.
Friday, June 23, 2017
Debating alternative splicing (part I)
I recently had a chance to talk science with my old friend and colleague Jack Greenblatt. He has recently teamed up with some of my other colleagues at the University of Toronto to publish a paper on alternative splicing in mouse cells. Over the years I have had numerous discussions with these colleagues since they are proponents of massive alternative splicing in mammals. I think most splice variants are due to splicing errors.
There's always a problem with terminology whenever we get involved in this debate. My position is that it's easy to detect splice variants but they should be called "splice variants" until it has been firmly established that the variants have a biological function. This is not a distinction that's acceptable to proponents of massive alternative splicing. They use the term "alternative splicing" to refer to any set of processing variants regardless of whether they are splicing errors or real examples of regulation. This sometimes makes it difficult to have a discussion.In fact, most of my colleagues seem reluctant to admit that some splice variants could be due to meaningless errors in splicing. Thus, they can't be pinned down when I ask them what percentage of variants are genuine examples of alternative splicing and what percentage are splicing mistakes. I usually ask them to pick out a specific gene, show me all the splice variants that have been detected, and explain which ones are functional and which ones aren't. I have a standing challenge to do this with any one of three sets of genes [A Challenge to Fans of Alternative Splicing].
- Human genes for the enzymes of glycolysis
- Human genes for the subunits of RNA polymerase with an emphasis on the large conserved subunits
- Human genes for ribosomal proteins
Thursday, June 22, 2017
Are most transcription factor binding sites functional?
The ongoing debate over junk DNA often revolves around data collected by ENCODE and others. The idea that most of our genome is transcribed (pervasive transcription) seems to indicate that genes occupy most of the genome. The opposing view is that most of these transcripts are accidental products of spurious transcription. We see the same opposing views when it comes to transcription factor binding sites. ENCODE and their supporters have mapped millions of binding sites throughout the genome and they believe this represent abundant and exquisite regulation. The opposing view is that most of these binding sites are spurious and non-functional.
The messy view is supported by many studies on the biophysical properties of transcription factor binding. These studies show that any DNA binding protein has a low affinity for random sequence DNA. They will also bind with much higher affinity to sequences that resemble, but do not precisely match, the specific binding site [How RNA Polymerase Binds to DNA; DNA Binding Proteins]. If you take a species with a large genome, like us, then a typical DNA protein binding site of 6 bp will be present, by chance alone, at 800,000 sites. Not all of those sites will be bound by the transcription factor in vivo because some of the DNA will be tightly wrapped up in dense chromatin domains. Nevertheless, an appreciable percentage of the genome will be available for binding so that typical ENCODE assays detect thousand of binding sites for each transcription factor.This information appears in all the best textbooks and it used to be a standard part of undergraduate courses in molecular biology and biochemistry. As far as I can tell, the current generation of new biochemistry researchers wasn't taught this information.
Jonathan Wells talks about junk DNA
Watch this video. It dates from this year. Almost everything Wells says is either false or misleading. Why? Is he incapable of learning about genomes, junk DNA, and evolutionary theory?
Some of my former students
Wednesday, June 21, 2017
John Mattick still claims that most lncRNAs are functional
Most of the human genome is transcribed at some time or another in some tissue or another. The phenomenon is now known as pervasive transcription. Scientists have known about it for almost half a century.
At first the phenomenon seemed really puzzling since it was known that coding regions accounted for less than 1% of the genome and genetic load arguments suggested that only a small percentage of the genome could be functional. It was also known that more than half the genome consists of repetitive sequences that we now know are bits and pieces of defective transposons. It seemed unlikely back then that transcripts of defective transposons could be functional.Part of the problem was solved with the discovery of RNA processing, especially splicing. It soon became apparent (by the early 1980s) that a typical protein coding gene was stretched out over 37,000 bp of which only 1300 bp were coding region. The rest was introns and intron sequences appeared to be mostly junk.
Tuesday, June 20, 2017
On the evolution of duplicated genes: subfunctionalization vs neofunctionalization
New genes can arise by gene duplication. These events are quite common on an evolutionary time scale. In the current human population, for example, there are about 100 examples of polymorphic gene duplications. These are cases where some of us have two copies of a gene while others have only one copy (Zarrie et al., 2015). Humans have gained about 700 new genes by duplication and fixation since we diverged from chimpanzees (Demuth et al., 2006). The average rate of duplication in eukaryotes is about 0.01 events per gene per million years and the half-life of a duplicated gene is about 4 million years (Lynch and Conery, 2003).
The typical fate of these duplicated genes is to "die" by mutation or deletion. There are five possible fates [see Birth and death of genes in a hybrid frog genome]:- One of the genes will "die" by acquiring fatal mutations. It becomes a pseudogene.
- One of the genes will die by deletion.
- Both genes will survive because having extra gene product (e.g. protein) will be beneficial (gene dosage).
- One of the genes acquires a new beneficial mutation that creates a new function and at the same time causes loss of the old function (neofunctionalization). Now both genes are retained by positive selection and the complexity of the genome has increased.
- Both genes acquire mutations that diminish function so the genome now needs two copies of the gene in order to survive (subfunctionalization).
Monday, June 19, 2017
Austin Hughes and Neutral Theory
Chase Nelson has written a nice summary of Hughes' work at: Austin L. Hughes: The Neutral Theory of Evolution. It's worth reading the first few pages if you aren't clear on the concept. Here's an excerpt ...
When the technology enabling the study of molecular polymorphisms—variations in the sequences of genes and proteins—first arose, a great deal more variability was discovered in natural populations than most evolutionary biologists had expected under natural selection. The neutral theory made the bold claim that these polymorphisms become prevalent through chance alone. It sees polymorphism and long-term evolutionary change as two aspects of the same phenomenon: random changes in the frequencies of alleles. While the neutral theory does not deny that natural selection may be important in adaptive evolutionary change, it does claim that natural selection accounts for a very small fraction of genetic evolution.I don't think there's any doubt that this claim is correct as long as you stick to the proper definition of evolution. The vast majority of fixations of alleles are likely due to random genetic drift and not natural selection.
A dramatic consequence now follows. Most evolutionary change at the genetic level is not adaptive.
It is difficult to imagine random changes accomplishing so much. But random genetic drift is now widely recognized as one of the most important mechanisms of evolution.
If you don't understand this then you don't understand evolution.
The only quibble I have with the essay is the reference to "Neutral Theory of Evolution" as the antithesis of "Darwinian Evolution" or evolution by natural selection. I think "Neutral Theory" should be restricted to the idea that many alleles are neutral or nearly neutral. These alleles can change in frequency in a population by random genetic drift. The key idea that's anti-Darwinian includes that fact plus two other important facts:
- New beneficial alleles can be lost by drift before they ever become fixed. In fact, this is the fate of most new beneficial alleles. It's part of the drift-barrier hypothesis.
- Detrimental alleles can occasionally become fixed in a population due to drift.
Originally proposed by Motoo Kimura, Jack King, and Thomas Jukes, the neutral theory of molecular evolution is inherently non-Darwinian. Darwinism asserts that natural selection is the driving force of evolutionary change. It is the claim of the neutral theory, on the other hand, that the majority of evolutionary change is due to chance.I would just add that it's Neutral Theory PLUS the other effects of random genetic drift that make evolution much more random than most people believe.
Austin Hughes was a skeptic and a creative thinker who often disagreed with the prevailing dogma in the field of evolutionary biology. He was also very religious, a fact I find very puzzling.
His scientific views were often correct, in my opinion.
In 2013, the ENCODE (Encyclopedia of DNA Elements) Project published results suggesting that eighty per cent of the human genome serves some function. This was considered a rebuttal to the widely held view that a large part of the genome was junk, debris collected over the course of evolution. Hughes sided with his friend Dan Graur in rejecting this point of view. Their argument was simple. Only ten per cent of the human genome shows signs of purifying selection, as opposed to neutrality.
Saturday, June 17, 2017
I coulda been an astronomer
In spite of this promising beginning, I decided to go into biology because it was harder and more interesting.
Tuesday, June 06, 2017
June 6, 1944
For baby boomers it means a day of special significance for our parents. In my case, it was my father who took part in the invasions. That's him on the right as he looked in 1944. He was an RAF pilot flying rocket firing typhoons in close support of the ground troops. During the initial days his missions were limited to quick strikes and reconnaissance since Normandy was at the limit of their range from southern England. During the second week of the invasion (June 14th) his squadron landed in Crepon, Normandy and things became very hectic from then on with several close support missions every day.
Stephen Meyer "predicts" there's no junk DNA
Wednesday, May 31, 2017
Tuesday, May 30, 2017
We are scientists
You can tell we are scientists because we're all wearing lab coats.
Left to right: David Isenman, Larry Moran, Marc Perry, Kim Ellison, Trevor Moraes, Mike Ellison.
The photo was taken in the biochemistry department labs at the University of Toronto (Toronto, Canada).
Three generations of scientists
Bottom row, left to right.
Marc Perry: Bioinformatics researcher and former graduate student in my lab.
Mike Ellison: Professor, University of Alberta (Alberta, Canada) and former graduate student in the lab of my colleague David Pulleyblank.
Trevor Moraes: Professor in my department at the University of Toronto and former graduate student with Mike Ellison.
Kim (Bird) Ellison: Professor at the University of Alberta, former undergraduate student in my lab (where she met Mike Ellison), Ph.D. at MIT.
Saturday, May 20, 2017
Denis Noble writes about junk DNA
I have read Dance to the Tune of Life. It's a very confusing book for several reasons. Denis Noble has a very different perspective on evolution and what evolutionary theory needs to accomplish. He thinks that life is characterized by something he calls "Biological Relativity." I don't disagree. He also thinks that evolutionary theory needs to incorporate everything that has ever happened in the history of life. That's where we part company.
I'm working slowly on a book about genomes and junk DNA so I was anxious to see how Noble deals with that subject. I tend to judge the quality of books and articles by the way they interpret the controversy over junk DNA. Here's the first mention of junk DNA from page 89. He begins by saying that it's difficult to explain development and the diversity of tissues in multicellular organisms. He continues with,Thursday, May 18, 2017
Jonathan Wells illustrates zombie science by revisiting junk DNA
Jonathan Wells has written a new book (2017) called Zombie Science: More Icons of Evolution. He revisits his famous Icons of Evolution from 2000 and tries to show that nothing has changed in 17 years.
I wrote a book in 2000 about ten images images, ten "icons of evolution," that did not fit the evidence and were empirically dead. They should have been buried, but they are still with us, haunting our science classrooms and stalking our children. They are part of what I call zombie science.I won't bore you with the details. The icons fall into two categories: (1) those that were meaningless and/or trivial in 2000 and remain so today, and (2) those that Wells misunderstood in 2000 and are still misunderstood by creationists today.
Tuesday, May 16, 2017
"The Perils of Public Outreach"
Julia Shaw is a forensic psychologist. She is currently a senior lecturer in criminology at the London South Bank University (London, UK). Shaw is concerned that we are creating a culture where public outreach is being unfairly attacked. Read her Scientific American post at: The Perils of Public Outreach.
Shaw's point is rather interesting. She believes that scientists who participate in public outreach are being unfairly criticized. Let's look closely at her argument.What scientists write in academic publications is generally intended for a scientific community, full of nuance and precise language. Instead, what scientists say and write in public forums is intended for lay audiences, almost invariably losing nuance but gaining impact and social relevance. This makes statements made in public forums particularly ripe for attack.
Wednesday, May 10, 2017
Debating philosophers: Pierrick Bourrat responds to my criticism of his paper
I recently criticized a paper by Lu and Bourrat on the extended evolutionary synthesis [Debating philosophers: The Lu and Bourrat paper]. Pierrick Bourrat responds in this guest post.
Research Fellow, Department of Philosophy
Macquarie University
Sydney, Australia
Both Qiaoying Lu and I are grateful to Professor Moran for the copious attention he has bestowed on our paper. We are early career researchers and didn’t expect our paper to receive so much attention from a senior academic in a public forum. Moran claims that our work is out of touch with science (and more generally works in philosophy of biology), that the paper is weakly argued and that some of what we write is false. But in the end, he puts forward a similar position to ours.
Saturday, May 06, 2017
Debating philosophers: Epigenetics
Qiaoying Lu and Pierrick Bourrat are philosophers in Australia.1 Their research interests include evolutionary theory and they have taken an interest in the current debate over extending evolutionary theory. That debate has been promoted by a small group of scientists who, by and large, are not experts in evolution. They claim that current evolutionary theory—which they define incorrectly as the 1960s version of the Modern Synthesis—needs to be overthrown or extended by including things like epigenetics, niche construction, developmental biology, and plasticity [New Trends in Evolutionary Biology: The Program].
Lu and Bourrat have focused on epigenetics in their recent paper [Debating philosophers: The Lu and Bourrat paper]. They hope to reach an accommodation by re-defining the evolutionary gene as: "any physical structure that causes a heritable variation." Then they go on to say that, "we define the phenotype of an evolutionary gene as everything that the gene makes a difference to when compared to another gene."By doing this, they claim that epigenetic changes (e.g. transient methylation) fall with the new definition. Therefore, epigenetics doesn't really represent a challenge to evolutionary theory. They explain it like this ....
Thursday, May 04, 2017
Debating philosophers: The molecular gene
This is my fifth post on the Lu and Bourrat paper [Debating philosophers: The Lu and Bourrat paper]. The authors are attempting to justify the inclusion of epigenetics into current evolutionary theory by re-defining the concept of "gene," specifically the evolutionary gene concept. So far, I've discussed their understanding of current evolutionary theory and why I think it is flawed [Debating philosophers: The Modern Synthesis]. I described their view of "genes" and pointed out the confusion between "genes" and "alleles" and why I think "alleles" is the better term [Debating philosophers: The difference between genes and alleles]. In my last post I discussed their definition of the evolutionary gene and why it is too adaptationist to serve a useful function [Debating philosophers: The evolutionary gene].
Wednesday, May 03, 2017
Debating philosophers: The evolutionary gene
This is the forth post on the Lu and Bourrat paper [Debating philosophers: The Lu and Bourrat paper]. The philosophers are attempting to redefine the word "gene" in order to make epigenetics compatible with current evolutionary theory.
I define a gene in the following way: "A gene is a DNA sequence that is transcribed to produce a functional product" [What Is a Gene?]. This is a biochemical/molecular definition and it's not the same as the definition used in traditional evolution.Lu and Bourrat discuss the history of the evolutionary gene and conclude,
Debating philosophers: The difference between genes and alleles
This is my third post on the Lu and Bourrat (2017) paper [Debating philosophers: The Lu and Bourrat paper]. Part of their argument is to establish that modern evolutionary theory is a gene-centric theory. They need to make this connection because they are about to re-define the word "gene" in order to accommodate epigenetics.
In my last post I referred to their defense of the Modern Synthesis and quoted them as saying that the major tenets of the Modern Synthesis (MS) are still the basis of modern evolutionary theory. They go on to say,