Sandwalk: Estimating the Human Mutation Rate: Direct Method

Friday, March 22, 2013

Estimating the Human Mutation Rate: Direct Method

This is the fourth in a series of posts on human mutation rates and their implication(s). The first three were ...

What Is a Mutation?
Estimating the Human Mutation Rate: Biochemical Method
Estimating the Human Mutation Rate: Phylogenetic Method

There are basically three ways to estimate the mutation rate in the human lineage. I refer to them as the Biochemical Method, the Phylogenetic Method, and the Direct Method.

The Biochemical Method is based on our knowledge of biochemistry and DNA replication as well as estimates of the number of cell divisions between zygote and egg. It gives a value of 130 mutations per generation. The Phylogenetic Method depends on the fact that most mutations are neutral and that the rate of fixation of alleles is equal to the mutation rate. It also relies on a correct phylogeny. The Phylogenetic Method gives values between 112-160 mutations per generation. These two methods are pretty much in agreement.

The Direct Method involves sequencing the entire genomes of related individuals (e.g. mother, father, child) and simply counting the new mutations in the offspring. You might think that the Direct Method gives a definitive result that doesn't rely on any assumptions, therefore it should yield the most accurate result. The other two methods should be irrelevant.

This would be true if the Direct Method were as easy as it sounds but things are more complicated.

The first paper to be published was by Xue et al. (2009). They looked at the sequences of Y chromosomes from two men separated by 13 generations. (6 generations in one lineage and 7 generations in the other.) The Y chromosomes differed by four mutations in 10.15 × 10⁶ bp.¹ These are neutral mutations and the rate works out to 3.0 × 10^-8 mutations per base pair per generation.

If we assume an average of 400 cell divisions per generation (male lineage) then this gives a mutation rate of 0.75 × 10^-10 mutations per bp per replication. This isn't far from the value of 1.0 × 10^-10 that we used in the Biochemical Method.

If we apply this mutation rate to the entire genome then there will be 96 mutations in each sperm cell and 7 in each egg cell for a total of ...

103 mutations per generation

Theme

Mutation

-definition
-mutation types
-mutation rates
-phylogeny
-controversies

The problems with this calculation have to do deciding how many real mutations there are. In this particular experiment, the Y chromosomes were extracted from cells in culture. The authors actually found 23 differences between the two Y chromosomes but only 12 of these were confirmed by resequencing. Of these, only four were confirmed by sequencing DNA directly from the donors. (Eight mutations occurred during growth of the cell lines.) The authors are confident that they have not missed any mutations and I suspect that the number of false negatives is, in fact, close to zero.

This value (103 mutations per generation) is on the low end of the values calculated previously but the error bars are significant due to the low number of mutations.

Three other papers have appeared recently.²

1. Roach et al. (2012) sequenced genomes from a family of four (mother, father, two children). They found 33,937 potential mutations but confirmed only 28 mutations in the two children. After making some adjustments for false negatives they estimate that the total average number of mutations per diploid genome per generation was ...

70 mutations per generation

This is about half the value estimated by the Biochemical and Phylogenetic Methods. It's not clear to me how they estimated the true number of mutations. What is clear is that it is not easy to count mutations when dealing with sloppy sequences.

2. Conrad et al. (2011) looked at two sets of parents and offspring (trios). They used cell lines so they had to distinguish between germline mutations and somatic cell mutations. One of the offspring had 49 mutations and the other had 35 mutations. There were 1,586 somatic cell mutations that had to be eliminated. After correcting for false negatives, they estimate 60 mutations in one child and 45 mutations in the other. Since only 2.555 Gb were analyzed, this works out to ...

75 mutations per generation
56 mutations per generation

These values are lower than what we expected from previous studies. The authors determined that 92% of the mutations in one offspring were from the father but only 36% of the mutations in the other trio were from the father. This is not reasonable and neither is the discrepancy in total mutations between the two different offspring. It suggests that there are a lot of errors in this study.

3. The most comprehensive study so far is from Kong et al. (2012). These authors looked at 78 Icelandic families whose genealogies were well known. They sequenced the genomes of 219 distinct individuals and found an average of 63.2 mutations in each child. Since they only looked at 2.63 Gb, this translates to ...

77 mutations per generation

Individual values vary over a wide range. The lowest score reported is 58 and the highest is 129. This study suffers from the same problems as the other two direct sequencing experiments; namely, that it's difficult to decide which of the differences are real mutations and which ones are artifacts. The authors claim that their false negative rate is only 2%.

The whole genome sequencing papers have been widely reported as giving a result that is half the mutation rate we estimated previously. This is a problem because the mutation rate is used in many calculations. We'll discuss the implications in later posts.

1. The Y chromosome is 24 Mb but they couldn't analyze regions of repeats and some other regions weren't well covered.

2. Please let me know if I missed any papers.

Conrad, D.F., et al. (2011) Variation in genome-wide mutation rates within and between human families. Nature Genetics 43:712-715. [doi: 10.1038/ng.862]

Kong, A., et al. (2012) Rate of de novo mutations and the importance of father's age to disease risk. Nature 488:471-475. [doi: 10.1038/nature11396]

Roach, J.C., et al. (2010) Analysis of genetic inheritance in a family quartet by whole-genome sequencing. Science 328:636-639. [doi: 10.1126/science.1186802]

Xue, Y., et al. (2009) Human Y chromosome base-substitution mutation rate measured by direct sequencing in a deep-rooting pedigree. Current Biology 19:1453-1457. [doi: 10.1016/j.cub.2009.07.032]

21 comments :

TheOtherJim said...: There were also estimates in one of the 1000 genomes project papers (Nature 467, 1061–1073, October 2010). 49 and 35 detected SNPs, giving 1.2 x10^−8 and 1.0 x 10^−8 muts/bp/gen.; Friday, March 22, 2013 5:09:00 PM
Konrad said...: Larry, I thought you banned this troll. He is polluting an interesting post with comments that just consist of negative tone and almost no content.

But let's take the phrase "when genetic mutations go against evolution" as a starting point. This sounds like a non sequitur, but I think the intended meaning was "reversions" (i.e. when a 2nd mutation at the same site reverts the nucleotide back to its original state) - which would make it a valid question (despite the extremely rude phrasing) that can actually contribute to the discussion:

Reversions are a well understood phenomenon and are for instance important in HIV evolution, where the mutation rate is high and selection is strong. In the studies Larry cited, any reversions that occur will be missed, resulting in underestimation of rate, but the question is whether reversions are common enough to affect the numerical estimates. From the biochemical approach we know that they cannot be _very_ common, so the effect will be minimal (i.e. will not affect the estimates at the level of precision given) unless there is a mechanism causing the rates to be hugely elevated in specific regions. Regions in which the mutation rate is elevated are called mutational hotspots and quantitative estimates of this effect are well established. Perhaps someone has the numbers handy; I would be _very_ surprised if mutational hotspots are strong enough to cause enough reversions to alter these estimates.

So nice try, John, but the answer is that reversions are not something that any of these researchers will have failed to think about, or that creates a problem for these analyses. Now if only you'd learn to phrase your questions a little more respectfully you might get serious answers from those of us who investigate these things for a living more often. But of course I understand that serious answers are the last thing you are after.; Friday, March 22, 2013 6:37:00 PM
Konrad said...: Ah, I see the troll post I was responding to has disappeared in the mean time. My post above is an answer to a potentially legitimate question about reversions. I thought I'd post it in case other readers are wondering about that.; Friday, March 22, 2013 6:41:00 PM
TheOtherJim said...: The comments are apparently hand-deleted. So his stuff persisits for a bit.; Friday, March 22, 2013 6:44:00 PM
The whole truth said...: A new article that you all might find interesting:

http://www.sciencedaily.com/releases/2013/03/130322114856.htm; Saturday, March 23, 2013 3:36:00 AM
AllanMiller said...: There is a danger of circularity in feeding mutation rates from some of these methods into other calculations (which is why it is useful that there is more than one). The phylogenetic method, for example, uses a fossil-based time for divergence, but if the mutation rate from that is then used to revise the time of the Homo-Pan split ...; Saturday, March 23, 2013 6:08:00 AM
Larry Moran said...: I'm not going to discuss mutation rates in mitochondria. It's clear that they are unreliable and they've become irrelevant.; Saturday, March 23, 2013 7:20:00 AM
DK said...: it's difficult to decide which of the differences are real mutations and which ones are artifacts

Not a real issue. Resequencing 100 or sites by a different method is easy and completely eliminates the problem.; Saturday, March 23, 2013 10:18:00 AM
Larry Moran said...: In a typical experiment there are over one million potential differences. The most obvious sequencing errors can be eliminated if you have extensive coverage (6 X). But this isn't always the case with short reads.

One is left with about 40,000 good candidates. I'm glad you think it's easy to resequence all those sites using a different method.

Of course this doen't help at all with false negatives.; Saturday, March 23, 2013 11:35:00 AM
Joe Felsenstein said...: Thanks, Larry, for this useful series of posts. I assume that in the phylogenetic method the authors are allowing for coalescent effects which will lead to a divergence of the gene copies that is greater than the time back to the fork on the species trees. If they do not do that they will get too high a mutation rate.

Another interesting issue is whether human mutation rates in recent years are higher than they have been in the longer term. The estimated mutation rates would seem to imply too high a mutational load (and creationists have noted this and are crowing that this shows that humans are deteriorating rapidly and could not have been around longer then, oh say, 6008 years). An elevated mutation rate could be due to being in an industrial society, or perhaps even just to having more of our reproduction done by older males than used to be the case. Comparison of mutation rates on branches of the phylogeny that do not include humans would be interesting as a check on whether human mutation rates are elevated over their (pre-)historic levels.; Saturday, March 23, 2013 11:39:00 AM
AllanMiller said...: Yet - taken at face value - the direct measures, necessarily recent, give a lower rate than the 'overall' methods that take the full divergent period.; Saturday, March 23, 2013 3:51:00 PM
Joe Felsenstein said...: You're right. Still, all these estimates are somewhat too high for the mutational load implied.; Saturday, March 23, 2013 5:02:00 PM
DK said...: One is left with about 40,000 good candidates

Can you elaborate? I admit to not being an expert but in no publication or a personal conversation I've come across figure this high. Seriously? "Good candidates"? Good coverage still gets you ~40,000??? Based on what is the number reduced to the typical ~100?; Saturday, March 23, 2013 9:46:00 PM
AllanMiller said...: I don't know the derivation in detail, but 'harmful' mutations of the order of 2 per individual are often quoted, which gives unreasonable figures for the proportion of the population that should fail to reproduce, and the numbers of offspring viable females would need to produce to offset those losses. However, it's not clear why as many directly harmful mutations should be thought to get through the filter of gametogenesis and early post-fertilisation expression. The proportion of harmful genes expressed late enough to allow the individual to exist as a counted non-reproducer in the population must be small?; Sunday, March 24, 2013 5:56:00 AM
TheOtherJim said...: I think that most of the "harmful" mutations are often recessives. So I'm not sure that these estimates are too decoupled from what we observe. One study found 31% of all pregnancies spontaneous terminated (22% occuring before clinical confirmation of pregnancy). 95% of the couples in the study went on to have a child within 2 years, so it was not an effect specific to low-fertility couples.

Another study claims that ~60% are due to chromosomal abonormalities, leaving 40% as undiagnosed. There is a lot of room where deleterious mutations could be involved.

www.ncbi.nlm.nih.gov/pubmed/3393170
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1013610/; Monday, March 25, 2013 9:40:00 AM
caynazzo said...: Larry is missing an important study "Estimating the human mutation rate using autozygosity in a founder population (2012)," which I'm surprised Joe Felsenstein didn't mention because it's from his department.

I would say this paper is more robust because it didn't just look at the bottom of the tree. I think there were regions of 100Mb autozygosity in the Hutterites studied but still some uncertainty as to what constitutes the common ancestor with respect to these regions.; Tuesday, March 26, 2013 12:04:00 PM
AllanMiller said...: I think that most of the "harmful" mutations are often recessives

Perhaps - recessives are certainly capable of getting through the filter; I don't think there is a mechanism constraining new mutations to be recessive?

But I think the assumption may nonetheless be a little high. On a rate of 130 per individual, 1/75th of mutations being adjudged 'harmful' would indicate that 1.33% of bases cannot tolerate a SNP without harm, or, spreading the risk out (and assuming 96% junk) that 1/3rd of all SNPs in non-junk are harmful.; Tuesday, March 26, 2013 1:03:00 PM
Andrew.Erickson said...: We had a discussion about evolution at my work lunch the other day. An evolution "skeptic" made a claim that there are never mutations in humans. A quick Google search on my phone brought up this article so I was able to rebut him.
Great read, thanks!; Tuesday, April 23, 2013 12:47:00 PM
Ryan W. said...: 1. Wouldn't recessives be far more common, though, since if you still have one working copy for a gene....

2. Isn't a big portion of that 'junk' structural, meaning that mutations to it could be deleterious?; Friday, July 26, 2013 12:54:00 AM
MattDehn said...: Wow! This material relates directly to my research paper about Genetic Diversity among Living Humans! The most interesting thing about your review Dr. Moran is citing Kong's 2012 study and the age dependence of the mutation rate. Very riveting and state of the art work.; Monday, March 03, 2014 10:19:00 PM
Unknown said...: What I especially appreciated was the common language approach that allowed me, an armchair expert (couch potato!) to understand your ideas with only limited trips to the Google define! Thank you very much, Larry.; Monday, January 11, 2016 10:44:00 AM

Quotations

The old argument of design in nature, as given by Paley, which formerly seemed to me to be so conclusive, fails, now that the law of natural selection has been discovered. We can no longer argue that, for instance, the beautiful hinge of a bivalve shell must have been made by an intelligent being, like the hinge of a door by man. There seems to be no more design in the variability of organic beings and in the action of natural selection, than in the course which the wind blows.Charles Darwin (c1880)

Although I am fully convinced of the truth of the views given in this volume, I by no means expect to convince experienced naturalists whose minds are stocked with a multitude of facts all viewed, during a long course of years, from a point of view directly opposite to mine. It is so easy to hide our ignorance under such expressions as "plan of creation," "unity of design," etc., and to think that we give an explanation when we only restate a fact. Any one whose disposition leads him to attach more weight to unexplained difficulties than to the explanation of a certain number of facts will certainly reject the theory.

Charles Darwin (1859)

Science reveals where religion conceals. Where religion purports to explain, it actually resorts to tautology. To assert that "God did it" is no more than an admission of ignorance dressed deceitfully as an explanation...

Peter Atkins

Quotations

The world is not inhabited exclusively by fools, and when a subject arouses intense interest, as this one has, something other than semantics is usually at stake. Stephen Jay Gould (1982)
I have championed contingency, and will continue to do so, because its large realm and legitimate claims have been so poorly attended by evolutionary scientists who cannot discern the beat of this different drummer while their brains and ears remain tuned to only the sounds of general theory. Stephen Jay Gould (2002) p.1339
The essence of Darwinism lies in its claim that natural selection creates the fit. Variation is ubiquitous and random in direction. It supplies raw material only. Natural selection directs the course of evolutionary change. Stephen Jay Gould (1977)
Rudyard Kipling asked how the leopard got its spots, the rhino its wrinkled skin. He called his answers "just-so stories." When evolutionists try to explain form and behavior, they also tell just-so stories—and the agent is natural selection. Virtuosity in invention replaces testability as the criterion for acceptance. Stephen Jay Gould (1980)
Since 'change of gene frequencies in populations' is the 'official' definition of evolution, randomness has transgressed Darwin's border and asserted itself as an agent of evolutionary change. Stephen Jay Gould (1983) p.335
The first commandment for all versions of NOMA might be summarized by stating: "Thou shalt not mix the magisteria by claiming that God directly ordains important events in the history of nature by special interference knowable only through revelation and not accessible to science." In common parlance, we refer to such special interference as "miracle"—operationally defined as a unique and temporary suspension of natural law to reorder the facts of nature by divine fiat. Stephen Jay Gould (1999) p.84

Quotations

My own view is that conclusions about the evolution of human behavior should be based on research at least as rigorous as that used in studying nonhuman animals. And if you read the animal behavior journals, you'll see that this requirement sets the bar pretty high, so that many assertions about evolutionary psychology sink without a trace.

Jerry Coyne
Why Evolution Is True

I once made the remark that two things disappeared in 1990: one was communism, the other was biochemistry and that only one of them should be allowed to come back.

Sydney Brenner
TIBS Dec. 2000

It is naïve to think that if a species' environment changes the species must adapt or else become extinct.... Just as a changed environment need not set in motion selection for new adaptations, new adaptations may evolve in an unchanging environment if new mutations arise that are superior to any pre-existing variations

Douglas Futuyma

One of the most frightening things in the Western world, and in this country in particular, is the number of people who believe in things that are scientifically false. If someone tells me that the earth is less than 10,000 years old, in my opinion he should see a psychiatrist.

Francis Crick

There will be no difficulty in computers being adapted to biology. There will be luddites. But they will be buried.

Sydney Brenner

An atheist before Darwin could have said, following Hume: 'I have no explanation for complex biological design. All I know is that God isn't a good explanation, so we must wait and hope that somebody comes up with a better one.' I can't help feeling that such a position, though logically sound, would have left one feeling pretty unsatisfied, and that although atheism might have been logically tenable before Darwin, Darwin made it possible to be an intellectually fulfilled atheist

Richard Dawkins

Another curious aspect of the theory of evolution is that everybody thinks he understand it. I mean philosophers, social scientists, and so on. While in fact very few people understand it, actually as it stands, even as it stood when Darwin expressed it, and even less as we now may be able to understand it in biology.

Jacques Monod

The false view of evolution as a process of global optimizing has been applied literally by engineers who, taken in by a mistaken metaphor, have attempted to find globally optimal solutions to design problems by writing programs that model evolution by natural selection.

Richard Lewontin

More Recent Comments

Friday, March 22, 2013

Estimating the Human Mutation Rate: Direct Method

21 comments :