Friday, March 19, 2010

Direct Measurement of Human Mutation Rate

Given what we know about errors in DNA replication, we estimate that every human infant carries 130 new mutations [Mutation Rates].

This number correspond to 75 mutations per haploid genome per generation or a mutation rate of 2.3 × 10-8 per base pairs in the haploid genome (3.2 × 109) per generation. This value is consistent with a variety of experimental measurements, notably the rate in Y chromosomes [Human Y Chromosome Mutation Rates].

A recent paper in Science attempted a direct measurement of the mutation rate by comparing the complete genome sequences of two offspring and their parents. They estimate that each offspring had 70 new mutations (instead of the predicted 130) for a mutation rate of 1.1 × 10-8 per haploid genome per generation (Roach et al. 2010).

This value is only half of the rate expected from previous results and John Hawks is pretty concerned about that: A low human mutation rate may throw everything out of whack. If it's true then the time of divergence of humans and chimps would have to be set at 9 Myr and a lot of studies of recent human evolution could be off by a factor of 2.

I don't think there's any cause for concern because the measured rate in these sequenced genomes is not nearly as reliable as you might think. Most people imagine that it's merely a question of adding up all the differences between the genomes of the two offspring and their parents. Problem is, that number came to 49,720 potential mutations and that's certainly wrong. What to do?

The authors checked their data and did a bunch of re-sequencing to confirm the differences. They ended up with 28 confirmed differences in the two genomes. That's too low, so they went back over the data to see if they could find false negatives. After some careful analysis they were able to estimate the false negative rate and adjust the final tally to 70 new mutations in each diploid offspring. (Recall that the calculation based on known DNA replication error rates was 130 mutations per diploid genome per generation.)

This is how they arrived at their final value of a mutation rate. Given the possible sources of error in the genome sequence data, I don't think we should get too excited about this number. After all, it's only a bit lower than previous estimates. We should be celebrating the remarkable consistency of the data and not the variability.

John, you don't need to re-write your grant—at least not for that reason.


Roach, J.C., Gustavo Glusman, G., Smit, A.F.A., Huff, C.D., Hubley, R., Shannon, P.T., Rowen, L., Pant, K.P., Goodman, N., Bamshad, M., Shendure, J., Drmanac, R., Jorde, L.B., Hood, L., and Galas, D.J. (2010) Analysis of Genetic Inheritance in a Family Quartet by Whole-Genome Sequencing. Science (Published Online March 10, 2010) [doi: 10.1126/science.1186802]

7 comments :

  1. Ah, if only I had a grant to rewrite.

    Thanks, Larry!

    ReplyDelete
  2. Worrying about the mutation rate is academic at this point. Whole-genome human sequences are going to be flying out the door with trivial rapidity and the empirical error rate be known to high precision in months probably, rather than years. Whole-genome sequencing of chimp families won't be far behind. Estimates based on the biochemical properties of the polymerases will be rendered irrelevant at that time.

    ReplyDelete
  3. Not to pick nits, but shouldn't that be "2.3 x 10^-8 per base pair per generation?"

    ReplyDelete
  4. gatzal says,

    Not to pick nits, but shouldn't that be "2.3 x 10^-8 per base pair per generation?"

    I fixed it. Is it clear now?

    ReplyDelete
  5. Whole-genome human sequences are going to be flying out the door with trivial rapidity and the empirical error rate be known to high precision in months probably

    I'm not so convinced. I remember seeing a presentation by the head of an Illumia re-sequencing effort. They needed 32x coverage of the human genome to detect 98% of the known heterozygous sites.

    ReplyDelete
  6. History has demonstrated repeatedly that it's foolish to bet against technology this way when the only barrier left to overcome is "faster and cheaper".

    ReplyDelete
  7. "Faster and cheaper" is not the only barrier. Miscall rates are very high in the 4 current next-gen platforms. In the case of accurately finding all SNPs in a genome, this can still be problematic, as in the example I mentioned.

    Hopefully Next-Gen version 2.0 will be less error-prone ;-)

    ReplyDelete