Thursday, July 25, 2013

Every non-lethal genome position is variable in the human population

Melissa Wilson Sayres blogs at mathbionerd and Panda's Thumb. A recent post on Panda's Thumb address a tweet from Daniel Wegmann where he said "Every non-lethal genome position is variable in the human population."

She asks "Is this true?" and proceeds to show that it is [How many mutations?]. She assumes that the human mutation rate is 1.2 × 10-8 per sit per generation. Multiply this by 7.16 billion people on the planet and you get an average of 86 mutations at every single base pair in the human genome.1

Many of these mutations will be deleterious and they will be quickly eliminated from the population if they are lethal or cause severe problems. Some moderately and slightly deleterious mutations will be present in the population because they haven't yet been eliminated by negative selection. (Some will have no effect if they are present in only one copy of your diploid genome.)

To a first approximation, the statement is pretty accurate. If it's true that most of our genome is junk then the nucleotide sequence is not important.2 As we sequence more and more genomes we should see heterogeneity at 90% of the base pairs in the genome. We haven't reached this sort of coverage but all available evidence is consistent with the idea that most positions can be variable.


1 I prefer a larger mutation rate of 100 new mutations per generation for a total of 112 mutations at every site.

2. This doesn't rule out functions that are not sequence-specific. Such functions are known to exist but there are no reasonable hypotheses that justify such functions for most of the genome.

13 comments :

  1. I suppose that in a genome of 6 billion bases "many" can still be a very small percentage, but "many of those mutations will be deleterious" still sounds a bit odd.

    ReplyDelete
    Replies
    1. Good point. There are about 6 × 10^11 new mutations every generation and only 1%, at most, are deleterious.

      That's 6 billion deleterious mutations per generation. I probably should have said "very few." :-)

      Delete
  2. The calculation Sayres made was only of mutations that occurred in the present generation. If you consider those that occurred in previous generations and were passed on and are segregating in our population, the number is of course higher.

    It gets complicated, because it involves not only the coalescent tree of ancestry of the copies in the present generation, it also involves the way the coalescent process breaks down as an approximation when the sample taken is the whole population.

    ReplyDelete
    Replies
    1. Yeah, I know. It doesn't change the conclusion that every non-lethal site is mutated, but it underestimates the number of segregating mutations in the population.

      Do you know how to calculate the real number? It's tricky because you don't know the effective population sizes of semi-isolated human populations. I'm sure that if anyone can do it, it's you. Take your time and get back to us in a few months! :-)

      Delete
    2. You're of course offering to fund my effort ...

      Delete
    3. I'll chip in $100. That should pretty much cover it, eh?

      Delete
    4. I'll put in $5 for chalk, that's about all you need right?

      Delete
    5. Berlinski spanked your a.s big time, so if I were you, I would take some math courses this summer or I would go to the atheistic confession. I'm afraid however, that people like you lie to themselves so much, they can no longer distinguish what it is; what is the truth and a lie.

      Delete
    6. I chip in a reference that might provide some of the answer: “Analysis of 6,515 exomes reveals the recent origin of most human protein-coding variants.”; (http://www.ncbi.nlm.nih.gov/pubmed/2320168)

      Delete
  3. Thanks for reposting.

    I look forward to Joe's calculation.

    I just like how cleanly this very simple calculation shows how we expect, on average, every site, to vary in a hundred or more individuals across the whole human population.

    We can take into account mutation rate variation across the chromosomes, context-dependent substitutions, inherited mutations from previous generations, and a host of other complications. This is why I will, hopefully, be employed for a few more years. :)

    ReplyDelete
  4. ??

    If the rate per site is 1.2*10^-8, the probability for any site to remain unmutated even in 7.16 billion trials is (1-(1.2*10^-8))^7.16 billion. Using my cutting-edge computational software (Excel) I get that 1 in 390 such sites remains unmutated on average.

    When the population was 6 billion, it was 1 in 148 - the same as the chance of being an unmutated individual. (Disclaimer: IANAM).

    ReplyDelete
  5. For mutations to specific bases divide N by 3? Still 33 is rather impressive.

    ReplyDelete