Friday, September 23, 2016

A philosopher's view of random genetic drift

Random genetic drift is a process that alters allele frequencies within a population. The change is due to "random" events. It differs from natural selection where the change is due to selection for alleles that confer selective advantage on the reproductive success of an individual. Here's one description,

If a population is finite in size (as all populations are) and if a given pair of parents have only a small number of offspring, then even in the absence of all selective forces, the frequency of a gene will not be exactly reproduced in the next generation because of sampling error. If in a population of 1000 individuals the frequency of "a" is 0.5 in one generation, then it may by chance be 0.493 or 0.505 in the next generation because of the chance production of a few more or less progeny of each genotype. In the second generation, there is another sampling error based on the new gene frequency, so the frequency of "a" may go from 0.505 to 0.501 or back to 0.498. This process of random fluctuation continues generation after generation, with no force pushing the frequency back to its initial state because the population has no "genetic memory" of its state many generations ago. Each generation is an independent event. The final result of this random change in allele frequency is that the population eventually drifts to p=1 or p=0. After this point, no further change is possible; the population has become homozygous. A different population, isolated from the first, also undergoes this random genetic drift, but it may become homozygous for allele "A", whereas the first population has become homozygous for allele "a". As time goes on, isolated populations diverge from each other, each losing heterozygosity. The variation originally present within populations now appears as variation between populations.

Suzuki, D.T., Griffiths, A.J.F., Miller, J.H. and Lewontin, R.C.
in An Introduction to Genetic Analysis 4th ed. W.H. Freeman (1989 p.704)
When you look at the number of alleles that have become fixed in a population, it seems clear that random genetic drift is the dominant mechanism of evolution. However, it does not cause adaptation and many biologists think that adaptation is the only important mechanism of evolution.

It seems that philosophers have recently become interested in drift. As far as I can see, most philosophers who write about evolution are completely ignorant of random genetic drift. It's refreshing to see that there's an entry on drift in the Stanford Encyclopedia of Philosophy [Genetic Drift].

The article is very philosophical and difficult to read. Here's an excerpt.
As will be discussed further below, much of the twentieth century was marked by debates among biologists about the relative importance of drift and selection in evolution. Were those debates at least in part the result of conceptual unclarity? Millstein (2002) argues that we need not accept this inadvertent consequence of Beatty’s argument, and that selection can, in fact, be distinguished from drift. In order to do this, three extensions should be made to Beatty’s account. First, similar to Hodge (1987), Millstein suggests that a proper distinction between drift and selection relies on causation, specifically, that drift processes are indiscriminate sampling processes in which any heritable physical differences between entities (organisms, gametes, etc.) are causally irrelevant to differences in reproductive success, whereas natural selection processes are discriminate sampling processes in which any heritable physical differences between entities (organisms, gametes, etc.) are causally relevant to differences in reproductive success. These more precise characterizations of “indiscriminate sampling” and “discriminate sampling” are intended to replace the metaphorical “sampling” talk, retaining the term “sampling” as a useful shorthand only. Second, we should be careful to distinguish the process of drift from the outcomes that drift produces, and the process of selection from the outcomes that selection produces. (Of course, the importance of distinguishing process from outcome is not a novel insight; what is novel here is its application to the problem of distinguishing drift from selection. The distinction has sometimes been rendered as “process vs. product” rather than “process vs. outcome” in the philosophical literature, but given the teleological and other misleading connotations of “product”, the term “outcome” is preferable and “product” should be avoided). Third, we should characterize drift and selection as processes rather than outcomes (as in the first of the three points). If we do these three things, then drift and selection are conceptually distinct and the problem Beatty raises is dissolved; discriminate sampling processes where unlikely outcomes obtain are still selection processes. On this view, it is further acknowledged that it is possible for drift and selection to produce the same outcomes, which helps explain the persistence of biologists’ debates over the relative importance of drift and selection without making them seem trivial (see Millstein 2002 for additional discussion of Beatty’s arguments).
After reading this lengthy article, it's not clear to me what philosophers can contribute to our understanding of random genetic drift.


  1. Hi Larry,

    Although I acknowledge that random genetic drift is an important mechanism in evolution, it just seems to be that you uncritically keep giving it too much importance.

    For instance, see Tenailon et al (2016) for a major recent support to the selectionist view based on massive genome data. Specifically, they state at the end:

    "Our experimental results thus support a selectionist view of molecular evolution, complementing indirect evidence based on comparative genomics in bacteria, Drosophila and humans45–47.": Tenaillon, O., J. E. Barrick, N. Ribeck, D. E. Deatherage, J. L. Blanchard, A. Dasgupta, G. C. Wu, S. Wielgoss, S. Cruveiller, C. Médigue, D. Schneider, R. E. Lenski. 2016. Tempo and mode of genome evolution in a 50,000-generation experiment. Nature 536:165-170

    1. Sergio,

      Larry puts too much emphasis on RGD because the common Darwinian approach to evolutionary theory is shit. Larry senses it. This doesn't mean that Larry can prove that RGD is a minor or major part of evolutionary theory. Larry just extends it into the new realms of doubt and not proof. It's smart until someone decodes it...I can and I will...

  2. They can't. If they could, they'd be biologists.

    1. They can't. If they could, they'd be biologists

      What makes computer scientist any different than a philosopher?

      Does criticizing ID make one an authority on the theme without mathematically proving that say...evolution is possible in time available?

    2. "..mathematically proving that say...evolution is possible in time available?"

      That's already been done.

    3. What makes a Dunning-Krugerite blog troll an expert on evoution or ID?

    4. Mikkel,

      Almost anything is possible if you preset computers with Darwinian bias. Just because you want it to be true doesn't necessarily make it so...

    5. Manning

      I can make a pretty good argument that Darwinian anti ID propaganda is bs.
      I can also make more than one pretty good argument that for evolution to fit a scientific theory there are just too many dead ends on the way. I like speculations but scientific theory has to have some evidence and not faith based bs.

    6. "Almost anything is possible if you preset computers with Darwinian bias. "

      That's cute. First you ask for proof, then when you get what you asked for you claim it's due to "Darwinian bias".

      Prove that there is such a flaw in their research.

    7. Cruglers says:
      "I can make a pretty good argument that Darwinian anti ID propaganda is bs. "
      Well, go ahead. But 'pretty good' ain't gonna hack it, you'll need convincing evidence.

    8. Cruglers says:
      "Almost anything is possible if you preset computers with Darwinian bias."

      How very interesting you say this, because you are right. If you let computers use biological aka Darwinian selection rules, the computer all of a sudden can solve stuff human engineers can't. It's called Evolutionary algorithm .

  3. I've been reading Stephen Goulds Structures of evolution and am narrowing down to the last two chapters on PE. I don't know if he says its settled that random genetic drift is the main thing in evolution.
    it certainly would ehco his insistence darwin was wrong about gradualism. Its surprising to me how much that is rejected since it seems to be theb popular explanation for evolution. Poor Darwin.

    Biologists are a different species from evolutionary biologists.
    A biologist studies blood and guts in the here and now and need not agree or care about the origin of same.
    Only a small number of bioloogists become evolutioonary biologists and/or get paid for it.
    Creationism is not facing that much of a army.

  4. I read about half of that entry! At that point, I thought even if the distinctions being made were important, I could not care.

  5. As far as I can see, Beatty makes a correct dustinction, and it is the one that population geneticists have in mind when they describe effects as due to genetic drift or not. The prose in the Encyclopedia entry makes somewhat "heavy weather" of this. I don't think that Beatty's formulation is new to population geneticists. These are very much the same issues that arise when a physicist describes changee in a distributiin of atoms suspended in a lighter liquid as due to Brownian motion as opposed to due to gravity.

  6. Joe, Larry, I have to admit, I had no idea philosophers (or at least one philosopher) were so confused about genetic drift. I'm not sure that this means that philosophy (if not "philosophers") has nothing to contribute, however. A clear (and concise) description of cause and effect in this context, and the relationships among models, mathematical/statistical calculation of consequences, and biological understanding, is relevant. Biologists do tend to give these issues short shrift sometimes because they are usually focused on the important details, but I find it hard to believe that such a thing doesn't already exist. I don't think 'philosophical' has to mean 'difficult to read' (correlation is not causation).

    1. David -- I'm a little puzzled by why this wasn't already settled by considering concepts from physics and physical chemistry where we have thermal noise making molecules jump around in potential wells. I would have thought the issues were basically the same. Maybe the philosophers of physics and chemistry haven't talked to the philosophers of biology.

    2. Some of the distinctions made in this article are worthwhile, and require some thought. You can use the same math to describe both selection and drift, hence, mathematically they can both produce a similar mathematical outcomes. Thus what is needed to distinguish one from the other is the idea of causation, which is not something that is always worked into the mathematics.

      But if the article above is referring to John Beatty, he's a very sharp philosopher who certainly understands the basic textbook characterization of drift and selection.

    3. @Rich: A unified body of math covers both selection and drift, but I would think that the outcomes of two differ at least probabilistically. I am confused by the implication of your comment that they cannot be distinguished even probabilistically. If we have a population of 1,000,000 fruit flies and see change of an allele frequency from 0.5 to 0.55 and then to 0.60 in three generations, that's very unlikely to be the result of drift -- though it could be but it would be a very unlikely outcome.

    4. @Joe Felsenstein

      Well, I remember a discussion on this blog where it was argued that natural selection does not necessarily produce a change in allele frequency, for example without any heritable variation present. I think a distinction between process and outcome is something biologists tend to be very sloppy about indeed.

    5. AND I also remember Simon Gunkel arguing on this blog that selection and drift cannot be disentangled as a process either. e.g. here

    6. @Corneel: Simon Gunkel's argument is that particular changes cannot be dissected into selection parts and parts due to random genetic drift. The distribution of outcomes is affected by both selection and drift, and if we remove the selection part, that distribution will change. So there is then statistical information enabling us to infer whether or not the selection part is present. How much information depends on the exact situation. But it is not true that there is no information enabling us to infer whether selection is present. I doubt that Simon Gunkel would disagree with me on this.

    7. @ Joe Felsenstein
      Yes, I think he would agree that selection can be detected but I believe he took issue with viewing selection and drift as conceptually different processes.

    8. @Joe...I see what you're saying, and would generally agree, but (I think!) there is nothing in the actual math that includes causation. Instead, in your case, we see a pattern and causally assign this result to selection, not drift. In my comment I was thinking of the case where we conceptualize selection as a non-zero covariance between fitness and phenotype (as in the Price equation). But such a covariance could also occur just via drift, due to random reproduction. Hence, when using the Price equation, we would need to decide if a non-zero covariance was causally due to differential fitness or non-causally due to random reproduction. The causation is assigned after the fact and not embodied in the actual math (even though the fitness terms are included in the math).

      I guess I am convinced by Sean Rice's argument that he made in Chapter 6 of his book. [And I'm also certainly biased since he was one of my teachers in grad school].

  7. Is it not the case that mutation is always random, that RGD is normal and accounts for all variation in the genotype? Evolution, in any useful, tangible or practical sense implies phenotypic change which alone is the subject and thus in turn the consequence of selection. The rest is, well, just philosophy. But what would I know?

  8. @rich and @Corneels: If we consider the same case, with the same genotypes having the same fitnesses, but the fitnesses of different genotypes being different, and now we make one change in the case: the population is made to be infinite. Then there is no genetic drift, but the genotype frequencies do change. So selection is still present, but genetic drift is not present. That makes them conceptually distinct. And the gene frequency change observed in that case is causally related to the differences in fitness between the genotypes.

    1. I certainly agree in the case of infinite population size. I'm more thinking of something we measured out in nature, like a selection gradient, then having to make the call as to whether the pattern is due to random reproduction or fitness differences.

      And to be clear, I'm not even trying to be argumentative or dogmatic, just pointing out that *sometimes* the pattern produced by selection versus drift is difficult to disentangle.

    2. I'm moving somethings I had typed above here, hoping it doesn't get too confusing.

      @Rich Lawler: But such a covariance could also occur just via drift, due to random reproduction.
      Not if fitness is (as it should be) defined as the expected number of offspring (or half the expected number of offspring in the case of sexual reproduction).

      To (hopefully) clarify my position, there are 3 arguments to not treat selection and drift as fundamentally different. One of them is historic, one is didactic and the final one is based on the notion of parsimony.

      The historic argument is that in their joint presentation both Darwin and Wallace define selection as a sampling process, noting that fitness is given by an expected number as above and that in finite populations sampling error has relevant effects. This is also the view presented in the origin. The earlier publications in the modern synthesis also did not draw a distinction, but it is worth noting that selection was always understood first as the resampling of a finite population. As simpler infinite population models were produced, the term drift was introduced as a descriptor for the error of the simplified models. Only later - starting AFAIK with Dobzhansky - was drift reified. This is an amathematical view of mathematical models. If you look at the Fisher-Wright model, you have a rather elegant description of evolution. If one was to write it in a form that would contain drift and selection as additive components, it stops being anywhere near that elegance.

      The didactic argument is that the idea that selection and drift are conceptually different encourages students to think of them as additive effects. If we introduce them to these with the special cases N->inf and s=0 they will tend to superimpose them for other cases. It's worth noting that the s=0 case leads to additional confusion as students (but not only students!) confuse drift and neutrality. Reading the Stanford entry, that appears to be precisely what gives philosophers pause, with some arguing that drift is to be defined as strict neutrality and others arguing that it occurs when s!=0 as well.

      The final argument is parsimony. As I've already noted, we do have very elegant descriptions of evolution in the form of models that are drift+selection. From Fisher-Wright, through Moran to Kimura. All of these get less elegant if one is to split them into drift and selection slices. And it's also worth noting that we do not do this for other things. We could for instance easily split selection into an allelic selection component and a dominance-related error term. Another way to put this is the following example. Let's say we have a population of 5 cats (Alvin, Bertha, Clodwig, Dustin and Erwin). There is a probability distribution for which of these 5 cats is the next to die, say p(A)=.06, p(B)=.12, p(C)=.37, p(D)=.41 and p(E)=.04. On the death of a cat we get this:
      Through selection .06 of Alvin, .12 of Bertha, .37 of Clodwig, .41 of Dustin and .04 of Erwin died.
      Through drift .96 of Erwin, -.06 of Alvin, -.12 of Bertha, -.37 of Clodwig and -.41 of Dustin died.
      We could also just say "Erwin is dead". Now, I like maths. I like them a lot. But anybody who prefers a description that includes the death of negative real values of cats to one that says that from time to time a cat dies should at least have to have a very strong argument on how this is helpful.

    3. @ Joe Felsenstein

      Since Simon himself joined in, I will leave it to him to clarify his position. He is much better at it.

      To make my position clear: In the OP Larry was somewhat dismissive of the contribution of philosophers to our understanding of drift and selection, and I got the impression that you supported his view. I just wanted to point out that there have been quite basic discussions about selection and drift here on Sandwalk, that could have benefit from the distinctions made in the article.

    4. @Corneel: I think that the philosopher made somewhat heavy weather of the distinctions, but was basically correct.

      I do not think, as @Simon seems to, that making a distinction between selection and drift means that they have to act additively, and that we have to be able to calculate for any gene frequency change how much of it was due to the one and how much due to the other. Actually the Kolmogorov Forward Equation and the Kolmogorov Backward Equation for gene frequencies have two terms on their right-hand side. The first has all the deterministic forces in it, the second the genetic drift. As the population size rises, the second term is reduced toward zero. The forces of drift and selection can be distinguished, in the sense that it is not a hopeless task to detect selection.

    5. @ Joe Felsenstein

      I fully agree, but I am also sympathetic to Simon's argument that separating selection from genetic drift is like separating the bias from a biased coin toss. You can do it, for example by assuming infinite populations, but it does feel artificial and it doesn't reallly occur anywhere in nature.

      Now if you could show that the coin you are using has a centre of gravity that is off-centre, then you know what is causing your bias. But this is something you never will learn from your model (I believe this was Rich's point). So nothing is gained from treating selection and drift as separate "forces".

    6. @Joe: I do not think that they would have to act additively in the sense that you could superimpose the neutral case and an infinite population model. But I argue that by making the distinction we give students the impression that they do. I'm pretty sure if you polled professional biologists, even evolutionary biologists, if their research was not primarily concerned with mathematical modeling, a majority would get that wrong. As I said above, this is a didactic point - thinking of drift and selection as separate processes or forces is more likely to hinder somebodies understanding than to help it.

      Now, regarding the KFE and KBE I think the first thing to note is that the drift term explicitly depends on s. Sure it goes to 0 as N->inf, but the effect of differential fitness which s records, bleeds into your drift term in all other cases. The same holds in other models - your drift term always depends on s. It's worth noting that in each case we are dealing with random processes and thus have sequences of random variables. We can of course (and it's easy to exclude RVs for which this doesn't work from consideration, since gene frequencies are not negative and have an upper bound at 1) rewrite these RVs as X=(X-E(X))+E(X). But that alone is no reason to call E(X) and (X-E(X)) different processes. We could say that a fair die with 6 sides has a probability of 1/6 for each number in {1,2,3,4,5,6}. We could also claim that a fair die is subject to two processes, one which always gives 3.5 and one which produces a number from {-2.5,-1.5,-.5,.5,1.5,2.5} with probability 1/6 each. Does this aid our understanding of die rolling?

    7. Simon - Sorry about the elementary level of the question, but I'm a layperson.

      - Do you think it is helpful conceptually to distinguish sampling of the population due to a less than infinite effective population size from sampling due to factors that could be lumped under the heading of "selection" (e.g., predation and avoidance of it, access to food sources, etc.)?

    8. I don't think that's helpful. I mean, we can distinguish between the neutral case, where s~0 and the case when s is significantly different from 0. But there simply isn't a way to remove the effect of selection from some other type of sampling. The normal way of doing this is just what Joe discusses above, it splits a random variable into an expected value and a centered term (in the diffusion approximation that centered term then undergoes a CLT approximation, which you can't perform on a non-centered RV, so there is a reason for splitting it up as you go from something like Fisher-Wright to diffusion, but I don't think we should read that much into a mathematical operation).
      I think that pretty much any discussion that contrasts selection and drift would be much clearer if we were talking about the magnitude of s instead.

  9. Larry wrote: "However, it does not cause adaptation and many biologists think that adaptation is the only important mechanism of evolution."

    I suspect that he meant to write something like "many biologists think that adaptation is the only important *consequence* of evolution."

  10. Off topic: "Canadian medical journals hijacked for junk science"

    This should interest you Larry and maybe few more. There is going to be more "junk science out there as if there wasn't enough already...