More Recent Comments

Friday, December 19, 2014

How to think about evolution

New Scientist published a short article on How to think about. Evolution. It was written by Michael Le Page who contacted me a few months ago.

I think it's better than most such articles but I may be a little biased. Here's an excerpt.
What's more surprising is that even mutations that don't increase fitness can spread through a population as a result of random genetic drift. And most mutations have little, if any, effect on fitness. They may not affect an animal's body or behaviour at all, or do so in an insignificant way such as slightly altering the shape of the face. In fact, the vast majority of genetic changes in populations – and perhaps many of the physical ones, too – may be due to drift rather than natural selection. "Do not assume that something is an adaptation until you have evidence," says biologist Larry Moran at the University of Toronto, Canada.

So it is wrong to think of evolution only in terms of natural selection; change due to genetic drift counts too. Moran's minimal definition does not specify any particular cause: "Evolution is a process that results in heritable changes in a population spread over many generations."


102 comments :

Joe Felsenstein said...

Hoo boy ... here we go again! I've said this an million times here but I will say it again:

How small does a selection coefficient have to be to have so little effect (in the face of genetic drift) that they have essentially no effect on the probability of the mutant getting fixed? Note that this is a different question than how small their effects have to be so that we can scarcely detect them in the lab.

In a population of size (say) 1,000,000 individuals, the selection coefficient must be smaller than 1/4,000,000. In a population of effective size N, it must be less than 1/(4N). (That is, smaller in absolute value of s).

That's an awful lot smaller than the value of the selection coefficient which the researcher can barely detect. That value would perhaps be somewhere between 0.01 and 0.001.

And no, 1 million individuals is not such a big number for many species. If you've ever been chased by a cloud of mosquitoes, it won't seem too exaggerated.

So just because you can't detect any difference does not mean that nature can't. Nature can do a much bigger experiment, for much longer, than we can.

I will be happy to show my calculations here if needed.

Konrad said...

I'm not sure what point you're trying to make. Certainly there are a great many mutations that get fixed in the presence of a real selective effect that is too small to detect. Are you saying that, among these cases, those with a selective effect favouring change dominate those with a selective effect opposing change? Are you saying that these cases (or at least those with a small effect favouring change) should be ascribed only to positive selection and not (at least partly) to genetic drift?

Robert Byers said...

What is the PROCESS??
Its so aggressive here that drift is mostly important, and selection on that, and mutations unimportant.
this is commonly said in evo bio these days?
Without the mutation nothing happens in the complicated biological systems.
if everything came from drift or nothing oNE would never know.
How could you tell?
Evolutionism must live or die in the conviction of mutationism.
endless chances for selection to favor population success in nature.
This drift stuff seems a retreat.
is there any evidence drift did produce the gene change in biology?

Joe Felsenstein said...

Mr. Expert on Evolution:

It's called "random genetic drift". It's been studied since the 1920s. It is random changes of gene frequencies owing to (1) random deaths, (2) random births, and (3) random segregation of genes in Mendelian genetics. Ever heard of it? (I guess not).

Please go read an evolutionary biology textbook -- any of them.

Joe Felsenstein said...

Michael Le Page is quoted above as saying And most mutations have little, if any, effect on fitness. They may not affect an animal's body or behaviour at all, or do so in an insignificant way such as slightly altering the shape of the face.

Insignificant to who? Our casual observation? An experimenter's ability to detect fitness differences? Or the effect on fixation probability in nature, over a great many generations?

You say "too small to detect". By you? By the experimenter? Or by nature?

The point I am making is that natural selection in a population can have a substantial effect on an allele's probability of fixation when phenotypic differences may seem to us (or to the experimenter) "insignificant". Loose talk about "insignificant" or "too small to detect" is useless here.

Genetic drift is present always. But the question is, is selection biasing the probability of fixation in substantially?

Here is a simple table for a case with N = 100,000, an initial gene frequency of 0.10, and various selection coefficients. It shows the probability of fixation. With no natural selection (s = 0) that should be 0.10:

    s             Prob(Fixation)

0.000000       0.100000
0.000001       0.118935
0.000002       0.139618
0.000005       0.209641
0.00001         0.335831
0.00002         0.550856
0.00005         0.864665
0.0001           0.981684
0.0002           0.999665
0.0003           0.999994
> 0.0004         1.0

Note that fixation probability is roughly doubled by natural selection with s as small as 0.00005. In other words, a fitness that is changed by 5/1000 of 1%. "Insignificant"? It is having a substantial effect.

Joe Felsenstein said...

typo in the above: should read "biasing the probability of fixation substantially"

Joe Felsenstein said...

Another typo: "with s as small as 0.000005. In other words, a fitness that is changed by 5/10000 of 1%".

Athel Cornish-Bowden said...

These a very impressive numbers, Joe. I don't think I was conscious before that such small selection coefficients had so much effect on the probability of fixation. Have you published the alculations, somewhere?

Larry Moran said...

@Joe

Your numbers are interesting but they are based on a panmitic population of 100,000 where the frequency of the allele has already reached 10%. Right?

For new mutations, the probability of fixation can be approximated by 2s. Right? For an allele with s = 0.001, it will be lost 99.99% of the time before it is fixed.

If you take 10,000 new mutations with such low selection coefficients then only one of them (on average) will become fixed in the population. Have I got that approximately right?

Do you think that's sufficiently different from chance to make a fuss?

S Johnson said...

Looking at the table I don't see any estimates on the number of generations for fixation to complete. I understand the selection coefficient is assumed to be constant. Skipping over any density dependence on the size of s, the smaller the s, the more generations needed to fix the trait ask the question, is it reasonable to assume a constant s.

Nor do I see comparisons of projections for different population sizes or gene frequencies. The table is specified to be for N=100 000 with a gene frequency of 0.10. In thinking about the power of natural selection, at least as exemplified by small s, it's not at all clear what a small s means for a novel trait, when the gene frequency equals 1/N. The table presented appears to apply to a population moving into a new habitat.

Lastly, there aren't any comparative figures for random genetic drift. (Maybe you would call this the null hypothesis, although that seems to be out of fashion nowadys.)

Perhaps these seem like stupid quibbles. But I hope at least they give some insight as to how anyone could be so foolish as to be unswayed by your figures, yet without questioning your expertise in population genetics.

Piotr Gąsiorowski said...

Shouldn't one also take into the expected time to fixation (expressed as the number of generations)? If the probability of fixation is close to 1 but the average number of generation required is roughly proportional to N_e, then for really large effective populations the time to fixation will be longer than any realistic period of environmental stasis (and constant s), unless we are talking about organisms with extremely short generation periods and extremely large effective populations.

Piotr Gąsiorowski said...

Oops, S Johnson beat me to it :)

S Johnson said...

Oops, Larry Moran beat me to the point in my second paragraph, except much more incisively.

Piotr Gąsiorowski said...

Shouldn't one also take into...

... account... (unaccountably missing)

William Spearshake said...

Correct me if I am wrong but it appears to me that Larry and Joe are just approaching the same "truth" (and not in the IDist misuse of that term) from different directions. Larry assumes that a mutation, gene, trait, that is fixed in a population is the result of random drift unless it can be demonstrated otherwise. Joe starts from the premise that the same fixed trait is the result of selection unless it can be otherwise demonstrated.

Although they have different starting points, they will ultimately come to the same conclusion. In any type of research, there are many approaches that can be used to attain the same goal.

S Johnson said...

PS Joe Felsenstein's first post, especially the second paragraph, approaches the question of the probability of random drift fixing a trait from a different angle. He gives a figure of 1/4N for the minimal size of a selection coefficient that has no effect. But here there is nothing that addresses either Larry's or Piotr's points. So I can't see that this serves as a comparative estimate for the probability of fixation by genetic drift. But, as near as I can tell, in a founder population, 1/4N might be a rather large figure!

Unknown said...

They aren't. Like everybody else Joe contrasts fixation probabilities under selection to the null hypothesis of neutrality. Larry is simply less informed about how one does that and arguably has some misconceptions about what selection is...

AllanMiller said...

S Johnson it's not at all clear what a small s means for a novel trait, when the gene frequency equals 1/N.

It means the same whatever the current frequency (assuming constant s). Provided s is above the 'effectively neutral bound' for a given population size, and constant, it will fix with greater probability than another allele that is below it, and be an adaptation. That differential increases exponentially with linear increases in s, although obviously it can always be stood on.

Unknown said...

It's basically what you obtain from Kimura 1963. It's nothing new. What should be noted though is that selection coefficients in the OOM Joe gives are pretty rare. If you take the values for 2Ns from Nielsen & Yang (2003), their best fit was a normal distribution with mean -1.72 and SD 0.72. Given that estimates for effective population size are about 10000 in this case, Joes highest value would be 8. Only 7*10^-42 of novel mutations would reach that level in this case. Note that s=0.00005 is reached by ~10-7 of novel mutants, and thus actually occurs regularly...
The thing here is that "such small selection coefficients" aren't really all that small.

Anonymous said...

Wow. I had no idea such very low selection coefficients could influence the rate at which genes are fixed.

I thought about relevance to the real world as I know it. Some plants certainly do have very large, panmictic populations (or did, before humans fragmented their populations). Such huge populations aren't as frequent as they seem to be because so many highly successful weeds are at least partially self-pollinating or apomictic, but still, these calculations are relevant to real world situations.

On the other hand, many (a majority?) of plants have more or less fragmented populations with varying, often low, migration rates between them (sometimes very close to zero even if propagules travel among them because of the low, low rate of seedling establishment in mature populations of long-lived perennial plants). Lately I've been working with plants where the species entire population doesn't exceed 10,000, and sometimes doesn't reach 100. With the smaller populations (realistically often in double digits as far as genetic individual are concerned) and the repeated founder effects as new suitable habitats open up, it seems to me that in the evolution of many (probably most) plant species, chance effects would swamp out effects of these very low selection coefficients.

So, in my opinion, Joe must be right here, but Larry's contention that we should assume traits that are fixed in a population or species are the result of genetic drift until proved otherwise is true, also.

(Although it's not relevant to the genetics of the situation, of course, one reason Larry's right is that at the moment we're mostly erring on the side of assuming the traits must be adaptive.)

Unknown said...

"That's an awful lot smaller than the value of the selection coefficient which the researcher can barely detect. That value would perhaps be somewhere between 0.01 and 0.001."

I disagree strongly here. We can detect selection coefficients far lower than that using phylogenetic approaches.

"For new mutations, the probability of fixation can be approximated by 2s. Right?"

Depends on the value for s. The approximation roughly holds if 1/N<s<.25.

"If you take 10,000 new mutations with such low selection coefficients then only one of them (on average) will become fixed in the population. Have I got that approximately right?"

The selection coefficient you give (0.001) would yield 20 fixations in this case.

"Do you think that's sufficiently different from chance to make a fuss?"

Be precise Larry. Is it sufficiently different from neutrality to make a difference? For a population size of 10000 a neutral mutation has a probability of fixation of 1/20000. Given 10000 mutations, we expect 0.5 fixations. The case you give has a 40-fold increase in fixation probability. That's a lot.

Ultimately it always boils down to this: Drift and selection are not different processes. Treating them as such is a bad idea. "[T]he vast majority of genetic changes in populations – and perhaps many of the physical ones, too – may be due to drift rather than natural selection" is what I would call a very bad way of thinking about evolution. The real question is whether an allele is neutral or not, not how much it is affected by drift and selection, because these are not separate processes. You can't assign them a degree of importance. I've got one question for anybody who thinks they can break up population resampling this way: An allele has a frequency of 0.1 and a selection coefficient of 0.00001 in a population of 10000. Which statement is true:
a) It is more likely for the allele frequency to increase by drift than to decrease by drift.
b) It is more likely for the allele frequency to decrease by drift than to increase by drift
c) Both an increase and a decrease due to drift are equally likely

Joe Felsenstein said...

There have been a lot of good points made here. Let me just comment a bit:

1. Yes, my calculations were not novel ones, but used Kimura's well-known diffusion approximation formula. As Simon will be aware, it's a very good approximation to more exact models. Simon Gunkel cited it as 1963. It is actually 1962, so he was off by 0.05%. ;-)

2. The calculations also work for negative selection, which will be much more common than positive selection. For example here are the corresponding numbers for negative values of s with initial gene frequencies of 0.1:

s            Prob(Fixation)
-0.000001     0.082978
-0.000002     0.067959
-0.000005     0.034653
-0.00001       0.009176151
-0.00002       0.0004112611

So natural selection against an allele is effective for quite small selection coefficients.

3. I agree with Gunkel that phylogenetic methods can detect small selection coefficients. That is because they measure the long-term effectiveness of selection. In-lab experimental selection methods are much less able to do that, and I stand by my guess that the lower limit of the absolute value of s that could be detected by them is in the vicinity of 0.001. Visual assessment that "it looks like an insignificant effect to me" is even less effective.

4. We could debate how typical effective population sizes of 10,000 are. I will just say that surely deer mice, fruit flies, raccons, foxes and lots of other organisms have much bigger effective population sizes than that, and one can't just ignore that when discussing the effectiveness of small selection coefficients.

5. As for mutants that occur as single copies, this has been dealt with well by Simon. In my classes I take the Haldane 1927 approximation (for favorable single mutants) of 2s, and compare single mutants that are neutral to those that are favorable. The probability of fixation of the neutral ones is 1/(2N). For an advantageous one, 2s. The ratio of the latter to the former is 4Ns. So once again the rule is that if s > 1/(4N) that selection can have a substantial effect, and if 4Ns is large, a very substantial effect.

6. Finally, of course most mutations are in "junk DNA" and are very likely to be neutral. But once we are talking about mutations affecting coding sequence or gene regulation, all bets are off. And if they have a visible phenotype, they are even less likely to be neutral.

Joe Felsenstein said...

I should just add, to Barbara's comments about fragmented populations, that one should be careful not to let small local populations give one the mistaken impression that the population size N for the fixation calculation should be small. If there are many small populations in a species, it turns out that the total species population size is the relevant one. This case has been worked on theoretically and that turns out to be essentially the message.

(I am not accusing Barbara of being mistaken on this point).

Divalent said...

"Ultimately it always boils down to this: Drift and selection are not different processes. Treating them as such is a bad idea."

I think this is much better way to view it. Much like in the case of electrochemical diffusion, where the electrical field provides a directional bias to the random diffusion of ions, so "fitness" provides a directional bias in the random fluctuations in allele frequency. But in both cases the "motion" is mostly due to random influences.

BTW, for the case of a particular mutation newly arising in one individual, the fitness benefit has to be nearly 50% for there to be a greater than 50% chance that it will become fixed in a model population. If the fitness advantage is less than that, then it is more likely to get lost.

Joe Felsenstein said...

And one more. The times it takes for selectively advantageous mutations to fix is less than for neutral mutations. But in any case that time is mostly irrelevant to our discussion. Because for each mutation that just occurred and will take 100,000 generations to get to fixation, there ought to be one that occurred 100,000 generations ago that is just reaching fixation.

Anonymous said...

Joe, don't worry. I know I could be mistaken. I used to work with population genetics a bit -- can it be 15 years ago? -- but I felt that I was at the edge of my knowledge then, and I know less about it now.

When you write, "If there are many small populations in a species, it turns out that the total species population size is the relevant one," does the migration rate matter? With a lot of wetland plants, the migration rate is high, thanks to waterfowl. For cliff-dwelling Sedum, the subtle differences from one ridge to the next suggest it's very, very low. For species with a clearly relict distribution, it's zero and will remain so until climate or land management changes. It seems to me that in the latter cases the population of the entire species might make little different (but see paragraph above).

Anonymous said...

"Evolution is a process that results in heritable changes in a population spread over many generations."...

I don't think I know even one creationist... with the exception of Witon and his crew of morons... who disagree with this statement...

So... where is the problem then if most creationists would most likely agree with Larry's definition of evolution...?

Joe Felsenstein said...

If migration rates are low, it can take quite a while for a favorable mutant to spread through the species. But if there are N total plants in the species, and each is diploid, there are 2N copies of the gene that can have a favorable mutation occur, and each still has about 2s probability that it does not get lost early on if s > 1/(2N). If a favorable mutant takes over a local population, then there are enough copies around that it will probably ultimately take over the whole species, though that may take a while.

Anonymous said...

Such beautiful calculations by so many accomplished evolutionists... One can't resist but to admire their precise mutation fixation rate and so on...
One can't also resist to wonder... if their calculations are so precise and so accurate... why can't they use their calculations in process of finally proving that macro-evolution is a fact...? Why...? If the math is so easy... why there is not even ONE=1 lab experiment proving macro-evolution..?

I guess our boys may be good at math but they may not be as good as dumb lack or, as Unknown would put it ..."not as good as (stupid) errors...

One can also wonder how it is possible that population genetics can be called a science... If dumb lack and genetic errors outsmart "intelligent scientists" like Joe Franks with their perfect predictions of genetic mutation rate... What should we really call this "science"...? I mean... really...

Larry Moran said...

Most people think that all evolution is due to natural selection. In order to counter that view it is useful to point out that many mutations (alleles) are effectively neutral but they may sometimes be fixed by random genetic drift. It's also useful to point out that new advantageous alleles aren't always fixed. In most cases it's the other, deleteriuor, allele that is preserved in the population. Natural selection can't explain that.

I understand why fans of adaptation want to treat random genetic drift as just a variant of selection but that's not very helpful. It's pretty hard to describe the accidential fixation of a deleterious allele without emphasing the contrast between natural selection and random genetic drift.

It's also pretty have to explain molecular clocks if you insist on treating drift and selection as the same process.

All the modern textbooks have it right. Natural selection and random genetic drift are two different mechanisms of evolution. Advantagous alleles are fixed by selection to give adaptations. Neutral and deleterious alleles are fixed by drift to give neutral and maladaptive phenotypes.

Larry Moran said...

Thanks for clearing that up Simon.

Are you absolutely certain that you understand the massive amounts of evidence for Neutral Theory and the importance of random genetic drift?

Larry Moran said...

Joe,

When you say that "all bets are off" when it comes to mutations in coding regions, what do you mean? If you line up the amino acid sequences of a protein from different species you will see lots a lots of differences. If you construct a phlyogenetic tree from those sequence comparisons you will usually observe a relativel constant rate of fixation over long periods of time (an approximate molecular clock). This strongly suggests that the alleles are effectively neutral in the evolving populations.

Furthermore, everything we know about the biochemistry of those proteins indicates that most of the substitutions are unlikely to affect the function of the protein.

Are you arguing that most of those changes are adaptive?

Unknown said...

Funny, when I had this discussion with Dawkins (shortly after the release of "the greatest show on earth") he accused me of treating selection just as a special case of drift...

Since you mention molecular I clocks, I'll note that that is in fact my field. I'm a graduate student and my PhD is on molecular clock dating and fossil calibrations. Hence I can pretty easily explain how you get a clock: You take Kimuras probability of fixation for a novel allele (say, eq. 13 for a rather general case). You then look at 2Nµu in the special case of s=0 and find that all the terms cancel out, apart from the mutation rate. That's not hard at all. Note that Kimuras formula is for selection and drift. In fact you rarely see any formula for drift at all, apart from the neutral case. But the drift term is not the same in other cases (one more strike against the separate processes view: there are effects of differential fitness between genotypes that end up in the drift part if you divide things up this way). This isn't just the case for the diffusion approximation, it holds for the Fisher-Wright process and the Moran process as well. All of the fundamental ways in which we can model populations

Once you have this complete description it's utterly pointless to divide it up into a selection and drift part. If you have an unfair coin that comes up heads 60% of the time, then you have a very simple model to describe this - the Bernoulli distribution with p=0.6. You can easily infer a distribution for the number of heads after n trials (Binomial), you can infer a frequency of heads after infinitely many trials through the LLN, you can give a very accurate approximation for the distribution around that value using the CLT. But essentially after saying that you have a random variable X which is Bernoulli (0.6) distributed you are done (say with heads=1, tails=0). Now, you can go and write
X=(X-E(X))+E(X)
Which is obviously true (you are adding and subtracting the number E(X))
E(X)=0.6
and
X-E(X)=0.4 with a probability of 0.6 and X-E(X)=-0.6 with a probability of 0.4
Hence there are two different processes at work for an unfair coin toss. One that always comes out with 0.6 Heads and 0.4 Tails. And one that comes out with 0.4 Heads and -0.4Tails with a probability of 0.6, but -0.6 Heads and 0.6 Tails with a probability of 0.4. If you think this is a more accurate description of the coin toss, I can't help you...

Your last paragraph goes a step further. It now claims that all Heads are due to the first process and all tails due to the second. That makes even less sense. I think your post is excellent in showcasing just how misleading it can be to treat these as individual processes rather than as a computational trick (which has outlived it's usefulness, we're not in log-table land anymore).

But again, I posed a question above which should be easy to answer if selection and drift really are different processes. We do have a pretty deep understanding of population genetics and a large amount of data. If drift and selection are both going on, it should be a cakewalk to answer the question: Given these parameters, is allele frequency more likely to increase due to drift than to decrease, more likely to decrease than increase or are the two probabilities equal. I am asking specifically for the effect of drift (i.e. not the change due to selection and drift combined).

Think about the question for a couple of minutes. It's a question that should be simple under your view, but is quite the opposite.

Larry Moran said...

Simon,

I don't understand your point, or your question.

Do you agree or disagree with the following statement.

"The rate of fixation of most alleles in a population approximates the mutation rate, strongly suggesting that most alleles are not fixed by natural selection."

Donald Forsdyke said...

And you might add to your reading list a leading US biohistorian's take: "The 'Random Genetic Drift' Fallacy," which traces the story from the early work of John T. Gulick in the 1880s (William B. Provine, 2014).

Joe Felsenstein said...

No, I am arguing that we have a hard time telling the difference between pure neutrality and:

1. Patterns in which selection events themselves occur at random times. Gillespie and Langley argued this in many papers in the 1980s and 1990s. They also pointed out cases where substitutions clustered more (along one lineage, in time) than would be expected by neutrality.

2. Patterns in which mildly deleterious mutations (as Tomoko Ohta has explored) gradually make the structure of a protein, or structure of an RNA, deteriorate. That leaves more room for advantageous mutations to occur, restoring the structure. The result is a pattern rather like neutrality, but not neutral. Processes like this will cause many adapted structures to exist in a slightly-dysfunctional state. How dysfunctional depends on how small the population is, as bigger populations allow smaller selection coefficients to be effective.

Michael Lynch's proposed patterns in his important book The Origins of Genome Architecture that are consistent with #2 as well as with pure neutrality. We have limited amounts of data for any one group and our tools for detection of selection are still crude. When one sees a site that has few mutations in divergence of some related species there should not be a lot of power to tell how much the process diverges from neutrality.

I am not saying I know that most molecular change is not neutral. Just that it's hard to tell. That actually has one fortunate effect -- it means we can get an approximate fit to the process by using neutral assumptions, and that this will be good enough to use to go ahead and infer phylogenies.

Unknown said...

I disagree with the statement. The mean rate of fixation in populations is lower than measured mutation rates, usually by about 1 OOM, strongly suggesting that novel mutations are mostly detrimental and that negative selection has a highly relevant effect. Now, if you look at what novel alleles get fixed (compared to the novel alleles that arise), you get a high percentage of nearly neutral alleles in that subset (about 85% of novel mutations are significantly affected by negative selection, but roughly 90% of the fixed novel mutations are in the near neutral range).
Now for a few genes we get away with simply postulating that some positions are under strong negative selection and have an effective substitution rate of 0, while others are strictly neutral. The substitution rate there becomes pµ, where p is the fraction of preserved positions. You then perform a likelihood ratio test (Felsenstein 1988) to check whether you can reject the strict clock and if you can't reject it you use it. But usually it's not that simple (apart from selection another confounding factor is that mutation rates are not constant throughout the tree) and there are a few relaxed clock methods available. I'm waiting for an alignment from transcriptomic data using only synonymous sites. That's a good candidate for not rejecting the strict clock, because synonymous sites are likely to be near neutral, but we still have to check.

"Most alleles are not fixed by natural selection" is a non sequitur. It's not even wrong. How would you distinguish an allele "fixed by selection" from an allele "fixed by drift"? What we can actually figure out is what the selection coefficient for a particular site is (with a particular degree of accuracy and with different approaches yielding different powers in rejecting the null hypothesis of neutrality).

To repeat my question in somewhat different terms. You state that selection and drift are different processes. You can write:
Change of allele frequency=Change of allele frequency due to selection+Change of allele frequency due to drift+Change of allele frequency due to mutation+...
Now my question concerns the "Change of allele frequency due to drift" term. It is a random variable. It has an expected value of 0 (by definition). What I want to know is what the probabilities of it being positive and negative are given a population size of 10000, s=0.00001 and a frequency of 1000. Is the probability of being positive greater/smaller or equal to that of being negative?

Unknown said...

I think the time to fixation IS relevant, because many ecological conditions (and hence s) change slowly over time. For a tree, 100,000 generations means passing through several ice ages and warm periods. Selection for many attributes will reverse multiple times over that period. If s is doing a nearly random walk in time, then you get results that look almost neutral.
Lou Jost

Konrad said...

Simon: you are correct, of course. Saying that selection is a separate process from drift is like saying that the bias of a (proverbial) coin is a "separate process" from the tossing of the coin. This is obvious the moment one writes down an equation, and Larry's claim (or implication) that textbooks contradict this is not helpful.

Nonetheless, it is worth thinking in terms of the way biologists use these terms when speaking less formally (as Larry does). If you think of drift as referring to the case where the selection coefficient is close to 0 (or in some contexts also the case where it is negative, unfortunately it's an informal usage and its interpretation varies by context) the usage makes sense. So when Larry says there are two mechanisms, what he really means is that (for a small positive value of epsilon) you can have s>epsilon (in which case he refers to the mechanism as selection) or s<epsilon (in which case he refers to the mechanism as drift). Of course these are just different regimes of the same mechanism, but that's not the way the word "mechanism" has historically been used in this context.

Other than differences in usage of language, I don't think the two of you are actually in disagreement.

Mikkel Rumraket Rasmussen said...

"One can't also resist to wonder... if their calculations are so precise and so accurate... why can't they use their calculations in process of finally proving that macro-evolution is a fact...? "

What you meant to ask is - why can't they do it to your personal satisfaction? That's a very different question, because the one you asked is answered merely by "they can".

S Johnson said...

Allan Miller

I appreciate the response. I'm sorry if I was confusing about my concern about gene frequency. My difficulty is, if the gene frequency is a measure of sample size used in calculating the probability s, how can one have a precise measure of probability with such a small sample size, even using other methods to measure it?

With a large gene frequency, you are in effect getting a large sample size to calculate probability s, so it is more plausible that you can be precise enough to talk of a small s. I don't know exactly how significant figures are determined (to avoid a Texas sharpshooter fallacy,) but it seems to me that to be sure of the smaller s values, one needs to examine the evolution of changes in gene frequency, that is, time to fixation. The greater the magnitude, the more rapid the fixation as a rule, while drift is slower.

As for the differences or non-differences between positive and negative selection versus drift? It seems to be agreed that time of fixation is different, meaning distribution in time is different. If you could plot the number of alleles versus time of fixation, there should be a bimodal distribution I should think. The closer the drift mode is to the mutation rate, the more important drift. The further away, the less important. I can't think of a way to distinguish drift and selection if you ignore the temporal dimension but nor can I see how you justify ignoring it. And this is especially true when seeking causes, for those who are still interested in such things.

Lastly, it seems there might be another approach to evaluating the power of even tiny selective advantages.. Random genetic drift is not s=0, but some small negative value, because it leads to junk DNA, which is a waste of resources, and therefore must be negative. Couldn't calculating a genome wide s value for junk DNA across different species would give us quantitative estimate of the power of very small negative selection forces.

Anonymous said...

Simon, You and Larry are disagreeing on two levels. You and he will have to duke it out over the numbers – I’m clueless. I think your bigger problem is one of the words used to communicate.

As far as the process of allele frequency change, yes, you’re right, its all one process with varying selection coefficients. However, the results and our ability to think easily about them differ importantly depending on the value of that selection coefficient.

It makes intuitive sense that beneficial alleles will increase and harmful ones will decrease (natural selection). This is relatively easy to understand, relatively predictable, and leads to better adaptation. It’s strongly influenced by the environment. If we didn’t already have a term for this process, we’d make one up now so we can talk about it.

It’s a little harder to understand that alleles with selection coefficents at or near zero can become fixed. It’s mind boggling (though obviously true) that beneficial genes can be lost and harmful genes can become fixed. And further mind boggling that this neutral to mildly harmful change occasionally can lead to better adaptation. These processes are relatively unpredictable and relatively independent of the environment. We need a name for all this, so we can talk about it and contrast it with that process of adaptation by natural selection. We call it genetic drift. It’s true that genetic drift just means “everything else, except natural selection” and so isn’t a great term in theory, but it’s a necessary term in practice.

For effective teaching, I can see either starting with natural selection and drift and working up to the unity of the underlying equations, or starting with the equations and reaching the special case of selection as well as the “everything else” of drift. In a general biology context I’d use the first approach, in part because too many student minds shut down at the sight of even the simplest equations. However, either should work if the teaching is clear and starts close to where the students are.

Anonymous said...

Thanks.

AllanMiller said...

S Johnson,

My difficulty is, if the gene frequency is a measure of sample size used in calculating the probability s, how can one have a precise measure of probability with such a small sample size, even using other methods to measure it?

I think the source of confusion is that I am looking at s as an attribute of the allele, whereas you are looking to measure it. Of course when an allele is present in single copy there is but a tiny sample, and random effects will likely swamp any selective effect in that one life. Hard to tell anyway. But that allele still has a selection coefficient, in this assumption of constant s. It doesn't change as we gain data, we just become more certain what it is! If we took two populations, in A our allele was nearly fixed and in B it was absent, we might have a good handle on what s 'really' is from measurement in A, then we take a single instance an add it to B, we don't end up back at square one. As far as measurement is concerned, 1 instance is useless - but then, for these kinds of selection coefficient, many thousands of instances could be useless. At a certain level it becomes a purely theoretical construct, immeasurably small but still, by mathematical analysis, distinct from neutrality.

If mutations were producing a neat distribution of 33% neutral alleles, 33% positive and 33% negative, with only a tiny variance in s, we would still get significantly more adaptation than fixation of detriment or neutrality, because the improvement in chances for positive s increases exponentially (s being, effectively, an exponent).

Anonymous said...


This is so sad... So many supposedly intelligent people believing in this [shit]...
I feel like crying... I just thank God [sic] for giving me insight into this [shit]....coz most people around me don't seem to see this [shit] as a problem... They seem to ignore the reality for some reason...?

It's funny, Quest, because that is very much the way I feel about your religious delusions.
The big difference seems to be that I can explain why I believe what I do, and you cannot.

TheOtherJim said...

Thank-you all. A very interesting thread, on topic, by people with knowledge in the area. I've missed the exchanges like these, as of late.

Robert Byers said...

Studied but it seems conclusions are newly being said that evolution is from drift mostly or only.
Either there is difference of opinion here bertween evolutionist posters or a confusion that there is by other posters. also evolutionists.
A creationist can be excused to be surprised about the thread.

Not only does the three points deal with very minor things in nature but what could be the evidence they ever did bring evolutionary change? How would one know? If not know then no evidence for this having occured to the needed extent to justify drift as a great or common agent for evolution.
It seems a retreat from mutationism as the agent of evolution.
i think ID/YEC thinkers would find this thread interesting.

AllanMiller said...

Lou Jost,

I think this is an important point. Pop-genetic formulae relate to idealised behaviour in simplified populations, often with constant assumptions such as a static distribution denoted by s, constant Ne and so on. The real world is clearly not like that. Other analyses import some messiness, but the reality is still far too complex. Nonetheless, the formulae provide a guide to the constraints on behaviour of populations under sampling, often with surprising power for small variations. Another reason, I think, to agree with Simon Gunkel that the processes of drift and selection are not separable, save where one is entirely absent (eg s=0 over the long term).

On another forum, I spent some time trying to pursue an argument with Simon justifying their separation, but it didn't really work, even in thought-experiment terms, so I humbly concede the point. Drift is a consequence of stochasticity regardless of s's current value, itself varying stochastically.

Arlin said...

Simon, you are right in many ways but this does not answer Larry's concerns. Larry is trying to make statements about causation. Simon is pointing out that, if one thinks about it carefully, these statements are often problematic from the perspective of the standard stochastic population genetics (altho there is a bit of exaggeration here, e.g., when Kimura argued for the neutral theory, he was specifically talking about the fraction of *fixations* attributable to drift, and certainly there are conditions under which attributing X% of fixations to drift *is* a logically possible attribution).

So far, Larry hasn't succeeded in making causal attributions that we can reconcile with this model. However, Simon is avoiding them altogether. Simon, you challenged us to answer questions based on the selection-drift distinction that, in your view, cannot be answered. OK, fine. The real question is not what can't be said, but what *can be said*. If your theory allows us to say nothing about the roles of different causes in evolution, then it isn't a very useful theory of evolution, is it?

I'm not suggesting that the situation is hopeless. Elliott Sober faces an analogous issue in regard to nature and nurture in Ch. 10 "Apportioning Causal Responsibility" of his book _From a Biological Point of View_. The question of whether nature or nurture caused a particular mouse's tail to be 2.5 inches long is not an answerable question, nor is the question of how much nature, and how much nurture, caused this particular outcome. However, there are other categories of questions in which we can apportion causal responsibility intelligibly.

S Johnson said...

Don't the models behaviorally distinguish large magnitude s by larger more persistent changes in population size, with low/zero magnitude s by smaller oscillations? This seems much more straightforward an effect to model than the changes in the magnitude of s as gene frequency and population size changes. At any rate, the models can't habitually leave N constant. What's the standard way of dealing with changes in N.

Joe Felsenstein said...

What the population size does as fitness changes up (or down) depends on the strength of population size regulation. In species that have strong regulation the population sizes would not change much. Generally we are discussing relative fitness rather than absolute fitness in these discussions. We make no distinction in them between fitnesses of 1 : 1+s
versus 0.95 : 0.95(1+s), for example.

Arlin said...

Allan, whether or not one can identify separate "forces" of selection and drift is a separate question from whether one can make *any statements at all* based on this sort of distinction. I agree that the idea of selection and drift as separate "forces" is misleading, not just because they are not separate ontologically. It is also because drift does not act like a force. Forces have magnitude and direction, but drift (at the level of a single population) has only magnitude.

However, the neutral theory was and is about the fraction of fixations that are attributable to drift. If this theory is sensible, then it is sensible to talk about a fraction X of changes (e.g., nucleotide or protein changes that accumulate over time) being attributable to fixation by drift. What we might mean by this is that the fixations we attribute to drift did not happen by virtue of the allele's effects on fitness, but rather due to the cumulative effects of sampling error. I consider this a sensible kind of statement to make, at least under some conditions.

AllanMiller said...

Arlin,

Forces have magnitude and direction, but drift (at the level of a single population) has only magnitude.

Yes, I like that, though do wonder how far we might take the analogy with physics (granted that diffusion equations find their place).

I agree the point about the neutral case, although this is the point I made in my original, that they can be separated where one is absent (s=0). Going from s=0 to s=0.000001, selection starts to come into play but drift does not of course immediately disappear. There is essentially a continuum of possible values of s, of which zero is but one. And, being a random variable, an s=0 allele can have positive and negative effects in individual lives. But yes, we might for practical purposes wish to create a discrete bin into which all s values and below (into the negative) could be declared 'fixed by drift'. I'd note, though, that fixation of s<0 alleles by drift must fractionally increase the mean s value of new mutations that are being produced, while repeated runs, even with s very near 0, must bias in favour of the ones where s>0. In terms of ongoing evolution, adaptation has legs. But of course these effects can only bias the fixation of the mutations actually produced, and those proportions form the primary input to the process.

Arlin said...

The "neutral" case (aka "strictly neutral") for Kimura, for Joe above, and for most population geneticists, means roughly the case in which abs(s) < 1/(4N), where N is the effective pop size. Ohta's "nearly neutral" theory is more complicated and relies importantly on negative values of s with a magnitude > 1(4N). In evolutionary discourse, there is also a somewhat confusing minority tradition in which critics of the neutral theory define the neutral case as s = 0. This provides the rhetorical advantage of being able to rule out "neutrality" _a priori_ as a useless idealization, and to claim that Kimura was wrong.

I'm talking about the case in which fitness effects are very small, not the purely hypothetical case of s = 0.

Consider the beneficial case. If an allele is beneficial, the chance of fixation (for a new mutation) in a somewhat idealized stochastic population is about 2s. It stands to reason that, as we reduce s, there is some point at which the chance of fixation due to selection becomes negligible relative to the chance of fixation due to drift, which is 1/(2N) for diploids. We can express this mathematically as 2s << 1/(2N), which is the same as s << 1/(4N).

This actually is not how Kimura defined neutrality, but it is how I think of it. That is, I think of neutrality as the point at which the chance of fixation by drift overwhelms the chance of fixation by selection.

So, consider 100 mutations that achieved fixation, and which have selection coefficients that are positive but < 1/(4N). I am saying that there is some legitimate sense in which we can describe causation by saying that the fixation of these alleles (collectively) was overwhelmingly due to drift rather than selection.

ealloc said...

Re: Whether Selection and drift are "forces"

Those discussing may be interested in the paper "The application of statistical physics to evolutionary biology", PNAS 2005.

This paper makes clear that the analogy is not to *newtonian* forces, but to *thermodynamic* driving forces. In fact, under certain limits there is a one-to-one correspodence between themodynamics and population genetics. It can be shown that the distribution of phenotypes follows a Boltzmann distribution, that log(fitness) is exactly like a thermodynamic potential, population size is inverse temperature, and the number of genotypes giving a phenotype is like an entropy, so that drift is like entropy and "pushes" the population towards more degenerate phenotypes.

In coarser terms, selection plays the role of a potential and drift plays the role of entropy, and equilibrium is determined by the balance betwen their opposing effects. Lower temperature (high population) reduces the effect of entropy, leading the system to lower energy (higher fitness) equilibrium.

It is hard to separate the effects of selection and drift for the same reason that it is hard to separate the effects of energy and entropy.

AllanMiller said...

Arlin,

Yes, effective neutrality would be an example of a 'binning' dichotomy such as I alluded to, although we include s alongside Ne in that case at our boundary.

I do see something of a difficulty though, trying to attach your argument to (slightly more) real populations. Suppose we have an allele of s=0.00000025. That would be effectively neutral in a population of 100,000, but not in a population of 1,000,000. Say the 1000000 are divided into 10 subpopulations. Our allele can only fix in the first by drift, and the second, and the third ... but when we pull back, and see it fixed in all - lo! it has been fixed by selection! s=1/4N.

Both are true. No individual deme is populous enough to make the selection coefficient effective, but the Large-Number result of 10 subpopulation trials is that selection has had a chance to assert itself. It's not just cumulative Drift - or, if it is, it is indistinguishable from Selection.

And I'd expect a similar result from 100 alleles that each, individually, had s below 1/4N. The population would have concentrated more of the alleles at the higher end of the range, despite an individual assessment of neutrality (I reckon!).

The more trials undergone, the more surely the bias in favour of positive s must rise above the background noise.

Arlin said...

That is a very fascinating point, and warrants more discussion, but I have to warn that the classical "forces" theory of population-genetics causation taught in textbooks, and the arguments of Sella & Hirsh (see also Vogel & Zuckerkandl, 1971) are working at two completely different levels that are in some ways contradictory.

For the architects of the Modern Synthesis, evolution was emphatically *not* a Markov chain of separate mutation-fixation events, which is the way Sella & Hirsch envision the process. That view is the "mutationist" view, the antithesis against which the Modern Synthesis took shape. The mutationist view re-emerged in the molecular era, and over the past few decades, population geneticists have reached a sub-conscious consensus to forget that this distinction ever existed.

The classical view of population genetics was that we could view evolution as a process taking place in the *interior* of an allele-frequency space, in which "evolution" consists entirely in shifting the frequencies of alleles already present in a population, i.e., shifts in non-zero values. Evolution was conceived as a process in which the environment changes, then the population responds by shifting allele frequencies to a new multi-locus optimum. This process was argued to be an inherently interactive, multi-locus process, due to epistasis. The idea of evolution by individual mutations was rejected. Evolution, in this view, is not a single-locus process, and selection never waits for a new mutation because there is always abundant variation in the "gene pool."

In this context, "forces" were conceived as mass-action processes that affect allele frequencies. The common language of causation was that a force, whether it is mutation, drift, migration, or selection, can displace a frequency from f to f + d. By analogy, a Newtonian force can displace the momentum of a particle. There is some extent to which forces can be combined. Dobzhansky and others also compared population-genetics forces to the forces of statistical physics, but they did this with a particular rhetorical view in mind. They used it to argue that individual events are like the movements of individual atoms, which are individually unimportant, and that causation arises at a higher level.

The notion of the *interior* of the allele-frequency-space is vital because if start thinking about jumping off of an axis from a value of 0 to a non-zero value, then the "forces" are no longer equal. Selection cannot shift an allele frequency from 0 to 1/N. Only mutation can do that.

This classical "forces" view falls apart specifically when we model evolution as a Markov chain of mutation-fixation events. Mutation here plays the role of a point process that introduces new alleles, and not a mass action force that shifts frequencies from f to f + d. When this new behavior is generalized over an ensemble of chains, a higher-level analogy with statistical physics emerges a la Sella and Hirsch. Mutation once again becomes a mass-action process, but it is not the same thing as the original "mutation pressure". It is now a pressure of origin events.

Arlin said...

Sorry, one of my points above wasn't very clear. Dobzhansky, et al sometimes invoked physics with a particular view in mind. They were thinking of the gas laws, and their argument was that individual movements of atoms are not important, but causation arises at a higher level. This higher level for them was "the population". They insisted that evolutionary causation takes place "at the population level". One of the things they meant by this, and stated explicitly, was that individual events of mutation are unimportant.

Arlin said...

Allan, I understand the point that you are making, though I don't see it as a major problem. As another example, we could have a situation in which there are a gazillion perfectly neutral (s=0) mutations from one synonymous codon to another, and a different set of a gazillion slightly beneficial mutations from one codon to a synonymous codon, each with a benefit << 1/(4N). Each mutation is "neutral" but if we look at a large enough number of them, we'll see a bias toward the beneficial set, and this will *not* be due to differential fixation by drift.

But I think this is just a matter of the way that s << 1/(4N) was defined to address the attribution of causation in the case of a single fixation. If we are considering the behavior of a large collection of events, this has to be adjusted accordingly to reflect the aggregate influence of selection over all the relevant values.

This is implicit if you look at neutrality in the way that I was suggesting, as a comparison 2s with 1/(2N).

Joe Felsenstein said...

In response to Allan's raising of the issue of subdivided populations, let me give some citations that I know Arlin is aware of: the classic papers are by Ed Pollak (J. Applied Probability, 1966) and Takeo Maruyama (Genetical Research, 1970). There is interesting followup work by Michael Whitlock (Genetics, 2003) and a fairly extensive review there and by Patwa and Wahl (Journal of the Royal Society Interface, 2008).

The classic Pollak-Maruyama result is that the probability of fixation is the same as if you just pooled all the populations into one big one. The followup work points out many cases where the symmetry assumptions of those papers are violated and you don't quite get that result.

Anonymous said...

I think Whitlock 2003 has fulfilled my desire to read population genetic for the whole next year. Or two. My take away understanding: Alleles should disappear or go to fixation in fragmented populations as they do in panmictic populations, but it takes longer (though it might go faster in a few cases). The number of generations to fixation (when the selection coefficient is low) is around 8,000 (sometimes less) to 30,000 or 40,000, sometimes over 100,000, depending on several variables.

My related thoughts: In an annual plant, we can approximate the number of years as equal to the number of generations (ignoring the seed bank issue). In perennial plants, however, the number of years is at least twice the number of generations and often much higher. (And we get into the complication of genetics in overlapping generations.) 16,000 years is the entire time humans have been in the New World (or 2/3 of the time we've been here; controversy). 100,000 years started well back in the time of glaciers. Critical values such as effective population size, selection pressure, and migration rates among subpopulations have varied over this time. Not to mention the different selection pressures in different simultaneous populations.

Therefore, the grand march of allele frequencies to fixation or zero is so overlaid with unpredictable (or extremely difficult to predict) event that the difference from random (s = 0) processes is usually not measurable. It's not really surprising that molecular clocks work, summing allele frequencies over long periods of time and variable population structures.

Unknown said...

I did start a long reply and my computer crashed and so I have to give it another go...

@Konrad: We do have perfectly good terms for the case when s=0 (neutrality) or s~0 (near neutrality). I think that confusing drift with neutrality is one of the most common misconceptions about evolution people have and it is one that is exploited by IDiots like Sanford (who's entire carreer is about conflating drift and neutrality, the beneficialness of an allele with its adaptiveness and absolute with relative fitness. That's 3 pairs of terms that are often dealt with in ambiguous terms in the popular literature and that's what Sanford abuses). I think we should be careful in making this distinction and we should be even more careful when talking to a non-expert audience. New Scientist definitely falls into that category and to some extend this blog does too. And of course it holds when talking to students. If you hold the view that "The top three criteria for effective teaching are; accuracy, accuracy, and accuracy." you should not conflate terms. I'm also not sure that this is what Larry is doing. He usually does use neutral when he talks about neutral alleles.

@Barbara:
"It makes intuitive sense that beneficial alleles will increase and harmful ones will decrease (natural selection). This is relatively easy to understand, relatively predictable, and leads to better adaptation."
This has one of the issues mentioned above in. An allele is beneficial if s>0, which is equivalent to stating that its presence is correlated with increased fitness. An allele is adaptive if it causes higher fitness. The two terms are not equivalent. We can think of this as two causal directions: whether an allele is beneficial or not records how the fitness of organisms influences the ultimate fate of the allele, while its adaptiveness is about how the allele influences the organisms fitness. You can make some idealizations about populations that make them equivalent and I wouldn't be opposed to looking at adaptiveness as "what the selection coefficient would be in an idealized population".

"These processes are relatively unpredictable and relatively independent of the environment."

But they aren't. One of the main issues with the "two processes" view, is just how much of the effects of differences in fitness actually end up in the drift part. The drift part is not independent of s (it is uncorrelated by definition though).

"[T]oo many student minds shut down at the sight of even the simplest equations."

I don't think this is a good reason not to introduce the maths. Students don't assume they will get by without maths in a physics course. They don't assume they will get by without maths in chemistry. Why poeple think they won't need these tools in biology is beyond me (then again I wanted to apply statistical methods to fossil data since I was 13 and always was aware that that would require learning the statistical methods). In the end I think you can go with a simple model and use it. There's a version of the Moran process that produces the correct result for the probabiltiy of fixation for haploid organisms in a few easy steps (if you remove all steps in which the standard Moran model does not alter allele frequency you get a random walk, with stepsize 1, absorbing boundaries at 0 and N, and a probability of increase of 1/(1+e^-s) and 1/(1+e^s) for the decrease and it's easy to work out the probability from there).

Unknown said...

@ S Johnson: "If you could plot the number of alleles versus time of fixation, there should be a bimodal distribution I should think."
Kimura and Ohta (1968) give a formula for the mean time to fixation in the case of allelic selection. This is a monotonously decreasing function as s increases (and continuous), which means that the time to fixation would only be bimodally distributed if selection coefficients were. However Nielsen and Yang (2003) got a normal distribution for the data sets they tested. Earlier approaches used gamme-distributions which are also unimodal.

@Arlin: "certainly there are conditions under which attributing X% of fixations to drift *is* a logically possible attribution"

I don't think so. I think a lot of the selectionist/neutralist debate was badly phrased, because it carried the drift/selection distinction as a legacy from the second wave of the MS. Borges (2005) argues that Dobzhansky and Mayr did not fully understand the mathematics of Fisher, Haldane and Wright and that this lead to a verbalization of the theory. I would add that it led to the reification of selection and drift as processes.

"If your theory allows us to say nothing about the roles of different causes in evolution, then it isn't a very useful theory of evolution, is it?"

I have two comments on this:
a) A theory has to be a couple of things to be useful, mainly it has to be testable and predictive. It does not have to offer a causal explanation (in fact it can not offer such an explanation, because we can not distinguish between two theories that make precisely the same predictions empirically and we can note that multiple causal interpretations exist for quantum mechanics, but even for classical mechanics this holds true - the traditional version established by Newton explains changes in motion by Forces. The commonly used Lagrange formalism instead uses the least action principle and does not use Forces anywhere. It is mathematically equivalent however. Newtonian mechanics is patently useful, but it is causally agnostic).
b) I do not think that drift and selection are different causes of evolution. Hence I do not consider it a weakness that I can not say anything about their different roles.

Anonymous said...

I meant: Alleles should disappear or go to fixation in SPECIES WITH fragmented populations as they do in panmictic SPECIES

AllanMiller said...

Arlin,

If we are considering the behavior of a large collection of events, this has to be adjusted accordingly to reflect the aggregate influence of selection over all the relevant values.

This is implicit if you look at neutrality in the way that I was suggesting, as a comparison 2s with 1/(2N).


Yes, but my point would be that the 'aggregation of events' takes place at levels other than that bulking subpopulations into larger ones. Multiple alleles are being fixed simultaneously, in 'typical' sexual populations, and a succession of such alleles is being poured in. All of them may be below the 'neutral' threshold, yet the population may still adapt.

I don't see this as important as such, merely supportive of the argument that it can be paradoxical to argue for separation of drift and selection. There is a single sampling process at work, with varying degrees of bias.

Unknown said...

@Allan: I would see this as important, because it allows us to reolve distributions of selection coefficients even within the near neutral range using phylogenetic methods.

To add another good reason to treat drift and selection as one process, consider the following:
A gene has 4 alleles A1,A2,B1 and B2. A1 and A2 differ in a synonymous site and code for the same protein, the same holds for B1 and B2, but the As differ from the Bs in the AA sequence and lead to different proteins. Assuming allelic selection and innitial frequencies of 0.25 for all 4 we find that a value of s for A1 of 0.00001 corresponds to a value of s of 0.000015 for the protein coded for by the As. For a population size of 20000, all 4 nucleotide sequences have abs(s)<4N, but both proteins have abs(s)>4N.

Anonymous said...

Simon, I've given some thought to your statement that "An allele is beneficial if s>0, which is equivalent to stating that its presence is correlated with increased fitness" and "An allele is adaptive if it causes higher fitness" would not be equivalent. An allele that increases adaptation is, by definition, beneficial and increases fitness and has s>0, and a beneficial gene is beneficial because it increases adaptation.

So, is your statement that these are not equivalent a "correlation is not causation" thing? I suppose that could be an issue here. There should be some alleles that have s measured as >0 because they're increasing with increased fitness but not because they cause an increase in adaptation, and not because they would have s>0 on their own, but because they are closely linked with some beneficial allele. Is this what you're after?

As to math and biologists, when I wrote "too many student minds shut down at the sight of even the simplest equations," I was attempting a bit of humor, though this math fear is a real problem, too. Much though I love teaching biology majors, I teach mostly mixed or non-major classes. I find that starting with words first seems to work better. Then we cover the math as needed, often at a very basic level.

Your emphasis that what we call natural selection and genetic drift are parts of the same process is important and I've learned a lot from it. It will impact my teaching. However, I will still use the terms selection and drift because I think they help communicate about some important differences between these aspects of the one process. (Good, fruitful approximations can be good places to start.)

Larry Moran said...

Simon Gunkel says,

I think that confusing drift with neutrality is one of the most common misconceptions about evolution people have ...

Hmmm ... the most common misconception is that people have never even heard of random genetic drift. However, I agree with you that among those who are somewhat knowledgeable about evolution, there's a lot of confusion about the difference between random genetic drift and Neutral Theory. Richard Dawkins, for example, hardly ever talks about random genetic drift—he uses "neutral evolution" instead.

I like to illustrate distinction by pointing out that the vast majority of beneficial alleles (s > 0 in a reasonable population size) are lost before they become fixed. Since this change in allele frequencies (allele extinction) doesn't occur by natural selection, there must be something else going on. That something is called random genetic drift in all evolution textbooks.

Then I point to examples of fixation of deleterious alleles and emphasize that these not neutral alleles and they are not fixed by natural selection.

Simon, it's unclear to me how you would explain these examples to the average student if you don't think there's a real distinction between natural selection and random genetic drift. Can you give me your explanation?

I'm also not sure that this is what Larry is doing. He usually does use neutral when he talks about neutral alleles.

Right. I've written about this many times.

Unknown said...

"So, is your statement that these are not equivalent a "correlation is not causation" thing? I suppose that could be an issue here. There should be some alleles that have s measured as >0 because they're increasing with increased fitness but not because they cause an increase in adaptation, and not because they would have s>0 on their own, but because they are closely linked with some beneficial allele. Is this what you're after?"

Yes, although linkage disequilibrium is only one possible reason for s>0 for an allele that is non-adaptive (though it's likely that it's resposible for most cases). Alternatives include multi-gene scenarios, where the allele in question is not linked to either of several adaptive alleles, but it's presence is correlated with the presence of several of them (basically the difference between mutual independence and pairwise independence).

The "maths and biologists" thing is something I think about a lot. I had a paper under review recently and there was a typo in a formula (in an earlier draft I had (1-p) with p giving a frequency and changed it to (N-n)/N with N the sample size and n the number of individuals which counted towards the frequency. In one of the formulae, this ended up as (1-n)/N). Neither the editor, nor the reviewers commented (in fact there were no comments on the entire section dealing with the statistical methods) and I caught it when I got the proofs and checked whether the maths had survived formating. It's a rather obvious typo if you read through that section and digest the formulae. So I think that section might have been skipped by the entire chain of reviewers. It's not something that stops with the students.

Unknown said...

I think the main issue is that you start out with an idea about selection that isn't right. That's the crux of the matter really, because the drift/selection distinction arose from a shift in the meaning of selection during the 2nd phase of the modern synthesis. From Darwin up and including the first phase of the MS selection was a stochastic resampling process. The Fisher-Wright model was supposed to be a model of selection (and it does accurately model the neutral case and gives correct probabilities for the fixation of detrimental alleles), the same holds for the Moran model and the diffusion approximation as well. These describe population resampling process. But these models as far as they were available were computationally unwieldy in the early 20th century. For this reason a range of approximations were derrived, from Haldanes 2s which you like to use, to simple deterministic models in which allele frequency in time follows a logistic curve (it should be noted that under this model, fixation never occurs!). Drift here is basically an error term - the difference between the approximation and the full model. And arguments between Fisher and Wright on the role of drift are basically a debate on how well the approximations match the biological reality, not on how important two different processes are.
In the 2nd wave this was reified - Selection was not approximated by the approximate models, selection WAS the approximate models. And Drift became a real thing as well. It's a conceptual shift that doesn't really help things.

So I would start out with a simple model of population resampling - my personal choice would be a simplified Moran model (birth/death pairs, ignoring the ones that do not change allele frequency). You then end up with a model that is basically a biased coin. In the neutral case it is a fair coin, the probability of increasing the number of A alleles by 1 and reducing that of B alleles by 1 is (1+e^-s)^-1, which of course is 0.5 when s=0. That's where I'd like students to run a couple of simulations (the model is simple enough that an R script is doable in 10 minutes or so).
Then I'd derrive the probability of fixation from this. At that point one can introduce the approximations (derriving 2s could be a homework assignment) and at that point one can mention that the difference between the approximation and the full model is sometimes refered to as drift. And empathically that it's not a different process.

Larry Moran said...

I see.

You are arguing about detailed models of selection.

... at that point one can mention that the difference between the approximation and the full model is sometimes refered to as drift. And empathically that it's not a different process.

That's emphatically not very useful in the real world. It would be like describing the differences between the human and chimp genomes (mostly in junk DNA) as due to difference between a full model of selection and an approximation.

I think the main issue is that you start out with an idea about selection that isn't right.

What you mean to say is that anyone who doesn't see real biology as corresponding to your favorite mathematical model of selection isn't thinking about selection properly.

AllanMiller said...

I don't think it's just down to the congruence between 'real biology' and a model. Real biology is a sampling process. Even with no bias, alleles become fixed - evidently, that can only be ascribed to that which is generally agreed to be 'drift'. That process is purely down to compounded sample error. Adding a tiny bias to the sampling process turns down this drift only marginally. Turning the bias up naturally turns drift down. One can make causal stetements about the implementation of that bias - there is clearly a reason why s<>0. But I think the neutral case is a good baseline, and somewhat counter-intuitive. Selection is a variable layer upon the baseline process, and drift is diminished by it. They are different things certainly in as much as we can talk about them - we can talk about bias, and we can talk about sample error. But I think that the pedagological point is a useful one, that starting with the neutral case and building selection upon it is a good way to go. And it is clearly the case that there is one process in operation, with two components.

Unknown said...

"That's emphatically not very useful in the real world. It would be like describing the differences between the human and chimp genomes (mostly in junk DNA) as due to difference between a full model of selection and an approximation."

But it would be correct. Under Haldanes 2s approximation the substitution rate for junk DNA is 0. Quite obviously the approximation is wrong. It's worth noting that empirical studies of the distribution of selection coefficients which allow us to make statements about how values for s are distributed among novel alleles and fixed alleles are making use of complete models, either using Kimuras diffusion model or the Fisher-Wright process in conjunction with a matrix model of substitution (from the simple JC, through the Fs to UTR).

"What you mean to say is that anyone who doesn't see real biology as corresponding to your favorite mathematical model of selection isn't thinking about selection properly."

I don't have *a* favorite model of selection. I've given 3 models (Fisher-Wright, Moran and diffusion) and they all agree on key features of how populations behave under resampling. They agree on probabilities of fixation, time to fixation, time to loss... Since there seem to be no takers on my question: They do disagree on the answer (under Fisher-Wright drift is more likely to decrease allele frequency, under Moran drift is more likely to increase allele frequency and under Kimura both increase and decrease have the same probability).

Of the 3 I would prefer starting a course with the Moran process, because it has the simplest mathematics.

Do you have an alternative model?

Petrushka said...

As an untutored outsider, I find this debate fascinating.

Joe Felsenstein said...

Maybe I just don't "get" what the argument is here. Genetic drift is always present. Selection may or may not be, if s can be 0.

Whether selection is "important" depends on the value of 4 Ne s, where Ne is the effective population size. For the Moran model Ne is (close to) half of the actual population size, so it undergoes genetic drift twice as fast as a Fisher-Wright model with population size N. Haldane's approximation is the limiting case when we have one copy of the mutant allele and N is very large.

The value of 4Nes = 1 is of course not magic, it is just that above it selection has a more and more noticeable effect and below it selection rapidly becomes unnoticeable.

I probably don't understand what Simon's question was. Like thermal noise in physics, genetic drift is always present. Like gravity in physics, selection may or may not be important. A cloud of fine rock dust in water will settle if it is left along, but end up with not quite all of them at the bottom. They will spread out in a distribution, most of them near the bottom, as a result of Brownian Motion. How close to the bottom they are depends on some parameter that involves the size of the dust grains and the water temperature. 4Ns plays a similar role here. (The analogy is not exact because there is no analogue to fixation in the rock dust case).

So what are you arguing about?

Joe Felsenstein said...

OK, I read a bit upthread and see Simon asking this:

An allele has a frequency of 0.1 and a selection coefficient of 0.00001 in a population of 10000. Which statement is true:
a) It is more likely for the allele frequency to increase by drift than to decrease by drift.
b) It is more likely for the allele frequency to decrease by drift than to increase by drift
c) Both an increase and a decrease due to drift are equally likely


As far as I can see (a) Haldane's approximation does not apply (initial gene frequency is 0.1 so there is more than one initial copy, but maybe he wants us to assign probability of fixation 2s separately to each copy). Under both WF and Moran models gene frequency is a tiny bit more likely to increase, both in the next generation and ultimately. WF and Moran models differ a bit in the ultimate fixation probabilities owing to their difference in effective population size.


AllanMiller said...

Joe - I had assumed that 'by drift' was a key point. Such an allele is more likely to increase (despite being effectively neutral in a single run), but this can only be because of selection.

Joe Felsenstein said...

Yes, thanks for pointing that out. We have to somehow get rid of selection to answer Simon's question. Yet he specified that it was (weakly) present.

I guess I'm confused and don't see what the question is in this part of the thread.

Piotr Gąsiorowski said...

Me too. I've learnt a lot from just following it. Thanks to all involved.

judmarc said...

Maybe for Simon, saying selection is weakly present is the same as saying there's a relatively large role for what others would call drift (as he seems to be arguing for no hard division between them but a continuum)?

Anonymous said...

Rum,

Don't be coy... You are talking to me-Quest...and not some other people Dino-paranoiac accuses to be my puppets...

Unknown said...

"Yes, thanks for pointing that out. We have to somehow get rid of selection to answer Simon's question. Yet he specified that it was (weakly) present.

I guess I'm confused and don't see what the question is in this part of the thread."

Well, if drift and selection are separate processes, then it should be possible to isolate them. The way I've seen that argument made explicitly (in Okasha 2009 for instance), was to partition the change in allele frequency predicted by selection+drift models into a component from the model sans drift and the remainder. For each of the models the change in allele frequency in a single step is given by a random variable. In FW it's binomially distributed, in Moran it's bernoulli distributed and in diffusion it's normal distributed. The drift-free models replace these random variables with their expected values. In FW the change for the next generation is right-skewed and the skewedness is not affected by altering the mean of the RV. Under Moran it is left-skewed and again the change of the mean does not affect this. Under Kimura the distribution is symmetrical and remains this way.

@judmarc: I'm not even arguing a continuum. I'm arguing that there's a single process in which allele frequencies change due to organisms being born or dying. This process is stochastic. We can sometimes get away with an approximation using the law of large numbers. But treating that approximation as a thing in itself is bogus. The population genetics models tell us something useful about how populations evolve. Saying that FW is a model of selection and drift does not tell you anything useful at all. What you want to know are things like "when can I approximate things using s=0?" or "when can I approximate fixation probabilities with 2s". Nothing of substance comes from debating drift vs. selection (the substantive part of these discussions is entirely about distributions of values for s).

Joe Felsenstein said...

OK, I see. When I teach students population genetics, I eliminate selection by making all fitnesses equal. I eliminate genetic drift by making the population size infinite. That lets us see what the effects of either along are.

We can do this with theory, or with computer simulation. I use my own lab's genetic simulation program PopG, which simulates a number of populations of size N at a single locus. It also shows the result, in one population, of having infinite population size.

Emphasizing the differences among mathematical models is a good thing unless was overdoes it and thinks that they therefore don't tell us anything.

When Simon says that

This process is stochastic. We can sometimes get away with an approximation using the law of large numbers. But treating that approximation as a thing in itself is bogus.

I'm not sure whether he and I are disagreeing, whether in his view examining what happens in the absence of selection and in the absence of genetic drift constitutes "treating the approximation as a thing in itself". That phrase sounds like a discussion of angels dancing on the heads of pins. The issue of whether or not the process is "a thing in itself" strikes me as unimportant as long as we can discuss the effect of selection and the effect of genetic drift separately, then combine them.

It's necessary for student understanding. If Simon teaches population genetics, he will find this. I am surprised to hear of him "start[ing] a course with the Moran Model". If he has found that students of biology can handle this, I am happy for him. Personally, I start with the Hardy-Weinberg proportions. Here are some relevant links to my teaching:

1. Genome 453, an upper-undergradiate course in Evolutionary Genetics. Note that you can download the projection PDFs and listen to audio recordings of my lectures. here is the site for the most recent time I gave the course (2013).

2, Genome 562, a graduate course in theoretical population genetics. Also has audio recordings. Some material was presented with projections but mostly the format is chalk-talk. The equations are found in my free, downloadable book "Theoretical Evolutionary Genetics". The 2013 course web site is available -- I will start teaching it again on January 5.

Joe Felsenstein said...

It's early morning here so some typos occurred:

"the effects of either along are" should have been "the effects of either alone are"

"unless was overdoes it" should be "unless one overdoes it"

S Johnson said...

I had myself convinced that the topic of the thread was about how meaningful very small selective advantages are. I was under the impression that one could always calculate them, but the difficulty was that the calculation might be tautological, mathematical artifacts that assumed all differences in gene frequencies are due to natural selection. And I also thought the topic mattered because panselectionism is a problem for evolutionary theory (an opinion I formed after reading some evolutionary psychology.)

I have seen here that one can't use differences in fixation times, since s can be assumed to be constant for any period of time. I don't get that to be honest. Actually, I don't even understand how s can be assumed to be constant...Are all traits just as advantageous to an organism when the vast majority of a population shares it as it was when only a few did?

Nor can one use population changes to sort the effects of drift, since s measures relative, not absolute advantage, although I don't quite understand that either. And apparently population regulation is something independent anyhow, which I didn't know. I'm still embarrassed that I was caught forgetting that genetic drift is rapid in founder effects.

I get the impression that somehow this is all to to be understood as in many fundamental respects as sampling, a process in which there is no separate existence of causal forces like natural selection or drift. But if it's a matter of sampling, then natural selection is signal and drift is noise. Saying there's no distinguishing them sounds completely cracked but I can't figure out how I'm misunderstanding.

And I don't understand why the implications from biochemistry that mutations tend to be neutral or nearly neutral are irrelevant. I don't even understand why the rough similarity between rate of fixation of most alleles to the mutation rate doesn't suggest that most are fixed by drift. Nor do I understand how the relative constancy with which molecular clocks keep time for a species/genus can correlate with the efficiency of genetic repair mechanisms...unless that's a false "fact" that's confusing me.

In short, although everyone else is sure that Larry Moran is wrong, I don't understand why. And I'm even more confused now than when I started. I suppose the moral of the story is that I should give up science.

Unknown said...

"When I teach students population genetics, I eliminate selection by making all fitnesses equal. I eliminate genetic drift by making the population size infinite. That lets us see what the effects of either alone are."

That's the issue though. You discuss drift in the neutral case - and that's something that leads to at least some students confusing drift and neutrality. In addition, something that student will most likely assume is that the principle of superposition holds, i.e. that these are additive. That's not true, as you know. Adding the expected change per generation under an infinite population model to the neutral case in WF gives you an equation that is not equal to the WF model with selection.
To make that explicit, let's consider an allele in a haploid population with discrete generations:
The allele frequency in the next generation of infinite population is
f'=pe^s/(pe^s+1-p)
In the neutral case using WF it is binomially distributed
f'~Bin(p,N)
Add these and you get: f'=pe^s/(pe^s+1-p)+X; X~Bin(p,N)
However the complete WF model for this case is
f'~Bin(pe^s/(pe^s+1-p),N)
It should be noted that the additive model assigns positive probabilities to allele frequencies greater than 1 when s>0 and to allele frequencies <0 when s<0. Yet, the additive model is what talking about selection and drift as separate processes, or as forces implies.

"The issue of whether or not the process is "a thing in itself" strikes me as unimportant as long as we can discuss the effect of selection and the effect of genetic drift separately, then combine them."

But that's not how we do it, is it? If we worked this way, then we would actually use some formula for drift in the case of weak selection. But we don't. We have law of large number approximations, we have the neutral case and we have models that incorporate drift and selection. We do not get to these models by combining an infinite population model with a neutral model. We get to them by starting out with treating the reproductive success of individuals as random variables and then going from there to populations.

I'm not teaching population genetics. I currently don't do any teaching. I did recently pass on your book to a colleague who wants to brush up his population genetics. I would not start such a course with the Moran model and HWE is definitely something I'd cover earlier. But the simplified version is easier IMO than most alternative ways to get into models of change in populations.

Joe Felsenstein said...

You are right that the processes of genetic drift and of selection are not additive. However, when 4Ns is small they do (approximately) add. Considering that can help students' intuition as to what is going on.

Tom Mueller said...

Thanks all around to everybody for a most illuminating thread!

OK, that said - it appears that I may be confused and way over my depth…

L.M. Most people think that all evolution is due to natural selection

I humbly repeat myself – Larry, I earnestly believe you are over-stating your case; that said, your assumed role as Socratic gadfly remains a welcome reminder and reality check.

However, every introductory textbook I am know deals with much of the above when discussing Hardy Weinberg, random genetic drift and Neutral Theory. Bottle-necks and founder effects are even cited to explain the fixation of deleterious alleles in every introductory text I know. We have discussed this on several occasions.

L.M. : I understand why fans of adaptation want to treat random genetic drift as just a variant of selection but that's not very helpful.

I am not exactly clear what Larry means by fans of adaptation and I strongly suspect Larry may be setting up a Straw man argument.

Please refer to
http://sandwalk.blogspot.com/2014/10/nature-criticizes-science-hyperbole-and.html?showComment=1417808999992#c4727783262835368592

This is probably the part I do not get:

Granted – much of the genome is “junk” and a “great deal” of morphological variation may indeed be neutral… I still have the nagging intuition that much evolutionary “shaping” (as Larry phrases it) is not all so "accidental" when considering the multifarious examples of convergent evolution that first swayed Darwin during his voyage on the Beagle.

http://tinyurl.com/nlllsf5

Clearly, Larry is not suggesting that Natural Selection accounts for only 1% of the convergent evolution story of those different anteaters.

I hope I got that right… so what exactly are we debating here? That none of the anteaters are perfectly adapted from an intelligent design POV? Uhmmm, OK, that’s too easy, there must be more…

How about that every one of those anteaters possess a lot of uninteresting junk DNA? Uhmm… still too easy, there still must be more happening on this thread that I am not clear about.

Albert Einstein once claimed: “If you can't explain it to a six year old, you don't understand it yourself.”

Clearly, I do not understand this all very well myself! What exactly is the distilled take-home-message I should be imparting to my students?

Tom Mueller said...

@ Joe
How small does a selection coefficient have to be to have so little effect (in the face of genetic drift) that they have essentially no effect on the probability of the mutant getting fixed?...

In a population of effective size N, it [s] must be less than 1/(4N).


Damn – that that was a shocker! Not to mention the other gems that followed.

Joe – I have cited you & Larry more than once on the AP teachers’ forum. I just wanted to pass along grateful regards to both of you. (..not to mention everybody else present who elaborated on Joe's overture to this thread)

I echo Barbara's comments above - Population Genetics has come a long way since I last studied it in university. This summer, I will sit down and bring myself up to date.

Anonymous said...

Dino-gene,

We're even then...

S Johnson said...

"However, every introductory textbook I am know deals with much of the above when discussing Hardy Weinberg, random genetic drift and Neutral Theory. Bottle-necks and founder effects are even cited to explain the fixation of deleterious alleles in every introductory text I know."

This is an excellent example of how I find this discussion so confusing. I would have sworn that neutral theory addressed evolution as change in gene frequencies, whereas drift addressed evolution as variation in phenotype. And that discussing fixation of delterious alleles as due to bottle-necks and founder effects means that alleles and/or traits are selected all other times, a pan-selectionism that contradicts the notion of drift. Drift is why not even natural selection can make a population stay the same forever, why eventually speciation must occur...which, looking at the history of life on earth, does seem to me to be the case.

"Granted – much of the genome is “junk” and a “great deal” of morphological variation may indeed be neutral… I still have the nagging intuition that much evolutionary 'shaping' (as Larry phrases it) is not all so 'accidental' when considering the multifarious examples of convergent evolution that first swayed Darwin during his voyage on the Beagle."

I could have sworn if we tackled the issues from this perspective, the power attributed to natural selection would have remolded all species to their changing environment, leaving a fossil record of continuous change, without extinction, save for exotic catastrophes like comets or supervolcanoes or whatnot. I suppose the population biologists have refuted the paleontologists who believe that natural selection tends usually to maintain species shape, within some degree of variation, especially neutral variation on the genetic level, more or less through the course of a given species' existence, until it goes extinct and novel species appear with geological rapidity.


Larry Moran said...

A creationist can be excused to be surprised about the thread.

Only on the grounds that just about everything scientific is a surprise to them.

AllanMiller said...

S Johnson

This is an excellent example of how I find this discussion so confusing. I would have sworn that neutral theory addressed evolution as change in gene frequencies, whereas drift addressed evolution as variation in phenotype.

No, that's not really it. Variation in phenotype depends upon the heritability of the genotype, and upon environmental influence, but this has nothing to do with drift per se.

I think the best way to look at evolution is as a sampling process. The new generation 'samples' the alleles of the old. All samples imperfectly represent the distributions of the wider population from which they are drawn. Therefore some alleles will be over-represented and some under-represented, compared to the parental population. Repeat sampling compounds the distortions, and the inevitable result is that a single allele at a locus becomes the last standing. That's neutral drift - it is compounded sample error.

Sampling processes can be biased. That's where selection comes in. A systematic bias in favour of or against one allele can improve its chances in this sampling lottery. This distortion is represented by s. It's not 'really' constant - across a range, or in time, things are never so neatly proportioned as to give a neat mean value for the increase or decrease in offspring number associated with an allele. But it is still instructive to consider constancy. A tiny change in s from zero to a positive number means that, periodically, the allele benefits its possessors. They gain more offspring than non-bearers. But drift is still acting large - sample distortions are still the dominant force for tiny s. Turn s up, you turn drift down.

Drift is why not even natural selection can make a population stay the same forever, why eventually speciation must occur...which, looking at the history of life on earth, does seem to me to be the case.

I wouldn't tie drift too closely to speciation per se. Mutation, recombination, drift and selection together are why anaganesis must occur - change in lineage. You get bifurcation (speciation) if there is a barrier to efficient genetic mixing, with independent anagenesis in the two lines due to a lack of cross-talk. Of course drift must be responsible for a good deal of the change. But once you have a barrier to mixing, it doesn't make much sense to talk of the populations being separated by either selection OR by drift. Those processes are taking place within each separate population. Between the populations, especially if they do not compete ecologically, there is simply no sense in which s values in one have any relevance to the other. There is just divergent change.

I could have sworn [...] the power attributed to natural selection would have remolded all species to their changing environment, leaving a fossil record of continuous change, without extinction, save for exotic catastrophes like comets or supervolcanoes or whatnot.

I don't really see why. If you have a bifurcation, with anaganensis in both lines, you get a gradual whittling away of alleles common to both lineages - that is, the ancestral genome. You don't need grand-scale extinction of the middle; it just ... fades away.

AllanMiller said...

And as a further point, it's not selection acting alone that has the power - it is the combination of selection and drift. Drift can shake alleles out of local optima. But this does not have to go to fixation - it does not have to have much palaeontological persistence to do work. It simply creates a subpopulation which can hit upon 'solutions' inaccessible to an ever-ascending view of the adaptive landscape.

Unknown said...

"And I also thought the topic mattered because panselectionism is a problem for evolutionary theory (an opinion I formed after reading some evolutionary psychology.)"

EP has bigger issues than drift. From the outset the discipline did exclude phylogenetic thinking and even the hardest of hardcore panselectionist among biologists doesn't do that. In Cosimides and Tooby (1997), which is pretty much the manifesto of EP, they contrast phylogenetic approaches with adaptionist approaches and declare EP to be soley concerned with the latter.

5 years later, one of the premier journals in the field pubished Barrashs review of Goulds "Structure" (Barrash, 2002). In the review Barrash childes Gould for inventing terms and picks out "autapomorphy", which of course came from Hennig and is standard fare in phylogenetics. But it shows a discipline shockingly unaware of the larger picture. The primer at least acknowledges that phylogenetics are useful, they are simply excluded from the discipline.

The final apex was reached on the website of A.Hejj, professor for EP in Munich (http://hejj.de/Dt/Evolpsy.htm) which states:
"Spricht ein Evolutionspsychologe von „evolvierten Verhaltenstendenzen“, also von solchen, die sich als Anpassung an die Umweltbedingungen der frühen Vorfahren aller Menschen entwickelt haben, so muss er keineswegs Anhänger des Darwinismus sein, also daran glauben, dass der Mensch „vom Affen“ abstamme. Angenommen wird lediglich, dass heute wirksame Verhaltenstendenzen durchaus eine lange Entstehungsgeschichte aufzuweisen haben."

This translates to
"When an evolutionary psychologists speaks of "evolved behavioural tendencies", i.e. those that developed as adaptations to the living conditions of the ealy ancestors of all humans, he does not have to be a supporter of Darwinism, i.e. believe that humans are descended "from apes". It´s only assumed that behavioural tendencies active today have a history of developing that took some time."

Because it actually is a slippery slope from "we don't consider common descent" through "we don't know how common descent works" to "we deny common descent"...

Unknown said...

"Actually, I don't even understand how s can be assumed to be constant..."
Simplified models. But you can relax this assumption and get a model where s is not constant. Kimura did work out some special cases, including the "mean neutral" case where E(s)=0 and found that this case is approximately neutral.

Unknown said...

Well yes, approximately they do. But I think it's relevant for students to know when they are using an approximation and when they are not and they need to consider when an approximation as adequate and when it isn't. For instance Larry uses the 2s approximation without checking whether it's appropriate and sometimes comes up with answers that are wrong because of this. I certainly agree that approximations can help to build students' intuition, but if they don't keep in mind that they don't apply in all situations, they can also mislead their intuition.

The same issue comes up in most fields. There are quite a few cases where people use normal approximations without checking whether that's a good idea for instance and these come up in a lot of places. And not checking whether a law of large numbers actually applies was a big issue in the financial crash of 2008.

Arlin said...

I just want to point out that philosophers have been debating related issues for about 10 years. Massimo Pigliucci has a readable piece on this topic that has pointers to the philosophical literature (http://rationallyspeaking.blogspot.com/2012/12/the-philosophy-of-genetic-drift.html).

The reason for this debate begins with the way that proponents of the Modern Synthesis established the belief that evolutionary biology is respectable and scientific and awesome because it has a rigorous foundation in a theory of population-genetic "forces". The forces act in the arena of a population, and produce microevolutionary behavior, which leads to macroevolutionary behavior. This is the way that textbooks describe the theory, and it has been absorbed by philosophers (e.g., http://plato.stanford.edu/entries/evolutionary-genetics/).

So, of course the philosophers get interested if it appears to be the case that the theory of forces presented in every textbook of evolution is flawed or problematic.

Simon, I like the way you refer to the idea of additivity. If the "forces" of population genetics were like newtonian forces, we could combine them according to simple rules of composition and resolution, in the way we can combine two vectors of force to get a resultant vector.

A classic example would be the deterministic mutation-selection balance for a deleterious, in which we have a force of mutation pushing the frequency up, and a force of selection pushing the frequency down, resulting in a simple equilibrium.

But I have argued elsewhere that this conception contributed importantly to a mistaken understanding of the interaction of mutation and selection as opposing forces in a zero-sum battle. The "forces" conception is designed for thinking about evolution as a process that occurs in the interior of an allele-frequency space, and it disintegrates when we attempt to apply it to (for instance) the mutation-limited case, in which mutation is not acting as a force but as a point process that introduces novelty. There was an era when classical population geneticists (e.g., Lewontin & Kojima, 1960) could claim to be modeling the dynamics of evolution without including any term for mutation, because the shifting of allele frequencies in the interior of the allele-frequency space (aka "evolution" in the Modern Synthesis) is not much influenced by mutation rates given their smallness, so that ignoring the mutation term has little impact on outcomes.

S Johnson said...

Thanks to Allan Miller for the courtesy in his effort to explain.

"Variation in phenotype depends upon the heritability of the genotype, and upon environmental influence, but this has nothing to do with drift per se."

When drift fixes an allele, I concluded the variation in the population phenotype decreases. But I thought when neutral evolution changes the DNA there is no observable change in the phenotype (which is what "neutral" means in this context.) I concluded that the two addressed different kinds of evolution, a term which can be used to mean 1)change in gene frequencies in a population 2)change in phenotype over generations or 3)multiplication of species.

"You get bifurcation (speciation) if there is a barrier to efficient genetic mixing, with independent anagenesis in the two lines due to a lack of cross-talk."

I didn't think barriers are required for speciation, unless time is regarded as a "barrier."
If one could somehow attempt to breed a modern and ancient horseshoe crab, I would say there is a much greater probability of failure. Closed ring species may be rare but they are just the most striking example of how drift changes populations. Natural selection may change populations of course when the "barrier" doesn't change separate breeding populations but persistently different selection pressures. But it seemed to me that was not necessarily the case, but drift always happens.

"If you have a bifurcation, with anaganensis in both lines, you get a gradual whittling away of alleles common to both lineages - that is, the ancestral genome. You don't need grand-scale extinction of the middle; it just ... fades away."

I know you're trying to correct my misunderstandings, which is very confusing here, because I said this. I thought the absence of extinction was precisely the issue with attributing such extreme powers to natural selection. I thought the fossil record was not comprised of smoothly continuous lineages, even after making due allowances for the its inevitable imperfections, that there were in fact geologically abrupt extinctions and novelties.

Further, the discussion here has by majority opinion agreed that a selection coefficient of 1/4N. If barriers are so important in speciation, then this means the founder effect, not natural selection, is tremendously important in speciation. But if the effective population is large for the species normal habitat, perhaps in the millions or even more, then the selection coefficient is correspondingly tiny, yet still effective, proving the incredible power of natural selection...and we have also been informed that the time for fixation and effects on absolute fitness (and thus population size,) are irrelevant to the issue, too. This seems to be mathematical proof of a panselectionist perspective, yet I cannot figure out how this is compatible without other facts of nature, such as the existence of junk DNA; the existence of neutral mutations, such as in enzymes; any correlation between mutation rate and rate of fixation of alleles; the existence of vestigial organs; wide variations in species phenotype; extinction; common geologically rapid appearance of new species in the fossil record.

As much as I appreciate your efforts, I'm still lost. As near as I can tell, either I am confused by incorrect information as to the true state of affairs. Or I am misled by impure thoughts such as reality and causes which are irrelevant to real science. Or possibly the minority view is correct in this instance.

No doubt the last possibility should be dismissed, and myself along with it. Farewell.





S Johnson said...

PS Of course I forgot to correct a typo about neutral evolution which makes "no observable change in the reproduction of the phenotype." My apologies.

Anonymous said...

S Johnson, You've covered a lot of issues here. I'm glad to see you're confused -- nice to know I have company. And if we throw out all of us who are confused, this blog will have few readers left. I’m going to write about some things I think I know from this, so someone else can come along and enlighten both of us.

For one thing, I share the thought that some of this is an intense discussion of theoretical possibilities that may not apply to many real populations. (Example: Time of fixation may not matter in theory, but it can have a significant effect on the likelihood of fixation in real populations, given their swings in size and the prevalence of environmental change.)

It seems to me that neutral evolution is change in frequencies of alleles that are not selected for or against. Their selection coefficient s = 0. Some of these allele changes have no effect on the phenotype – changes in junk DNA, say, or synonymous changes that code for the same amino acid. In other cases, the phenotype does change but that doesn’t matter. For example, in a group of plants I’m working with, flowers can be white, pale yellow, or bright yellow and they are all visited by the same pollinators and they all set lots of seeds. Alleles for these flower colors may all have s = 0, compared to one another.

Genetic drift is a process of random genetic change. The conceptually “pure” case is the neutral case where s = 0, but random changes (drift) happen all the time. No matter how much “better” one allele is than another, as long as there are at least 2 alleles for a locus (gene) in the population, some part of the changes in their frequencies is due to chance, or to stochastic processes we can’t practically distinguish from chance.

Natural selection happens when some interaction with the environment (or within the organism) causes one allele to be associated with (or cause) an increase or decrease in reproductive success, compared to the other allele(s) at that gene. Much of the discussion has been Simon’s explaining that selection and drift are really the same process. What we call selection is just drift with a bias, like rolling loaded dice. True and important, but then he wants to jettison the term “selection” entirely, because of historical misuse (or miscalculation), and I just can’t go there. (The whole “resampling” theme is involved in explaining this, but I think “resampling” is a good term once you get it, but just confusing until you do.)

Another theme has been, do really small selection coefficients matter, and under what circumstances? Conclusion: yes, if the population is big enough and time is long enough, even if the population is fragmented, but lots of things can make effects of small selection coefficients hard to distinguish from neutrality.

A lot of time has been spent on what I would call purely nomenclatural issues – people disagreeing at length about what things should be called, and/or misunderstanding what other people have said. (I suspect that some of the (hopefully unintentional) insults fall in that latter category.)

You seem to want to know how these processes of allele change relate to “bigger picture” issues of speciation, extinction, and evolutionary change. These are important issues that books can be and have been written about, but I’ll tackle one in my next too-long reply. This all helps me solidify what I think I know.

Anonymous said...

S Johnson, Speciation is one of the things I think I know something about. There are (at least) four different overall patterns of speciation. The proportion of selection and drift involved differs, but they are both involved in all these things. (Simon would say, “Duh! They’re the same thing!”)

1. Anagenesis -- One population exists for a long period of time, but doesn't divide. Drift and/or selection happen. After a few million years, the members may be so different that they couldn't breed with those from millions of years ago (if the necessary time travel were available), so we can reasonably call the descendents a different species from the ancestor, but there's no clear boundary between the species.

Anagenesis can happen to rare or abundant species, but small populations tend to become extinct, so it seems to me it usually happens in big populations. Therefore, we can make predictions. If the population is big, selection may keep the population healthy and relatively unchanged, because selection can be effective, changes in a well-adapted population would be likely harmful, and random drift isn’t likely to bring harmful alleles to fixation, not to mention because the population may migrate to follow its habitat as the environment changes. However, large population size allows even slightly beneficial new alleles to increase, so even the large population will change slowly.

2. A divided anagenesis can happen. ("Bifurcation with anagenesis.") For example: When glaciers were extensive, a belt of forested habitat extended across southern North America. As glaciers retreated, forests moved north and thinned out in the middle (now the Great Plains). Numerous forest organisms were divided into east and west populations that became different. Some are now treated as species (among birds, Wood-Pewees, Screech-Owls, Indigo/Lazuli Buntings, etc.), and some are not (among birds, Dark-eyed Juncos, Flickers, Northern Oriole). No founder effects, but drift and selection happened. The separated populations become different because environments are different (causing different selection), random processes yield different results, and different mutations happen.

3. My favorite cases involve small populations and often founder effects. A small population crosses a barrier and becomes isolated, or a big widespread population retreats, leaving little relict populations here and there. Often the populations remain small for a long time. Fixation of slightly beneficial alleles is far from certain, and genetic drift gets the credit/blame for the most of the changes. The most common results are re-absorption of the small population into the parental species or extinction. However, odd, unexpected results can happen. A new species with surprising new adaptations and may develop and become common, sometimes replacing the parental species. Because somewhat differentiated, isolated populations are common in some plant speciess, I think this process happens a lot.

This third process is basic to Eldridge and Gould’s theory of punctuated equilibrium, which you allude to, in which abundant, widespread species are replaced by related species rather than changing from within (anagenesis). Their theory has been much battered since it came out, but I thought “Oh, wow, I wish I’d thought of that” then, and it still seems useful.

People disagree about the relative importance of this process and anagenesis in life’s history. Actually, you can reasonably argue that slow anagenesis in big populations and relatively fast change in small populations are all the same process, just with different population sizes – and you’re right! Shades of Simon Gunkle and selection vs. drift!

4. Speciation by polyploidy, often with hybridization. This is the only pattern that doesn’t require some spacial or temporal barrier between the diverging populations because the polyploidy itself can be a barrier.