One of those contributions is a paper by Michael Behe and David Snoke published eleven years ago in Protein Science (Behe and Snoke, 2004). I described the result in a previous post: Waiting for multiple mutations: Intelligent Design Creationism v. population genetics.
If Behe & Snoke are correct then modern evolutionary theory cannot explain the formation of new functions that require multiple mutations.
Cassey Luskin is aware of the fact that this result has not been widely accepted. He mentions one specific criticism:
In 2008, Behe and Snoke's would-be critics tried to refute them in the journal Genetics, but found that to obtain only two specific mutations via Darwinian evolution "for humans with a much smaller effective population size, this type of change would take > 100 million years." The critics admitted this was "very unlikely to occur on a reasonable timescale."He's referring to a paper by Durrett and Schmidt (2008). Those authors examined the situation where one transcription factor binding site was disrupted by mutation and another one nearby is created by mutation. The event requires two prespecified coordinated mutations.
Durrett and Schmidt show that in Drosophila such event can occur on a timescale of about 4,300 years. In humans with an effective population size of 10,000 the event would take about 160 million years. The authors point out that such and event—two prespecified mutations—is very unlikely. They also point out that their estimates are 5 million times more likely that those given by Michael Behe in The Edge of Evolution.
This study is not very relevant to the evolution of new functions as described in Behe and Snoke (2004).
Lynch wrote the editors when the Behe & Snoke paper appeared asking if they would be receptive to a rebuttal in the form of a new research paper. The editors agreed, subject to peer review [see Editorial and position papers].
Lynch begins with:
In a recent paper in this journal, Behe and Snoke (2004) questioned whether the evolution of protein functions dependent on multiple amino acid residues can be explained in terms of Darwinian processes. Although an alternative mechanism for protein evolution was not provided, the authors are leading proponents of the idea that some sort of external force, unknown to today's scientists, is necessary to explain the complexities of the natural world (Behe 1996; Snoke 2003). The following is a formal evaluation of their assertion that point-mutation processes are incapable of promoting the evolution of complex adaptations associated with protein sequences. It will be shown that the contrarian interpretations of Behe and Snoke are entirely an artifact of incorrect biological assumptions and unjustified mathematical oversimplification.He continues by pointing out that, contrary to their claims, Behe & Snoke were not modeling "Darwinian processes" because they were examining intermediates that were neutral with respect to selection and created a nonfunctional product. (The initial "compatible" mutations inactivated one copy of a duplicated gene pair.) Lynch concludes that, "... there is no logical basis to the authors' claim that observations from a non-Darwinian model provide a test of the feasibility of Darwinian processes."
This is standard criticism of ID proponents. It will be familiar to most Sandwalk readers. It proves that even when Intelligent Design Creationists should know better—like publishing a paper about population genetics—they still get confused by their own rhetoric.
This gives rise to seven different alleles and allele combinations that form the initial features of his model. He also assumes that there can be more than one way to change the function via a combination of such different mutations.
The vertical lines in the figure indicate "compatible" mutations that, in combination, will change the function. As we see in allele #2, some of them can occur before the gene duplication because they are not disruptive. This is different than the Behe & Snoke assumption.
Lynch retains several other assumptions in the Behe & Snoke paper:
To simplify the presentation as much as possible, the focus here is on a nonrecombining haploid genome (as assumed by Behe and Snoke), with the origin of a new adaptive function involving a two-residue interaction, for example, the disulfide bond between two cysteines. As in Behe and Snoke (2004), this adaptation is assumed to be acquired at the expense of an essential function of the ancestral protein, so that the new function can only be permanently established via gene duplication, with one of the copies maintaining the original function. The Behe-Snoke assumption that a selective advantage only results after both participating residues are in place is also adhered to.Lynch quotes several papers to justify his assumption that the initial mutations are neutral, not disruptive. He concludes, correctly, that "... most proteins in all organisms harbor tens to hundreds of amino acid sites available for evolutionary modification prior to gene duplication."
A new variable, n is introduced. This is the number of codons that can potentially mutate to produce a new function. In Behe & Snoke this number was 1. If three mutations were required then there were only three possible codons that could mutate. In the Lynch model this number could be 10 or even 50.
Here's how Michael Lynch describes the process:
Successful establishment of the new function (neofunctionalization of one of the copies) requires the founding pair of linked gene duplicates to (1) initially attain a high frequency; (2) acquire the mutations essential to the expression of the new function (allelic types 6 or 7) while en route or subsequent to fixation; and (3) be preserved by positive selection subsequent to the origin of the new function. All three processes occur in parallel with a background production of null alleles. The two central issues to be resolved are then: (1) How frequently will a duplication event lead to neofunctionalization; and (2) How long will this take? Answers to these questions can be acquired by recursively following the population through the sequential steps of mutation, selection, and random sampling.The model includes parameters for the gene duplication event as well as the standard variables such as mutation rate and population size. The mutation rate for amino acid substitution is 10-8, the same as in Behe & Snoke. The mutation rate for production of a null allele (deleterious mutation rate) is 10-6 per gene per generation.
Simulations were run using the requirement that two mutations are required for neofunctionalization (formation of a new gene). As you might expect, the probability is very sensitive to the total number of codons that can potentially be mutated. If there are 50 such sites (n=50), then fixation is almost assured for a population size of 10,000 individuals. If there are only two such sites, then you need a population size of 100 million to reach a reasonable probability of success.
Behe & Snoke claimed that with two mutations you need a population size of 1012 in order to fix the new allele in one million generations. The Lynch simulation shows that the new allele can be fixed in one million generations with population size of only 106 provided there are 50 potential mutable sites (n=50). That's a difference of six orders of magnitude.
Even with n=2 (as in Behe & Snoke) a population of one million could lead to fixation in 108 generations in the Lynch simulation whereas it would take 100 times as long according to Behe & Snoke. Recall that most of these simulations apply to single-cell, haploid organisms such as bacteria. In those populations the generation times can be measured in days. There might be 100 generations per year so 108 generations could be only one million years—a short time in the history of life.
There are three important differences between the two approaches. First, as already noted, Lynch assumes that the initial mutations are neutral. Second, Lynch assumes that there are multiple pathways to the formation of a new functional protein.
The third difference is more subtle. Behe & Snoke failed to take into account the fact that a linked pair of functional genes has a selective advantage because it now takes two null mutations to silence both copies. This effect doesn't make much difference in small population but in large populations it makes fixation of the new allele more probable.
According to Lynch, there are several reasons for thinking that his estimates of time to fixation are too high. You should read the paper to see what they are. (They're a little too complicated for the average Sandwalk reader, e.g. me.)
The bottom line is:
In summary, the conclusions derived from the current study are based on a model that is quite restrictive with respect to the requirements for the establishment of new protein functions, and this very likely has led to order-of-magnitude underestimates of the rate of origin of new gene functions following duplication. Yet, the probabilities of neofunctionalization reported here are already much greater than those suggested by Behe and Snoke. Thus, it is clear that conventional population-genetic principles embedded within a Darwinian framework of descent with modification are fully adequate to explain the origin of complex protein functions.Casey Luskin may be unaware of this criticism but Michael Behe was not.
Our paper (Behe and Snoke 2004) contains one simple result. When reasonable parameters are used with our model to estimate actual time scales or population sizes for the evolution of multi-residue (MR) protein features, they are unrealistically large. This implies that the model we chose, which is restricted to point mutations and assumes intermediate states to be deleterious, isn't a plausible evolutionary pathway. One must therefore look about for a new model. We did not rule out such a possibility; in our original article, we explicitly stated, “we should look to more complicated pathways, perhaps involving insertion, deletion, recombination, selection of intermediate states, or other mechanisms, to account for most MR protein features.”In other words, Behe and Snoke deliberately choose a model that leads to unrealistic results but they are perfectly willing to accept other valid evolutionary models that seem much more reasonable.
In his Editorial (this issue), Professor Hermodson reports that comments sent to him assume a consensus, “Thus, intermediate states must also be assumed to be selected.” Some significant previous work does not make this assumption (Kimura 1985; Ohta 1989), but our paper supports such a consensus. This is a strong requirement—that not only the end products, but steps along the way to a multi-residue function, must be either selected or at least neutral. Michael Lynch makes a similar assumption. Our model posited necessary intermediate mutations to be deleterious in the unduplicated gene; Lynch's model assumes them to be neutral: “all 20 amino acids are equally substitutable in the intermediate neutral state” (Lynch 2005, this issue). All of his objections to our work stem from this difference.
Recall that Casey Luskin touts this paper as one of the few results from ID proponents that produce a genuine "scientific discovery." He notes that there was criticism but fails to mention the Lynch paper or the concession published by Behe and Snoke.
Isn't that strange?
Behe, M.J., and Snoke, D.W. (2004) Simulating evolution by gene duplication of protein features that require multiple amino acid residues. Protein science, 13:2651-2664. [doi: 10.1110/ps.04802904]
Behe, M. and Snoke, D.W. (2005) A response to Michael Lynch. Protein Science 14:2226-2227. [doi: 10.1110/ps.051674105]
Durrett, R., and Schmidt, D. (2008) Waiting for two mutations: with applications to regulatory sequence evolution and the limits of Darwinian evolution. Genetics, 180:1501-1509. [doi: 10.1534/genetics.107.082610]
Lynch, M. (2005) Simple evolutionary pathways to complex proteins. Protein science, 14(9), 2217-2225. [doi: 10.1110/ps.041171805]