Sunday, November 29, 2015

Motoo Kimura calculates a biochemical mutation rate in 1968

I recently had occasion to re-read a paper by Motoo Kimura from 1968. (Kimura, 1968). I noticed, for the first time, that he estimates a mutation rate based on his understanding of the error rate of DNA replication. He also makes a comment about creationists.

Remember, this was in 1968 and we didn't know as much then as we do now. Kimura took note of the fact that evolutionary trees based on comparing amino acid sequences gave rates of amino acid substitutions that seemed far too high. His conclusion is in the abstract.
Calculating the rate of evolution in terms of nucleotide substitutions seems to give a value so high that many of the mutations involved must be neutral ones.
This is the beginning of Neutral Theory and we now know that his conclusion is correct.

Kimura wanted to know if there was any independent, supporting, evidence for the estimated mutation rates from amino acid substitutions so he looked to the error rate of DNA replication.

In the first edition of The Molecular Biology of the Gene Jim Watson had estimated the overall error rate for DNA replication at 10-8 to 10-9 per nucleotide. This was based partly on Seymour Benzer's saturation mapping of the rII locus in bacteriophage T4 [see Wikipedia: T4 rII system]. Kimura figured that there were about 50 cell divisions in the lineage from gamete to fertilized egg so he estimated that if the human genome was 4 × 109 bp then there would be somewhere between 200 and 2000 mutations per generation in humans.

Watson's error rate was too high. We know know that it's closer to 10-10. Kimura's guesstimate of 50 cell divisions is too low because we now know that there are about 400 cell divisions during production of sperm and 30 cell division in the production of egg cells for an average of 215 [Estimating the Human Mutation Rate: Biochemical Method]. Kimura also over-estimated the size of the human genome—in this case he should have known the correct value (3.2 × 109).

Nevertheless, he ended up with a value of 200 for the lower bound based on the lowest error rate and this isn't too far off the current estimate (130 mutations per generation). Not bad for the 60s!

....... musical interlude ..... here's a song from 1968 to get you in the mood ......


Now back to our regularly scheduled programming ....

Kimura knew what his calculation meant.
Finally, if my chief conclusion is correct, and if the neutral or nearly neutral mutation is being produced in each generation at a much higher rate than has been considered before, then we must recognize the great importance of random genetic drift due to finite population number in forming the genetic structure of biological populations, The significance of random genetic drift has been deprecated during the past decade. This attitude has been influenced by the opinion that almost no mutations are neutral, and also that the large number of individuals forming a species is usually so large that random sampling of gametes should be negligible in determining the course of evolution, except possibly through the "founder principle." [reference to Ernst Mayr].
Now, Kimura obviously doesn't think much of those people who deny neutral mutations and the importance of random genetic drift so he makes an analogy.
To emphasize the founder principle but deny the importance of random genetic drift due to finite population number is, in my opinion, rather similar to assuming a great flood to explain the formation of deep valleys but rejecting a gradual but long lasting process of erosion by water as insufficient to produce such a result.
There are still a lot of scientists acting like flood geologists.

Here's a photo for all you evolution fans. It shows James ("Jim") Crow, R.A. Fisher, and Motoo Kimura in Crow's lab at the University of Wisconsin in 1961.1



Image Credit: the cartoon is from: Robins depot

1. I was going to post a good song from 1961 but there weren't any ... except maybe 'Hats off to Larry' #68 by Del Shannon.

Kimura, M. (1968) Evolutionary rate at the molecular level. Nature, 217:624-626. [PDF]

93 comments :

  1. A small correction: Kimura's given name is romanised as Motoo or Motō (the first "o" is short). Japanese mōtō (with two long vowels) means 'not in the least'.

    ReplyDelete
    Replies
    1. Americans who knew Kimura called him "Moe-Tow" (or something like that) if they were trying to be linguistically correct, but usually called him "Moe-toe" when they weren't trying to be correct. Even Jim Crow did that.

      But nobody who knew him called him "Moe-Tew", as they knew that the "oo" was not pronounced "ew".

      Delete
  2. The Beatles video says "not available in your country" but this one from 1961 applies well to those who make such bold predictions:

    The Shirelles - Will you still love me tomorrow
    https://www.youtube.com/watch?v=cnPlJxet_ac

    ReplyDelete
  3. In 1961 I was an undergraduate student at the University of Wisconsin, and Jim Crow was allowing me to come and hang out at his lab. Kimura was visitng from Japan. The day of that photo was a day I didn't come in to the lab. The next day I came in, and everyone said "Fisher was here". He had come through Madison to visit his daughter Helen Fisher Box, who was later his biographer.

    Some days later everyone in the lab was showing off copies of photos that had been taken with Fisher. So I never got to meet Fisher, and don't have a photo with him.

    ReplyDelete
    Replies
    1. Hi Joe,

      I know that you interacted in the past with people like Bruce Walsh, John Gillespie, Lee Altenberg, Stevan Arnold and Russ Lande. They, just like you, are undoubtedly more theoretically-inclined than your average biologist. So I was wondering: did they acquire these mathematical skills in graduate school, or were they always “good with numbers”?

      I feel that mathematically-naïve people in my area of interest (evo-devo) try to run discussions on things like developmental drive, developmental constraints, evolvability and canalization, but they usually end up rediscovering well established concepts in evolutionary quantitative genetics, such as G and M matrices. So do you think that experimentally-oriented grad students in, say, developmental biology could develop math skills from scratch in order to fully understand papers dealing with these issues from perspective of theoretical population genetics?

      Delete
    2. Bogi, Bruce did his Ph.D. degree under my supervision (but was very independent in his thinking). Except for John Gillespie, I have been in recent touch with all the others. Steve does some theory but is also an empirical field biologist.

      But it happens that I never asked any of them how they got into doing theory, or when. I guess I would suggest taking courses, but also just trying to use theory, whether or not you can publish the result. If you see a theoretical result, before you look at how it was done, see whether you can rederive it yourself. And when you get a great new idea, before sending off that paper, do some checking in the literature.

      Delete
  4. Kimura : "Calculating the rate of evolution in terms of nucleotide substitutions seems to give a value so high that many of the mutations involved must be neutral ones."

    Larry: "This is the beginning of Neutral Theory and we now know that his conclusion is correct."

    I am sorry to say that Kimura made a mistake in his calculation of mutation rate. To use the equation: Rate = D/2T, D (distance) must satisfy the precondition for using the formula, which is that it must not be saturated, maximum, or plateau distance. And yet, this issue regarding maximum distance was never considered by Kimura or anyone else for that matter. The distance was simply assumed to be linear. That assumption was understandable as the distance does appear to scale with evolutionary time. However, we now know that even maximum distance also scales with evolutionary time. Several papers of ours since 2008 have established that distance measured by most proteins over evolutionary time scales are maximum distance. The way to distinguish maximum from linear distance is quite simple as we have shown in a 2010 paper on the overlap feature of the genetic equidistance result.

    Huang, S. (2010) The overlap feature of the genetic equidistance result, a fundamental biological phenomenon overlooked for nearly half of a century. Biological Theory, 5: 40-52.

    We have recently published two preprints on the topic.

    Luo, D., and Huang, S. (2015) The genetic equidistance phenomenon at the proteomic level. bioRxiv, doi: http://dx.doi.org/10.1101/031914

    Abstract
    The field of molecular evolution started with the alignment of a few protein sequences in the early 1960s. Among the first results found, the genetic equidistance result has turned out to be the most unexpected. It directly inspired the ad hoc universal molecular clock hypothesis that in turn inspired the neutral theory. Unfortunately, however, what is only a maximum distance phenomenon was mistakenly transformed into a mutation rate phenomenon and became known as such. Previous work studied a small set of selected proteins. We have performed proteome wide studies of 7 different sets of proteomes involving a total of 15 species. All 7 sets showed that within each set of 3 species the least complex species is approximately equidistant in average proteome wide identity to the two more complex ones. Thus, the genetic equidistance result is a universal phenomenon of maximum distance. There is a reality of constant albeit stepwise or discontinuous increase in complexity during evolution, the rate of which is what the original molecular clock hypothesis is really about. These results provide additional lines of evidence for the recently proposed maximum genetic diversity (MGD) hypothesis.

    ReplyDelete
    Replies
    1. OK, here's a basic issue with what you keep claiming:
      "Rate = D/2T" has never been the theory.
      Let's look at S as a measure of similarity, rather than D=1-S.
      then we find that
      dS/dt=-2rS
      for a model in without backmutations, i.e. assuming that any position can take infinitely many states. This solves as S(t)=S_0*e^(-2r(t-t_0)), given S_0 as the similarity at time t_0
      Now to change this to a more realistic model using JC gives us
      dS/dt=-1.5rS-+5r(1-S)=-2rS+.5r
      this solves to
      S(t)=(S_0-.25)*e^(-2r(t-t_o))+.25
      for S~1, you can do a linear approximation that isn't bad. But it's not as if saturation wasn't known at the time...

      Delete
    2. Simon,

      It seems to me that instead of JC you should more appropriately have applied the Kimura 2-parameter correction, because it would be harder for Gnomon to claim that Kimura didn't know about it.

      Delete
    3. Yea, but I wanted to keep it 60s. And MCMCtree as well as multidivtime (but not BEAST and dpp-div) move the model to the branch length estimation, leaving the final rate estimate in the same form as JC - i.e. they rewrite the matrices as r*A, where A is a normalized matrix and r is a scalar multiplier. The estimation of rates then only estimates r for each branch.

      Delete
    4. I looked at the abstract. He seems merely to have discovered the phenomenon of multiple hits and that they increase with increasing distance, without having any idea that this was already well known.

      Delete
    5. I only commented on the bit he wrote here. Looking at the preprint he shows results that make perfect sense when viewed from a phylogenetic perspective. It's just a matter of misinterpreting a tree as a ladder to get to his BS and the result he is proclaiming would crumble as soon as he included - say - a couple more lisamphibians. Or birds. Or echinoderms. Or a couple of insects.
      It's worth noting that in the end they start talking about prime numbers and state that the number of primes smaller than x is ~x. But the prime number theorem states that its ~x/ln(x). That they speculate on whether the result that distances between a species and two other species that belong to a clade that the first one does not belong to are equal is due to the same thing that makes prime numbers increase ~x would be silly if they got that right about prime numbers. But that they manage to get the prime number theorem wrong is just icing on the cake...

      Delete
    6. What is the agenda for the straw-man ~x, when it is Li(N) in the preprint ?

      Delete
    7. "The cumulative increase in prime numbers along the progression in natural numbers is well known to follow a nearly constant rate."
      (emphasis mine).

      Delete
    8. You should also include 'nearly'. Li(N) is a much more precise prime number counting function than x/ln(x), although it still has error margins.

      Delete
    9. That's besides the point. You claim that there's a nearly constant rate, and no function that counts primes, including pi(x) has anywhere near a constant rate.

      Delete
    10. The mathematician Don Zagier once said in a 1975 lecture: “There are two facts about the distribution of prime numbers of which I hope to convince you so overwhelmingly that they will be permanently engraved in your hearts. The first is that, despite their simple definition and role as the building blocks of the natural numbers, the prime numbers grow like weeds among the natural numbers, seeming to obey no other law than that of chance, and nobody can predict where the next one will sprout. The second fact is even more astonishing, for it states just the opposite: that the prime numbers exhibit stunning regularity, that there are laws governing their behavior, and that they obey these laws with almost military precision.”

      We can say nearly the same on the appearance of more and more complex species during macroevolution (bacteria gradually evolve into human). There are two facts about them. The first is that they grow like weeds among the species created by micro-evolutionary mechanisms (e.g. bacteria to bacteria or monkey to monkey via random drift or Nei’s niche filling theory or natural selection), seeming to obey no other law than that of chance, and nobody can predict where the next one will sprout. The second fact is even more astonishing, for it states just the opposite: that the appearance of new species of higher complexity exhibits stunning regularity, that there are laws governing their behavior, and that they obey these laws with almost military precision.”

      Delete
    11. The deepest question humans can ask cannot be deeper than the Riemann hypothesis on the dual facts of prime numbers, the combination of both random and lawful. Is it a law that created everything but could not inherently get rid of seeming randomness? Or is it random force that happens by chance to have created everything with a seemingly lawful pattern?

      In an interview, the mathematician David Hilbert explained that he believed the Riemann Hypothesis to be the most important problem 'not only in mathematics but absolutely the most important.’. The Fields medalist Enrico Bombieri said in an interview: “The Riemann Hypothesis is not just a problem. It is the problem. It is the most important problem in pure mathematics. It’s an indication of something extremely deep and fundamental that we cannot grasp.”

      Delete
    12. " It’s an indication of something extremely deep and fundamental that we cannot grasp."

      God of the gaps in mathematics...

      Furthermore, did 5 and 7 mate, exchange genetic material and produce 11 as offspring?
      Why are you trying to insert your god into this gap, the Riemann hypothesis, and at the same time use a very bad analogue to map the Riemann hypothesis on something which clearly has nothing to do with mathematics, i.e. biology?

      Delete
    13. " The Fields medalist Enrico Bombieri said in an interview: "

      And could you link the interview in which he said this? Google for some weird reason doesn't come up with a hit where this citation is mentioned. Or would it be the case that Google doesn't have the hit, because (shock/ horror) you quote mined it?

      Delete
    14. The quote appeared in two books. The music of the primes by du Sautoy, and A beautiful mind (on John Nash) by Nasar.

      Delete
    15. Did it appear in something actually written by Enrico Bombieri?

      Delete
    16. I mean, not the quote but any equivalent sentiment. In what Bombieri has written about the Riemann hypothesis there is no mention of a mystery we can't grasp, though he does call it "probably the most important problem in pure mathematics" (p. 1 of the linked essay):

      http://empslocal.ex.ac.uk/people/staff/mrwatkin/zeta/riemann.pdf

      Delete
    17. The book "A Beautiful Mind" does quote Bombieri as saying what gnomon has in quotes.

      https://books.google.com/books?id=uNPOmXAj1ScC&pg=PA64-IA141&lpg=PA64-IA141&dq=Enrico+Bombieri+riemann+deep+fundamental&source=bl&ots=FhSGPBBa24&sig=brBDkQUK2z0ErFRCwmz2EgxkvJk&hl=en&sa=X&ved=0ahUKEwi34oerybrJAhUFYiYKHXiSCJMQ6AEIJjAB#v=onepage&q=Enrico%20Bombieri%20riemann%20deep%20fundamental&f=false

      That's nice. Trying to relate this to some top-down planning of the universe is of course classic crackpot stuff.

      Delete
    18. Yes, I saw the link to the book too, from a paper in an open source/ non peer reviewed library. And I had exactly the same questions Piotr has: when and where did he, Bombieri, say this?
      Pointing to a non-science book (yes the book is about a scientist, but it's not science but rather a drama) as a source, is like doing a Trump. You can claim the tweets were there, but if there's no first hand proof of it happening, it most likely never did happen.

      Delete
    19. I suppose one could devise a population of counting systems and test to see if this behavior is typical or anomalous.

      Delete
    20. I was mistaken about the Bombieri quote being in the du Sautoy book. The Hilbert quote is.

      But regardless who said what, that the Riemann hypothesis is THE problem is beyond any doubt in my opinion for anyone who have understood its general implications. Here is my way to demonstrate it, which I believe is convincing. I would like to know if one can ask anything deeper about Nature than the problem I ask here in abstract forms that was inspired by the Riemann hypothesis. For something both lawful and random or unpredictable, be it about species or prime numbers in terms of details, is it caused or determined by a law or by a flip of dice? The question is extremely hard because either answer would have nearly impossible difficulties to overcome. If by a law, it would easily take care of the lawfulness or regularity aspect. But how can and must a law allow uncertainty, randomness, or non-predictability, which is antithesis to lawfulness?

      If by chance, it would easily take care of the random or non-predictability aspect. But how can chance produce extreme regularity that has essentially and practically zero probability of chance occurrence? Furthermore, how can regularity produced by chance lasts a very very long time (all the way to infinity for the prime numbers) all the while defying the destructive power of chance? A priori, chance is way more destructive to order than constructive. The equilibrium state or the most stable state of a system is maximum entropy according to Boltzmann. In other words, the most disordered state is the most natural state of all. Chance is the main source of disorder and would quickly destroy any order it happens to produce, thus causing a long-lasting stable state of maximum entropy. Soon after one drops a blue ink into a glass of water, one may see a pattern created by chance that some may call beautiful or abstract art. But alas, that is not to last and chance would soon destroy the pattern and create a stable and equilibrium state of plain bluish water with no regularity pattern of any kind.

      Delete
    21. I'm reminded of Bombieri's (authentic, not apocryphal!) description of how the Riemann Hyphotesis was solved. It's worth quoting in full:

      Dear Doron,

      There are fantastic developments to Alain Connes's lecture
      at IAS last Wednesday. Connes gave an account of how to obtain
      a trace formula involving zeroes of L-functions only on
      the critical line, and the hope was that one could obtain also
      Weil's explicit formula in the same context; this would solve
      the Riemann hypothesis for all L-functions at one stroke. Thus there
      cannot be even a single zeroe(1) off the critical line!

      Well, a young physicist at the lecture saw in a flash that
      one could set the whole thing in a combinatorial setting
      using supersymmetric fermionic-bosonic systems (the physics
      corresponds to a near absolute zero ensemble of a mixture
      of anyons and morons with opposite spins) and, using the
      C-based meta-language MISPAR, after six days of uninterrupted
      work, computed the logdet of the resolvent Laplacian,
      removed the infinities using renormalization, and, lo
      and behold, he got the required positivity of Weil's explicit
      formula! Wow!

      Regards also from Paula Cohen.
      Please give this the highest diffusion. Best,

      Enrico


      (1) This is the correct spelling, according to vicepresident
      Dan Quayle.


      (archived here)

      Delete
    22. Te Riemann Hypothesis year of publication is interesting.

      Delete
    23. Since I have the book A Beautiful Mind handy on my bookshelf, I just checked the source of the Bombieri quote. The author did gave a reference for it. It is a specific date in 1995 when the author personally conducted the interview with Bombieri. So, there is no more authentic source for the quote other than the book itself.

      Delete
  5. Huang, S. (2015) New thoughts on an old riddle: what determines genetic diversity within and between species. arXiv:1510.05918

    Abstract
    The question of what determines genetic diversity both between and within species has long remained unsolved by the modern evolutionary theory (MET). However, it has not deterred researchers from producing interpretations of genetic diversity by using MET. We here examine the two key experimental observations of genetic diversity made in the 1960s, one between species and the other within a population of a species, that directly contributed to the development of MET. The interpretations of these observations as well as the assumptions by MET are widely known to be inadequate. We review the recent progress of an alternative framework, the maximum genetic diversity (MGD) hypothesis, that uses axioms and natural selection to explain the vast majority of genetic diversity as being at optimum equilibrium that is largely determined by organismal complexity. The MGD hypothesis fully absorbs the proven virtues of MET and considers its assumptions relevant only to a much more limited scope. This new synthesis has accounted for the much overlooked phenomenon of progression towards higher complexity, and more importantly, been instrumental in directing productive research into both evolutionary and biomedical problems.




    ReplyDelete
    Replies
    1. It seems that your concept of MET is totally different than that of Dr. Moran's?

      Does it mean that you also don't support the neutral theory as part of the modern evolutionary theory as Dr. Moran has been supporting for a while now?

      What's your view of random genetic drift as one of the main mechanisms of evolution promoted by Dr. Moran?

      Delete
    2. Below is how leading experts view the MET, some of whom actually directly created it. I agree with them.

      Ohta and Gillespie: "..we have yet to find a mechanistic theory of molecular evolution that can readily account for all of the phenomenology. ..we would like to call attention to a looming crisis as theoretical investigations lag behind the phenomenology."

      Ohta and Gillespie said in 1996: "all current theoretical models suffer either from assumptions that are not quite realistic or from an inability to account readily for all phenomena."

      Lewontin said of the neutral school: “…we are required to believe that higher organisms including man, mouse, Drosophila and the horseshoe crab all have population sizes within a factor of 4 of each other. …The patent absurdity of such a proposition is strong evidence against the neutralist explanation of observed heterozygosity.”

      Lewontin has observed in 1974 regarding the theory of population genetics: “How can such a rich theoretical structure as population genetics fail so completely to cope with the body of fact?"

      Delete
    3. While I'm sure that many Sandwalk readers are already familiar with the Ohta and Gillespie paper that gnomon unceremoniously ripped most of those quotes from, here it is for anybody who might want to read it:

      http://www.weizmann.ac.il/complex/tlusty/courses/landmark/OhtaGillespie1995.pdf

      It's actually quite an interesting read.

      Delete
    4. Gnomon,

      I know you are not a creationist. I also suspect you are not an ID proponent.
      So, who are you? The Third Way believer? Don't misunderstand me. I'm all ears to all REASONABLE and not so reasonable ideas.That's why I've ended up following this blog as many, many others.

      Delete
    5. Has it occurred to you that the real mechanism of evolution has yet to be widely known? So, it should come as no surprise that you could not identify me with any widely known school of ideas or doctrines. If anything, I am closer to the 3000 year old Chinese book I Ching (The Law of Changes) than anything else. It is the most influential text in the history of Chinese civilization. At the heart of it, it views all essential beings in Nature to be at their most natural states or optimum equilibrium states. My Maximum Genetic Diversity hypothesis also says the same. For example, it considers genetic polymorphisms observed today for most species and certainly for humans to be at optimum equilibrium level. A priori, equilibrium level should be positively selected to be quickly reached because anything less would be deleterious for the species. Of course, anything more would also be deleterious. Cancer and Parkinson’s disease patients have higher levels of polymorphisms than normal matched controls as we have shown in several recent papers.

      Our idea could not be more different from Kimura’s random drift. Under his model, genetic polymorphisms are simply “a transient phase of molecular evolution” (Kimura, 1983). If you accept the infinite sites assumption key to Kimura’s neutral theory, you would not have expected equilibrium to be possible any time during evolution, no matter how long evolution has been going on or will continue. Unfortunately, that assumption is a priori invalid and there is no evidence for any of its deductions regarding events on evolutionary time scales. Over short time scales long before reaching equilibrium, it however is a priori valid approximation.

      Kimura M (1983) The neutral theory of molecular evolution. Cambridge: Cambridge University Press.

      Delete
    6. If anything, I am closer to the 3000 year old Chinese book I Ching (The Law of Changes) than anything else.

      Thank you, this explains your obsession with numerology. So it's some sort of New Confucian woo as opposed to the more typical woo we get from ID creationists.

      Here are a couple of quotes from your online paper:

      Human has the lowest genetic diversity among all species is because of its high complexity rather than time.

      Can you quote the studies that have established this fact? (I mean the first half of your sentence; I don't think I understand the rest.)

      Simple organisms such as bacteria are expected to have much greater MGD than primates. But the observed MGD of a specific type of bacteria species may not be that much greater than a specific type of monkey, which may seem inconsistent with the MGD hypothesis. However, if one looks at within clade between species diversity, one observe the expected results. The between species diversity within the bacteria clade is much greater than that within the primate clade. The bacteria kind has much greater MGD, which is one reason for the huge amount of diversification in bacteria species.

      I don't think you know what a clade is, and you don't seem to realise that "bacteria species" and "monkey species" have different definitions and can't be compared directly. But those fundamental problems aside -- are you seriously arguing that "the bacteria clade" and "the protozoa kind" (your term) are genetically more diverse than "the primate clade" because the primates are more complex?

      And the last question. You "thank Donald Forsdyke for critical reading of the manuscript". Did he suggest any... uh... improvements, and did you follow any of his suggestions?

      Delete
    7. While one may say I maybe influenced by my ancestry and hence less objective in my views, it may not be that simple. I have a much stronger than average genetic predisposition if you will to be a religious person in the Western tradition as the father of my maternal grandmother was a well off pastor in Wuhan in Central South China. His four daughters all had modern education and were much influenced by Western ideas. For example, they all had natural foots at a time when the vast majority of Chinese girls followed the ancient tradition of footbinding. My mother went through ~10 years of religious schools in Wuhan prior to going to medical colleges. So, I am a bit surprised myself that I have never had any drive to be associated with any organized religion while living in the US for the past 31 years.

      Delete
    8. Leffler EM, Bullaughey K, Matute DR, Meyer WK, Segurel L, et al. (2012) Revisiting an old riddle: what determines genetic diversity levels within species? PLoS Biol 10: e1001388.

      Read Figure 1 of this paper. Other than a few species like Lynx Lynx that have unnatural low diversity because of good reasons like inbreeding or recent near extinction events or natural selection or small population size etc, humans have lower genetic diversity than all despite having much larger population size than most.

      Delete
    9. So the low diversity of, say, wolverines, is due to a recent bottleneck, and that of the potato blight oomycete, or the nematode Caenorhabditis briggsae is due to... hmmm... well, something or other anyway, but the low diversity of humans is due entirely to our exceptional complexity.

      I'm waiting for more revelations. For example, the northern treeshrew is genetically much more diverse than some water fleas, and the diversity of different Caenorhabditis or Drosophila species is all over the scale. This must be due to "a good reason".

      Delete
    10. From the legend of Figure 1: "Each estimate represents the mean of at least three loci and in most cases is based on only non-coding or synonymous sites."

      Their results are supposed to be more random and chaotic than it should be given that they only examined 3 loci in most cases and in a non-consistent way. What they should be presenting is whole genome comparisons for all species as is done for humans. And furthermore, they should sequence hundreds of individuals for each species to have a good estimate of the average. And the paper did not say how many individuals they examined in each case. Finally, syn sites have much lower diversity than intro-genic sites and introns which make up the vast majority of the human genome. So, in reality, most species except human should have much higher diversity than what appear in the paper when the proper whole genome sequencing experiments are done.

      Delete
    11. Finally, syn sites have much lower diversity than intro-genic sites and introns which make up the vast majority of the human genome.

      I beg your pardon? Do you use a kind of random sentence generator to compose your comments? What percentage of the human genome consists of "intro-genic sites and introns"?

      Delete
    12. I meant to say anything that are not protein coding. So, about 98%

      Delete
    13. Right, so the paper you cited in support you are now saying is garbage?

      Delete
    14. Except for the cherry selected by Gnomon. Gnomon, can you quote a reliable comparative study of genetic diversity which shows that humans diversity is lower than in any other species? That was your original claim.

      And what about my other questions? Do you think is makes sense, methodologically, to compare "the primate clade" with "the bacteria clade"?

      Delete
    15. Genomon,

      It all depends on what one means by evolution.

      Even most creationists accept some type of evolution i.e. changes within kinds or species overtime.

      For example; Darwin finches in the Galapagos Islands; the ones with the smaller beak and the larger one differ by 2 base pair change in their genome. This mechanism is known; at least part of it, as some make claims that epigenetics is also part of the mechanism.

      Delete
    16. George Simpson: “If the two (micro and macro) proved to be basically different, the innumerable studies on micro-evolution would become relatively unimportant and would have minor value in the study of evolution as a whole.”

      Let’s ponder this. Is the creation of the first individual or founding member of a fundamentally new creed, new species, new paradigm (like a prime number or an Einstein or Darwin) the same as the creation of the followers or descendants of the first member? Are numbers created as multiples of a prime number created the same way as a prime number? All indications, including a priori reasoning, are that they are not.

      The creation or origin of the first cell-based life is well acknowledged by all to be a mystery not yet accounted for by any theory. Well, as a matter of fact, things are much worse than that in my opinion. It is a much more general and widespread mystery. It is fundamentally the same mystery as the one behind the first individual of any new species of higher complexity level. A common theme for all these “firsts” is the reduction in entropy or disorder in the building blocks of the cell. Macroevolution from bacteria to human involves a near constant albeit stepwise or discontinuous suppression of entropy or disorder or randomness in the system of the first individual of any new species or new form of matter of higher complexity/order. The first cell or life is obviously more complex and orderly than nonliving matters. For the descendants of the first, life is just a matter of quickly reaching maximum or optimum entropy or maximum genetic diversity allowed within the kind of system as first established in the first individual.

      Delete
    17. I should say: "Macroevolution from non-living matter to the first life and all the way to human involves a near constant albeit stepwise or discontinuous suppression of entropy or disorder or randomness in the system of the first individual of any new species or new form of matter of higher complexity/order."

      Delete
    18. Judmarc,
      That paper had done all the wrong things so to have the best shot at putting human diversity at higher level than most species and yet still allowed human to emerge at the top rank of low diversity. That just tells you how robust the extremely low genetic diversity character of humans is. So, I have no problem with citing it to back up my claim.

      Delete
    19. Anyone who can type "the first individual of any new species" with a straight face knows very little about evolution.

      Delete
    20. Yes, if evolution is simply population genetics, than there may be no such thing as a first individual. But is it? Let's hear what Lewontin has to say: “How can such a rich theoretical structure as population genetics fail so completely to cope with the body of fact?"

      Delete
    21. John Harshman has noted the lack of sense in the last part of the sentence; the middle ain't any better:

      near constant albeit stepwise or discontinuous suppression of entropy or disorder or randomness

      What is the evidence that any sort of planning of separate creative acts is necessary for evolution? All the creativity is *unplanned* - mutation, the contingencies that create space for new forms of life (such as the various causes of the great extinctions) - all the essence of disorder and random, unplanned change. Order comes from the *non-creative* force of selection, though there's nothing about selection that needs planning either.

      Nor do species emerge as individuals - we're always talking about populations.

      Think of languages - did the change from Shakespeare's English to today's come from some central planning office? Was there a lonely individual "first modern English speaker," babbling about computers in the 1800s with no one to understand him or her? These things occur through the accumulation of random errors, and in populations as opposed to individuals. Evolution of new species and new languages is the exact opposite of holding back the forces of randomness.

      Sorry, your theory makes no sense and has no evidence. Try again, or better don't - there's already a very good theory that doesn't need your help.

      Delete
    22. The above comment just proved my thesis that the followers of a paradigm are qualitatively different in a worse way than the first founding individual(s) of the paradigm. One such individual in this case, Lewontin the winner this year of a Noble equivalent prize in evolution, had this to say to the "already a very good theory" he helped to create: “How can such a rich theoretical structure as population genetics fail so completely to cope with the body of fact?" Ohta, his co-winner and the founder of the Nearly neutral theory of population genetics, said basically the same:" "..we have yet to find a mechanistic theory of molecular evolution that can readily account for all of the phenomenology. ..we would like to call attention to a looming crisis as theoretical investigations lag behind the phenomenology."

      Wishful thinking and cherry-picking are the common fallacies or characters for the followers of a creed. Or else they would not be followers. And all creeds have imperfections and thus need those followers to stay alive who are borne with a stronger than average ability to overlook, to cherry-pick, and to wishful-think. They are constructive in terms of keeping the creed alive for as long as it could before it is overtaken by a new and better one. Thomas Kuhn is right.

      Delete
    23. Gnomon,

      your behavior is that of a typical creationist, you cherry pick quotes from papers you clearly do not understand, hoping these quotes will prove your position. And when faced with evidence that your pet theory isn't what you hope it is, your fall back position is the classic 'my religion is better than yours' rant.

      In short you mirror your own lack of knowledge of biology to be the current standing of biology, you mirror your own behavior (cherry picking) on to people who have shown to you why you are wrong. You're trying and utterly failing to invoke the Gallileo's gambit .

      Delete
    24. Okay may be you understand those quotes better. So, to you, what do those quotes mean? Could they possibly mean (you can spin it in whatever way you want) that the theory was very good? or do they mean nothing?

      Delete
    25. Wishful thinking and cherry-picking are the common fallacies or characters for the followers of a creed.


      Oh dear, there goes another irony meter.

      Delete
    26. or do they mean nothing?

      They mean exactly nothing when you lift them out of their context.

      Delete
    27. do you have evidence that those quotes are not an accurate summary of at least one major point of what they wrote in their papers? if you don't and you still believe that they are not, that is wishful thinking, is it not?

      BTW, the Kimura quote that Moran put up here mean exactly nothing?

      Delete
    28. I see that you are an associate professor of language on Indo-European. That is interesting as I have an interest in this. Is there evidence for a common ancestry of Indo-European and Sino-Tibetian? Some say yes based on many shared or similar pronunciations.

      Delete
    29. Re: Indo-European and Sino-Tibetan,

      There is no convincing evidence at present. "Shared or similar pronunciations" have no probative value unless they are supported by enough systematic correspondences to rule out accidental similarity, convergence and borrowing.

      There is also another problem: there's no secure Proto-Sino-Tibetan reconstruction as yet, and the traditional division of the ST into Sinitic and Tibeto-Burman is being questioned. It has recently been proposed that ST and Austonesian are related, and even that Chinese and the Austronesian family are more closely related to each other than either is to Tibeto-Burman. These relationships need to be sorted out first.

      Delete
    30. "do you have evidence that those quotes are not an accurate summary of at least one major point of what they wrote in their papers? "

      The Simpson quote for example, in the paper after the quote he goes into an explanation of the quote. And this changes the context quite dramatically.
      The Bombieri quote you took from a novel.

      Delete
    31. BTW, the Kimura quote that Moran put up here mean exactly nothing?

      From what I've seen so far, whenever our host quotes someone, the quotation accompanies a more general discussion of the author's views. If you use an isolated quotation to support something you are arguing but which is not compatible with the author's position visible in the wider context, you are guilty of quote-mining.

      Delete
    32. That is very interesting to learn that ST grouping may not be as real as it was once thought and there may be a Chinese-Austronesian grouping. Thank you. Maybe Y and mtDNA analysis can help. Chinese Y chr O and European R shared a most recent common ancestor and so the sharing of some words, while not enough to be an independent evidence, is at least consistent with a common ancestor in language. If so, there is the next controversial topic of who gave rise to who. It seems that there are supporters and evidence on both sides, right?

      Delete
    33. "which is not compatible with the author's position visible in the wider context"

      Political correctness is poisoning everything, including science.

      Delete
    34. Lewontin was far more outright as a young guy in 1974 when he wrote his famous book than he is now regarding his view on the the weakness of the popgen theory, even though the theory has changed little in the past several decades.

      Delete
    35. Against the ST grouping, Y chr D type is very common in Tibetians but rare in Chinese, and D groups with E common in Africans and Middle East people and some Europeans.

      Delete
    36. Ed,

      That was an autobiography not a novel. wishful thinking? Besides, we have not heard any complaints from Bombieri. And common sense reading of these quotes is enough for anyone to understand their meanings even without context. Or else these authors would be too sloppy with their words and they are not.

      Delete
    37. One can invite the whole 6 billion people on Earth to quote mine a sentence from any evolutionary science expert to the effect that the evolutionary theory has no weakness whatsoever, and I promise that no one would be able to find one.

      Delete
    38. Political correctness is poisoning everything, including science.

      Your interpretation of Piotr's point is so invalid it spawned a non sequitur.

      Delete
    39. Lewontin was far more outright as a young guy in 1974 when he wrote his famous book than he is now regarding his view on the the weakness of the popgen theory

      You mean back when he was one of the leading supporters and explicators of the nearly neutral theory you are now trying to quote-mine him as opposing?

      Delete
    40. One can invite the whole 6 billion people on Earth to quote mine a sentence from any evolutionary science expert to the effect that the evolutionary theory has no weakness whatsoever, and I promise that no one would be able to find one.

      I haven't seen all the evolutionary biology departments closing down their research labs, have you? Guess that means all the answers aren't known yet, and people are still disagreeing and questioning. That's the way science works, when the disagreement and questions are based on a curiosity about real answers, rather than an attempt to validate some crackpot notion in the face of all evidence and logic to the contrary.

      Delete
    41. Against the ST grouping, Y chr D type is very common in Tibetians but rare in Chinese, and D groups with E common in Africans and Middle East people and some Europeans.

      This would be relevant if language were transmitted biologically.

      Delete
    42. Shi Huang (gnomon) writes,

      One can invite the whole 6 billion people on Earth to quote mine a sentence from any evolutionary science expert to the effect that the evolutionary theory has no weakness whatsoever, and I promise that no one would be able to find one.

      Perhaps not. But here's a quote from Michael Lynch who has little use for people like you (my emphasis) ...

      Because it deals with observations on historical outcomes, often in the face of incomplete information on intermediate steps (especially at the molecular level), the field of evolution attracts significantly more speculation than the average area of science. Nevertheless, the substantial body of well-tested theory established over the past century lays the groundwork for understanding the pathways that are open to evolutionary exploration in various population genetic contexts, providing guidance as to the likely reality of alternative evolutionary hypotheses. Because the nonadaptive forces of mutation, recombination, and random genetic drift are now readily estimated in multiple species using molecular data, there is no longer any justification for rejecting the utility of population genetic theory based on its quantitative unreliability.

      Michael Lynch in The origins of Genome Architecture p. 389

      Delete
    43. "This would be relevant if language were transmitted biologically."

      It almost surely is based on a priori reasoning as well as modern experience, isn't it?

      Delete
    44. Michael Lynch is famous for saying and for having it highlighted in his website;"nothing in biology makes sense except in the light of population genetics." I will make sure that he will regret for saying it and soon. And given the riddle paper (Lefler et al 2012) I cited above, it is astonishing that he could be plain blind to so many riddles to which his beloved popgen theory is powerless.

      Delete
    45. What do you mean? There's some correlation between biological and cultural inheritance. Most modern Irish people had mostly Celtic-speaking ancestors, and most of them have English (a Germanic language) as their mother tongue. The Indo-Aryan languages (derived from Indo-European) are the largest linguistic grouping in the Indian subcontinent, but their speakers have biologically more in common with their Dravidian or Austroasiatic-speaking neighbours that with Indo-European speaking Swedes, Italians or Russians. The same goes for other widely spoken families (whose spread has involved migration, intermarriage and language shifts).

      Delete
    46. Let's face it: Gnomon used a quote from Lewin in an attempt to support some kind of unspecified saltation, in which a new species is formed in one step from a single mutation in a single individual. That's quote-mining writ large.

      Delete
    47. I will make sure that he will regret for saying it and soon.

      Only if you have his email address.

      Look, you've got a one-note focus on an interesting unsolved problem in the population genetics subtopic of evolutionary theory: genetic diversity vs. population size. And you've got quotes from people, or citations to papers, calling attention to this as an interesting problem. They don't, however, go on to say this calls all of evolutionary theory or even population genetics into question. And they don't, on the basis of exactly no evidence or logic, go on to say this must necessarily mean the old BS about special creation must be true. Only you have done that. Have you ever thought there might be a reason you're alone in this, and that it might not be superiority of intellect?

      Delete
    48. Piotr,
      Which is the more authentic Dravidian language less influenced by Indo-European, Dravidian Tribal or Dravidian Upper caste?

      Regarding the inconsistency between language and genetics, just keep in mind the fact that the popgen theory has yet to make sense of the riddle of genetic diversity.

      Delete
    49. Oops, I meant to say Lower caste. I guess the Dravidian Upper caste would be more influenced by Indo-European than the Lower and the Tribal.

      Delete
    50. Lynch as quoted by Moran:”Because it deals with observations on historical outcomes, often in the face of incomplete information on intermediate steps (especially at the molecular level), the field of evolution attracts significantly more speculation than the average area of science.”

      But that is not justification for a priori invalid and completely wild or anything goes kind of speculations or assumptions that are widespread like weeds in the evolutionary science, as a priori valid ideas do not depend on real world experience and are independent of “historical outcomes” regardless whether one has or has not complete information about them.

      Real hard core science and mathematics (which happens do describe this world better than anything else) are all based on a priori valid speculations or axioms.

      Delete
    51. But that is not justification for a priori invalid and completely wild or anything goes kind of speculations or assumptions

      So why are you still writing?

      Delete
    52. Re: "Pure" Dravidian.

      First of all, Dravidian is not a language but a language family with about 30 members (and ca. 80 major dialects). All of them have absorbed some Indo-Aryan influence, but not to the same degree. You can see that if you concentrate on any particular part of the language system. For example, some Dravidian languages have retained the inherited Dravidian numerals; others have kept 1, 2, and 3, but replaced 4-10 with Indo-Aryan numerals; a few also have an Indo-Aryan word for 3. I'm not a spoecialist in Dravidian, but from what I have read, influence diffusing from the vernacular Indo-Aryan languages of the "Middle IA" period (the so-called Prakrits) affected mainly the "low caste" (popular, colloquial) varieties of the Dravidian languages. I suppose the most prestigious, Sanskrit-like varieties of Indo-Aryan had more impact on their "high" (formal, literary) styles.

      The known Dravidian languages probably diverged from a common ancestor less than three thousand years ago. Throughout their reconstructible history they have been in contact with Indo-Aryan, so quite naturally there has been a lot of reciprocal borrowing and grammatical influence.

      Delete
    53. I see. About ST, if Tibetian does not group with Sino, any thought which one it may group with?

      Delete
    54. “Compared with lancelets, modern vertebrates retain, at least relatively, less protein diversity, fewer nucleotide polymorphisms, domain
      combinations and conserved non-coding elements (CNE). Modern vertebrates also lost substantial transposable element (TE) diversity,”

      Taken from the abstract from a recent paper on whole genome sequencing analysis of primitive lancelet, which showed much greater SNP diversity that humans (4.5% vs 0.1%).

      http://www.nature.com/ncomms/2014/141219/ncomms6896/abs/ncomms6896.html#affil-auth

      Delete
    55. Bacteria typically have 30% within species nucleotide diversity (difference per site). The new definition in the genome era for a bacteria species is that individuals within a species have less than 30% genome wide difference. If more than 30%, the two individual would present different species.

      Delete
    56. I see. About ST, if Tibetian does not group with Sino, any thought which one it may group with?

      Again, Sino-Tibetan studies are not my speciality, but the current consensus seems to be that the traditional Sinitic vs. Tibeto-Burman classification can't be correct. If Sino-Tibetan is a valid grouping in the first place, Tibeto-Burman is paraphyletic with respect to Sinitic (or, if Laurent Sagart is right, with respect to a larger clade consisting of Sinitic plus Austronesian, and containing Tai-Kadai as one of the branches of Austronesian).

      Tibeto-Burman is a huge group, with ca. 400 languages subgrouped into about 40 established "branches". Most of them are underresearched, so much fieldwork needs to be done before we gather enough evidence to sort out their family relationships.

      Delete
  6. Hey Larry,

    Did you see that paper published in July in Nature, where authors found a positive correlation between genome-wide level of heterozygosity and mutation rates in individuals by doing parent-progeny sequencing? Michael Lynch had a nice research highlight in the same issue summarizing implications of their work.

    You probably now this already, but in case you didn't, I thought you would be interested in this.

    Paper:
    http://www.nature.com/nature/journal/v523/n7561/full/nature14649.html

    Lynch's News and Views:
    http://www.nature.com/nature/journal/v523/n7561/full/nature14634.html

    ReplyDelete
    Replies
    1. Lynch is very interested in the variation of mutation between and within species. The numbers he quotes reflect mutation rates per generation so it's not surprising that humans, for example, have much higher overall mutation rates than single-cell organisms. It's also not surprising that some viruses have much higher mutations rates than cellular organisms.

      However, his data go beyond that to suggest that there is variation in the error rates of DNA replication and/or repair. The differences are not huge. In this particular case, he reviews a paper where there is a difference of 3.5-fold between mutation rates in inbred (homozygous) strains and outbred (heterozygous) strains.

      On other words, if the average mutations rate is x then extreme examples of inbreeding and outbreeding can result in observed mutation rates of about 0.5x and 1.5x (for a difference of 3-fold). There are many possible explanations.

      This may be interesting but I'm more concerned with getting people to understand the big picture. Let's leave the details to Lynch and others.

      Delete