More Recent Comments

Friday, April 17, 2015

Does natural selection constrain neutral diversity?

Razib Khan is an adaptationist and he's discovered a paper that gets him very excited: Selectionism Strikes Back!.

Here's the paper and the abstract.
Corbett-Detig, R.B., Hartl, D.L., Sackton, T.B. (2015) Natural Selection Constrains Neutral Diversity across A Wide Range of Species. PLoS Biology Published: April 10, 2015 doi: 10.1371/journal.pbio.1002112

The neutral theory of molecular evolution predicts that the amount of neutral polymorphisms within a species will increase proportionally with the census population size (Nc). However, this prediction has not been borne out in practice: while the range of Nc spans many orders of magnitude, levels of genetic diversity within species fall in a comparatively narrow range. Although theoretical arguments have invoked the increased efficacy of natural selection in larger populations to explain this discrepancy, few direct empirical tests of this hypothesis have been conducted. In this work, we provide a direct test of this hypothesis using population genomic data from a wide range of taxonomically diverse species. To do this, we relied on the fact that the impact of natural selection on linked neutral diversity depends on the local recombinational environment. In regions of relatively low recombination, selected variants affect more neutral sites through linkage, and the resulting correlation between recombination and polymorphism allows a quantitative assessment of the magnitude of the impact of selection on linked neutral diversity. By comparing whole genome polymorphism data and genetic maps using a coalescent modeling framework, we estimate the degree to which natural selection reduces linked neutral diversity for 40 species of obligately sexual eukaryotes. We then show that the magnitude of the impact of natural selection is positively correlated with Nc, based on body size and species range as proxies for census population size. These results demonstrate that natural selection removes more variation at linked neutral sites in species with large Nc than those with small Nc and provides direct empirical evidence that natural selection constrains levels of neutral genetic diversity across many species. This implies that natural selection may provide an explanation for this longstanding paradox of population genetics.
It is impossible for someone like me to evaluate this paper. Can someone take a look to see if it's valid?

How many selective sweeps must there every 50,000 years in order to remove substantial amounts of neutral diversity from junk DNA?


117 comments :

Unknown said...

I've been waiting for the post on this one since I saw it and you fail to disappoint by posting this. My initial reaction is preserved in a mail I sent to a colleague after reading it:
"Interesting indeed, although not really surprising.
Basically if the distribution of s-values is roughly constant, then we would expect S=2Ns values to become more extreme, i.e. for |S| to go up. In this case neutral variation should decrease (although it should be noted that neutrality is used for cases where the gene effect on fitness is 0 and also where the fitness effect on genes is 0 and these are only the same when there is no linkage. Neutral variation in the gene->fitness sense should decrease, but neutral variation in the fitness->gene sense shouldn't).
A neat result is that this affects local substition rates, but global rates are unaffected, i.e. this effect can disturb clocklike divergence for single genes, but it's unlikely to affect the clockwise divergence for larger regions."

As for Khans comment: Meh. As noted above, this does not have an effect that would make the neutral null invalid and clockwise behaviour is still expected for long time intervals and large genomic datasets.

gnomon said...

check out this paper to see experimental evidence for no junk
http://www.sklmg.edu.cn/Public/Uploads/attached/file/20140830/20140830063859_64243.pdf

Yuan, D., Zhu, Z., Tan, X., Liang, J., Zeng, C., Zhang, J., Chen, J., Ma, L., Dogan, A., Brockmann, G., Goldmann, G., Medina,E., Rice, A.D., Moyer, R.W., Man, X., Yi, K., Li, Y., Lu, Q., Huang, Y. and Huang, S. (2014) Scoring the collective effects of SNPs: association of minor alleles with complex traits in model organisms. Sci China Life Sci. 57:876-888.

Abstract:
It has long been assumed that most parts of a genome and most genetic variations or SNPs are non-functional with regard to reproductive fitness. However, the collective effects of SNPs have yet to be examined by experimental science. We here developed a novel approach to examine the relationship between traits and the total amount of SNPs in panels of genetic reference populations. We identified the minor alleles (MAs) in each panel and the MA content (MAC) that each inbred strain carried for a set of SNPs with genotypes determined in these panels. MAC was nearly linearly linked to quantitative variations in numerous traits in model organisms, including life span, tumor susceptibility, learning and memory, sensitivity to alcohol and anti-psychotic drugs, and two correlated traits poor reproductive fitness and strong immunity. These results suggest that the collective effects of SNPs are functional and do affect reproductive fitness.

gnomon said...

also this paper in press with more evidence for the same point:

http://www.sciencedirect.com/science/article/pii/S0888754315000725

Zhu, Z., Man, X., Huang, Y., Xia, M., Yuan, D., and Huang, S. (2015) Collective effects of SNPs on transgenerational inheritance in Caenorhabditis elegans and budding yeast. Genomics, in press

Abstract
We studied the collective effects of single nucleotide polymorphisms (SNPs) on transgenerational inheritance in C. elegans recombinant inbred advanced intercross lines (RIAILs) and yeast segregants. We divided the RIAILs and segregants into two groups of high and low minor allele content (MAC). RIAILs with higher MAC needed less generations of benzaldehyde training to gain a stable olfactory imprint and showed a greater change from normal after benzaldehyde training. Yeast segregants with higher MAC showed a more dramatic shortening of the lag phase length after ethanol exposure. The short lag phase as acquired by ethanol training was more dramatically lost after recovery in ethanol free medium for the high MAC group. We also found a preferential association between MAC and traits linked with higher number of additive QTLs. These results suggest a role for the collective effects of SNPs in transgenerational inheritance, and may help explain human variations in disease susceptibility.

John Harshman said...

Haven't looked at the paper yet, but doesn't it concern only sites linked to selected sites? That should still be a fairly small proportion of the genome, and only if there's been a selective sweep within the not too distant past. Are they saying that most of the average genome is linked to a locus that undergoes frequent selective sweeps?

gnomon said...

Neutral is only an assumption, and even worse a counter-intuitive one. Why would anyone take it seriously!! It never really worked in explaining nature, as this mainstream paper made it clear:

“Revisiting an old riddle: what determines genetic diversity levels within species?”

-- Leffler et al., 2012, PLoS Biology

SPARC said...

If you authored the paper why don't you just say it?

SPARC said...

see above

W. Benson said...

The authors are saying that enough of the genome is affected by selective sweeps to put a ceiling on standing neutral variation in species that are numerically abundant. Since hitchhiking by neutral variation will speed up fixation, the loss of variation caused by sweeps may be compensated. The two effects, loss of variability and faster fixation, may cancel out such that the rate of neutral evolution is little affected. There will be work for theoretical population geneticists.
The article doesn’t explain very well why big populations should have faster adaptive evolution. They seem to imply that, in big populations, mutations with small favorable effects will be less affected by drift and tend to evolve in a more deterministic manner. I would propose (and it may be mentioned in the paper) that if adaptive evolution is as mutation-limited as the paper seems to show, a large population will have more adaptive mutations and more repetition of adaptive mutations than a small one. As a consequence, adaptive evolution, measured by the frequency of new adaptive mutations, will go faster in populations that are large. This was a major finding of Darwin in the Origin of Species: species that are widespread and abundant evolve to be exceptionally variable.
(typos corrected)

Joe Felsenstein said...

By the way, Larry, the link to Razib Khan's post lacks a colon after http and does not work as is.

"Linked", yes. For a selective sweep to be effective in reducing genetic variability at nearby loci the selection coefficient s of the favorable allele should exceed the recombination fraction between the two sites. With a recombination fraction of 0.001, the distance (in humans) is about 100,000 nucleotides. That is far enough to find other protein coding loci. And the effect of a sweep lasts quite a while. It takes a time of roughly 1/u generations for the neutral variability to build back up, where u is the neutral mutation rate per site.

A few additional points:

(1) The Corbett-Detig paper does not argue that most sites in the genome are subject primarily to neutral variation. It is talking about whether or not they are near enough to selective sites to be affected by sweeps.

(2) The sweeps will not affect the rate of neutral substitution. They clean out variability, but randomly fix some of it. The number of neutral changes in the ancestry of any one copy of a site is in fact unaffected. We can argue about this if people want.

(3) The Corbett-Detig paper does cite work by the Charlesworths on the effect of deleterious mutations that create "background selection" that reduces variability at nearby neutral sites by a Muller's Ratchet effect. That is another effect that should be taken into account in addition to the effects of selective sweeps.

Joe Felsenstein said...

Typo in the above: In point (1) the Corbett-Detig paper does not reject the conclusion that most sites in the genome are subject primarily to neutral variation.

(Ouch, that's a bad mistake for me to have made).

Joe Felsenstein said...

The Leffer et al. paper continually discusses the effect of selective sweeps on nearby neutral sites. It does not at all reject neutrality.

"gnomon" is, as noted above, Shi Huang. To put Dr. Huang's dramatic statements into the proper perspective, note that one of his results has been acclaimed as "The First Axiom of Biology".

Acclaimed, that is, by himself.

gnomon said...

A few questions for you, Dr. Felsenstein, 1. Do you think there is a genetic equidistance phenomenon in nature as first reported by Margoliash in 1963? 2. If so, is the molecular clock the right explanation for it? or is the molecular clock a real phenomenon in Nature 3. Is the molecular clock the best evidence for the Neutral Theory as Kimura said?

gnomon said...

In the old days, people routinely claimed their stuff to be axioms, including Spinoza, Newton etc. It is just common practice to call something axiom that is intuitive or self evident by logic alone. Nothing grandiose needs to be implied. But unfortunately, opponents want to use that implication to discredit whoever who dare to call their staff axioms. They avoid the more difficult task, which is to take on the content of an axiom head on.

gnomon said...

The Leffer et al. paper still did not solve the old riddle of what determines genetic diversity, dont you agree, Dr. Felsenstein? This means at a minimum that neither Darwin's or Kimura's theory (or the two combined) is a complete account of nature, if not fundamentally incorrect. The neutral theory never really worked in explaining the full reality of genetic diversity (worked to some extent but not completely). So then, why insist on using the neutral concept to explain everything? If the genome were to be found to be nearly all functional, it would be completely expected as that would easily explain why the neutral theory has not solved the old riddle of genetic diversity. I hope you agree with this axiom: without understanding the first and most astonishing result in molecular evolution, the genetic equidistance result of Margoliash, one cannot understand evolution. The field used and still uses the molecular clock and in turn the neutral theory to explain the equidistance, which is completely embarrassing and so much so that it never dared to put into textbooks.

gnomon said...

Axiom - Wikipedia, the free encyclopedia
en.wikipedia.org/wiki/Axiom
Wikipedia
An axiom or postulate is a premise or starting point of reasoning. As classically conceived, an axiom is a premise so evident as to be accepted as true without controversy. The word comes from the Greek axíōma (ἀξίωμα) 'that which is thought worthy or fit' or 'that which commends itself as evident.'

gnomon said...

In all these debate on the Neutral assumption or any mainstream paradigm ill fated for oblivion, one must not forget Max Planck: "A scientific truth does not triumph by convincing its opponents and making them see the light, but rather because its opponents eventually die and a new generation grows up that is familiar with it."

Joe Felsenstein said...

More succinctly, he also said "Science advances one funeral at a time".

Yes, but the hard part is to know in advance which view is going to disappear by having its adherents die off.

And yes, I had wondered about that too -- why had you taken a conclusion that you had come to, and taken it as an axiom? Very strange.

gnomon said...

One sentence is enough to state my view: genetic distance as measured by sequence identity has a maximum limit. It is self evident, in my opinion. If you disagree, please share your reasons.

Joe Felsenstein said...

That is certainly true. With N sites in the sequence, the maximum limit on genetic distance, as measured by sequence (non-)identity, is N sites different.

Unknown said...

It's worth noting here that the practival limit is lower than that, simly as an artifact of alignment. But even more relevantly, we generally do not assume that the number of substitutions is equal to the number of observed differences, because a match can come about in different ways (,no substitution in either lineage, same substitution in two lineages, back-substitution) and so can a mismatch. All current molecular clock implementations use this as far as I know. Nobody has operated with the idea that the number of differences between two sequences is going up as a linear function of time for ages.

gnomon said...

Absolutely amazing you said this. I would be shocked if you don't regret to ever said this. So you are saying that protein sequence identity could go from 100% in the beginning when two lineages just split to 99%, 98%, 97%, ....... all the way to 0% given enough time (or more precisely 10-15% because that is the identity for two non-related proteins). That means to me no limit!!! But what I observe and what is demanded by common sense is that a functional protein cannot freely change whatever residue it has, for example, cytochrome C in fungi are not and cannot be more than 50% different. Anything more than that would destroy the function of cytochrome C in fungi and in turn the individual that carries those extra mutations. There is a thing called functional constraint on sequence variation. Anyone who has done mutagenesis experiments can attest to that. If what you said is true, we could easily imagine two non related sequences to encode the same function. We dont just stop there. Two can become 4, and 4 to 8, and so on it goes all the way to infinity. So, one day we would see an infinite number of non related sequences all encoding the same function. If that is not absurd I dont know what is. So, by this way of logic (reductio ad absudium), it is easily proven that a maximum limit on genetic distance imposed by function is a common sense or axiom. It shows that you idea of no limit or a maximum limit of 100% difference is false. To experimentally prove you wrong, all one has to do is to mutate one conserved residue in cytochrome C and observe its deleterious effects on an organism, which obviously has been done many times in the past. What you said could only be true for a protein with no function what so ever and existing only in a test tube.

gnomon said...

Simon said :"Nobody has operated with the idea that the number of differences between two sequences is going up as a linear function of time for ages." Yes, they may not treat it as linear but still treat it as going up with time. The don't have my concept of plateau when distance do not increase at all with time. Most distances we observe today are in the plateau phase and have been like that long before present.

Unknown said...

Yes, they may not treat it as linear but still treat it as going up with time.
Not how the current generation of computational tools work. The 3 things used today are MCMCtree (which implements multidivtime, so I don't list that separately), BEAST and dpp-div (including variants like fdpp-div). In all of these the number of differences is modeled as a markovian process, which does include the probability that differences are going down through reversals and convergences.

The don't have my concept of plateau when distance do not increase at all with time.

They may not have your concept of "plateau", but there's literature discussing "saturation" stretching back decades. There's also literature on how we can test for the effects on our divergence time estimates and how to minimize them (the key is using multiple partitions and checking whether dates converge).

Anonymous said...

@gnomon You say:
"What you said could only be true for a protein with no function what so ever and existing only in a test tube."

1. I'm don't think that Simon is specifically talking about protein coding sequences,

2. Don't neutral alleles by definition have no "function"?

3. How can you be sure that there are no peptides/proteins produced in organisms with no "function" ?

I have the feeling (If I'm not mistaken) that you are (mostly) talking about protein-coding-sequences, where (usually) a selection-pressure constraints the variety possible, whereas everybody else is talking about (nearly) neutral alleles which are under no (direct) selection pressure.

What is your opinion on (f.e.) most of the Alu repeats ? Do they have a function?

Cheers,
Michael

gnomon said...

Simon,
Saturation is not the same as plateau. Yes, saturation can be corrected but plateau cannot. Saturation assumes no plateau. The best evidence for plateau is the first result in molecular evolution by Margoliash, Zuckerkandle and Pauling, the genetic equidistance result. The overlap feature of that result (check my paper in 2010 to learn more about it), where the same position is found mutated God knows how many times in different species (against probability theory) while other positions never mutated even once, has never been explained or even recognized as unusual until my maximum limit idea came along in 2008.

Don't forget the whole theoretical basis for the field, molecular clock and neutral theory, were inspired by protein data not nucleotide data. and regardless, we have a reality as shown by protein alignments and we need a real explanation. The clock and in turn the neutral theory is a false one for that matter.

With regard to alu and other questions, all I need to say is that my idea of mostly functional genome works in explaining what has been observed, including best of all, the genetic equidistance result. It also directs real experiments on medical problems as our recent papers did. The neutral theory did not any of those. therefore the neutral idea is already falsified, while mine not yet. And to aspire to be a hard core scientist, I only need one contradiction to be proven false.

Unknown said...

In this case your plateau is even less of a novelty. Kimuras strict neutral model (1968) assumes that there are two classes of sites: ones for which s=0 and ones for which s=-infty. This already assumes that there is a maximum number of divergent sites, because the ones for which s=-infty are absolutely conserved. There are weaker statements of this - mainly because functional constraints can be weakened by things like gene duplication, where one copy can start to change, because the other one takes over the function.
So your novelty isn't something we've known since the 80s, it's something we've known since the 60s.

Joe Felsenstein said...

"gnomon" (Shi Huang) said:

One sentence is enough to state my view: genetic distance as measured by sequence identity has a maximum limit. It is self evident, in my opinion. If you disagree, please share your reasons.

I agreed that there was an upper limit, namely 100% difference. But gnomon is unhappy with this, saying that

it is easily proven that a maximum limit on genetic distance imposed by function is a common sense or axiom.

That's funny, the "one sentence" has changed by having "imposed by function" added to it. I guess that the particular sentence used was not enough to state gnomon's view.

gnomon said...

In that case, what is the reason for the plateau of Kimura as interpreted by you? Do different species have different plateau? A bacteria species such as E. coli has as high as 30% maximum difference in genome wide sequence among different individuals (just learned from microbiome seq community that the definition for a bacteria species is that different individuals of a species should not be more than 30% different in genome seq.). In contrast, two different species of mammals such as mouse and human are only 10% different in genome sequence. Or, human has only 0.1% maximum difference between two different individuals. Why bacteria can tolerate so much genetic variations? Why simple organisms can tolerate so much higher plateau? Is the idea that simple organisms can tolerate more random variations and hence higher maximum/plateau distance already obvious in Kimura's theory? Do organismal complexity play any role in constraining random variations in his theory? If so, my idea would indeed be nothing novel. But then, why the field still uses and has always used the clock and neutral theory to explain the genetic equidistance result, which treats all observed distances as still increasing with time? (The correct way is to use my plateau idea as I did in 2008). Why the field still considers nearly 90% of human genome to be neutral/junk, which would imply that two individual humans could be as much as 90% different rather than the observed 0.1%? Bottle neck or short time are all ad hoc speculations invoked to explain away such low diversity. But we do have real evidence that ~0.1% is the plateau. One such data is that anything bigger could only be found in patient populations with complex diseases such as Parkinson's disease, schizophrenia, and cancer (Yuan et al, 2012).

Yuan, D., Zhu, Z., Tan, X., Liang, J., Zeng, C., Zhang, J., Chen, J., Ma, L., Dogan, A., Brockmann, G., Goldmann, G., Medina,E., Rice, A.D., Moyer, R.W., Man, X., Yi, K., Li, Y., Lu, Q., Huang, Y., Wang, D., Yu, J., Guo, H., Xia, K., and Huang, S. (2012) Minor alleles of common SNPs quantitatively affect traits/diseases and are under both positive and negative selection. arXiv:1209.2911

gnomon said...

Joe Felsenstein said:"I agreed that there was an upper limit, namely 100% difference."

I would have to say that you are one of a kind and your answer did surprise me. In all my past 8 years of discussing the maximum distance idea, no one has taken maximum limit to mean what you have done. Somebody is lacking common sense here. And the field of evolution is not known to have any common senses. And what is known is that a "strange inversion of reasoning" is acclaimed to be the best idea mankind has ever had.

Unknown said...

In that case, what is the reason for the plateau of Kimura as interpreted by you?

Kimuras strictly neutral model had two classes of mutations: neutral mutations and lethal mutations. Some percentage of the genome would not change at all, the remainder of the genome would change at the neutral rate. There's not a lot of interpretation on my part...

A bacteria species such as E. coli has as high as 30% maximum difference in genome wide sequence among different individuals (just learned from microbiome seq community that the definition for a bacteria species is that different individuals of a species should not be more than 30% different in genome seq.).

Species definition alert! That definition is a purely phenetic one and quite a few people (me included) would reject it as a definition for species. The BSC is still the most commonly used definition and if you think that you have said anything meaningful when you point out that there are differences between the properties of species defined by the BSC and species defined by some other (in this case obviously arbitrary) definition, then you should think again.

Joe Felsenstein said...

gnomon: How kind! My colleagues in evolutionary biology join me in thanking you.

Mikkel Rumraket Rasmussen said...

"Absolutely amazing you said this. I would be shocked if you don't regret to ever said this. So you are saying that protein sequence identity could go from 100% in the beginning when two lineages just split to 99%, 98%, 97%, ....... all the way to 0% given enough time (or more precisely 10-15% because that is the identity for two non-related proteins). That means to me no limit!!!"

Yes. Why the heck not? Here's the problem with your functional constraint: There can be more than one function available in sequence space. Suppose you change one of your duplicates one amino acid at time. The sequence identity drops as you note, 99-98-97 etc. etc. Sooner or later, the protein might stop having the original function, but it might have taken on an entirely new function in the mean time, due to the change. It is even possible that a protein can totally lose all sequence identity (0%), yet still have maintained roughly a similar fold structure (because of similar but not identical amino acids). It could have maintained the amino acid motif (polar vs non-polar) for example.

Why not? The only problem I see with this is that it makes it difficult to determine whether two structurally similar proteins with little to no sequence similarity evolved due to descent from a common ancestor or due to convergent evolution. But I don't see why it would be impossible for a protein to change so much over 3 billion years of evolution you're left with 0 sequence similarity, since we know proteins can go through multiple functions from one to the other through gradual loss and eventual replacement. Such cases might be relatively rare, but why would they not happen at all?

John Harshman said...

Joe, can you explain what gnomon is on about here?

Joe Felsenstein said...

He (Shi Huang) thinks he has refuted neutrality by observing a maximum limit to "diversity", which I now realize means difference between individuals within a species. (Of course, that would depend on the fraction of the genome under selective constraint and, for the non-constrained sites in the genome, the depth of the coalescents there. That in turn depends on effective population sizes and mutatiion rates.) He thinks that this correlates with "organismal complexity". He thinks that this is well-established, and so important that it should be called the First Axiom of Biology.

(Biology went along for hundreds of years without a First Axiom).

He thinks that his work is what we really need to discuss here.

Here is his blog with much more.

He has never been accused of excessive modesty.

gnomon said...

Joe Felsenstein,
Overlooking inconvenient facts is not the way to do hard science. Let us not get distracted here and focus on the real issue. What results inspired the UNIVERSAL molecular clock? Is it not true that Kimura proposed the neutral theory to explain the now defunct UNIVERSAL molecular clock? One really doesn't need anything else to disprove the neutral theory. All he needs to observe is that the UNIVERSAL molecular clock is not real and widely acknowledged to be false and hence any theory supported by it and invoked by it, which of course is the neutral theory, would fall apart by default. He would also observe that the original result that provoked the clock idea has now no real theory behind it (if you disregard my work) and yet the field could care less and still goes on with a theory that was invented by mistake. (I am not saying the neutral theory is not right in some limited situations. But as an explanation for the majority of the genome and for the fake molecular clock and for the real equidistance phenomenon, it is the exact opposite of what is true.)

For a universal molecular clock interpretation of the first result in molecular evolution, check this
http://www.antievolution.org/people/wre/evc/argresp/sequence.html

Joe Felsenstein, please tell us whether we should or should not believe in the molecular clock interpretation of the genetic equidistance result of Margoliash. And also, whether we should or should not believe in the neutral theory if there is no universal molecular clock and if the equidistance result has no answer in the neutral theory. One of course has to keep this in mind that you are not neutral in this debate as your lifelong work is premised on the neutral theory. But unfortunately truth and time do not care.

Mikkel Rumraket Rasmussen said...

I do have to ask, who says neutral theory is "an explanation for the majority of the genome" ?

What is "an explanation for the genome" anyway? What question is being asked? Why is the genome so big? Why does it look the way it does? What elements does it contain? It isn't clear what question neutral theory is being accused of failing to answer. Can you be more precise?

John Harshman said...

So gnomon seems incoherent and unable to explain even his own ideas. He doesn't like neutral theory, but I see no sign that he's proposing that the entire genome is under selection as an alternative. He seems to be hinting at something else, whatever that might be. Nor does he seem to know whether he's talking about within-species diversity or between-species distances.

Larry Moran said...

Here's a brief description of the molecular clock that was based on the first sequence comparisons in the 1960s: The Modern Molecular Clock.

The existence of an approximate molecular clock is not in doubt in spite of what Shi Huang (gnomon) is saying. He is speaking nonsense.

The explanation for an approximate molecular clock is that most of the fixed alleles are neutral and the rate of fixation is equal to the mutation rate. Since the mutation rate is approximately constant in different lineages, this gives rise to a relatively constant (stochastic) rate of fixation in each branch of the phylogenetic tree.

This explanation is consistent with everything we know about population genetics. The idea that the alleles (amino acid substitutions) are neutral fits with everything we know about protein structure and evolution. You would have to be crazy to reject all of that.

John Harshman said...

I think you also have to add some kind of covariation hypothesis to make this work for protein evolution. Most substitutions are not neutral at any given time, which is why protein evolution is slower than the neutral rate. But which substitutions, and at which sites, are neutral changes over time because of environment and, mostly, substitutions at other sites. It's time-variant, constrained neutral evolution.

Mong H Tan, PhD said...

LAM: "The existence of an approximate molecular clock is not in doubt in spite of what Shi Huang (gnomon) is saying. He is speaking nonsense."

On the contrary, I think you both are talking passing each other: ie, the Molecular Clock (or Neutral) theory vs the Maximum Genetic Diversity (MGD) hypothesis!?

While both theories/hypotheses are theoretically and scientifically sound -- especially from the mid-20th-to-21st-century biomolecular points of view -- both the Neutral and the MGD theories may not be scientifically or deductively used to infer or prove the "evolutionary theory" of species by Natural Selection as first globally observed and speculated by Charles Darwin in the years 1831-1858! -- Despite the rhetorical claims by the Natural Selectionist or Neo-Darwinists since the late-19th-to-mid-20th century, I have since several years ago proclaimed that the classical Darwinism (since 1859) is a philoscientific observation-analysis of sort, that may be classified in the 20th-century Natural Phenomenology: a philoscientific observational theory that may not be empirically proven: thus the Macroevolution vs the Microevolution, forever!?

As I read through the research projects as proposed in the MGD hypothesis by SH, I thought that the project #3 -- when its complete data are obtained -- shall answer my proclamation above!?

Best wishes, MHT.

Joe Felsenstein said...

The molecular clock is a useful and fruitful approximation. As Larry implies, the rate of substitution in lineages is approximately constant, but this varies as one moves farther and farther away on the tree. It ought to -- the biology of the species changes and so the mutation rates and generation times change, and changes in population size expand or contract the fraction of mutations that are effectively neutral.

So no, there is not a single universal molecular clock. But almost all population genetics inference within species is based on having a clock.

gnomon said...

Joe Felsenstein:
I carefully selected this quote from one of my papers for my profile on the Third Way of Evolution, which I was just invited in along with a few other scholars.
"The Maximum Genetic Diversity hypothesis thus includes the proven virtues of the modern evolution theory, consisting of Darwin’s theory and the neutral theory, as a component relevant only to microevolution over short time scales before reaching maximum genetic distance/diversity."

(Huang, S. 2010. The Overlap Feature of the Genetic Equidistance Result—A Fundamental Biological Phenomenon Overlooked for Nearly Half of a Century. Biological Theory, 5: 40-52.)

In this paper, you will find why the molecular clock/neutral theory is both true and false. True for the linear phase and false for the plateau phase. The linear and plateau phase can be easily distinguished by the overlap feature when doing protein alignments of 3 species. If one sees a lot of positions where only one species is mutant while the other two share the same residue, things are in linear phase. If on the other hand, if one sees a lot of positions where all 3 species have each got a different residue at the same position, which means two or more independent mutations have occurred at the same position, things are in plateau phase. A large number of such overlapped mutant positions (12 out of 102 positions for cytochrome C when aligning human drosophila and yeast homologs) is against probability theory, and hence molecular clock and neutral theory, while a small number is consistent. When aligning 3 yeast species, one observes only 2 out of 102 positions to be overlaps, consistent with probability theory, and hence molecular clock and neutral theory.

So, Joe Felsentein, you are absolutely right to say:"So no, there is not a single universal molecular clock. But almost all population genetics inference within species is based on having a clock." But unfortunately you did not realize or don't want to admit that the original clock idea was a universal clock invoked to explain an equidistance phenomenon that is really a reflection of the plateau phase of evolution. And, you did not realize or don't want to admit that the neutral theory was meant to explain a single universal molecular clock of either hemoglobin or cytochrome c. If as you admitted that there is not a single universal molecular clock, what are you going to say about the equidistance result of Margoliash which was precisely explained by Margoliash by imagining a single universal molecular clock? Is it not plain obvious that the equidistance has now no real explanation if there is not a universal clock and if you disregard my MGD hypothesis? How can the field possibly avoid not to do most things wrong when the theory for the field was invented by mistake to explain a fake reality of a universal clock? How can the field possibly avoid not to do most things wrong when the equidistance phenomenon, ‘one of the most astonishing findings of modern science’ as rightly said by Mike Denton, remains unexplained by the neutral theory or any other theories of the field?

The correct view on the neutral theory is what my quote above shows. The question now is what fraction of a genome is in the plateau phase today after 3 billion years of evolution. The answer is, after our careful analysis, nearly 100% for humans. We still can pick out a fraction of the genome that is in linear phase, which we are using to date human history or origin (ms in preparation). Indeed, if most things are not in plateau phase after such a long time nearly incomprehensible by humans, one would seriously wonder why.

Anonymous said...

gnomon,

If you'd decided to be part of a propagandist group such as "the third way," then your views must be quite flawed.

It seems like your position consists on misunderstanding a few terms and definitions, then "fixing" something that happens to be obvious to the rest of us, and make it all up into something much more meaningful than it really is. I suspect that you're nothing but self-promoting hot air.

Larry Moran said...

I have no idea what Shi Huang is on about but the fact that he quotes Michael Denton is interesting.

Piotr Gąsiorowski said...

This particular Denton quote (from Evolution: A Theory in Crisis, p. 277-278) is popular with creationists of every hue. It expresses Denton's naive amazement that although eukaryotes may differ greatly from one another, they all seem to be equidistant from prokaryotes -- something allegedly at odds with evolutionary theory. Denton evidently attacks a man of straw -- the idea of linear progress in the living world, and the notion that, say, modern fish are less evolved than mammals and should therefore be "more closely related" to bacteria and look "transitional". I'm not at all surprised that Shi Huang quotes this stuff. It shows at once where he comes from.

gnomon said...

“The most remarkable result in molecular evolution, the approximately constant evolutionary rate of homologous proteins.”

- Leigh van Valen, 1974 (Red Queen fame)Molecular evolution as predicted by natural selection. J. Mol Evol, 1974, 3: 89–101

Larry Moran,
So Leigh van Valen said basically the same thing as Mike Denton except that he is talking about the mistaken interpretation of the real most remarkable result behind the universal molecular clock, the genetic equidistance result. Also, Denton has published as recently as 2001 in Nature, long after his 1986 book. Hope that is also interesting to you.

Michael Denton and Craig Marshall, (2001) Laws of form revisited. Nature, 2001, 410:417

Piotr Gąsiorowski said...

P.S. I'm not sure if the page reference is correct for Denton 1985. Anyway, it's from section 12.2 ("The molecular equidistance of all eucaryotic organisms from bacteria"), pp. 280-281 in the 2002 edition.

gnomon said...

from wiki:
The genetic equidistance phenomenon was first noted in 1963 by Emanuel Margoliash, who wrote: "It appears that the number of residue differences between cytochrome C of any two species is mostly conditioned by the time elapsed since the lines of evolution leading to these two species originally diverged. If this is correct, the cytochrome c of all mammals should be equally different from the cytochrome c of all birds. Since fish diverges from the main stem of vertebrate evolution earlier than either birds or mammals, the cytochrome c of both mammals and birds should be equally different from the cytochrome c of fish. Similarly, all vertebrate cytochrome c should be equally different from the yeast protein."[2] For example, the difference between the cytochrome C of a carp and a frog, turtle, chicken, rabbit, and horse is a very constant 13% to 14%. Similarly, the difference between the cytochrome C of a bacterium and yeast, wheat, moth, tuna, pigeon, and horse ranges from 64% to 69%. Together with the work of Emile Zuckerkandl and Linus Pauling, the genetic equidistance result directly led to the formal postulation of the molecular clock hypothesis in the early 1960s.[3] Genetic equidistance has often been used to infer equal time of separation of different sister species from an outgroup.[4][5]

Piotr Gąsiorowski said...

Sure, but Denton confused outgroups with "intermediate forms" in his book.

Larry Moran said...

@Piotr Gąsiorowski

We need to be really careful about interpreting Denton. One could argue that he was confused in his first book Evolution: A Theory in Crisis, but his second book (Nature's Destiny) is much more relevant [see Michael Denton and Molecular Clocks].

Personally, I think we have misinterpreted Denton's first book but most people think he changed his mind. If so, it's important to refer to his more recent views and not the old ones.

Larry Moran said...

Denton knows a lot more about evolution than most creationists. Here's what he thinks of the molecular clock.

These twin discoveries—that the mutation rate equals the evolutionary substitution rate, and that the rate of change in many genes is regulated by a clock which seems to tick simultaneously in all branches of the tree of life—may represent the first evidence, albeit indirect, that the mutational processes that are changing the DNA sequences of living things over time are indeed directed by some as yet unknown mechanism, or more likely mechanisms. Of course, these discoveries do not prove directed evolution, but it is far easier to imagine them as the outcome of some sort of direction than the outcome of purely random processes. (Nature's Destiny p. 292)

He agrees with Joe Felsenstein that the fact that the rates are pretty constant is surprising given that mutation rates and generation times should vary considerably in different lineages.

I don't think mutation rates vary by very much and the generation times aren't nearly as different as most people believe. The fact that similar numbers of changes occur in the lineage leading to E. coli and humans is consistent with evolution.

Larry Moran said...

photosynthesis says,

If you'd decided to be part of a propagandist group such as "the third way," then your views must be quite flawed.

Kooks tend to find each other. They become more powerful when they form a herd.

Tom Mueller said...

Please correct me if I am wrong, but I must be missing something.

Neutral Theory does not deny the existence or importance of selection but rather questions the relative importance of selection vs. random drift as THE major driving force of evolution. I hope I have not over-simplified this all.

OK so far? … and the champions of Neutral Theory would therefore have NO problem with data suggesting that certain categories of lineages demonstrate enhanced importance of selection vs. other lineages (the majority presumably) that exhibit the contrary.

I hope I have got this correct so far.

It seems to me that large populations are more likely to demonstrate the “Allee Effect” than small populations.

But (and this is the important bit)

An Allee effect is by definition a positive association between absolute average individual fitness and population size over some finite interval.

http://www.nature.com/scitable/knowledge/library/allee-effects-19699394

So again, please correct me if I am wrong… but I may be missing something.

The Champions of Neutral Theory should have no problem with certain populations demonstrating enhanced selection compared to others if in fact they were exhibiting a positive Allee effect, a positive effect that by definition occurs in some but not all larger populations (ergo a trend).

I hope I am not hopelessly confused again.

judmarc said...

Why should mutation rates vary considerably in different lineages?

gnomon said...

As I said above, the clock and the infinite sites assumption of the neutral theory can explain linear equidistance just fine where every mutation hits a new site. But precisely because of that virtue, the same theory cannot explain a vastly different phenomenon, the maximum equidistance where many mutations hit the same site. Two different phenomena require two different ideas, one for the linear and one for the plateau. And unfortunately for the clock/neutral theory, the vast majority of equidistance we observe today are not linear and can be easily shown to be so unless one insists on pretending to be dumb and blind.

Nature is not as simple as some naive minds would like to imagine. If one insists on ignoring the more brain-demanding parts of nature, he is just wasting his life to pursue a career in science.

Tom Mueller said...

@ judmarc

Re:
“Why should mutation rates vary considerably in different lineages?

I am not really clear on why, but is it not certain that they do?

Can some genomes evolve more slowly than others?
http://sandwalk.blogspot.ca/2014/01/can-some-genomes-evolve-more-slowly.html

judmarc said...

From the cited post:

It looks like the mutation rate is relatively constant in all lineages (bacteria, protozoa, plants, animals, etc.). This isn't a big shock since the vast majority of mutations are due to errors in DNA replication and the fundamental biochemistry of DNA replication and repair are similar in all species.

That's the overview, which is what I was thinking when I asked the question about variation. Further in, there's a little more detail about slightly anomalous results from the elephant shark:

It looks like there are about 10% fewer changes that have been fixed in the elephant shark genome compared to many other vertebrates. This doesn't seem like a big number to me since we're looking at stochastic changes over hundreds of millions of years. I prefer to see this glass as half full—there is an approximate molecular clock. I also remain a bit skeptical of the results since there are many potential sources of error.

But what if the data actually reflects a true slowing down of evolution in elephant sharks? What does this mean?

...The simplest explanation is that the biochemical mutation rate in elephant sharks is lower than in other species. In other words, DNA replication is more accurate in sharks or repair is more efficient. While we can't rule this out, it doesn't seem very likely.

Perhaps the explanation is much more complicated. Michael Lynch has some good arguments for a correlation between genome size and overall per nucleotide mutation rate per generation. Species with larger genomes tend to have larger mutation rates (Lynch, 2007, 2010). Note that the elephant shark genome is only about 1 Gb whereas mammalian genomes are about 3 Gb in size.


So Denton I suppose is doing what I call the "Thermos fallacy" again. There's a perfectly simple biochemical explanation as to why mutation rates shouldn't vary terribly much, but when he finds this to be the case Denton figures it's evidence of a directed process. It's like the guys in the old Thermos joke: "You put something hot in, it stays hot. You put something cold in, it stays cold." "Yeah, so what?" "So how does it know?"

Tom Mueller said...

@ judmarc...

THANK YOU!!!!

I really will need to sit down and reread and rethink what you wrote.

ITMT... things that make you go hmmmm!

Species with larger genomes tend to have larger mutation rates (Lynch, 2007, 2010).

On the subject of lineage specificity; have we closed one Pandora's Box only to open another?

I am thinking of selection at a species level as per former conversations with John Harshman.

Piotr Gąsiorowski said...

LM: Personally, I think we have misinterpreted Denton's first book but most people think he changed his mind. If so, it's important to refer to his more recent views and not the old ones.

My impression is that he did make a blunder in the 1985 book (and the blunder remained uncorrected in the later edition) but realised his error later and was careful not to repeat it. It's human to err and I wouldn't have mentioned it again, but the passage quoted by "gnomon" reflects Denton's old view, and it's unfortunately that old passage that is circulated by creationists to this day as "evidence against evolution", from the Discovery Institute to Harun Yahya's brochures

Jmac said...

Gnomon,

I know you are not quest/Whiten. So, who are you questioning? And why?

gnomon said...

Ominous news for the neutral theory nearly every week now: Nature paper yesterday found endogenous retrovirus (ERV) to be functional.
http://www.nature.com/nature/journal/vaop/ncurrent/full/nature14308.html#ref1

Human endogenous retrovirus (HERV) proviruses comprise a significant part of the human genome, with approximately 98,000 ERV elements and fragments making up nearly 8%. One family, termed HERV-K (HML2), makes up less than 1% of HERV elements but is one of the most studied.

The paper found HERV-K to be fully functional. By inference via good common sense, the whole ERV class should also be functional, which just needs time and effort to be found out. This inference for the ERV kind sequence is exactly like we consider the protein kind to be all functional. Despite the fact that the functions of probably ~80% of human proteins remain unknown but no one doubts that they have a function because we do know some proteins have functions. So, if one type of ERV has functions, which happens to be the most studied, should it not to be the null hypothesis that all ERVs have functions?

The popgen and molecular evolution field today, mostly made up of people who rarely do any bench work on DNA functions, still considers ~90% of human genome to be neutral junks. But how interesting and dramatic, a big chunk of these junks were turned into gold overnight by one paper!! More interesting and dramatic findings of the same kind are sure to come over and over again within the next two years until all popgen researchers abandon their neutral bandwagon and join their bench colleagues who are nearly all on the functional train since long time ago.

Abstract of the paper:

Intrinsic retroviral reactivation in human preimplantation embryos and pluripotent cells
• Edward J. Grow, et al

Endogenous retroviruses (ERVs) are remnants of ancient retroviral infections, and comprise nearly 8% of the human genome1. The most recently acquired human ERV is HERVK(HML-2), which repeatedly infected the primate lineage both before and after the divergence of the human and chimpanzee common ancestor2, 3. Unlike most other human ERVs, HERVK retained multiple copies of intact open reading frames encoding retroviral proteins4. However, HERVK is transcriptionally silenced by the host, with the exception of in certain pathological contexts such as germ-cell tumours, melanoma or human immunodeficiency virus (HIV) infection5, 6, 7. Here we demonstrate that DNA hypomethylation at long terminal repeat elements representing the most recent genomic integrations, together with transactivation by OCT4 (also known as POU5F1), synergistically facilitate HERVK expression. Consequently, HERVK is transcribed during normal human embryogenesis, beginning with embryonic genome activation at the eight-cell stage, continuing through the emergence of epiblast cells in preimplantation blastocysts, and ceasing during human embryonic stem cell derivation from blastocyst outgrowths. Remarkably, we detected HERVK viral-like particles and Gag proteins in human blastocysts, indicating that early human development proceeds in the presence of retroviral products. We further show that overexpression of one such product, the HERVK accessory protein Rec, in a pluripotent cell line is sufficient to increase IFITM1 levels on the cell surface and inhibit viral infection, suggesting at least one mechanism through which HERVK can induce viral restriction pathways in early embryonic cells. Moreover, Rec directly binds a subset of cellular RNAs and modulates their ribosome occupancy, indicating that complex interactions between retroviral proteins and host factors can fine-tune pathways of early human development.

SRM said...

So, a) HERVK is the most recently acquired ERV, b) makes up 0.08% of the genome, c) was already known that genes were expressed in certain contexts, and d) from abstract: "Unlike most other human ERVs, HERVK retained multiple copies of intact open reading frames encoding retroviral proteins".

Are you sure this is as dramatic as you think Gnomon.

gnomon said...

Of course it is. a) more anciently acquired ERV may play a role in functions important also for other non human lineages. b) 0.08% is a lot of sequence , considering the plateau distance between any two people is merely 0.1%. c) 80% of the genome were expressed, which surely covers a lot of supposedly junks. 4) non-coding does't equate non-functions unless you are still in the 70s of the last century.

gnomon said...

If you had my experience in publishing papers on the collective effects of common SNPs, you would know how dramatic it is. Our paper found that common SNPs representing about 0.1% of the human genome are all functional at least in a collective way. Yet nearly all the editors and reviewers of the numerous journals we submitted to hated this message, believing most of them to be neutral junks. So if they now are told that 0.08% of the genome, which they had always treated as neutral, are functional, just imagine what their facial expression would be like.

Unknown said...

Larry: I don't think mutation rates vary by very much and the generation times aren't nearly as different as most people believe.

Which mutation rates? Conventionally we would either give mutation rates in terms of BP/generation or BP/Myr. The two are obviously related through generation times and can't both be constant. For a lot of population genetics, we'd be looking at BP/generation, for MCdating we'd need BP/Myr.
I think you are overshooting here and I'd urge you to look at molecular clock implementations in use. Relaxed clocks are the norm and we do see improvements in age estimates from multiple fossil calibrations. If rate heterogeneity could be safely ignored that would not be the case.

Donald Forsdyke said...

George Romanes, like Shi Huang,
Who calls himself Gmonon,
Was sad there were no praises sang,
For his favorite axiom.
Advanced in distant eighteen eighties,
But admired not by his maties.

So “Intuitive and self-evident” thought
Darwin’s young associate.
Yet, fruitless battle long he fought,
For “collective variation” postulate.
This today sounds not absurd,
Though “collective mutation” is the word.

But Thomas Huxley and Thiselt’n-Dyer,
On this issue ‘came quite testy,
Romanes’ axiom did not admire,
Deplored his lack of modesty.
Circled wagons round this crank,
‘Til they died off as in dictum Plank.

Yes, one by one they all died off,
Grim reaper’s selective sweep.
Romanes had won no single prof,
Ideas now fade in dusty heap.
Ignored by modern biometricians
All intent on neutral missions.

Said, for us to do our sums,
Need mutation sans adaptation.
And we can cite our neutral chums
Befuddle rest with long equations.
Then along came proud Gmonon
With reductio ad absurdum.

He knoweth not Akiyoshi Wada,
Nor Grantham’s Genome Hypothesis,
Yet casting doubt on Kimura,
To our ears his words are bliss.
And if you endure not poetry
See Notes and Rec R. S’ciety!

Forsdyke DR (2010) Notes & Records of the Royal Society 64:139-154.
http://rsnr.royalsocietypublishing.org/content/early/2009/10/27/rsnr.2009.0045.full.pdf+html

Greg Laden said...

It is great to see someone sticking up for selection, because it is way more interesting than neutral process. But, I have two questions about the paper, pertaining to the species selected. Maybe three.

There are a lot of domestic species (or quasi domestic) in the data base, and thus, species with strong artificial selection.

My gut feeling is that there is a disproportionate number of species that tend to have larger than average changes in population size (boom/bust). This would mean that larger population size would be associated with relatively low diversity (initially) because of founder effects.

Third (maybe) I'm not sure if mixing entirely different reproductive patterns together is wise (i.e., looking at bees alongside bighorn sheep. Seems like you could get stung (or butted) that way.

Larry Moran said...

It is great to see someone sticking up for selection, because it is way more interesting than neutral process.

What you meant to say was that YOU are far more interested in adaptation than some of the rest of us. I'm mostly interested in molecular evolution so random genetic drift and nearly neutral alleles are extremely interesting to me. I gather you aren't interested in junk DNA?

Larry Moran said...

I meant mutations per replication per nucleotide. The mutation rate per generation depends on the number of replications per generation. So, yes, it's true that the number of replications per generation vary considerably from one to several hundred. The amazing result is that the number of fixations in diverse lineages (e.g. bacteria and mammals) is still pretty close as Margoliash observed 50 years ago.

Jmac said...

Greg,

Larry no likes selection even if it is natural. As you can tell he is in love with the G-drift. There is a reason for it. Larry is not stupid and realized that natural selection can't account for the real evolution we all keep asking for the evidence for. I'm not going to comment on the latter part of my post. I'm sorry.

John Harshman said...

I think there's a lot of talking at cross purposes here because some people are talking about DNA evolution and others are talking about protein sequence evolution. Please try to make it clear which you mean at any given time. It's pretty clear that protein sequences are not evolving neutrally; if they were, 2nd position substitutions would be as common as silent 3rd position substitutions. I'd say that most substitutions in protein sequences are neutral, but only a few possible substitutions are neutral at any given time.

Tom Mueller said...

Hi Larry

Above I suggested the following by way of contradicting gnomon :

The Champions of Neutral Theory should have no problem with certain populations demonstrating enhanced selection compared to others if in fact they were exhibiting a positive Allee effect, a positive effect that by definition occurs in some but not all larger populations (ergo a trend).

I take it by your lack of response that I must have been off target. Could you please explain? Is the Allee effect insignificant?... or irrelevant to this discussion.

Thanks in advance for your patience and your indulgence.

Claudiu Bandea said...

Greg Laden: “It is great to see someone sticking up for selection, because it is way more interesting than neutral process.”

If you are searching for people “sticking up for selection” look no farther than to the ‘fathers’ of Neutral Theory, Motoo Kimura and Jack King. Just for the record, I’m not the first to state this. Here is an excerpt from Masatoshi Nei’s book “Mutation-Driven Evolution”:

“…many evolutionists including Motoo Kimura and Jack King believed that phenotypic evolution is caused primarily by natural selection”

gnomon said...

Donald Forsdyke:
Nice poem and good infor. Thank you very much for sharing. But is the Gmonon here purposely misspelled for poetic effects or simply a typo?

peer said...

SNPs?

Check out the 1000GP literature and you will understand that it hue INDELs are causing the variation. People differ by up to 12% of genomic content.

peer said...

Gnomon, can you sent me the link to your paper you refer too here:

"If you had my experience in publishing papers on the collective effects of common SNPs, you would know how dramatic it is. Our paper found that common SNPs representing about 0.1% of the human genome are all functional at least in a collective way. Yet nearly all the editors and reviewers of the numerous journals we submitted to hated this message, believing most of them to be neutral junks. So if they now are told that 0.08% of the genome, which they had always treated as neutral, are functional, just imagine what their facial expression would be like."

peer said...

Variation is mainly due to huge INDEL mutations is the message of the 1000GP. They are facilitated by VIGEs, short for variation-inducing genetic elements, which also includes the ERvs, transposons, TEs and allt theri derivatives. Check out these papers:

http://creation.com/genetic-redundancy
http://creation.com/baranomes-and-the-design-of-life
http://creation.com/vige-introduction
http://creation.com/vige-function

peer said...

The 1000GP literature shows that a huge part of the genome (12% between randomly picked humans) varies beteween populations and genes associated with INDEL regions can simply be lost from the genome. Redundancy all over the genome. Frontloading proved.

peer said...

to proof my point read this...

http://creation.com/images/pdfs/tj/j27_3/j27_3_105-112.pdf

Donald Forsdyke said...

Sorry Gnomon (and also Planck) for the error. I am particularly impressed with Figure 6 and the penultimate sentence in Yuan et al. (2014) that suggests that "Genome compositional constraints may play a role."

As GC% values diverge the possibility of speciation (by mechanisms outlined in 1996) increases and consequently the probability that advantageous complex traits (Y) will be preserved (not lost by blending) increases.

Similarly, the intracellular "antibody RNA" repertoire (largely transcribed under stressful circumstances from non-genic DNA) will become more diverse, so that a pathogen will be less able to "anticipate" what to expect in its next host.

gnomon said...

http://arxiv.org/abs/1209.2911

Yuan, D., Zhu, Z., Tan, X., Liang, J., Zeng, C., Zhang, J., Chen, J., Ma, L., Dogan, A., Brockmann, G., Goldmann, G., Medina,E., Rice, A.D., Moyer, R.W., Man, X., Yi, K., Li, Y., Lu, Q., Huang, Y., Wang, D., Yu, J., Guo, H., Xia, K., and Huang, S. (2012) Minor alleles of common SNPs quantitatively affect traits/diseases and are under both positive and negative selection. arXiv:1209.2911

gnomon said...

http://www.sklmg.edu.cn/Public/Uploads/attached/file/20140830/20140830063859_64243.pdf

Yuan, D., Zhu, Z., Tan, X., Liang, J., Zeng, C., Zhang, J., Chen, J., Ma, L., Dogan, A., Brockmann, G., Goldmann, G., Medina,E., Rice, A.D., Moyer, R.W., Man, X., Yi, K., Li, Y., Lu, Q., Huang, Y. and Huang, S. (2014) Scoring the collective effects of SNPs: association of minor alleles with complex traits in model organisms. Sci China Life Sci. 57:876-888.

peer said...

What real evolution?

peer said...

Gnomon,

I now read most of your publications...you are a real threat to the establishment.

What a brilliant analyses!

Tom Mueller said...

Hello everybody

FYI - Carl Zimmer has jumped on the HERV-bandwagon

http://www.nytimes.com/2015/04/23/science/ancient-viruses-once-foes-may-now-serve-as-friends.html?smid=pl-share&_r=0

which if you remember segues from an earlier version

http://blogs.discovermagazine.com/loom/2012/06/14/we-are-viral-from-the-beginning/

Raising the vexing jDNA question (or would that be begging the question) in the public forum of the hoi poloi

I offer this merely as an FYI in passing.

John Harshman said...

"the hoi polloi" is similar to "the La Brea tar pits".

As for the Zimmer post, what do you think that has to do with junk DNA?

Tom Mueller said...

Hi John

as always - I appreciate your patience and your indulgence.

To answer your question, here is Carl Zimmer in his own words:

Evolution is an endlessly creative process, and it can turn what seems utterly useless into something valuable. All the viral debris scattered in our genomes turns out to be just so much raw material for new adaptations. From time to time, our ancestors harnessed virus DNA and used it for our own purposes.

How did the domestication of this viral DNA help give rise to placental mammals 100 million years ago? Who knows? Why are viruses so intimately involved in so many parts of pregnancy? Awesome question. A very, very good question. Um, do we have any other questions?


Hmmm... I can hear Claudiu chaffing at the reins already.

ITMT - some of this "domestication" of viral DNA may still have bulk and even (dare I whisper) epigenetic effects.

I dunno - I am still slogging through a whole whack of reading because of you. (that was a compliment btw)

I remain unconvinced of your criticism of higher levels of evolution, at the species level say. That said, I will need to get back to you on that since I still am a long ways from wrapping my head around that particular conundrum.

best

Tom Mueller said...

John,

I would appreciate any reaction you may care to offer regarding my suggestions about the Allee Effect above.

Thanks in advance for even considering my petition.

John Harshman said...

Sorry, but I have to repeat my question: what do you think this has to do with junk DNA? It should come as no surprise that junk can in some cases turn out to be adaptive. That's been done to death. So?

John Harshman said...

I am unclear on what your suggestions are and how they relate to the topic.

Tom Mueller said...

Hi John

In some (not all) populations, increased population density can have a positive/synergistic effect on population growth.

In such populations, in fact undercrowding, not competition, limits population growth; the corollary being a positive association between absolute average individual fitness and population size over some finite interval.

Checkout

http://www.nature.com/scitable/knowledge/library/allee-effects-19699394

Now consider that not all species demonstrate an Allee effect, but some (many?) do and those that do, then have large populations sizes when exhibiting this Allee effect.

Now presume that such populations that exhibit an Allee effect do so for genetic reasons (i.e behavior having a genetic component) meaning the Allee Effect itself is a trait subject to selection in many but not all species.

See where I am going with this?

Then, to my way of thinking, it should come as no surprise that natural selection is positively correlated with large population size or Nc, some of the time but not all the time (forgive my lack of polished finesse in parsing my words).

But isn’t that what the original paper was on about? A general trend that could easily demonstrate bias due to the Allee effect?

So as I mentioned above, the Champions of Neutral Theory should have no problem with larger populations (as a whole) demonstrating unexpected enhanced selection if in fact some subset thereof were exhibiting a positive Allee effect.

To my understanding, this does not constitute a general contradiction of Neutral Theory.

John Harshman said...

See where I am going with this?

No. Might I advise you to spend more time composing your thoughts before you type? The Allee effect doesn't predict large population sizes in the abstract, or larger population sizes than other species. It only predicts that there will be an optimal population density at which average individual fitness is greatest, while without it individual fitness is greatest at the lowest population sizes. And I still don't see how that relates to any constraint on neutral diversity. At higher population sizes, purifying selection is more effective and nearly neutral diversity should be reduced; what more is needed?

Tom Mueller said...

I am going to beg deference of that question until I am up to speed on the writings of Gregory et al.

I would like to revisit so-called higher orders of selection as advocated by Gould et al, such as at the species level, but not just now.

I am not ready just now.

Tom Mueller said...

First things first.

John, you say:

At higher population sizes, purifying selection is more effective and nearly neutral diversity should be reduced; what more is needed?

But is that not the whole point? What you just described is not supposed to be happening if (and I quote Razib Khan) "the extremism of some of the anti-selectionists"[sic] were correct.

I cite directly from Larry’s cited link: Selectionism Strikes Back!

http://www.unz.com/gnxp/selectionism-strikes-back/?utm_source=rss&utm_medium=rss&utm_cam

Assuming the neutral theory of molecular evolution you’d expect that you’d have more genetic diversity in species with larger population sizes,…

…not less.

So what am I missing?

Tom Mueller said...

Hi again John

You say:

The Allee effect doesn't predict large population sizes in the abstract, or larger population sizes than other species.

I agree, unless of course "species range" is employed as a "proxy" for Nc as was done in this paper.

John Harshman said...

You are missing the difference between "neutral" and "nearly neutral". Also, the way you quantify diversity may influence what you find. What do *you* mean by "diversity"?

John Harshman said...

What paper? Why would you use species range as a proxy for population? And even more, why would you use population as a proxy for population density?

Tom Mueller said...

Hi John

First a prequel

I only recently became aware of the Allee effect when attempting to summarize the regulation of complex Biological systems from the level of Gene Regulation (example lambda bacteriophage lysogeny) to Ecosystems (the Allee Effect) and was struck by the fact that positive feedback is egregiously misrepresented in standard introductory textbooks.

"Positive Feedback" plays a very important role in maintaining "set points"; often-as-not before quick transitions to some new "commitment step" as presciently elucidated by Jacob and Monod as far back in 1961. Positive feedback does not always accelerate to some "terminal event".

Ergo, my ongoing interest in Allee effects. All such considerations become VERY complex when actually attempting to determine the ubiquity of Allee Effects. When reading Larry's post, I immediately wondered out loud whether the authors had unknowingly stumbled across an indirect measure of Allee. Mind you my maths skills are woefully inadequate to take that line of conjecture further on my own.

Tom Mueller said...

Hi again John

Your question:

What paper?

I cite below the relevant bit from Razib Khan's

http://www.unz.com/gnxp/selectionism-strikes-back/?utm_source=rss&utm_medium=rss&utm_cam

Why would you use species range as a proxy for population?

That relevant bit would be

As the figure above shows there is a correlation between the power of selection on the genome and inferred effective population size. I say inferred because they had to use species range and size as proxies. Obviously this isn’t perfect, but I suspect that the utilization of these proxy variables only diminishes the correlation. The authors admit that there is a lot of work to be done, but this is just the first step.

Your question:

why would you use population as a proxy for population density?

Any species exhibiting Allee over a large range ie. that simultaneously exhibited a large population density AND a large range would be definition exhibit a large Nc.

As I mentioned above, I do not think this should prove a problem for the champions of Neutral Theory.

Tom Mueller said...

Hi again John… and again thank you for your patience and your indulgence

Regarding your question

You are missing the difference between "neutral" and "nearly neutral".

I admit I am climbing a steep Ebinghus curve here. I did refer to Razib Khan’s citation of

http://en.wikipedia.org/wiki/Neutral_theory_of_molecular_evolution

… but that is no guarantee I have indeed mastered the concept.

Also, the way you quantify diversity may influence what you find. What do *you* mean by "diversity"?

Agreed - refer to my earlier posts (down below) regarding “proxy” variables.

By ” diversity” I was deferring to the original definition on the first sentence of the original PLoS paper

The neutral theory of molecular evolution predicts that the amount of neutral polymorphisms within a species will increase proportionally with the census population size (Nc).

http://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.1002112

FTR – I actually have spent quite some time composing my thoughts before typing! That said, I may clearly be over my head and out of my league and I thank you for your efforts to aid my understanding of this interesting question.

Best and warmest regards

John Harshman said...

It should be obvious that the correlation between species range and size (body size?) and population size is fairly weak, and the correlation between population size and population density is even weaker. And I still don't see what the Allee effect could have to do with any of this.

Anonymous said...

@Peer Terborg
I don't agree with Gnomon and so do you (I suspect you missed that point).
Shi Huangs Ideas certainly are at odds with your "frontloading ideas", he is an evolutionist.
If I you think that I am wrong, I'd like you to elaborate why do you think that the "brilliant analysis" of an evolutionist can be compatible with your point of view.
There is a lot at stake! You could disprove the first axiom of creationism research!

First Axiom of Creationism-Research:
Shit attracts flies.

Tom Mueller said...

Hi John

Let’s go back to Larry’s original question.

Razib Khan is an adaptationist and he's discovered a paper that gets him very excited: Selectionism Strikes Back!...

...It is impossible for someone like me to evaluate this paper. Can someone take a look to see if it's valid?


You will note above that I cited Razib Khan and was merely taking Khan at his word:

As the figure above shows there is a correlation between the power of selection on the genome and inferred effective population size. I say inferred because they had to use species range and size as proxies. Obviously this isn’t perfect, but I suspect that the utilization of these proxy variables only diminishes the correlation. The authors admit that there is a lot of work to be done, but this is just the first step.

Now I cite your reaction to all this:

It should be obvious that the correlation between species range and size (body size?) and population size is fairly weak, and the correlation between population size and population density is even weaker.

Thank you for helping me out and answering Larry's question above.

So I guess that would mean the original PLoS paper does not pose a threat to what Kahn deems "the extremism of some of the anti-selectionists"

Moving on, you ask:

And I still don't see what the Allee effect could have to do with any of this.

Let’s stop talking in abstractions and cite specific examples such as the evolution of shoaling and schooling in Fish populations which would be a paradigm of the Allee effect.

There must be an adaptive and polygenetic component to swarming behavior. I wanted you to agree that left to their own devices, shoaling and schooling fish (for example) will demonstrate a high Nc over time if they have a large range, and all as an adaptive advantage of the Allee effect.

TM: Any species exhibiting Allee over a large range ie. that simultaneously exhibited a large population density AND a large range would be definition exhibit a large Nc.

For the sake of argument, let’s just for the moment assume that that contention is in fact correct, then Allee does indeed have a lot to do with any of this.

According to Neutral Theory, Razib Khan poses a problem:

…natural selection removes more variation at linked neutral sites in species with large Nc than those with small Nc and provides direct empirical evidence that natural selection constrains levels of neutral genetic diversity across many species.

This should not be happening.

There are a number of trivial reasons why the Allee effect would reduce variation such as the "oddity effect".

I cite from Wikepedia:

The "oddity effect" posits that any shoal member that stands out in appearance will be preferentially targeted by predators. This may explain why fish prefer to shoal with individuals that resemble themselves. The oddity effect would thus tend to homogenize shoals.

http://en.wikipedia.org/wiki/Shoaling_and_schooling#How_fish_school

In other words, the “oddity effect” would confound Neutral Theory expectations for apparently trivial reasons that only apply to those prey species demonstrating an Allee effect.

And not just the “oddity effect”… there are other consequences, but that’s enough for now.

Tom Mueller said...

just as a postscript:

Re: “body size”

Presuming identical Biomass – shoals of smaller fish by definition would demonstrate higher Nc than shoals of larger fish.

Tom Mueller said...

Of course I meant to say

Presuming identical Biomass – smaller species of fish that exhibit shoaling over a large range by definition would demonstrate higher Nc than larger species of fish under identical circumstances

Paul McBride said...

To continue the conversation, part of my work deals with correlations between range size and population size. Before talking about correlations between range/population/density, it is important to define what is meant by range size. Many of the widely available estimates of range size for vertebrate species (e.g., through IUCN) are estimates for the extent of occurrence (EOO), which is the smallest convex polygon drawn around all of the observation points made for that species. Provided the whole range is encapsulated, this approach overestimates the actual range of the species, because almost certainly some habitat interpolated between points will not be occupied, or will have only a low density from accidentals. On the other hand, area of occupancy (AOO) is based only on areas where direct observations have been made, with minimal interpolation based on habitat availability. Because this is far more intensive, there are few range-wide estimates of AOO for species. However, AOO is more ecologically meaningful, and it has a strong and positive correlation with population density--the larger the range as estimated by AOO, the higher the density. This correlation is considered strong enough that it alone is widely used to make conservation decisions (i.e. threat categories) about population size.

EOO does not have a clear correlation with population density. Therefore, independent estimates of density are needed to compare population sizes of different species using only range. Using direct population estimates for well-studied groups (e.g. birds) it is possible to show that population size and EOO correlate quite well after correcting for body size, but in my experience this works well only for reasonably similar species (e.g. within families of birds).

The authors estimated their own EOOs using the GBIF database. For bird and mammal species, separate estimates have already been made, which allows for some comparisons. In some cases there are two- or three-fold differences between theirs and the previous estimates. So, before even trying to translate range size into population size, it needs to be acknowledged that range size estimates themselves are only rough estimates of true range. Further, I would be very surprised if the breadth of species used in the discussed PLOS Biology study could possibly lend itself to accurate Nc estimates based on EOO and body size, even if their EOO estimates were extremely good estimates of the true ranges. An extra hurdle: to find the correlations that they did, their estimates for population size would have to be representative of long-term population size, as it takes many generations for mutation-drift equilibrium to be reached in large populations.

With all of this criticism said, the correlation they detect between range/body size and the "effect of selection" on genetic diversity is fascinating, and in line with expectations if they had reasonable population size estimates. Perhaps they were lucky; perhaps the estimates actually work out quite well for some reason across deep divergences like this, maybe because of the scale of the comparisons (I don't know--I work with much closer relationships than this); or perhaps there is an unmeasured explanatory factor involved.

Tom Mueller said...

Hi Paul

Thank you - very interesting!

So if I understand you correctly, these intriguing findings suggest the correlation between species range and size and Nc would in fact appear to be strong. Meanwhile, natural selection constrains levels of neutral genetic diversity moreso in species with large Nc than small Nc.

Furthermore, by definition, an Allee effect is a positive association between absolute average individual fitness and population size over some finite interval.

Is it possible that the authors may in fact have inadvertently stumbled across an indirect measure of the Allee Effect which has hitherto proven elusive and difficult to do? ... or is that a bridge too far?

Paul McBride said...

Hi Tom,

They have certainly found evidence for a reasonably strong correlation between range/body size and the strength of linked selection. I have given a number of caveats as to why I wouldn't expect the relationship between range and population size to be generally so strong across such taxa, but there is also no reason why the relationship wouldn't be qualitatively right in most cases (which is what would be important for a discussion of Allee effects).

Yes, I think that this could possibly be interpreted as an example of an Allee effect. If positive selection is mutation limited (i.e. there is a waiting time for the arrival of beneficial mutations) then larger populations should produce a greater number of positive mutations. Larger populations should probably see overall fewer fixations of slightly deleterious alleles (i.e. stronger background selection). Both of these factors should increase average fitness. Because selective sweeps from positive mutations, and background selection against negative ones both reduce Ne, these effects limit neutral diversity. Based on the authors' logic, the very fact that neutral diversity appears to be bounded, rather than scaling freely with Nc, is an Allee effect because they attribute it to two selective effects that at least slow degradation of population fitness and might increase it.

It is worth pointing out that there are other quite plausible factors that could prevent the scaling of genetic diversity with Nc (e.g., see Maruyama and Kimura, 1980). It is also worth noting that while the authors have used actual genetic diversity, their estimation of linked selection is only based around the potential for selective sweeps and background selection given the density of functional sites. Finally, and this is particularly important for the interpretation of an Allee effect, the authors have not demonstrated any correlation with fitness.

Tom Mueller said...

Hi Paul

I am in awe. Thank you.

I am intrigued with your suggestion:

If positive selection is mutation limited (i.e. there is a waiting time for the arrival of beneficial mutations) then larger populations should produce a greater number of positive mutations.

I hope I understand you correctly. I now wonder out loud whether that statement would have significant implications when that logic is extrapolated (along the lines first suggested by Gould) to any long-term potential positive bulk effects of so-called junk DNA (as I hinted above).

But now I am in really over my head, and I need to warn you. That subject is a particularly sensitive on e on this forum and in the past has provoked acrimony.

Thanks again for providing such a detailed answer to Larry’s original question above.

Best and grateful regards

Tom Mueller said...

Hi again Paul,

Remembering that I am an aging high school teacher out of my depth here; I was hoping you could elaborate a little on your very last sentence:

Finally, and this is particularly important for the interpretation of an Allee effect, the authors have not demonstrated any correlation with fitness.

I am not certain I exactly understand your point.

Perhaps it does not matter, given genes can be fixed at extremely low low (presumably undetectable) selection coefficients as explained here:
http://sandwalk.blogspot.ca/2014/12/how-to-think-about-evolution.html

Paul McBride said...

Tom: I now wonder out loud whether that statement would have significant implications when that logic is extrapolated (along the lines first suggested by Gould) to any long-term potential positive bulk effects of so-called junk DNA (as I hinted above).

Unless I am misunderstanding exactly where you are going with that--and please explain further if I am--I don't see how. I was referring to the production of positive mutations under the assumption of mutation-limited selection. Incidentally, there is some evidence that this is a fair assumption in many natural populations, although Ohta (1972) also provided some theoretical reasons why it might not always be the case. Either way, the rate of production of beneficial mutations is a distinctly population-genetic phenomenon--the production of new mutations is increased because of great Nc. If there is a connection to bulk effects of junk DNA and species selection I am not sure what it is.

I am not certain I exactly understand your point.
My point is that one of several caveats in interpreting the Corbett-Detig et al. paper explicitly in terms of Allee effects is that there is only an inference of an effect on mean population fitness based on population-genetic theory--there is not a demonstration that such an effect occurs in practice. Combined with the other caveats (i.e. that selection is not definitively the cause of the bounded genetic diversity), I am just pointing out for balance what some of the assumptions are that would go into interpreting their result as a genetic Allee effect. Hope that helps.

Perhaps it does not matter, given genes can be fixed at extremely low low (presumably undetectable) selection coefficients
Perhaps, although if we can't know either way, then we need to be cautious in our interpretations.

Tom Mueller said...

Hi Paul

To answer your first question, I was responding to your statement:
“If positive selection is mutation limited (i.e. there is a waiting time for the arrival of beneficial mutations) then larger populations should produce a greater number of positive mutations.”

I immediately wondered if we could possibly be opening the door to a repertoire of exaptations, some subset of which that could provide delayed positive selection at a higher hierarchical level i.e the species level.

As mentioned elsewhere, my favorite candidate for one such exaptation that POSSIBLY could prove to provide positive selection at the higher hierarchical level of “species selection” happens to be bulk DNA aka junk DNA. The inspiration for this hunch (and I insist it is merely a hunch) would be Peter Fraser’s work on the non-random orientation of the X chromosome where the folding of Chromatin and exposure of different active regions to the “periphery” is dependent on the differentiation status of the cell type which fascinates me. That means, some of the chromatin must serve some nondescript “function” as bits and pieces are either exposed or sequestered.
http://www.bbsrc.ac.uk/news/health/2013/130925-pr-x-shape-chromosome-structure/

But as I mentioned above, I am speaking more from intuition than conviction, I really have a lot more reading to do on the subject before wading in on that topic again. I really should not have gone there.

You also stated earlier that:

“Finally, and this is particularly important for the interpretation of an Allee effect, the authors have not demonstrated any correlation with fitness.”

I thank you for your later clarification:

“…interpreting the Corbett-Detig et al. paper explicitly in terms of Allee effects is that there is only an inference of an effect on mean population fitness based on population-genetic theory--there is not a demonstration that such an effect occurs in practice.”

I now understand that you are reiterating Rahzib Khan’s

“…there is a correlation between the power of selection on the genome and inferred effective population size. I say inferred because they had to use species range and size as proxies.”

I find Razib Khan’s characterization of ”the extremism of some of the anti-selectionists” somewhat amusing. I take it you and Razib Khan are in agreement that these results are “intriguing” but do not (yet?) constitute proof positive in settling the debate.

In other words, we can conclude that the “anti-selection extremists” may have lost a significant battle (perhaps maybe, that is) but have not yet lost the war.

Thank you for helping me out with possible implications to the Allee Effect. I learned a lot on this thread and I remain in your debt!

Paul McBride said...

TomI now understand that you are reiterating Rahzib Khan’s

“…there is a correlation between the power of selection on the genome and inferred effective population size. I say inferred because they had to use species range and size as proxies.”


Actually, no, I was pointing out an additional assumption that applies if you want to demonstrate (rather than infer) a genetic Allee effect: that you would need to measure fitness. Increased average fitness seems to be a reasonable assumption if positive and negative selection cause the result in Corbett-Detig et al., but as I pointed out there are other possible explanations as well, so an assumption of increased fitness alone is not totally convincing. As you have pointed out, those differences might not be measurable, but this only serves to limit the strength of any conclusions we might have.

Tom: I find Razib Khan’s characterization of ”the extremism of some of the anti-selectionists” somewhat amusing. I take it you and Razib Khan are in agreement that these results are “intriguing” but do not (yet?) constitute proof positive in settling the debate.

I agree with that, yes.

In other words, we can conclude that the “anti-selection extremists” may have lost a significant battle (perhaps maybe, that is) but have not yet lost the war

If we define this "anti-selection extremism" to encompass the POV that selection is sufficiently rare that it does not influence genetic diversity, then I would say yes, this paper is a blow to that world view. However, I'd point out that this paper is one in a long line all suggesting the same thing--that genetic diversity has upper bounds that do not reflect the levels predicted at mutation-drift equilibrium for given census population sizes. What this paper does is make a reasonable case for positive and negative selection being a major cause of this boundedness, rather than neutralist perspectives that might emphasise limits to Ne through population factors and periodic bottlenecking..

Tom Mueller said...

I remain in your debt. Thank you!

Anonymous said...

Hello gnomon. I just read the paper you linked above after having bookmarked this thread several months ago. Sorry I'm late to the party.

I'm very interested interested in the amount of strict nucleotide-specific functional DNA in the human genome. In the paper your team writes:

> "For calculating MAC [minor allele content], the number of informative SNPs used for each panel ranged from ~120 to ~151000. Since the SNPs used here are largely selected in a non-biased way, the number of SNPs used should not significantly affect the calculation of MAC. Indeed as shown for the BXD mouse panel, MAC calculated from ~51000 SNPs were highly similar to those calculated using two different non-overlapping sets of 1000 SNPs randomly selected from the ~51000"

Suppose only 1 in 20 SNP's affect function, indicating something like 5% of the genome is nucleotide specific functional. If this is true wouldn't you have still gotten the same result. Since a set of 1000SNP's would still on average have 50 function-altering SNP's? Apologies if I'm missing something obvious here.

Gnomon said...

Sorry, just noticed your great comments.Our results show patient population has greater genome wide variation or MAC than normal matched populations. Our latest paper on this is about lung cancer. If what you said is true that only a few deleterious SNPs are the cause of disease rather than the collective effect of all SNPs, you would not expect to see our finding. Yuan et al 2014 has address this: "Do the results here mean an additive effect of large numbers of MAs in MAC action and hence non-neutrality of most MAs? Many major effect risk alleles of diseases are known to be minor alleles [18], which may plausibly imply
that the effect of MAC may be mediated by a few known major effect risk alleles rather than large numbers of minor effect MAs. But this may not necessarily be the case. The effect of MAC was in fact abolished or weakened by major effect MAs such as kras2 mutation in lung cancer or npr-1 mutation in brood size as found here. Furthermore, MAC
preferentially affects traits with larger number of known
additive QTLs [34]. Obviously, the more the number of QTLs involved in a trait, the less the individual effect of each QTL on the trait. Thus, MAC-linked traits are expected to have more additive minor effect SNPs as risk alleles than those not linked to MAC."

ref.
Lei et al., Collective effects of common SNPs and risk prediction in lung cancer. Heredity 121, pages537–547 (2018) https://www.nature.com/articles/s41437-018-0063-4

Yuan et al, Scoring the collective effects of SNPs: association of minor alleles with complex traits in model organisms Science China Life Sciences 57, pp 876–888

Gnomon said...

One more reason to mention here. Our finding of higher MAC or genetic diversity in patient populations relative to normal controls is unexpected under the presently popular neutral paradigm. Why? Because different racial groups are well known to have different level of genetic diversities. Neutral theory explains it as to mean different evolutionary times and would predict that given enough time whites would reach the same level of high genetic diversity as blacks. Our finding says no.