Friday, September 07, 2012

THIS Is What Michael Eisen Is Thinking!!!

Here's an excellent example of what's wrong with the way the ENCODE Consortium is interpreting their data. Congratulations to Michael Eisen! I wish I had said this: A neutral theory of molecular function.1

Read the whole thing very carefully and heed the lesson. Here's a excerpt,
I think a lot about Kimura, the neutral theory, and the salutary effects of clear null models every time I get involved in discussions about the function, or lack thereof, of biochemical events observed in genomics experiments, such as those triggered this week by publications from the ENCODE project.

It is easy to see the parallels between the way people talk about transcribed RNAs, protein-DNA interactions, DNase hypersensitive regions and what not, and the way people talked about sequence changes PK (pre Kimura). While many of the people carrying out RNA-seq, ChIP-seq, CLIP-seq, etc… have been indoctrinated with Kimura at some point in their careers, most seem unable to apply his lesson to their own work. The result is a field suffused with implicit or explicit thinking along the following lines:
I observed A bind to B. A would only have evolved to bind to B if it were doing something useful. Therefore the binding of A to B is “functional”.
One can understand the temptation to think this way. In the textbook view of molecular biology, everything is highly regulated. Genes are transcribed with a purpose. Transcription factors bind to DNA when they are regulating something. Kinases phosphorylate targets to alter their activity or sub-cellular location. And so on. Although there have always been lots of reasons to dismiss this way of thinking, until about a decade ago, this is what the scientific literature looked like. In the day where papers described single genes and single interactions, who would bother to publish a paper about a non-functional interaction they observed?

But experimental genomics blew this world of Mayberry molecular biology wide open. For example, when Mark Biggin and I started to do ChIP-chip experiments in Drosophila embryos, we found that factors were binding not just to their dozen or so non-targets, but the thousands, and in some cases tens of thousands of places across the genome. Having studied my Kimura, I just assumed that the vast majority of these interactions had evolved by chance – a natural, essential, consequence of the neutral fixation of nucleotide changes that happened to create transcription factor binding sites. And so I was shocked that almost everyone I talked to about this data assumed that every one of these binding events was doing something – we just hadn’t figured out what yet.


Rather than assuming – as so many of the ENCODE researchers apparently do – that the millions (or is it billions?) of molecular events they observe are a treasure trove of functional elements waiting to be understood, they should approach each and every one of them with Kimurian skepticism. We should never accept the existence or a molecule or the observation that it interacts with something as prima facia evidence that it is important. Rather we should assume that all such interactions are non-functional until proven otherwise, and develop better, compelling, ways to reject this null hypothesis.
Read the comments, especially the one from former colleague Chris Hogue on how to interpret phosphorylation of proteins and signal transduction. That's not going to be popular in my department!

I just have one small quibble with Michael's post. Not all textbooks describe the cell as if it were a finely tuned Swiss watch and not all textbooks take an adaptationist approach to evolution. Mine doesn't.

1. As a result of this post I've now relegated Jonathan Eisen to "brother of Michael Eisen" rather than the other way around. Sorry, Jonathan.


  1. Oh noooooooooooooooooooo (re last sentence)

  2. from the article
    "Rather we should assume that all such interactions are non-functional until proven otherwise, and develop better, compelling, ways to reject this null hypothesis."
    Applause, applause!

  3. Sorry, I just can't take him seriously. It's that lizard ... ;0)

  4. One remark: in this 'wholesale biology' era people tend to forget that CHiP is flooded with false-positive results. Meaning that the probability that the supposed interaction between a protein and DNA is due to an artefact of the method is quite high.

  5. George Williams said essentially the same thing a half century ago: "Adaptation is an onerous concept and one that should only be turned to as a last resort." Steve Gould and Dick Lewontin made the same point even more forcefully two decades later, and yet the widespread opinion persists that all characteristics of living organisms should be assumed to be evolutionary adaptations until proven otherwise. This is exactly the wrong way to do evolutionary biology, but will probably persist as long as the "Platonic" view of ideal forms has persisted in western culture.

    1. That kind of problem goes back to the beginning, as A. R. Wallace noted in 1866.

      One of the aspects of "adaptation" as an onerous concept is that, in evolutionary biology, it's a single word used to describe an enormous number of very different actual outcomes of very different events, physical characters and behaviors all in relation to extremely varied external and internal factors. Perhaps it means too many things to remain a coherent concept avoiding extreme ambiguity. It can, though, remain a habit of thought, which also leads to trouble.

      As Wallace noted, you can say the same thing of "natural selection", though, these days, substituting "survival of the fittest", his proposal, wouldn't work any better.

  6. The lizard has a huge tumor growing from its belly. The tumor looks a bit human - a case of adaptive mimicr?

  7. Let me take a different stance than Larry and Michael Eisen:

    1. The presence of lots of adaptation of organisms to their environments is prima facie evidence that this is caused by natural selection, as no other evolutionary force has the property of favoring adaptation. However we cannot draw from this the conclusion that any particular one trait is favored by natural selection (see spandrels, etc.). But ...

    2. Morphological and behavioral traits typically show variation sufficiently large that it cannot be selectively neutral. (Larry often neglects this point when discussing such traits). In a population of size N a selection coefficient has to be as small as ± 1/(4N) to be effectively neutral. That is very small. So ...

    3. The mere observation that we cannot see fitness differences between phenotypes in the lab does not allow us to rule out that these phenotypic variations are selectively neutral. So, am I about to conclude that all DNA variations cannot be neutral?

    4. Not at all. Those that cause no known changes of phenotype are candidates for neutrality. And there are three lines of evidence that much, even most, DNA has its base substitutions effectively neutral. They are, briefly (1) the excessive mutational load that would result if they weren't neutral, (2) the occurrence of large families of parasitic elements such as transposons, LINEs, SINEs, and short tandem repeats, and (3) the ridiculously large variations of DNA content among organisms. These have all been mentioned here and at Panda's Thumb by Nick, Larry, and others. Kudos (κῦδος) to all of them for standing firm in the face of a media frenzy.

    Junk DNA, by which we should mean DNA whose variation is not constrained by natural selection, is likely to be most of our genome, despite the ENCODE designation of much of it as "functional". But we should not be too quick to toss out natural selection as a significant force in the evolution of the rest.

    1. But we should not be too quick to toss out natural selection as a significant force in the evolution of the rest.

      Just to be clear, I'm certainly not tossing out natural selection as a mechanism that affects 10% of the genome. Nor am I tossing out random genetic drift as a significant force in the entire genome, including the functional part.

      BTW, Joe, when you say, "Morphological and behavioral traits typically show variation sufficiently large that it cannot be selectively neutral. (Larry often neglects this point when discussing such traits)" I'm not sure what you mean. Can you point me to some examples suggesting that this is "typical"?

    2. Sure. Many size-related morphological traits show around 10% coefficient of variation (their standard deviation is 10% of their mean). They also show heritabilities of about 0.3, so the additive genetic variation has a standard deviation of the square root of 0.3, which is 0.55 so that one additive genetic standard deviation would be 5% of the mean.

      If these traits are subject to optimizing selection, so that they are sitting on a local peak of fitness and fitness falls away from there in both directions, and if population sizes are (say) N = 1,000,000, it is implausible that fitness would remain within 1/(4,000,000) of its peak value as the phenotype changes by 5% of its mean value.

      Basically any morphological change big enough for us to notice must result in a fitness difference bigger than that.