Thursday, September 06, 2012

What in the World Is Michael Eisen Talking About?

I've been trying to keep up with the ENCODE PR fiasco so I immediately click on a link to Michael Eisen's blog with the provocative title it is NOT junk. The article is: This 100,000 word post on the ENCODE media bonanza will cure cancer.

Michael Eisen is an evolutionary biologist at the University of California at Berkeley. He's best known, to me, as the brother of Jonathan Eisen.

Michael, like me and hundreds of other scientists, is upset by the ENCODE press releases. One of them is: Fast forward for biomedical research: ENCODE scraps the junk.
The hundreds of researchers working on the ENCODE project have revealed that much of what has been called 'junk DNA' in the human genome is actually a massive control panel with millions of switches regulating the activity of our genes. Without these switches, genes would not work – and mutations in these regions might lead to human disease. The new information delivered by ENCODE is so comprehensive and complex that it has given rise to a new publishing model in which electronic documents and datasets are interconnected.
Here's the interesting thing. Many of us are upset about the press releases and the PR because we don't think the ENCODE data disproves junk DNA. Michael Eisen's perspective is entirely different. He's upset because, according to him, junk DNA was discredited years ago.
The problems start before the first line ends. As the authors undoubtedly know, nobody actually thinks that non-coding DNA is ‘junk’ any more. It’s an idea that pretty much only appears in the popular press, and then only when someone announces that they have debunked it. Which is fairly often. And has been for at least the past decade. So it is more than just intellectually lazy to start the story of ENCODE this way. It is dishonest – nobody can credibly claim this to be a finding of ENCODE. Indeed it was a clear sense of the importance of non-coding DNA that led to the ENCODE project in the first place. And yet, each of the dozens of news stories I read on this topic parroted this absurd talking point – falsely crediting ENCODE with overturning an idea that didn’t need to be overturned.
Eisen is wrong, junk DNA is alive and well. In fact almost 90% of our genome is junk.

This is what makes science so much fun.


T Ryan Gregory said...

I read it the same was as you. But he's clarified what he meant on my blog:

I was not saying that everybody knows that 100% of the genome is functional! I was saying that nobody thinks that 100% of non-coding DNA is non-functional. My point was that it’s dishonest to pretend like they’re the first people to debunk the junk DNA meme.

I completely agree with this: “a significant percentage of the non-coding DNA in the human genome is functional in the sense of being biologically meaningful, but most of it probably is not.”

Larry Moran said...

Clear as mud.

What he's saying is the the ENCODE project is the 2nd or nth people to debunk the junk DNA meme.

Then in the next paragraph he says that most non-coding DNA is not functional.

Is he arguing against the mythical concept that "all noncoding DNA is junk"? That seems strange because the press release didn't say that. Maybe I'll ask him about the title of his blog.

Michael Eisen said...

I don't think it's inconsistent at all. First, here is what I believe - a large fraction of non-coding DNA is functional (duh - this is what I work on), and a large fraction of non-coding DNA is non-functional by any reasonable definition of the term (affecting phenotype, subject to purifying selection, etc...).

What the ENCODE press releases were saying is "Hey. Everyone thinks that ALL non-coding DNA is junk, but we've DISCOVERED that this isn't true". This is flat out dishonest because a) nobody who pays the slightest bit of attention believes that all non-coding DNA is non-functional, and nobody went into the ENCODE project thinking that, and b) there have literally been thousands of papers over the past 15 years that have argued and then proven that this isn't the case. So the claim this as a discovery borders on mendacity.

I also take exception to the actual claims from the papers/press releases that they've shown that 80% of the genome is functional. They haven't. I can disagree with there numerology but still believe that a lot of non-coding DNA is non-functional.

konrad said...

When you say "a large fraction", I'm guessing you mean something like "at least 10%". Can you clarify? (You clearly do _not_ mean "at least 50%", but I can only deduce that because you applied the phrase to two quantities that have to sum to 1 - in a different context, it would not be clear.)

Larry Moran said...

First, here is what I believe - a large fraction of non-coding DNA is functional ...

Why do you even use the term "non-coding DNA" in this context? It serves no useful purpose.

We've known for forty years that a certain percentage of our genome is functional and that it consists of a variety of different sequences, including regions that encode protein. A large part of the remainder could be junk.

I think that close to 90% of our genome is junk [What's in Your Genome?]. What's your number? Do you have a similar table?

Devin said...

I'm curious what fraction of the genome can be functional and maintained by selection. We're getting 50-odd mutations each generation.. how much of the genome can be functional given say a limit of 10% of offspring dying due to mutational burden? I recall Haldane writing about this but I forget everything.

Michael Eisen said...

I agree - non-coding DNA is a lazy shorthand.

I can give you what I think is the likely answer - but I want to preface it by saying that this is based on some amount of data and a large amount of intuition - so I am not putting this out as some kind of firm data-based number.

But it seems most likely to me that only around 10-15% of the genome would pass a strict definition of functional, in that it contributes meaningfully to fitness and/or phenotype. So we're more or less in the same ballpark.

Of course, as I'm sure you'd agree, this doesn't mean the remainder is inert. Indeed, I think it's likely that ANY alteration of any individual genome, including all ~18b single base pair substitutions, would have some measurable effect with some assay in some condition. This is precisely why I think the 80% number trotted out by ENCODE is meaningless - more a measure of the limitations of their assays and scope than anything. The problem is that there is absolutely no reason to believe that every biochemical event is "functional" in any meaningful sense of the word - indeed our first supposition should be that, until proven otherwise, they are NON-functional.

Michael Eisen said...

I am a biologist. What could possibly look MORE professional than walking around with a lizard on my head?

Michael M said...

Hmmmm....It seems that if that were really the case Larry would spend more time on content rather than form.

john harshman said...

Let's not be oversensitive here. Larry had to pick a photo. Who, when given a choice of a picture with a lizard on the head and one without, would fail to pick the one with the lizard? Sure, he misunderstood what you were saying. But a lot of people did. You weren't expressing yourself very clearly. And Larry has a tendency to knee-jerk responses in cases like this.

NickM said...

I will point out I apprehended what Mike meant:

...but then, I've seen his talks comparing the reduced Drosophila genome to the flabby genomes of other flies, which have the same exons in the same order but with 10 times more noncoding spacer sequence between them. In the talk he made 2 excellent points: (1) promoter regions appear to "drift" because they can originate or be lost through just a few lucky mutations, and (2) the reason to sequence things with big flabby genomes, and to make sure you sequence the "junk", is that, by comparing close relatives and measuring the conservation, you can see very clearly which sections of the genome are under purifying selection and which are drifting randomly -- whereas in Drosophila most of it is under purifying selection and you have no "contrast" in your conservation plots.

(He also made the point that this was a good argument about why to spend money to sequence some of the bloated mondo-genomes of salamanders.)

That, and I read his twitter feed religiously...

Larry Moran said...

Michael, pay no attention to andyboerger. He's completely incapable of recognizing humor.

I thought the photo was cool. I wish Ihad one like that of me.

Larry Moran said...

I can't figure out what you're saying.

First you say that only 10-15% is "functional" using a strict definition then you say any sustitution in the remaining 85-90% would have some measurable effect. If that's true then how do you explain the genetic load agrument and how do you explain the evidence that all those alleles in junk DNA are behaving as though they were neutral alleles segregating by random genetic drift? Do you think everyone is wrong about that?

Why did you name your blog "it is NOT junk"? Do you think that any part of the genome is junk?

Michael Eisen said...

I don't understand where you're are confused.

I am not trying to argue against any of the observation of genetic load/neutrality. Indeed I believe them very strongly. Rather I am trying to offer up a reconciliation between the observations of ENCODE and this reality - something that they largely ignore.

They seem to be operating under the general assumption that if you can measure it, it matters. I think this is absurd. My point was that I change some random base in the genome, something is likely to happen. If it's in a transcribed region, then maybe the total number of transcripts made mide increase/decrease slightly. Or the transcript might become marginally more/less stable. Or one of the thousands of RNA binding proteins in the genome will now bind/not bind to it. Or, if the sequencing is not transcribed, then some transcription factor will bind more/less strongly to it. Something will happen, and if you look hard enough you will be able to measure it. This is what ENCODE is doing.

However, the fact that you can measure a biochemical effect, does not mean it will affect fitness. Only a small fraction of the measurable biochemical events in a cell/organism will have a significant enough effect to be seen by selection.

Michael Eisen said...

I was not in any way offended by the choice of photo. It is one of the few of me that I actually like.

