Sandwalk: How does Nature deal with the ENCODE publicity hype that it created?

Friday, May 09, 2014

How does Nature deal with the ENCODE publicity hype that it created?

Let's briefly review what happened in September 2012 when the ENCODE Consortium published their results (mostly in Nature).

Here's the abstract of the original paper published in Nature in September 2012 (Birney et al. 2012). Manolis Kellis (see below) is listed as a principle investigator and member of the steering committee.

The human genome encodes the blueprint of life, but the function of the vast majority of its nearly three billion bases is unknown. The Encyclopedia of DNA Elements (ENCODE) project has systematically mapped regions of transcription, transcription factor association, chromatin structure and histone modification. These data enabled us to assign biochemical functions for 80% of the genome, in particular outside of the well-studied protein-coding regions. Many discovered candidate regulatory elements are physically associated with one another and with expressed genes, providing new insights into the mechanisms of gene regulation. The newly identified elements also show a statistical correspondence to sequence variants linked to human disease, and can thereby guide interpretation of this variation. Overall, the project provides new insights into the organization and regulation of our genes and genome, and is an expansive resource of functional annotations for biomedical research.

Most people reading this picked up on the idea that 80% of the genome had a function.

Here's a video produced by Nature. It features Senior Editor Magdalena Skipper and ENCODE Consortium PR frontman Ewan Birney. Pay attention to what Magdalena Skipper says at 2:24 and decide for yourself whether she thinks most of our genome is junk. (I assume that Ewan Birney approved this video.) It looks like hype to me.

Here's an article published on Sept. 5, 2012 by Nature writer Brendan Maher: ENCODE: The human encyclopaedia.

After an initial pilot phase, ENCODE scientists started applying their methods to the entire genome in 2007. Now that phase has come to a close, signalled by the publication of 30 papers, in Nature, Genome Research and Genome Biology. The consortium has assigned some sort of function to roughly 80% of the genome, including more than 70,000 ‘promoter’ regions — the sites, just upstream of genes, where proteins bind to control gene expression — and nearly 400,000 ‘enhancer’ regions that regulate expression of distant genes (see page 57)1. But the job is far from done, says Birney, a computational biologist at the European Molecular Biology Laboratory’s European Bioinformatics Institute in Hinxton, UK, who coordinated the data analysis for ENCODE. He says that some of the mapping efforts are about halfway to completion, and that deeper characterization of everything the genome is doing is probably only 10% finished. A third phase, now getting under way, will fill out the human instruction manual and provide much more detail.

It's hard to read that any other way than saying that 80% of our genome has a function and very little is junk. We now know that's wrong and Nature was colluding with the ENCODE Consortium to hype the results.

Speaking of hype. Ryan Gregory collected a bunch of articles on the ENCODE results from September 2012. Almost all of the focus is on the idea that junk DNA has been debunked. Many of them contain quotations from ENCODE Consortium leaders reinforcing that claim. [The ENCODE media hype machine]. My favorites are Elizabeth Pennisi's article in Science where she declares the end of junk DNA [ENCODE Project Writes Eulogy for Junk DNA] and where she devotes a special feature to Ewan Birney [Genomics' Big Talker].

Here's how Ed Yong reported on the ENCODE papers back in September 2012. Ed Yong is an excellent science writer. It's not likely that he would misquote or misrepresent the views of Ewen Birney and Tom Gingeras.

According to ENCODE’s analysis, 80 percent of the genome has a “biochemical function”. More on exactly what this means later, but the key point is: It’s not “junk”. Scientists have long recognised that some non-coding DNA probably has a function, and many solid examples have recently come to light. But, many maintained that much of these sequences were, indeed, junk. ENCODE says otherwise. “Almost every nucleotide is associated with a function of some sort or another, and we now know where they are, what binds to them, what their associations are, and more,” says Tom Gingeras, one of the study’s many senior scientists.

And what’s in the remaining 20 percent? Possibly not junk either, according to Ewan Birney, the project’s Lead Analysis Coordinator and self-described “cat-herder-in-chief”. He explains that ENCODE only (!) looked at 147 types of cells, and the human body has a few thousand. A given part of the genome might control a gene in one cell type, but not others. If every cell is included, functions may emerge for the phantom proportion. “It’s likely that 80 percent will go to 100 percent,” says Birney. “We don’t really have any large chunks of redundant DNA. This metaphor of junk isn’t that useful.”

I suppose it's possible that all these journalists misunderstood what the ENCODE Consortium leaders were saying about function and junk. On the other hand, I suppose it's also possible that most of the journalists got it right. One thing is very clear. Nature blew it.

Now we've got a peculiar situation. With the publication of their latest paper (Kellis et al., 2014) the ENCODE Consortium is pretending that they didn't mean it after all. It's all a big misunderstanding.

An anonymous writer at Nature picks up on the story [ENCODE debate revived online]. Here's how he/she describes the current situation ...

In the social-media age, scientific disagreements can quickly become public — and vitriolic. A report from the ENCODE (Encyclopedia of DNA Elements) Project consortium proposes a framework for quantifying the functional parts of the human genome. It follows a controversial 2012 Nature paper by the same group that concluded that 80% of the genome is biochemically functional (Nature 489, 57–74; 2012). Dan Graur, who studies molecular evolutionary bioinformatics at the University of Houston in Texas and is a vocal ENCODE critic, weighed in on this latest report. ENCODE's “stupid claims” from 2012 have finally come to back to “bite them in the proverbial junk”, Graur wrote on his blog. The targets noticed. “Some people seek attention through hyperbole and mockery,” says the report's first author Manolis Kellis, a computer scientist at the Massachusetts Institute of Technology in Cambridge. “We should stay focused on the issues.”

Did Manolis Kellis, lead author on the latest paper and contributing author on the origninal hype paper, actually criticize someone for "hyperbole"?

Kellis says that ENCODE isn't backing away from anything. The 80% claim, he says, was misunderstood and misreported. Roughly that proportion of the genome might be biochemically active, he explains, but some of that activity is undoubtedly meaningless, leaving unanswered the question of how much of it is really 'functional'. Kellis also argues that focusing on the portion of the genome that is shaped by natural selection can be misleading. For example, he says, genes that cause Alzheimer's disease or other late-in-life disorders may be largely immune to evolutionary pressure, but they are still definitely functional.

If the ENCODE Consortium leaders really meant something different that what was being reported in the media then they should have spoken up loud and clear in September 2012. They should have disavowed all the quotations that were attributed to them and they should have made it very clear that their results did not mean the end of junk DNA.

But I don't believe for a second that the 80% claim was misunderstood and misreported. I believe that most Consortium leaders really believed that there was almost no junk in our genome. I think most of them still believe this.

But there's another issue. No matter how you look at it, Nature was wrong. Either they were wrong because most of our genome is junk (as I believe) or they were wrong because they misrepresented the ENCODE results (as Kellis claims).

I wonder when we can expect an apology and a retraction from Nature? Or Science?

(Not holding my breath ....)

[Hat Tip: Dan Graur: Misunderstanding & Misreporting? Perjury? An NP_Complete Problem?]

Birney et al. (The ENCODE Consortium) 2012) An integrated encyclopedia of DNA elements in the human genome. Nature 489:57–74. [doi: 10.1038/nature11247]

Kellis, M. et al. (2014) Defining functional DNA elements in the human genome. Proc. Natl. Acad. Sci. (USA) April 24, 2014 published online [doi: 10.1073/pnas.1318948111 ]

81 comments:

Matt GSaturday, May 10, 2014 8:32:00 AM
It is very discouraging when the two most prominent science journals play games like this. Newspapers like the New York Times and the Washington Post have been slipping for years, but I naively expected science journals to maintain their integrity.

Humans have thousands of cell types? I thought the number was around 200.
ReplyDelete
Replies
Piotr GąsiorowskiSaturday, May 10, 2014 3:30:00 PM
411, according to this source
ReplyDelete
Replies
Robert ByersSunday, May 11, 2014 8:25:00 PM
The thing about this is the equation for error.
If Nature was wrong, or its critics, then why couldn't there be more error??
How could Nature be wrong if they chose scientists?? How could THOSE scientists bev wrong?
Science is about methodology to eliminate error BEFORE conclusions are strongly stated.
I wonder if they could be wrong as rain about other things?? Hmmm.
ReplyDelete
Replies
UnknownSunday, May 11, 2014 10:52:00 PM
Robert, scientists are as prone to error as anybody else. Peer review is aimed at catching these errors, but it is far from a perfect system. But where science is strong is that it is constantly being tested.

Yes, wrong theories often sneak in. But, eventually, the evidence against them builds up to the point where they must be discarded.
ReplyDelete
Replies
Fil SalustriMonday, May 12, 2014 1:03:00 AM
It sounds to me like among all the actors in the ENCODE affair may be Co fusing "behaviour" with "function." I come at this from outside the discipline, but even in my area (engineering design) there is often confusion between the two concepts. I wonder if this helps explain any of it.
ReplyDelete
Replies
Michael A. PhillipsMonday, May 12, 2014 8:13:00 AM
I wonder if we should re-consider the usefulness of the term 'function'. It has always bothered me because it implies agency and we simply have no proof of that. Wouldn't it be better to talk in terms of the 'properties' of an enzyme or regulatory sequence rather than its function if we don't want to imply 'purpose', which the term function clearly does?

'Behavior', as mentioned by Filippo above, also seems like a better way of describing the properties of complex molecules in biology.
ReplyDelete
Replies
Tom MuellerMonday, May 12, 2014 8:54:00 AM
oops - so sorry for the double posting... I had meant to post here.

I suggest we stop using the expression ‘junk DNA’. My study is not filled with ‘junk’ as understood in the sense of trash or garbage.(my wife may disagree) Consider such DNA as collateral miscellany, bric-a brac useful in its own way when the occasion rises.

Other teachers may have more Spartan study rooms than I; and they manage to get their work done their own way; meanwhile I get my work done, my way.

I am always fascinated by human chimp karyotype similarities? Why should karyotypes remain so similar after 7 million years? If all that non-coding DNA is conserved, it just stands to reason it must have some importance!

The suggestion that "excess" DNA (lungfish have 40x as much compared to humans, for instance) might play a structural role or determine other cellular parameters under selection has long been accepted, even by so-called "junk" supporters such as Ford Doolittle.

Ford Doolittle came up with an apt metaphor,

“…it's like the "clean fill" you see signs for along the highway. There may be a need for that much DNA but it doesn't matter what it is, as long as it doesn't contain deleterious sequences.”

Larger-genomed organisms might be regulating the same amount of function as smaller-genomed organisms, just in more (and possibly unnecessarily?) elaborate ways.

When considering the c-value paradox, redundancy of function does not necessarily translate into lack of function!

Again, even Ford Doolittle said as much in his PNAS paper.
ReplyDelete
Replies
Tom MuellerMonday, May 12, 2014 3:42:00 PM
@ Allan @ judmarc

Again, allow me to play devil’s advocate on the understanding that my impetuous naiveté in extremis will probably require smack-down.

Is possible to compare the question
“How much of the genome is functional?”

to an analogous question …

“How many amino acids in a protein are functional?”

The second question to me appears quite absurd. For example, there is more to an enzyme than its active and allosteric sites. In a way, one can consider active sites and allosteric sites as operating in an amino-acid milieu provided by the rest of the protein. Deletions and substitutions are very proscribed.

What about chromosomes? Do they too have tertiary/quaternary structure not dissimilar to proteins?

Is there more to a chromosome than its functional transcripts together with their cis-acting regulatory elements? In other words, higher order chromosome architecture is essential to understanding gene control.

I am thinking along these lines:

http://www.nature.com/nature/journal/v502/n7469/full/nature12593.html
http://phys.org/news/2013-09-x-shape-true-picture-chromosome-imaging.html

To repeat how I understand Ford Doolittle: Larger-genomed organisms might be regulating the same amount of function as smaller-genomed organisms, just in more (and possibly unnecessarily? [redundantly]) elaborate ways.

OK – I thank one and all in advance for your patience and indulgence.
ReplyDelete
Replies
GreenieMonday, May 12, 2014 7:08:00 PM
Why should I believe you given that Ewan Birney got elected to the Royal Society last week for his ENCODE work? The press release suggests he is equivalent to Darwin (or by extension anyone arguing against ENCODE should be compared to the religious establishments of the era).

"Those elected to the Royal Society over the years include Isaac Newton, Charles Darwin, Dorothy Hodgkin, Tim Berners-Lee, John Sulston, Janet Thornton and Paul Nurse.

[snip]

In terms of data integration, Ewan has led the analysis in many genomic consortia, in particular ENCODE, leading the integration of many genomic assays; for example making robust predictions of enhancers, promoters, and their integration with disease associated regions. He also co-developed many widely used bioinformatics resources."

http://www.ebi.ac.uk/about/news/press-releases/ewan-birney-FRS
ReplyDelete
Replies
Tom MuellerTuesday, May 13, 2014 3:25:00 PM
Hi John

I may require some more smack-down of exuberant naiveté. As I mentioned above

Is there more to a chromosome than its functional transcripts together with their cis-acting regulatory elements? In other words, is not higher order chromosome architecture essential to understanding gene control.

I am thinking along these lines:

http://www.nature.com/nature/journal/v502/n7469/full/nature12593.html

http://phys.org/news/2013-09-x-shape-true-picture-chromosome-imaging.html

Anticipating the c-paradox rebuttal, may I also repeat what I asked above

The suggestion that "excess" DNA (lungfish have 40x as much compared to humans, for instance) might play a structural role or determine other cellular parameters under selection has long been accepted, even by so-called "junk" supporters such as Ford Doolittle.

In other words: When considering the c-value paradox, redundancy of function does not necessarily translate into lack of function... or does it!?

To continue with a totally naive comparison, perhaps chromosomes have their equivalent to tertiary and quaternary structure. Otherwise how does one explain constancy of karyotypes across primate lineages unless invoking positive selection?

Similarly, how else does one explain constancy of X chromosome architecture as indicated by its invariable 3D orientation in the nucleus?
ReplyDelete
Replies
Tom MuellerSunday, May 25, 2014 1:45:00 PM
Hi John – Hi Allan – Hi Georgi

You all are conjuring happy memories of a graduate seminar I attended decades ago! I remain in your debt and thank you.

I have always suggested to my students that perhaps chromosomes do indeed have their equivalent to tertiary and quaternary structure. Otherwise how does one explain constancy of karyotypes across primate lineages unless invoking positive selection?

I then ask my students to check out this link:
http://phys.org/news/2013-09-x-shape-true-picture-chromosome-imaging.html

FTR – unless I am missing something, even the champions of “JUNK-DNA” do not disagree!

Here is Ford Doolittle in a response to one of my naïve questions:

That "excess" DNA (that 40x as much that lungfish have compared to us, for instance) might play a structural role or determine other cellular parameters under selection has long been accepted, even by us "junk" supporters.

To my mind, it's like the "clean fill" you see signs for along the highway. There may be a need for that much DNA but it doesn't matter what it is, as long as it doesn't contain deleterious sequences. You are also suggesting, I think, that larger-genomed organisms might be regulating the same amount of function as smaller-genomed organisms, just in more (and possibly unnecessarily?) elaborate ways. I agree, as spelled out in the [PNAS] paper. Ford Doolittle

ITMT, I remind you that the most recent common ancestor of the Hominidae lived roughly 14 million years ago! That is a remarkably long time to maintain the integrity of karyotype banding patterns for what you both claim to be functionless junk given the frequency of genomic rearrangement in eukaryotes. For example, many perfectly healthy populations of house mice, for example, can be distinguished from other house mice by fused chromosomes.

I suspect a strong selection pressure for the maintenance of karyotypes in Hominidae. Georgi’s answer perturbed me greatly: “[these] structural and spacing functions are not biochemically visible to the assays used by ENCODE”

Oh dear! I had naively presumed otherwise.

The problem is how to measure this positive selection if it in fact exists.
ReplyDelete
Replies
AnonymousFriday, May 30, 2014 8:56:00 PM
Georgi Marinov,

You may have misunderstood the message from Larry... It was a veiled threat.... I'm sure you are very comfortable with that..... After all you are one of the every few who can "see" some sunlight.... Let's just hope it is enough....,
ReplyDelete
Replies

Add comment