Sandwalk: The 10th anniversary of the ENCODE publicity campaign fiasco

Monday, September 05, 2022

The 10th anniversary of the ENCODE publicity campaign fiasco

On Sept. 5, 2012 ENCODE researchers, in collaboration with the science journal Nature, launched a massive publicity campaign to convince the world that junk DNA was dead. We are still dealing with the fallout from that disaster.

The Encyclopedia of DNA Elements (ENCODE) was originally set up to discover all of the functional elements in the human genome. They carried out a massive number of experiments involving a huge group of researchers from many different countries. The results of this work were published in a series of papers in the September 6th, 2012 issue of Nature. (The papers appeared on Sept. 5th.)

Most of the papers are quite technical, and several of them are almost unreadable, so the consortium leaders published a summary article in order to explain the results to the average reader (Birney et al., 2012). The summary article contained these sentences in the abstract [my emphasis, LAM].

The human genome encodes the blueprint of life, but the function of the vast majority of its nearly three billion bases is unknown. The Encyclopedia of DNA Elements (ENCODE) project has systematically mapped regions of transcription, transcription factor association, chromatin structure and histone modification. These data enabled us to assign biochemical functions for 80% of the genome, in particular outside of the well-studied protein-coding regions.

The idea that 80% of the genome is functional was taken to mean that there's almost no junk DNA in our genome and this was the main theme of the publicity campaign launched by Nature and promoted in videos, press releases, and guest editorials [What did the ENCODE Consortium say in 2012?]. The death of junk DNA was announced on the same day in newspapers and science websites all over the world. (Science journalists were provided with advanced notice under embargo until Sept. 5th.)

There was immediate criticism on blogs and other websites and this prompted an explanation from a senior Nature editor, Brendan Maher [Brendan Maher Writes About the ENCODE/Junk DNA Publicity Fiasco]. Here's how he explained the reason for the publicity campaign on the very next day (Sept. 6).

ENCODE was conceived of and practised as a resource-building exercise. In general, such projects have a huge potential impact on the scientific community, but they don’t get much attention in the media. The journal editors and authors at ENCODE collaborated over many months to make the biggest splash possible and capture the attention of not only the research community but also of the public at large. Similar efforts went into the coordinated publication of the first drafts of the human genome, another resource-building project, more than a decade ago. Although complaints and quibbles will probably linger for some time, the real test is whether scientists will use the data and prove ENCODE’s worth.

The important point here is that the publicity campaign was deliberate. It was planned over several months in order to make "the biggest splash possible." Apparently the "real test" will be whether researchers use the date but another, fairly important, "real test" is whether they will agree with the conclusions reached by ENCODE researchers.

As the nature of the fiasco became known in the scientific community, there was some attempt to make excuses by claiming that the ENCODE researchers and Nature editors were misunderstood. The revisionist story is that there were using a very particular definition of function and they didn't mean to imply that this refuted junk DNA [How does Nature deal with the ENCODE publicity hype that it created?]. That's nonsense and everybody knows it. There's abundant evidence that the ENCODE researchers really did mean to sound the death knell for junk DNA and Nature supported them [The truth about ENCODE]. One of the best examples is a press release from the Sanger Institute (UK) on Sept. 5, 2012.

The ENCODE Project, today, announces that most of what was previously considered as ‘junk DNA’ in the human genome is actually functional. The ENCODE Project has found that 80 per cent of the human genome sequence is linked to biological function.

One of the best analyses of the ENCODE publicity campaign fiasco is a paper by Casane et al. (2015). Unfortunately, it is written in french but the english abstract tells you all you need to know [The apophenia of ENCODE or Pangloss looks at the human genome].

In September 2012, a batch of more than 30 articles presenting the results of the ENCODE (Encyclopaedia of DNA Elements) project was released. Many of these articles appeared in Nature and Science, the two most prestigious interdisciplinary scientific journals. Since that time, hundreds of other articles dedicated to the further analyses of the Encode data have been published. The time of hundreds of scientists and hundreds of millions of dollars were not invested in vain since this project had led to an apparent paradigm shift: contrary to the classical view, 80% of the human genome is not junk DNA, but is functional. This hypothesis has been criticized by evolutionary biologists, sometimes eagerly, and detailed refutations have been published in specialized journals with impact factors far below those that published the main contribution of the Encode project to our understanding of genome architecture. In 2014, the Encode consortium released a new batch of articles that neither suggested that 80% of the genome is functional nor commented on the disappearance of their 2012 scientific breakthrough. Unfortunately, by that time many biologists had accepted the idea that 80% of the genome is functional, or at least, that this idea is a valid alternative to the long held evolutionary genetic view that it is not. In order to understand the dynamics of the genome, it is necessary to re-examine the basics of evolutionary genetics because, not only are they well established, they also will allow us to avoid the pitfall of a panglossian interpretation of Encode. Actually, the architecture of the genome and its dynamics are the product of trade-offs between various evolutionary forces, and many structural features are not related to functional properties. In other words, evolution does not produce the best of all worlds, not even the best of all possible worlds, but only one possible world.

It's now been ten years since publication of the original paper and ENCODE has never again mentioned their 80% functional claim or claimed that most of the genome is functional.

We still have to deal with fallout from the huge success of a massive publicity campaign that spread false information about junk DNA. Ten years later, the majority of scientists and most of the general public still believe that ENCODE refuted junk DNA. It tells us that propaganda is much more effective than scientific evidence.

Birney et al. (The ENCODE Consortium) 2012) An integrated encyclopedia of DNA elements in the human genome. Nature 489:57–74. [doi: 10.1038/nature11247]

Casane, D., Fumey, J., et Laurenti, P. (2015) L’apophénie d’ENCODE ou Pangloss examine le génome humain. Med. Sci. (Paris) 31: 680-686. [doi: 10.1051/medsci/20153106023]

12 comments :

Georgi Marinov said...: Yeah, we knew it was going to happen like that 10 years ago, and here we are.

It's the usual pattern -- the first that people here is what most of them remember as ground immutable truth, especially if it is designed to pander to their biases, then the corrections after that nobody pays attention to, and thus the damage is done and becomes very hard to undo.; Monday, September 05, 2022 8:46:00 AM
Larry Moran said...: @Georgi

Many editors on Wikipedia see the Kellis et al. paper as a defense of the ENCODE position. Is that what you thought when you were writing your part of the paper?

Did the people in your lab at the time think that they were disproving junk DNA? Did you ever talk about it in group meetings?

Where are you now?; Monday, September 05, 2022 10:04:00 AM
Georgi Marinov said...: As I said, once something has been given a very public platform and lay people see it for the first time, no amount of damage control can help fix things in the short term.

We had a lot of those discussions here at the time if you recall.

The 80% claim was never discussed by anyone prior to it appearing on September 5th 2012, it just appeared in the 2012 paper, and even then nobody in the know would have paid much attention to it (because they know what stood behind that number) if it wasn't for the media publicity (which again is something that came out of the blue for most).

I certainly didn't wake up on that day ten years ago with the expectation that what happened would happen.; Monday, September 05, 2022 1:11:00 PM
Larry Moran said...: @Georgi

Yes, but the Kellis et al paper was eighteen months later and there had been plenty of time to recognize what had happened in September 2012.

The ENCODE leaders could had said in that paper that they disowned the 80% function claim and did not dispute the existence of lots of junk DNA. They didn't say that and I think it's because most of them are very much opposed to junk DNA.

Do you agree?; Monday, September 05, 2022 1:31:00 PM
Joe Felsenstein said...: The state of acceptance of the very-little-junk-DNA delusion is illustrated by an article in today's New York Times section ScienceTimes. A report by Oliver Whang is called Cracking the Case of the Gient Fern Genome and starts:
"Humans, like many complex organisms, have large genomes, which contain the codes for our lives. Want to explain your dark hair, thin bones, and existential dread? Look to your 46 chromosomes and three billion nucleotide base pairs. But those numbers are nothing compared with the genomes of another organism, which contains twice as many base pairs and three times as many chromosomes."
He goes on to reveal that this is a Flying Spider Monkey Tree Fern, found in Southeast Asia. (He ignores even more massive genomes in lungfish).
Then he continues:
"What accounts for, or requires, so much DNA is what Fay-Wei Li, a botanist at the Boyce Thompson Institute calls 'the biggest question in fern genomics'"
The rest of the article does not address the question but concerns other ferns that have been found to have similar sized genomes.; Tuesday, September 13, 2022 2:41:00 PM
Larry Moran said...: @Joe

The sequences of two other fern genomes were published last week. The Ceratopteris richardii genome is also quite large (9.6 Gb, 7.5 Gb was sequenced) and 85% of it consists of repetitive DNA. Transposon-related sequences account for 75% of the genome. A total of 37,000 protein-coding genes were identified.

There's evidence of two polyploidy events (whole genome duplications, WGD). One was only 60 My ago and the other was 300 My ago. There's no mystery about the large genome. It's due to the whole genome duplications and expansion of repetitive DNA. I assume that a very large percentage of the genome is junk DNA but the authors don't mention junk DNA for some strange reason.

https://doi.org/10.1038/s41477-022-01226-7

The other paper reports the sequence of the maidenhair fern genome sequence. The complete genome is 5.0 Gb of which 4.8 Gb was sequenced. 85% of the genome is repetitive DNA and most of this is transposon-related.

There's no evidence of a recent WGD (<300 My) in the maidenhair fern genome. The authors report 31,000 protein-coding genes and 9,000 noncoding genes.

In my opinion, most of the genome is junk DNA due to expansion of repetitive sequences so there's no great mystery about why the genome is larger than the human genome. But if the numbers of protein-coding genes are accurate, then some of the increase in genome size among ferns compared to mammals is due to more genes with large introns. Thus, part of the expansion includes insertion of junk repetitive DNA elements into introns. For some strange reason, these authors also avoid using the term "junk DNA" and avoid any mention of the possibility that much of the genome could be nonfunctional.

https://doi.org/10.1038/s41477-022-01222-x

The bottom line is that there isn't anything to see in the three fern genome sequences to alter the view that genome expansion is due to polyploidy events and transposon-related sequences giving rise to lots of junk DNA.

There have been a least a dozen popular science articles about this work but none of them have mentioned junk DNA. Almost all of them treat large genomes as an important mystery that's puzzling scientists.; Wednesday, September 14, 2022 1:19:00 PM
Joe Felsenstein said...: Just wait till they discover lungfish genomes ...; Wednesday, September 14, 2022 2:20:00 PM
Graham Jones said...: I think ferns can beat lungfish.

"Estimates of fern genome sizes range from 0.77 pg for Azolla microphylla (heterosporous
leptosporangiate) to 65.55 pg for Ophioglossum reticulatum and 72.68 pg for Psilotum nudum (two
eusporangiate ferns; Bennett and Leitch 2001; Obermayer et al. 2002)."
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4607520/

Ophioglossum reticulatum has very high ploidy with 1440 chromosomes.

Possibly the really important question in fern genomics is "How can we get a grant to sequence the really big ones?"; Wednesday, September 14, 2022 4:33:00 PM
Joe Felsenstein said...: I think Paris japonica, an alpine flower in Japan, beats them all.; Wednesday, September 14, 2022 5:20:00 PM
John Harshman said...: It's as if nobody has ever heard of the c-value problem and need to discover it anew, without knowledge of the literature about it. Or without knowledge of onions, fugu, genome size databases, etc.; Thursday, September 15, 2022 12:45:00 PM
Larry Moran said...: @John Harshman

The significance of a model can judged by its explanatory power, which is another way of saying that a good model explains observations and data much better than bad models. The junk DNA view of genomes explains the range of genome sizes much better than any model suggesting that most of the human genome is functional.

This is why the Onion Test is so important in these discussions.

The bigger problem is that a large number of junk DNA skeptics are really bad at seeing the "big picture" and putting their speculations into the broader context of all of life and all of the data. I think this is a failure of critical thinking.; Friday, September 16, 2022 10:50:00 AM
Stewart said...: @Larry

I wouldn't appeal in general to whole genome duplication to account for large genomes in plants - while it accounts to the large genomes of neopolyploids such as Paris japonica relative to their diploid congeners plants have gone through repeated cycles of polyploidisation and diploidisation, and any correlation with the number of rounds of ancient polyploidisation is weak: Arabidospsis thaliana was the first plant genome sequenced partly due to its small genome (it was also a model organism for the study of plant development) but still shows evidence of 2 or 3 rounds of whole genome duplication.

Gossypium (cotton) is a relatively young genus, but the random walk in genome sizes has been such that some diploid Australian cotton species have larger genomes than the tetraploid New World/Pacific species; this is not so much because the tetraploids (subgenus Karpas) have lost duplicated genes as that the Australian species (subgenus Sturtia) had acquited genomes twice the size of those of the the Afro-Asian (subgenus Gossypium) and American (subgenus Houzingenia) lineages.

My layman's explanation for why plants in general have more genes than vertebrates (which have also gone through 3 rounds of whole genome duplication) is that plants have to synthesise all their own biomolecules - both in the essential bits of the metabolism, and in the secondary metabolites they produce to deter herbivores.; Monday, October 03, 2022 1:11:00 PM

Quotations

The old argument of design in nature, as given by Paley, which formerly seemed to me to be so conclusive, fails, now that the law of natural selection has been discovered. We can no longer argue that, for instance, the beautiful hinge of a bivalve shell must have been made by an intelligent being, like the hinge of a door by man. There seems to be no more design in the variability of organic beings and in the action of natural selection, than in the course which the wind blows.Charles Darwin (c1880)

Although I am fully convinced of the truth of the views given in this volume, I by no means expect to convince experienced naturalists whose minds are stocked with a multitude of facts all viewed, during a long course of years, from a point of view directly opposite to mine. It is so easy to hide our ignorance under such expressions as "plan of creation," "unity of design," etc., and to think that we give an explanation when we only restate a fact. Any one whose disposition leads him to attach more weight to unexplained difficulties than to the explanation of a certain number of facts will certainly reject the theory.

Charles Darwin (1859)

Science reveals where religion conceals. Where religion purports to explain, it actually resorts to tautology. To assert that "God did it" is no more than an admission of ignorance dressed deceitfully as an explanation...

Peter Atkins

Quotations

The world is not inhabited exclusively by fools, and when a subject arouses intense interest, as this one has, something other than semantics is usually at stake. Stephen Jay Gould (1982)
I have championed contingency, and will continue to do so, because its large realm and legitimate claims have been so poorly attended by evolutionary scientists who cannot discern the beat of this different drummer while their brains and ears remain tuned to only the sounds of general theory. Stephen Jay Gould (2002) p.1339
The essence of Darwinism lies in its claim that natural selection creates the fit. Variation is ubiquitous and random in direction. It supplies raw material only. Natural selection directs the course of evolutionary change. Stephen Jay Gould (1977)
Rudyard Kipling asked how the leopard got its spots, the rhino its wrinkled skin. He called his answers "just-so stories." When evolutionists try to explain form and behavior, they also tell just-so stories—and the agent is natural selection. Virtuosity in invention replaces testability as the criterion for acceptance. Stephen Jay Gould (1980)
Since 'change of gene frequencies in populations' is the 'official' definition of evolution, randomness has transgressed Darwin's border and asserted itself as an agent of evolutionary change. Stephen Jay Gould (1983) p.335
The first commandment for all versions of NOMA might be summarized by stating: "Thou shalt not mix the magisteria by claiming that God directly ordains important events in the history of nature by special interference knowable only through revelation and not accessible to science." In common parlance, we refer to such special interference as "miracle"—operationally defined as a unique and temporary suspension of natural law to reorder the facts of nature by divine fiat. Stephen Jay Gould (1999) p.84

Quotations

My own view is that conclusions about the evolution of human behavior should be based on research at least as rigorous as that used in studying nonhuman animals. And if you read the animal behavior journals, you'll see that this requirement sets the bar pretty high, so that many assertions about evolutionary psychology sink without a trace.

Jerry Coyne
Why Evolution Is True

I once made the remark that two things disappeared in 1990: one was communism, the other was biochemistry and that only one of them should be allowed to come back.

Sydney Brenner
TIBS Dec. 2000

It is naïve to think that if a species' environment changes the species must adapt or else become extinct.... Just as a changed environment need not set in motion selection for new adaptations, new adaptations may evolve in an unchanging environment if new mutations arise that are superior to any pre-existing variations

Douglas Futuyma

One of the most frightening things in the Western world, and in this country in particular, is the number of people who believe in things that are scientifically false. If someone tells me that the earth is less than 10,000 years old, in my opinion he should see a psychiatrist.

Francis Crick

There will be no difficulty in computers being adapted to biology. There will be luddites. But they will be buried.

Sydney Brenner

An atheist before Darwin could have said, following Hume: 'I have no explanation for complex biological design. All I know is that God isn't a good explanation, so we must wait and hope that somebody comes up with a better one.' I can't help feeling that such a position, though logically sound, would have left one feeling pretty unsatisfied, and that although atheism might have been logically tenable before Darwin, Darwin made it possible to be an intellectually fulfilled atheist

Richard Dawkins

Another curious aspect of the theory of evolution is that everybody thinks he understand it. I mean philosophers, social scientists, and so on. While in fact very few people understand it, actually as it stands, even as it stood when Darwin expressed it, and even less as we now may be able to understand it in biology.

Jacques Monod

The false view of evolution as a process of global optimizing has been applied literally by engineers who, taken in by a mistaken metaphor, have attempted to find globally optimal solutions to design problems by writing programs that model evolution by natural selection.

Richard Lewontin

More Recent Comments

Monday, September 05, 2022

The 10th anniversary of the ENCODE publicity campaign fiasco

12 comments :