More Recent Comments

Friday, May 09, 2014

How does Nature deal with the ENCODE publicity hype that it created?

Let's briefly review what happened in September 2012 when the ENCODE Consortium published their results (mostly in Nature).

Here's the abstract of the original paper published in Nature in September 2012 (Birney et al. 2012). Manolis Kellis (see below) is listed as a principle investigator and member of the steering committee.
The human genome encodes the blueprint of life, but the function of the vast majority of its nearly three billion bases is unknown. The Encyclopedia of DNA Elements (ENCODE) project has systematically mapped regions of transcription, transcription factor association, chromatin structure and histone modification. These data enabled us to assign biochemical functions for 80% of the genome, in particular outside of the well-studied protein-coding regions. Many discovered candidate regulatory elements are physically associated with one another and with expressed genes, providing new insights into the mechanisms of gene regulation. The newly identified elements also show a statistical correspondence to sequence variants linked to human disease, and can thereby guide interpretation of this variation. Overall, the project provides new insights into the organization and regulation of our genes and genome, and is an expansive resource of functional annotations for biomedical research.
Most people reading this picked up on the idea that 80% of the genome had a function.

Here's a video produced by Nature. It features Senior Editor Magdalena Skipper and ENCODE Consortium PR frontman Ewan Birney. Pay attention to what Magdalena Skipper says at 2:24 and decide for yourself whether she thinks most of our genome is junk. (I assume that Ewan Birney approved this video.) It looks like hype to me.


Here's an article published on Sept. 5, 2012 by Nature writer Brendan Maher: ENCODE: The human encyclopaedia.
After an initial pilot phase, ENCODE scientists started applying their methods to the entire genome in 2007. Now that phase has come to a close, signalled by the publication of 30 papers, in Nature, Genome Research and Genome Biology. The consortium has assigned some sort of function to roughly 80% of the genome, including more than 70,000 ‘promoter’ regions — the sites, just upstream of genes, where proteins bind to control gene expression — and nearly 400,000 ‘enhancer’ regions that regulate expression of distant genes (see page 57)1. But the job is far from done, says Birney, a computational biologist at the European Molecular Biology Laboratory’s European Bioinformatics Institute in Hinxton, UK, who coordinated the data analysis for ENCODE. He says that some of the mapping efforts are about halfway to completion, and that deeper characterization of everything the genome is doing is probably only 10% finished. A third phase, now getting under way, will fill out the human instruction manual and provide much more detail.
It's hard to read that any other way than saying that 80% of our genome has a function and very little is junk. We now know that's wrong and Nature was colluding with the ENCODE Consortium to hype the results.

Speaking of hype. Ryan Gregory collected a bunch of articles on the ENCODE results from September 2012. Almost all of the focus is on the idea that junk DNA has been debunked. Many of them contain quotations from ENCODE Consortium leaders reinforcing that claim. [The ENCODE media hype machine]. My favorites are Elizabeth Pennisi's article in Science where she declares the end of junk DNA [ENCODE Project Writes Eulogy for Junk DNA] and where she devotes a special feature to Ewan Birney [Genomics' Big Talker].

Here's how Ed Yong reported on the ENCODE papers back in September 2012. Ed Yong is an excellent science writer. It's not likely that he would misquote or misrepresent the views of Ewen Birney and Tom Gingeras.
According to ENCODE’s analysis, 80 percent of the genome has a “biochemical function”. More on exactly what this means later, but the key point is: It’s not “junk”. Scientists have long recognised that some non-coding DNA probably has a function, and many solid examples have recently come to light. But, many maintained that much of these sequences were, indeed, junk. ENCODE says otherwise. “Almost every nucleotide is associated with a function of some sort or another, and we now know where they are, what binds to them, what their associations are, and more,” says Tom Gingeras, one of the study’s many senior scientists.

And what’s in the remaining 20 percent? Possibly not junk either, according to Ewan Birney, the project’s Lead Analysis Coordinator and self-described “cat-herder-in-chief”. He explains that ENCODE only (!) looked at 147 types of cells, and the human body has a few thousand. A given part of the genome might control a gene in one cell type, but not others. If every cell is included, functions may emerge for the phantom proportion. “It’s likely that 80 percent will go to 100 percent,” says Birney. “We don’t really have any large chunks of redundant DNA. This metaphor of junk isn’t that useful.”
I suppose it's possible that all these journalists misunderstood what the ENCODE Consortium leaders were saying about function and junk. On the other hand, I suppose it's also possible that most of the journalists got it right. One thing is very clear. Nature blew it.

Now we've got a peculiar situation. With the publication of their latest paper (Kellis et al., 2014) the ENCODE Consortium is pretending that they didn't mean it after all. It's all a big misunderstanding.

An anonymous writer at Nature picks up on the story [ENCODE debate revived online]. Here's how he/she describes the current situation ...
In the social-media age, scientific disagreements can quickly become public — and vitriolic. A report from the ENCODE (Encyclopedia of DNA Elements) Project consortium proposes a framework for quantifying the functional parts of the human genome. It follows a controversial 2012 Nature paper by the same group that concluded that 80% of the genome is biochemically functional (Nature 489, 57–74; 2012). Dan Graur, who studies molecular evolutionary bioinformatics at the University of Houston in Texas and is a vocal ENCODE critic, weighed in on this latest report. ENCODE's “stupid claims” from 2012 have finally come to back to “bite them in the proverbial junk”, Graur wrote on his blog. The targets noticed. “Some people seek attention through hyperbole and mockery,” says the report's first author Manolis Kellis, a computer scientist at the Massachusetts Institute of Technology in Cambridge. “We should stay focused on the issues.”
Did Manolis Kellis, lead author on the latest paper and contributing author on the origninal hype paper, actually criticize someone for "hyperbole"?
Kellis says that ENCODE isn't backing away from anything. The 80% claim, he says, was misunderstood and misreported. Roughly that proportion of the genome might be biochemically active, he explains, but some of that activity is undoubtedly meaningless, leaving unanswered the question of how much of it is really 'functional'. Kellis also argues that focusing on the portion of the genome that is shaped by natural selection can be misleading. For example, he says, genes that cause Alzheimer's disease or other late-in-life disorders may be largely immune to evolutionary pressure, but they are still definitely functional.
If the ENCODE Consortium leaders really meant something different that what was being reported in the media then they should have spoken up loud and clear in September 2012. They should have disavowed all the quotations that were attributed to them and they should have made it very clear that their results did not mean the end of junk DNA.

But I don't believe for a second that the 80% claim was misunderstood and misreported. I believe that most Consortium leaders really believed that there was almost no junk in our genome. I think most of them still believe this.

But there's another issue. No matter how you look at it, Nature was wrong. Either they were wrong because most of our genome is junk (as I believe) or they were wrong because they misrepresented the ENCODE results (as Kellis claims).

I wonder when we can expect an apology and a retraction from Nature? Or Science?

(Not holding my breath ....)


[Hat Tip: Dan Graur: Misunderstanding & Misreporting? Perjury? An NP_Complete Problem?]

Birney et al. (The ENCODE Consortium) 2012) An integrated encyclopedia of DNA elements in the human genome. Nature 489:57–74. [doi: 10.1038/nature11247]

Kellis, M. et al. (2014) Defining functional DNA elements in the human genome. Proc. Natl. Acad. Sci. (USA) April 24, 2014 published online [doi: 10.1073/pnas.1318948111 ]

81 comments :

Matt G said...

It is very discouraging when the two most prominent science journals play games like this. Newspapers like the New York Times and the Washington Post have been slipping for years, but I naively expected science journals to maintain their integrity.

Humans have thousands of cell types? I thought the number was around 200.

Piotr Gąsiorowski said...

411, according to this source

Robert Byers said...

The thing about this is the equation for error.
If Nature was wrong, or its critics, then why couldn't there be more error??
How could Nature be wrong if they chose scientists?? How could THOSE scientists bev wrong?
Science is about methodology to eliminate error BEFORE conclusions are strongly stated.
I wonder if they could be wrong as rain about other things?? Hmmm.

Unknown said...

Robert, scientists are as prone to error as anybody else. Peer review is aimed at catching these errors, but it is far from a perfect system. But where science is strong is that it is constantly being tested.

Yes, wrong theories often sneak in. But, eventually, the evidence against them builds up to the point where they must be discarded.

Fil Salustri said...

It sounds to me like among all the actors in the ENCODE affair may be Co fusing "behaviour" with "function." I come at this from outside the discipline, but even in my area (engineering design) there is often confusion between the two concepts. I wonder if this helps explain any of it.

Michael A. Phillips said...

I wonder if we should re-consider the usefulness of the term 'function'. It has always bothered me because it implies agency and we simply have no proof of that. Wouldn't it be better to talk in terms of the 'properties' of an enzyme or regulatory sequence rather than its function if we don't want to imply 'purpose', which the term function clearly does?

'Behavior', as mentioned by Filippo above, also seems like a better way of describing the properties of complex molecules in biology.

Tom Mueller said...

oops - so sorry for the double posting... I had meant to post here.

I suggest we stop using the expression ‘junk DNA’. My study is not filled with ‘junk’ as understood in the sense of trash or garbage.(my wife may disagree) Consider such DNA as collateral miscellany, bric-a brac useful in its own way when the occasion rises.

Other teachers may have more Spartan study rooms than I; and they manage to get their work done their own way; meanwhile I get my work done, my way.

I am always fascinated by human chimp karyotype similarities? Why should karyotypes remain so similar after 7 million years? If all that non-coding DNA is conserved, it just stands to reason it must have some importance!

The suggestion that "excess" DNA (lungfish have 40x as much compared to humans, for instance) might play a structural role or determine other cellular parameters under selection has long been accepted, even by so-called "junk" supporters such as Ford Doolittle.

Ford Doolittle came up with an apt metaphor,

“…it's like the "clean fill" you see signs for along the highway. There may be a need for that much DNA but it doesn't matter what it is, as long as it doesn't contain deleterious sequences.”

Larger-genomed organisms might be regulating the same amount of function as smaller-genomed organisms, just in more (and possibly unnecessarily?) elaborate ways.

When considering the c-value paradox, redundancy of function does not necessarily translate into lack of function!

Again, even Ford Doolittle said as much in his PNAS paper.

Diogenes said...

Robert, creationists said ENCODE proved there was no junk. If creationists were wrong about that, couldn't they be wrong about other things?

Piotr Gąsiorowski said...

They also claimed ENCODE results confirmed their predictions:

As just one example of a successful ID-based prediction:

Non-functionality of “junk DNA” was predicted by Susumu Ohno (1972), Richard Dawkins (1976), Crick and Orgel (1980), Pagel and Johnstone (1992), and Ken Miller (1994), based on evolutionary presuppositions.

By contrast, predictions of functionality of “junk DNA” were made based on teleological bases by Michael Denton (1986, 1998), Michael Behe (1996), John West (1998), William Dembski (1998), Richard Hirsch (2000), and Jonathan Wells (2004).

These Intelligent Design predictions are being confirmed. e.g., ENCODE’s June 2007 results show substantial functionality across the genome in such “junk” DNA regions, including pseudogenes.


Now, I suppose, they will have to pretend they never made such predictions.

Piotr Gąsiorowski said...

Oops, I forgot the link.

Larry Moran said...

@Piotr Gąsiorowski,

It is simply not true to say that Richard Dawkins predicted massive amounts of junk DNA in his book The Selfish Gene (1976). It's also not correct to say that Crick and Orgel (1980) predicted that our genome was full of junk. Both of these references are about "selfish DNA," which is a way of explaining the presence of transposons in our genome. Transposons are FUNCTIONAL elements. They are not junk.These are actually anti-junk arguments.

Richard Dawkins was very skeptical of junk DNA so it's a blatant misrepresentation of his view to say that he predicted junk DNA.

The Pagel & Johnstone (1992) paper is just one of dozens of papers that make the case for junk DNA based on C-value variations.

Ken Miller argues that the conservation of pseudogenes in humans and chimpanzees is evidence of common ancestry. Many other make this argument as well but it has very little to do with the debate over the quantity of junk DNA in our genome. The argument would be just as valid if the only junk in our genomes was the 1% attributed to pseudogenes.

Miller did not "predict" junk DNA. Almost everyone agrees that pseudogenes are junk.

You have to give the IDiots credit for one thing. They have an almost perfect record of misunderstanding the scientific literature and failing to understand the arguments of their opponents. It's pretty amazing, actually. It almost looks deliberate but it's much kinder to attribute it to stupidity.

judmarc said...

Larger-genomed organisms might be regulating the same amount of function as smaller-genomed organisms, just in more (and possibly unnecessarily?) elaborate ways.

Well, you could attribute substantial genome size differences among various members of the allium family to closely related species going off and regulating function in ways vastly different vastly different from each other; or you could attribute it to simple contingency (e.g., copying error doubling the amount of some subpart of the "junk" in the genome). Which seems more likely to you?

judmarc said...

Ah, I see there's a "copy error" in my own comment above! :)

Joe Felsenstein said...

Dawkins is out of the British school of evolutionary biologists, who tended to be rather panselectionist, and were slow to accept the ubiquity of neutral mutation. (However John Maynard Smith and Brian Charlesworth accepted it rather early).

Creationists and ID advocates rewrite history when they say that evolutionary biologists wanted to see junk DNA as most of our genome. The opposite was true -- although population geneticists early on worried about the "c value paradox" and mutational load, they were actually a bit reluctant to accept that so much of our genome was neutral. But they did ultimately accept it, and then noticed that similarities in neutral sequences were a good argument for common ancestry. Creationists and ID types make a conspiracy theory that evolutionary biologists foisted the notion of junk DNA on us purely to make the argument for common descent.

The issue of whether we call active transposons "junk" is semantic, but very interesting. They create dead transposons that are mostly neutral, but don't even dead transposons have an effect by binding polymerase from the live ones? From the point of view of adaptation, they are "junk" in the sense of being deleterious. Should we only call them "junk" if they have no fitness effect?

AllanMiller said...

Which is more fascinating, karyotype preservation in the human/chimp clade or the wide variation in - say - some deer? http://mbe.oxfordjournals.org/content/17/9/1326.full

Of course, alignments of these chromosomes would probably reveal deep and preserved commonality. But given a reasonable assumption of mutation rate, how much difference would you expect anyway after 5-7 millions years? The bulk of the genome doesn't look to be under positive selection, even for that bulk itself. Mechanisms for deletion of surplus bases are few, and selection for such loss weak at best.

AllanMiller said...

I still prefer Ohno's original definition - cannot suffer a deleterious mutation. Deleterious from the point of view of the broader genome, that would be, and therefore transposons, live or dead, are junk IMO.

Tom Mueller said...

@ Joe
You raise a very interesting point. I was under the impression that the prevalence of transposable elements in our genome was indeed functionally significant.

I always understood that retroviruses co-opted host regulatory machinery and vice versa constituting the acme in molecular host-parasite coevolution.

http://www.nature.com/nature/journal/v487/n7405/full/nature11244.html
http://www.sciencedaily.com/releases/2007/11/071114121359.htm

Meanwhile, the different distributions of Alu and LINE1 in the genome would suggest that selection pressure may be involved. Do Alus direct methylation? Are Alus and Line1 DNA symbionts?

Larry Moran said...

I prefer to call active transposons functional because they are examples of selfish DNA. I realize that their functionality doesn't affect the genome of the organism but it's still "function" in my book. The same reasoning applies to integrated DNA viruses and retroviruses as long as they are still capable of making complete virus particles.

Fortunately, this doesn't affect the debate very much because the vast majority of transposon sequences are nonfunctional bits and pieces of formerly active transposons.

This might be a remnant of my days hanging out with the phage group back in the 60s and 70s. I can't imagine anyone referring to a λ prophage as junk DNA.You would be laughed at.

Tom Mueller said...

@ Allan @ judmarc

Again, allow me to play devil’s advocate on the understanding that my impetuous naiveté in extremis will probably require smack-down.

Is possible to compare the question
“How much of the genome is functional?”

to an analogous question …

“How many amino acids in a protein are functional?”

The second question to me appears quite absurd. For example, there is more to an enzyme than its active and allosteric sites. In a way, one can consider active sites and allosteric sites as operating in an amino-acid milieu provided by the rest of the protein. Deletions and substitutions are very proscribed.

What about chromosomes? Do they too have tertiary/quaternary structure not dissimilar to proteins?

Is there more to a chromosome than its functional transcripts together with their cis-acting regulatory elements? In other words, higher order chromosome architecture is essential to understanding gene control.

I am thinking along these lines:

http://www.nature.com/nature/journal/v502/n7469/full/nature12593.html
http://phys.org/news/2013-09-x-shape-true-picture-chromosome-imaging.html

To repeat how I understand Ford Doolittle: Larger-genomed organisms might be regulating the same amount of function as smaller-genomed organisms, just in more (and possibly unnecessarily? [redundantly]) elaborate ways.

OK – I thank one and all in advance for your patience and indulgence.

Tom Mueller said...

@Larry

Excuse my lack of clarity. My point was that viruses have co-opted host regulatory machinery and vice versa. That constitutes the acme in molecular host-parasite coevolution especially in primates: to the point that Alu and Sine1 exhibit significant purifying selection.

Forget parasitism. I should have said:Alu and Sine1 are no longer just "selfish genes". We are in fact witnessing molecular mutualism and no longer even molecular commensalism.

Re your remark about λ

Bravo!!!

Americans have every right to be proud of their great accomplishments such as the Manhattan Project and the race to the Moon but in my opinion these achievements pale in significance to the accomplishments of the Phage Group!!!

Youngsters nowadays do not appreciate that! ... explaining yet one more reason why I appreciate your bog so much!

Claudiu Bandea said...

Tom Mueller: Are Alus and Line1 DNA symbionts?

Tom,

First, I want to say that I enjoyed reading your comments here at Sandwalk; they are refreshing. It must be that ‘high school environment’ that encourages common sense and curiosity…

Interestingly, a few days ago at Dan Graur’s blog (Judge Starling), I labeled ‘junk DNA’ as ‘symbiotic DNA' (sDNA) (http://judgestarling.tumblr.com/post/85095159186/functional-pseudogenes-people-like-function-hate-to#disqus_thread)

And, I did that for a good reason; please see the following material:

http://biorxiv.org/content/early/2013/11/18/000588

http://www.ncbi.nlm.nih.gov/pubmed/23479647#cm23479647_1429

Joe Felsenstein said...

@Tom Mueller: I was naïvely assuming that Alus and such are mostly just parasites. The occasional case where they get coopted into doing something useful does not persuade me that most of them are doing anything useful (the folks over at Uncommon Descent seem instantly persuaded of that by every single case they hear of).

The question I raised was: if so, should we still call them "functional" and should be still call them "junk"? I'd say "yes" and "yes" but it's a matter of semantics and others here will have different opinions on that.

Greenie said...

Why should I believe you given that Ewan Birney got elected to the Royal Society last week for his ENCODE work? The press release suggests he is equivalent to Darwin (or by extension anyone arguing against ENCODE should be compared to the religious establishments of the era).

"Those elected to the Royal Society over the years include Isaac Newton, Charles Darwin, Dorothy Hodgkin, Tim Berners-Lee, John Sulston, Janet Thornton and Paul Nurse.

[snip]

In terms of data integration, Ewan has led the analysis in many genomic consortia, in particular ENCODE, leading the integration of many genomic assays; for example making robust predictions of enhancers, promoters, and their integration with disease associated regions. He also co-developed many widely used bioinformatics resources."

http://www.ebi.ac.uk/about/news/press-releases/ewan-birney-FRS

Joe Felsenstein said...

It feels too much like a reward for killing off junk DNA. Hmm ....

Joe Felsenstein said...

Typo: ... and should we still ...

Robert Byers said...

NO. I disagree. Scientists can't be wrong like everyone else. tHey claim to be using the scientific method BEFORE hard and fast conclusions are drawn like everyone else does and fairly.
NO you are wrong.
Science makes its claims for accuracy based on methodology. Not jUST what they think at this moment in this issue.
Thats why they tell creationists evolution is scientific fact. As opposed to other facts in society that are well supported themselves.
Therefore this matter hits the nerve about how these subjects are not actually having applied scientific investigation. It isn't easily done in subjects like this.
So error is as likely as accuracy. So these mags are the ones who vote up or down truth. this because the evidence is so flimsy.
Evolution was voted up but soon will be voted down.
tHere is no biological scientific evidence for evolution or it easily would convince the public.
Only the public who trusts evolutionists and the system behind them believe in evolution for that reason.
Actual evidence for evolution is almost irrelevant in the issue.

Diogenes said...

"error is as likely as accuracy" is typical bullshit from Byers. He thinks the Earth being flat is as likely as it being round.

The creationists and IDers lied about junk DNA, Robert, and now they're exposed. You're trying to flip this-- IDers having been caught lying again about easily verifiable facts-- into "This proves the creationists were right."

No. We caught them lying. Again. The IDers at the Discovery Institute said ENCODE assayed & identified functions in almost all DNA. They lied. You can't flip it into "Their lies prove they were telling the truth." $%&# you for even trying.

IDers lied. They were caught lying. Most scientists give no thought to ID hoaxsters, but the few who do feel nothing but contempt for the Dishonesty Institute. They will never win, they will never be majority nor a plurality, they will never be taken seriously, there will never be a controversy.

IDers lied. They got caught lying. @#$% you for trying to flip that into a win for the perps.

Diogenes said...

Greenie, Ewan Birney himself has admitted in multiple venues that most of the human genome (perhaps 80%) is NON-functional, and he admitted ENCODE didn't disprove the junk DNA hypothesis.

Who will you believe-- Ewan Birney, who admits ENCODE never refuted junk DNA, or the lying Intelligent Design perps of the Discovery Institute, who say it did?

Greenie said...

Really? The abstract of Birney's Nature paper states -

"These data enabled us to assign biochemical functions for 80% of the genome, in particular outside of the well-studied protein-coding regions."

Also, I see Birney right there in the video posted above. Given that Skipper is only an editor articulating the observations of ENCODE paper, I would say that the entire video is based on Birney's paper and not Skipper's scientific observations.

Georgi Marinov said...

It's in general a good idea to read more of a paper than the abstract before talking about its conclusions. In that case, what is meant by "biochemical function" is defined on the first page:

Operationally, we define a functional element as a discrete genome
segment that encodes a defined product (for example, protein or
non-coding RNA) or displays a reproducible biochemical signature
(for example, protein binding, or a specific chromatin structure)


Keeping that definition (which, BTW, makes everything that is said in the paper correct, and precludes any possibility of retraction; what was said by who outside of it is a different subject) in mind, the interpretation of those words becomes much different. As should have been clarified by the PNAS paper - did you read that one?

John Harshman said...

Most of the fault still belongs to Birney and his co-authors, who defined biochemical function in such a way that junk DNA and random DNA sequences are functional. If you call a tail a leg, then a dog does have five legs, but it would make a very bad paper to announce that your research shows that dogs have five legs.

Unknown said...

"I disagree. Scientists can't be wrong like everyone else. tHey claim to be using the scientific method BEFORE hard and fast conclusions are drawn like everyone else does and fairly."

While I'm aware of the futility of trying to explain something to Byers, in case an innocent bystander finds that persuasive:
The scientific method produces tested hypotheses. When something gets published, we expect the hypothesis to have been tested, usually with the demand that there's significance at the p=.05 level. Now p=.05 means that there's a 1 in 20 chance of the results being a statistical fluke. And in fact publication bias might make matters worse - there are hypotheses that get tested and fail and then don't get published, but if somebody else tests the same hypothesis their results might be significant.
At the point of publication the hypothesis is stated clearly and strongly. Which makes it easier to falsify it than a statement that is fuzzy and weak.
Later research tests these again. That's how we find out whether the hypothesis is wrong. Publication is just the second step - you come up with a hypothesis, then put in the legwork trying to disprove it until it meets the .05 standard. And then you tell the scientific community: I've failed to reject this, maybe you can do it. And then the key part of the scientific method kicks in: Other people trying to bring your hypothesis down, using additional methods, generating more data...

Of course creationists like to think it's done as soon as it's published because then they can wave around the papers on biochemistry they snuck past peer review in journals of electrical engineering or whatever. It's like declaring potty training complete, when you manage to run into the bathroom before peeing in your pants.

Tom Mueller said...

Hi John

I may require some more smack-down of exuberant naiveté. As I mentioned above

Is there more to a chromosome than its functional transcripts together with their cis-acting regulatory elements? In other words, is not higher order chromosome architecture essential to understanding gene control.

I am thinking along these lines:

http://www.nature.com/nature/journal/v502/n7469/full/nature12593.html

http://phys.org/news/2013-09-x-shape-true-picture-chromosome-imaging.html

Anticipating the c-paradox rebuttal, may I also repeat what I asked above

The suggestion that "excess" DNA (lungfish have 40x as much compared to humans, for instance) might play a structural role or determine other cellular parameters under selection has long been accepted, even by so-called "junk" supporters such as Ford Doolittle.

In other words: When considering the c-value paradox, redundancy of function does not necessarily translate into lack of function... or does it!?

To continue with a totally naive comparison, perhaps chromosomes have their equivalent to tertiary and quaternary structure. Otherwise how does one explain constancy of karyotypes across primate lineages unless invoking positive selection?

Similarly, how else does one explain constancy of X chromosome architecture as indicated by its invariable 3D orientation in the nucleus?

Tom Mueller said...

@ Claudiu Bandea

Thank you for your kind words!

Family beckons - I will respond in detail to your posted links when I have a chance to read them!

best regards

Georgi Marinov said...

Structural function is very much not what the debate is about - structural and spacing functions are not biochemically visible to the assays used by ENCODE. The claims made were of a different nature.

Tom Mueller said...

@ Joe

I need to pursue this line of inquiry further. I thankyou for your pateince and your indulgence.

My understanding was that most transposible elements have indeed been silenced. But those that have not been silenced are in fact symbiotic.

ITMT - much of what passes for regulatory sequences in primates owe their existance to ancestral retroviruses.

I guess what should be asked - how much of what we are talking aobut is under positive selection?

Diogenes said...

Greenie, I'm rather intrigued by your use of the word "Really?" followed by an extract from the abstract of the lead ENCODE paper, which was dissected at enormous length on this blog two years ago, and which all of us here know by heart.

Your use of the phrase "Really?", and then quoting a document which all of us here practically know by heart-- makes me think that you're attempting to falsely create the impression that you know more about the issue than we do. You're the expert. Uh huh.

Greenie, before I dump a pile of Birney quotes on you, I'm going to ask you a direct question: Has Ewan Birney ever stated that the fraction of the human genome that's functional (with "function" here meaning the definition relevant to natural selection/Junk DNA hypothesis) is between 10% and 20%?

It's a simple yes or no question, Greenie. You could also answer "I don't know" but I doubt you will ever say "I don't know", because I think you're trying to dishonestly portray yourself as knowing more about the subject than we do.

I'm asking this question to determine how far you'll go down the rabbit hole of dishonesty. (Note that on this blog we often have internal controversies about whether ID proponents are deliberately lying or just plain stupid. Yes, I'm asking Greenie this question as a little experiment to gather evidence for my side of the "lying or stupid" controversy. Pay attention, John Harshman.)

It's a simple yes or no question, Greenie. Has Ewan Birney ever stated that the fraction of the human genome that's functional (with "function" here meaning the definition relevant to natural selection/Junk DNA hypothesis) is between 10% and 20%?

Answer the question, and then I'll dump a pile of Ewan Birney quotes on you.

John Harshman said...

There are simple reasons for conservation (such as it is) of karyotype: too much difference in chromosome structure makes meiosis difficult in heterozygotes or may cause gametes to have too many or no copies of some essential gene. I seen no reason to invoke any hypothetical tertiary or quaternary structure.

I am not acquainted with the invariable 3D orientation of the X chromosome, and so can't comment on it.

I do have trouble believing in radical transformation of the basics of gene regulation resulting from 320-fold differences in DNA content (that's fugu vs. lungfish) with no apparent change in the gene regulation we know about and no huge transformation of morphology.

Besides which, there are many other reasons to believe in junk DNA aside from the onion test/c-value paradox. Lack of apparent purifying selection, for example.

Robert Byers said...

What?? You confuse a simple matter.
Science is a methodology or its just other things known.
Evolution claims to be a theory of science. nOt just hypothesis in process.
Therefore its on the merits of science that it must prove its a theory.
I say it has failed to do this. its a flaw to say evolutionary biology is a theory of science. Its just a hunch with unrelated data claiming to be supportive.

If science is real then it must be about methodology before conclusions are drawn.
Otherwise its just ordinary weighing data as everyone does.
I say there is no science and IT IS just people figuring things out. Just more careful.
So i accept its claims to being a better methodology but it must prove its done this better methodology.
Evolution fails.
Thats why i always hit them with the GIVE ME your top three biological scientific evidences for the great claims of evolution.
They never fail me that they fail.

christine janis said...

"I say there is no science and IT IS just people figuring things out. "

Congratulations, Robert. You finally understand what science is about.

Greenie said...

"Has Ewan Birney ever stated that the fraction of the human genome that's functional (with "function" here meaning the definition relevant to natural selection/Junk DNA hypothesis) is between 10% and 20%?"

Can you qualify your 'ever'? I do not follow Birney 24/7, but in science formal publication is all that counts and I have not seen Birney make any such statement in any post-ENCODE paper. 80% functionality of human genome is what stands until ENCODE formally retracts their claim.

Not only that, Birney became a member of Royal Society AFTER Dan Graur, Doolittle, Moran etc. raised hell about Birney's claim of murdering junk DNA. That means the society led by Sir Paul Nurse heard all negative opinions and decided to accept Birney's argument of 80-100% (biochemical) functionality of human genome.

Here is the quote of Ewan Birney, FRS from the main blog post -

'If every cell is included, functions may emerge for the phantom proportion. “It’s likely that 80 percent will go to 100 percent,” says Birney. “We don’t really have any large chunks of redundant DNA. This metaphor of junk isn’t that useful.”'

That is an amazing discovery, for which he will become Sir Birney sooner or later.

Diogenes said...

It's a simple yes or no question, Greenie. Has Ewan Birney ever stated that the fraction of the human genome that's functional (with "function" here meaning the definition relevant to natural selection/Junk DNA hypothesis) is between 10% and 20%?

I want you all to pay attention. All you who say creationists aren't deliberately, consciously, knowingly lying. All you who say they just deceived themselves.

Pay attention to how calculated their insinuations and evasions are. They know they're lying.

ANSWER THE QUESTION.

AllanMiller said...

how does one explain constancy of karyotypes across primate lineages unless invoking positive selection?

Tom - what John said. Karyotype variants are filtered by difficulties in meiosis. A rare variant is more likely than not to be selectively disadvantageous. If it becomes common (as, through drift, it might), the disadvantage dissipates, but this provides a drag on variation.

Further, there aren't that many ways of getting rid of excess, and deletions are more likely to remove vital sequence, again tending to preserve the status quo or an insertion over such deletions.

And finally, just how heavy is the burden of excess DNA? It tends to be more assumed than demonstrated that junky genomes will be costly, but when you analyse such a cost, you must do so against the current population. It is possible (though not necessarily so) that having 80% junk makes one less fit than a variant that has none, but that's not the contest. It's one that pits genomes that vary by fractions of a percent against each other.

Lynch invokes population size as a key factor - the smaller populations of eukaryotes are less effective at dealing with the small selection coefficients involved. But I am skeptical of the breadth of application even of this principle. There appears to be a correlation with Ne within similar types, but I think it is over-extended across species with very different mechanistic constraints. I think those constraints themselves provide the main reason for differences across kingdoms.

No-one has properly assessed what the selection coefficients associated with surplus DNA actually are.

Mikkel Rumraket Rasmussen said...

Can you qualify your 'ever'? I do not follow Birney 24/7, but in science formal publication is all that counts and I have not seen Birney make any such statement in any post-ENCODE paper. 80% functionality of human genome is what stands until ENCODE formally retracts their claim.

They HAVE formally retracted their initial claim in their most recent paper, by explicitly conceding there is serious debate about the sensibility of their initial definition of function as "biochemical activity".

I'm with Diogenes here, are you a LIAR for doctrine or just ignorant?

Larry Moran said...

Groegi Marinov says,

Keeping that definition (which, BTW, makes everything that is said in the paper correct, and precludes any possibility of retraction; what was said by who outside of it is a different subject) in mind, the interpretation of those words becomes much different.

I understand why some members of the ENCODE Consortium want to blame everything on the media and claim that they never meant to announce the death of junk DNA. However, I'm not going to let them get away with hyping this revisionist history. I was going to quote you passages from the Birney et al. paper but I've decided to make it into a separate post.

Greenie said...

"I want you all to pay attention. All you who say creationists aren't deliberately, consciously, knowingly lying. All you who say they just deceived themselves.

Pay attention to how calculated their insinuations and evasions are. They know they're lying.

ANSWER THE QUESTION."

You are so used to arguing against creationists and other ID hacks that you lost touch with the scientific world.

Facts.

1. I already answered your question in previous comment. Learn to read.

2. Ewan Birney got nominated to FRS two weeks back. 'Fellow of Royal Society' is not a blog, mailing list or comment section, but UK's topmost scientific society. His nomination means UK's leading biologists vetted his case (despite Moran calling Birney ENCODE's PR frontman), and found his ENCODE discovery to be equivalent to those from the greatest British scientists. The press release compared him with Newton, Darwin, etc.

3. It also means leading biologists like Paul Nurse do not give a f*** about sandwalk blog, Graur or Doolittle. In another few years, anyone claiming that 80% of human genome is NOT functional will be called ENCODE-deniers.

You can bitch here all day about how ENCODE is wrong and so on, but the FRS decision clearly shows that you did not make an effective case against anything.


Mikkel Rumraket Rasmussen said -

"They HAVE formally retracted their initial claim in their most recent paper, by explicitly conceding there is serious debate about the sensibility of their initial definition of function as "biochemical activity""

You are imagining things. A convoluted paper 'neither accepting nor denying any wrongdoing' is far from a formal retraction except in your imagination. Even the Kellis quote in OP's blog post says "ENCODE isn't backing away from anything" and Kellis is the first author of the new paper. Mattick published a paper in 2014 claiming the entire genome to be full of functional RNA. Here is what Mattick's website says as of today -

http://www.garvan.org.au/research/neuroscience/rna-biology-and-plasticity/johmat

"Over the past 20 years he has pioneered a new view of the genetic programming of humans and other complex organisms, by showing that the majority of the genome, previously considered ‘junk’, actually specifies a dynamic network of regulatory RNAs that guide differentiation and development. He has published over 250 research articles and his work has received coverage in Nature, Science, Scientific American, New Scientist and the New York Times, among others."

Diogenes said...

Greenie says: "I already answered your question in previous comment. Learn to read."

No you have not. Here is what you wrote: "I do not follow Birney 24/7"

I did not ask you if you followed Birney 24/7. Here, again, is the question I asked: "It's a simple yes or no question, Greenie. Has Ewan Birney ever stated that the fraction of the human genome that's functional (with "function" here meaning the definition relevant to natural selection/Junk DNA hypothesis) is between 10% and 20%?"

You wrote things like

"Can you qualify your 'ever'?"

This is not "yes", "no", nor "I don't know." My definition of "ever" is from the Big Bang until May 14, 2014, 4PM EST. Why are you asking me to qualify "ever"? Preparing to evade?

"I do not follow Birney 24/7"

I did not ask you if you followed Birney 24/7. This is not "yes", "no", nor "I don't know."

"but in science formal publication is all that counts"

This is not "yes", "no", nor "I don't know." I didn't ask you what is "all that counts" in science. Why can't you answer the question?

But as an aside, here's another question to ask you: of Ewan Birney's "formal publications", which "formal publication" said it had refuted the Junk DNA hypothesis-- or described, or even mentioned the Junk DNA hypothesis?

Ooops. You see, Greenie, what really counts to you are not peer-reviewed papers, but press releases-- because Greenie, the ENCODE consortium made no mention of Junk DNA in their peer-reviewed papers of 2012, instead only mentioning it in press releases.

Press releases, Greenie-- not peer-reviewed papers. But Greenie says "formal publications" are all that counts.

Then why did Ewan Birney never, not once, describe the Junk DNA hypothesis and present evidence against it in ANY of his peer-reviewed papers?

Greenie quotes the following: "He [Birney] has published over 250 research articles"

Wow! 250 research articles-- and not one of them, NOT ONE, describes the Junk DNA hypothesis and presents evidence against it! Why is that, ya think?

So now, to keep a running total, are two, two questions for Greenie to answer:

1. Has Ewan Birney ever stated that the fraction of the human genome that's functional (with "function" here meaning the definition relevant to natural selection/Junk DNA hypothesis) is between 10% and 20%?

2. In which of Ewan Birney's 250 publications did he mention the Junk DNA hypothesis and present evidence against it, in the publication, not in the press release? Remember, Greenie says only "formal publications" count, so press releases are verboten!

Diogenes said...

I would like to point out the following irony.

Greenie, above, is attempting "argument from authority". In Greenie's first comment here, Greenie presented the evasive, backpedaling, self-contradicting Ewan Birney as an infallible authority whom we may not question, even when he contradicts himself, saying one thing in the abstract, another thing in the press release, a third thing on blogs and media interviews. Of course I destroyed that argument by pointing out that Ewan Birney has repeatedly stated the fraction of the genome that's functional is really between 10% and 20%. If Birney is the infallible authority, then ENCODE never disproved Junk DNA.

But our IDcreationist friend, Robert Byers, by contrast, constantly tells us that we may not invoke argument from authority. Scientists don't exist, Byers tells us, and science does not exist, he tells real scientists over and over again; it's all just "lines of reasoning."

Byers: "“scientists” are just people who did it as kids in their late teens and early twenties and got a degree saying they know this or that. ...I have never found any subject in science to be difficult for me or anyone to quickly understand. ...Nope. you can’t cry authority." [Robert Byers, 2013, Panda's Thumb]

"You can’t cry authority" Byers sternly warns real scientists, when he himself has never entered a laboratory in his life. Yet Byers himself constantly cries "authority" when he quotes the Bible. And his favorite way of crying authority is to claim Christians are smarter, the true intellectuals, therefore they must be right about Noah's Flood:

"This [IDcreationist] book was read/bought by the educated public. Only they would be interested in such subjects. This is a reason evolutionism is losing public support. They are not persuasive to the educated people in North America. The most intelligent people in mans history." [Robert Byers, 2013, Panda's Thumb]

So, "crying authority" is OK only when you're making up fictional authority you don't have. "Crying authority" is OK when you do it, not OK when we do it.

IDcreationists just change their standards ad hoc, on the fly, to support their predetermined answer. Their own standards always hoist them on their own petards-- they get stuck in self-contradictions, which we rub in their faces-- then, without blinking an eye, they change to a different standard of infallible, unchanging truth. IDcreationists will change their standards of infallible, unchanging truth 50 times a day without batting an eye.

Georgi Marinov said...

Laurence A. MoranWednesday, May 14, 2014 9:59:00 AM
Groegi Marinov says,

Keeping that definition (which, BTW, makes everything that is said in the paper correct, and precludes any possibility of retraction; what was said by who outside of it is a different subject) in mind, the interpretation of those words becomes much different.

I understand why some members of the ENCODE Consortium want to blame everything on the media and claim that they never meant to announce the death of junk DNA. However, I'm not going to let them get away with hyping this revisionist history. I was going to quote you passages from the Birney et al. paper but I've decided to make it into a separate post.


I am by no means trying to revise history; who said what outside of the paper is publicly available information. However, in the OP you said the paper should be retracted, which is wrong, because as a self-contained entity, there is absolutely nothing wrong with the paper. When has a paper been retracted because of the commentaries and press releases written about it? First, that makes no sense, and second, if we were to follow such criteria, we should retract three quarters of the papers for which a press releases has been written (because rare is the press release that does not vastly overhype the results in the paper it is reporting on)

Diogenes said...

I get Georgi's point, but

"if we were to follow such criteria, we should retract three quarters of the papers for which a press releases has been written (because rare is the press release that does not vastly overhype the results in the paper it is reporting on)"

Uh... so you're saying the problem is so widespread, we can do nothing?

Shouldn't there be consequences for Press Release sociopaths?

Larry Moran said...

Georgi Marinov says,

However, in the OP you said the paper should be retracted, which is wrong, because as a self-contained entity, there is absolutely nothing wrong with the paper. When has a paper been retracted because of the commentaries and press releases written about it?

I didn't mean that the paper should be retracted. I meant that Nature (or Science) should retract it's incorrect description of what the authors claim is in the paper. What else can they do now that the authors have come right out and said that the Nature editors got it wrong?

As for the paper itself, the authors should issue a "correction" or a "Corrigendum" pointing out that they did not really mean to say that 80% of the genome has a function or that all DNA binding sites are functional. They should also make it clear that they don't dispute the idea that most of our genome is junk in spite of what the paper seemed to be saying.

Those kind of corrections appear in every issue of Nature when authors realize that they have made a mistake.

Tom Mueller said...

Hi John – Hi Allan – Hi Georgi

You all are conjuring happy memories of a graduate seminar I attended decades ago! I remain in your debt and thank you.

I have always suggested to my students that perhaps chromosomes do indeed have their equivalent to tertiary and quaternary structure. Otherwise how does one explain constancy of karyotypes across primate lineages unless invoking positive selection?

I then ask my students to check out this link:
http://phys.org/news/2013-09-x-shape-true-picture-chromosome-imaging.html

FTR – unless I am missing something, even the champions of “JUNK-DNA” do not disagree!

Here is Ford Doolittle in a response to one of my naïve questions:

That "excess" DNA (that 40x as much that lungfish have compared to us, for instance) might play a structural role or determine other cellular parameters under selection has long been accepted, even by us "junk" supporters.

To my mind, it's like the "clean fill" you see signs for along the highway. There may be a need for that much DNA but it doesn't matter what it is, as long as it doesn't contain deleterious sequences. You are also suggesting, I think, that larger-genomed organisms might be regulating the same amount of function as smaller-genomed organisms, just in more (and possibly unnecessarily?) elaborate ways. I agree, as spelled out in the [PNAS] paper.
Ford Doolittle

ITMT, I remind you that the most recent common ancestor of the Hominidae lived roughly 14 million years ago! That is a remarkably long time to maintain the integrity of karyotype banding patterns for what you both claim to be functionless junk given the frequency of genomic rearrangement in eukaryotes. For example, many perfectly healthy populations of house mice, for example, can be distinguished from other house mice by fused chromosomes.

I suspect a strong selection pressure for the maintenance of karyotypes in Hominidae. Georgi’s answer perturbed me greatly: “[these] structural and spacing functions are not biochemically visible to the assays used by ENCODE”

Oh dear! I had naively presumed otherwise.

The problem is how to measure this positive selection if it in fact exists.

Tom Mueller said...

With hat in hand and head bowed - please permit me a hopeful petition;

Any chance we could discuss these two papers on this forum?

http://www.nature.com/news/jelly-genome-mystery-1.15264

http://www.sciencemag.org/content/342/6164/1242592

Did the first split in the animal tree of life occur between the ctenophore lineage and all other subsequent animal phyla or alternatively between the Porifera lineage (or at least one of them) and all other subsequent animal phyla.

In other words, do modern sponges represent a more recent branch in the animal tree of life while modern ctenophores represent a more ancient branch?
… on the understanding that this question is not at all equivalent to asking whether the last common ancestor to eumetazoa looked anything like a ctenophore.

One nagging question persistently bothers me: I wonder out-loud whether lateral/horizontal gene transfer confound this analysis greatly. Wouldn’t the gut of some dying primitive eumetazoan be the perfect reaction chamber for DNA transformation?

I imagine two different species captured inside a third species gut. I also imagine genes being transferred from some partially digested dead organism to the germline of another living organism before it managed eventual escape from the gut? Maybe some accidental release of sperm or eggs in the gut would be sufficient for such transformation from one species to another. Maybe the germline of a primitive predator was subject to this kind of transformation.

Aplogies in advance... my Ritalin levels are low and my imagination is in hyperdrive.

Tom Mueller said...

oops - let me rephrase that

I should have said;

...this question is not at all equivalent to asking whether the last common ancestor to eumetazoa looked anything like a modern ctenophore.

AllanMiller said...

Tom,

Hmmm. Are you aware of this paper: http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1461872/

?

It appears that karyotype distributiuons correlate strongly with 'mostly-acrocentric' or 'mostly-metacentric' in a genus, with the middle ground rarely occupied. This seems to be due to drive in female meiosis, with the current polarity favouring one arrangement over the other as they 'try' and evade the polar body fate. This polarity periodically reverses, and is probably fixed by drift, but then favours alternative centromere placement. This drive is clearly strong enough to override any diminution of fitness due to misalignments and aneuploidies and any other functional brake provided by higher-dimensional chromosome structure. But there is no rule that polarity has to reverse in any given time period, nor in any particular genus. It appears pretty labile in rodents, and in those deer.

There seems to be some implication of human exceptionalism in your argument - a hypothesised function with a selection coefficient sufficient to prevent karyotype rearrangement in our neck of the woods, while the rest of Mammalia scuttle from side-to-side of the boat unimpeded by this, with only aneuploidies to oppose female drive by fitness effects. I'm skeptical.

Tom Mueller said...

Hi Allan

Thank you for that reference. I always suspected that alignment and subsequent segregation of trivalents at Metaphase I in Meiosis was not entirely random. I will need to incorporate that into my worksheet that I provide my high school students.

Parenthetically: the fact that either both mules and hinnys can be male or female begs many “epigenetic” questions. The identity of the fetilized ovum determines the identity of the hybrid.

Advocatus diaboli ON

Let’s talk about heterochromatin “function”.

Heterochromatin is employed as a platform for the recruitment of effectors across extended domains along chromosomes including but not restricted to silencing and anti-silencing factors. It gets better: Heterochromatin (both facultative and constitutive) also regulates cell-type specific spreading of protein complexes along chromosomes that ultimately controls transcription, chromosome segregation and long-range chromatin interactions.

Sounds pretty “functional” to me.

Now of course – The exact sequences of “functional” murine heterochromatin would not be identical to the human equivalent making assay and identification difficult. That said – I am betting that strong positive selection exists for the maintenance of karyotype commonalities between primates.

So I wonder out loud: is it possible that ENCODE was right for all the wrong reasons?

http://www.nature.com/nrg/journal/v8/n1/execsumm/nrg2008.html

http://www.ncbi.nlm.nih.gov/pmc/articles/PMC139905/

Advocatus diaboli OFF

Thanking everyone in advance for their patience and indulgence.

Tom Mueller said...

Re: Parenthetically: the fact that either both mules and hinnys can be male or female begs many “epigenetic” questions. The identity of the fetilized ovum determines the identity of the hybrid.

I hope I did not just "sound stupid"


I employed "begging the question" transitively as opposed to intransitively i.e. I was taking issue with many falacious textbook handwaving explanations of "epigenetics" that were, in fact, guilty of "begging the question" as elucidated by Mark Ptashne

Perhaps my "transitive" usage is also incorrect.

Claudiu Bandea said...

Tom Mueller says: Georgi’s answer perturbed me greatly: “[these] structural and spacing functions are not biochemically visible to the assays used by ENCODE”

Those kind of statements are indeed perturbing, particularly when presented in a puerile fashion by our most esteemed scientists. See, for example, Ford Doolitle ( http://www.ncbi.nlm.nih.gov/pubmed/23479647):

“junk advocates have to date generally considered that even DNA fulfilling bulk structural roles remains, in terms of encoded information, just junk”

BTW, what is the original source for the quote you attribute to Doolittle: That "excess" DNA (that 40x as much that lungfish have compared to us, for instance) might play a structural role or determine other cellular parameters under selection has long been accepted, even by us "junk" supporters…"

Georgi Marinov said...

I am not sure what the argument is about here.

ChIP/RNA/DNAse-seq do not assess putative spacing and structural functions. Fact.

When people took ENCODE's data and claimed they have disproved junk DNA with it, it was the biochemical activity seen in those assays that formed the basis of the argument. It was not spacing and structural functions. Fact.

DNA that has spacing and structural roles (and I remain totally unconvinced these things are as important as claimed) does not function at the sequence level so it's free to evolve neutrally and basically it carries no information. And it's sequence-level information that people who claim that there is no junk DNA usually have in mind. I don't see what's so controversial about that. BTW, I am not even sure what exactly is meant by "structural function" - centromeres, telomeres, origins of replication, etc. seem to fit the term to me, but somehow I am left with the impression that something else is meant when the term is used, but it is rarely defined clearly.

Larry Moran said...

There are four bulk DNA hypotheses that I know of. They are ....

1. The skeletal DNA hypothesis (more DNA = larger nucleus and more nuclear pores) (Cavalier-Smith)
2. Spacers and loops (Zuckerkandl)
3. Mutation protection (various authors)
4. Teleological hypotheses (excess DNA is necessary for the evolution of new genes and new regulatory functions)

Some DNA is required as "spacers" as I've discussed several times on this blog. The sequence of that DNA doesn't matter but it is NOT JUNK. It has an essential function. (The minimal size of introns is a good example.)

If any of the other bulk DNA hypotheses turn out to be true then that DNA isn't junk either.

Georgi Marinov said...

The minimal size of introns is not what I had in mind - in mammalian genomes we are talking about such large distances between enhancers and promoters and intron lengths, that it is difficult to imagine how there could be serious constraints keeping them within a tight range.

Also, the minimal length of introns is less than 20bp (in nucleomorph genomes), which is a totally irrelevant number in the context of this debate

Tom Mueller said...

Hi Claudiu

The quote comes from Professor Ford Dolittle’s kind & patient email response a while ago to a lengthy and naïve query I posed regarding the c-value paradox posed by the recently published Bladderwort sequence data.

Here is the relevant portion of what I wanted to pass along to my students. I asked Prof. Dolittle to vet it before I caused any damage:

OK - here follow my speculative attempts to make sense of all this information:

First, let us stop using the expression “junk DNA”. My study is not filled with “junk” as understood in the sense of trash or garbage. (my wife may disagree) Consider such DNA as collateral miscellany, bric-a-brac useful in its own way when the occasion rises. Other teachers may have more Spartan study rooms than I; and they manage to get their work done their own way; meanwhile I get my word done, my way.

Birds are different than most vertebrates – they do not possess much “bric-a-brac” DNA. Avian gene regulation is somewhat less embellished or elaborate than in other vertebrates.

Bladderworts are also different! They could be described the Avians of the botanical world, from a “bric-a-brac DNA” and gene regulation point of view.

That said, there are organisms that make enthusiastic employ of “bric-a-brac DNA” for gene regulation… Vive La Différence!

Here is yet another recent publication that merits attention

http://phys.org/news/2013-09-x-shape-true-picture-chromosome-imaging.html

( factoid alert – the X-Chromosome got its name by pioneer cytologists who had not yet determined it was a chromosome. It was identified as the X-particle along the same lines as “X” in algebra and the “X-Files” on TV.
Now, forgive the rambling digressions of an elder Biologist.)

Back to that picture: Consider how we are discussing the defined and decidely non-random orientation of a chromosome during interphase.

That is mind-boggling. And exactly how does the X chromosome achieve this remarkable “architecture”? I am suggesting it must have something to do with all that bric-a-brac-DNA. Clearly this architecture is there for a reason and must be important – something to do with the function of DNA - i.e. something to do with gene expression.

If correct, could we not predict that Birds and Bladderworts have floppier DNA architecture and avail themselves of many, but not all gene-regulatory mechanisms found in Chimps and Humans?...

Maybe ENCODE is on to something.


Tom Mueller said...

Prof. Dolittle’s reply:

I attach my PNAS article on this, which is pretty much what the UNBSJ talk covered. I am perfectly willing to re-define "junk'. The problem with ENCODE is that they used one operational definition of "function" to refute a claim (that most DNA is junk) which is grounded in a quite different (etiological) definition. One can define "function" however one wants, but this sort of conflation (semi-deliberate on their part) is disingenuous.


My follow-up:


Thank you! I will definitely read your paper over carefully.


ITMT – please permit me a fast query: in your opinion, would my fast and dirty rendition be sufficiently error-free to present “as-is” to my students?


For example, would I perhaps be stretching a point too far when I claim interphase chromosome architecture is “functional” ?

… or

…is my suggestion perfectly acceptable form an ENCODE POV. I guess I am asking whether my redefinition of “functionality” is not similarly “disingenuous” along the lines you criticize in your PNAS paper.


Thank you again for the reference. I will read it tonight.


Thank you in advance, for your patience and your indulgence.


Prof Dolittle’s response, including the quote in question:

Well no, not disingenuous, but perhaps not new. That "excess" DNA
(that 40x as much that lungfish have compared to us, for instance)
might play a structural role or determine other cellular parameters
under selection has long been accepted, even by us "junk" supporters.
To my mind, it's like the "clean fill" you see signs for along the
highway. There may be a need for that much DNA but it doesn't matter
what it is, as long as it doesn't contain deleterious sequences. You
are also suggesting, I think, that larger-genomed organisms might be
regulating the same amount of function as smaller-genomed organisms,
just in more (and possibly unnecessarily?) elaborate ways. I agree, as
spelled out in the paper.


Tom Mueller said...

Bottom Line – As far as I can make out, I need to agree with Larry and Georgi that ENCODE was indeed guilty of being “disingenuous” and that Dolittle’s critique was right on target.

At a minimum, if ENCODE did not have the data in hand, they had no business speaking out of turn...

In any case, I probably betrayed further naïveté by posing a question above regarding recent data on heterochromatin. I should would appreciate being set straight. (... or would that be 'strait'?) ;-)

Larry Moran said...

@Georgi Marinov

I didn't mean that the minimal size of introns was significant. What I meant was that I acknowledge that there can be some functional DNA (not junk) whose sequence is not conserved. Some small bits of sequence between regulatory sites also fall into this category.

This is important because it complicates the debate over the definitions of "function" and "junk." I don't think that the amount of functional bulk DNA is significant but it means the proponents of junk DNA can't rely on sequence conservation as the ultimate defining feature of functional DNA.

As usual, biology is far too messy to allow simple universal definitions of anything. :-)

None of the bulk DNA hypotheses make any sense when we're looking at the big picture. Our genome is 90% junk.

Georgi Marinov said...

There is no sharp distinction between "function" and "junk" - no argument there, I in fact agree wholeheartedly. And it's not just the spacers within regulatory sites, and other sequences like that - the importance of regulatory sites is distributed on a continuum from absolutely vital to those that are in the process of drifting into or out of existence, and the same likely applies to lncRNAs, etc. This is why it is pointless to debate how much exactly of the genome is functional - whether it is 10% or 11%, it does not really matter, and I would argue that the difference between 5% and 10% is not very significant either. The questions are:

1) Whether most of the genome is junk (because that is vitally important to how we think about it and ultimately, ourselves)
2) The precise identity of the functional elements (because that''s what matters in practice)

BTW, that's what was meant in the PNAS paper when it was said that the exact estimate of how much of the genome is functional is not important. But it ended up being widely ridiculed as if it implied that the main results (which was not the main result at all) of the 2012 ENCODE paper was now all of a sudden not important...

Larry Moran said...

Georgi Marinov said,

BTW, that's what was meant in the PNAS paper when it was said that the exact estimate of how much of the genome is functional is not important. But it ended up being widely ridiculed as if it implied that the main results (which was not the main result at all) of the 2012 ENCODE paper was now all of a sudden not important.

The original ENCODE Consortium paper of 2012 focused on defining the functional parts of the human genome. The authors explicitly said that their goal was to "delineate all functional elements in the humans genome."

The only number in the abstract was the statement that "these data allow us to assign biochemical functions to 80% of the genome." The second paragraph of the article describes the definition of "function" that justifies the goal of the project and allowed them to conclude that most of the genome is functional.

You are engaging in revisionism if you now claim that the main goal, and the main conclusion, of the 2012 summary paper was NOT to promote the idea that 80% of our genome is functional. That's exactly how the vast majority of science writers read it and that includes commentaries in Nature and Science. At the time, there were no ENCODE leaders that I know of who spoke out against these interpretations. That's because the press reports accurately reflected what they really meant to say.

The PNAS paper (Kellis et al. 2014) makes two major concessions. The ENCODE leaders now say that there may be other legitimate ways of defining biological function and that much of what they described as functional in 2012 may, in fact, be spurious or artifact.

They now concede that there are legitimate reasons to conclude that most of our genome is junk. There's nothing in the PNAS paper that wasn't known in 2012 except for the criticism that ENCODE was subjected to.

The ENCODE leaders now say in the 2014 PNAS paper ...

The major contribution of ENCODE to date has been high-resolution, highly-reproducible maps of DNA segments with biochemical signatures associated with diverse molecular functions. We believe that this public resource is far more important than any interim estimate of the fraction of the human genome that is functional.

Georgi, you are perfectly correct when you say that the ENCODE Consortium (including you) were ridiculed for making that statement. It's the exact opposite of what the original ENCODE publicity campaign was all about. Are you seriously trying to tell me that back in September 2012 the ENCODE Consortium was really trying to hype the fact that they were publishing lots of data and not that they had discovered functions for 80% of the genome?

The ridicule was/is deserved.

Georgi Marinov said...

I am not trying to revise anything, I don't understand why you have to say that.

The 80% thing was tacked onto the set of papers published around that time and eventually became the highlight that was talked about in the press and in all the responses that have been published since then.

But it was very much not what the consortium spent 5 years working on, in fact there was almost no discussion of it internally. That's what I was referring to in that post.

You are absolutely correct that for someone, who is not intimately familiar with the field, reading the paper and the press releases would have left the impression that this was the main thing, But it really wasn't.

Claudiu Bandea said...

Tom,

Thanks for disclosure.

Claudiu Bandea said...

The problem with the ENCODE was that the scientists working on this project were not familiar with the scientific literature on genome evolution, or they choose to ignore it; and, this also seem to be the case with many people addressing the ENCODE fiasco.

An apparent exception to this scholarly blunder is the recent treatise by Ford Doolittle: “Is junk DNA bunk? A critique of ENCODE” ( http://www.ncbi.nlm.nih.gov/pubmed/23479647). I’m quoting below a passage from this paper, but it would make sense for those interested in the genome evolution and the C-value enigma to read the entire paper, as well as the associated literature [Parenthetically, as I pointed out in a PubMed Commons post on Doolittle’s paper (http://www.ncbi.nlm.nih.gov/pubmed/23479647#cm23479647_1429), when evaluating the biological function of genomic informational DNA (iDNA), things are relatively straightforward; however, evaluating potential biological functions of the non-informational DNA (niDNA) is a much more difficult issue, to the extent that even Doolittle narrative get entangled in confusion and nonsensical statements].

“Of course, DNA inevitably does have a basic structural role to play, unlinked to specific biochemical activities or the encoding of information relevant to genes and their expression. Centromeres and telomeres exemplify noncoding chromosomal components with specific functions. More generally, DNA as a macromolecule bulks up and gives shape to chromosomes and thus, as many studies show, determines important nuclear and cellular parameters such as division time and size, themselves coupled to organismal development (11–13, 17). The “selfish DNA” scenarios of 1980 (20–22), in which C-value represents only the outcome of conflicts between upward pressure from reproductively competing TEs and downward-directed energetic restraints, have thus, in subsequent decades, yielded to more nuanced understandings. Cavalier-Smith (13, 20) called DNA’s structural and cell biological roles “nucleoskeletal,” considering C-value to be optimized by organism-level natural selection (13, 20). Gregory, now the principal C-value theorist, embraces a more “pluralistic, hierarchical approach” to what he calls “nucleotypic” function (11, 12, 17). A balance between organism-level selection on nuclear structure and cell size, cell division times and developmental rate, selfish genome-level selection favoring replicative expansion, and (as discussed below) supraorganismal (clade-level) selective processes—as well as drift—must all be taken into account.” (emphasis added).

Tom Mueller said...

Claudiu OK – this is where I get confused…

In a post above, I asked the following:


I always understood that retroviruses co-opted host regulatory machinery and vice versa constituting the acme in molecular host-parasite coevolution.

http://www.nature.com/nature/journal/v487/n7405/full/nature11244.html
http://www.sciencedaily.com/releases/2007/11/071114121359.htm

Meanwhile, the different distributions of Alu and LINE1 in the genome would suggest that selection pressure may be involved. Do Alus direct methylation? Are Alus and Line1 DNA symbionts?


Claudiu, you agreed – and elaborated even further above. Let's see if I managed to capture your drift...

I am now scratching my head at the continuing exchange and Claudiu's rebuttal. We are no longer talking about whether selfish DNA is functional, but rather whether the repetitive sequences of ancestral retroviral symbiotic bulk DNA has been co-opted for global gene regulation and cell differential and whether this new state of affairs merits the designation of “functional DNA”.

I am still attempting to wrap my head around exactly what is meant by “bulk” DNA and how to assess positive selection.

“…repeated DNAs display very high frequencies of sequence changes during evolution that become homogenized across genomes. These observations suggest the presence of mechanisms that balance interactions and exchange of information between heterochromatic sequences with the need to avoid negative consequences to genome stability.” link

Now that I find interesting… If I understand all this correctly, positive selection for conservation of sequence “type” is not equivalent to conservation of the precise original sequence; but that said, positive selection is still very real and very real for real important reasons.

If so, I have addressed Allan's rebuttal.

If so, then I also understand where Claudiu is coming from, together with Claudiu's criticism of Doolittle’s PNAS rebuttal.

Larry came up with a great list:

1. The skeletal DNA hypothesis (more DNA = larger nucleus and more nuclear pores) (Cavalier-Smith)
2. Spacers and loops (Zuckerkandl)
3. Mutation protection (various authors)
4. Teleological hypotheses (excess DNA is necessary for the evolution of new genes and new regulatory functions)


I wonder out loud if this list is incomplete. So I will repeat myself:

Let’s talk about heterochromatin “function” along epigenetic lines. I paraphrased above, a review above along these lines:

Heterochromatin is employed as a platform for the recruitment of effectors across extended domains along chromosomes including but not restricted to silencing and anti-silencing factors. It gets better: Heterochromatin (both facultative and constitutive) also regulates cell-type specific spreading of protein complexes along chromosomes that ultimately controls transcription, chromosome segregation and long-range chromatin interactions.

Sounds pretty “FUNCTIONAL” to me.

Now of course – The exact sequences of “functional” murine heterochromatin would not be identical to the human equivalent making assay and identification difficult.

That said – I am betting that strong positive selection exists for the maintenance of karyotype commonalities between primates.

So I wonder out loud: is it possible that ENCODE was right for all the wrong reasons?


It would appear that ancient retrovirus sequences are the sine qua non of cell differentiation and global gene regulation (at least in primates) and do constitute functionality along lines originally espoused by ENCODE. I humbly suggest that 18–12 million years is a very long time to maintain karyotype commonalities in apes.

Of course there still remains the entirely separate question of whether ENCODE had the data in hand to justify such lines of hypothesis... a completely separate question, altogether. Being guilty of hubris is not tantamount to being guilty of falsehood.

Anonymous said...

Georgi Marinov,

You may have misunderstood the message from Larry... It was a veiled threat.... I'm sure you are very comfortable with that..... After all you are one of the every few who can "see" some sunlight.... Let's just hope it is enough....,

Larry Moran said...

@Georgi Marinov

Are you saying that over a period of five or six years the majority of members of the ENCODE Consortium were just interested in data collection and storage and didn't think much about the implications or whether they were actually cataloguing sites that had biological significance?

Are you suggesting that during group meetings nobody wondered wether the pervasive transcription they were recording was real or just spurious transcripts as many had already suggested in the published literature. Are you telling us that nobody in those labs raised any questions about nonspecific binding of transcription factors as described in the textbooks? Is it true that none of the PI's, postdocs, or graduate students gave journal club presentations on the junk DNA controversy and how if impacted the work they were doing on the characterization of the human genome?

That's a pretty damning accusation. It strongly suggests that those labs were just composed of a bunch of highly trained technicians who never thought about the results they were churning out and what they might mean. It also suggests that they didn't read the press reports published in 2008 when ENCODE finished (and published) the pilot project. It suggests that members of ENCODE labs were completely ignorant of, and uninterested in, the criticisms that followed publication of those papers.

Is it really true that the people in your lab never talked about whether the transcription factor binding sites they were analyzing were really regulatory sites or artifacts? Did you never discuss ways of identifying functional sites from nonfunctional sites or was the goal just to publish the locations of all the sites and let someone else try and figure out which ones were real?

Claudiu Bandea said...

Hi Tom,

I posted the following comment at Larry’s post “What did the ENCODE Consortium say in 2012?”, but it might be more appropriate here:

When thinking about the evolution of genome size and about C-value enigma, it is critical to realize that genomic DNA can play informational (iDNA) functions, which are based on sequence specificity, or it can have non-informational (niDNA) functions, which are independent of the nucleotide sequence.

We have known for half of century or so (and for good reasons, such as sequence conservation and mutational load) that in organisms with high C-value, such as humans, only a low percentage of the genomic DNA can be iDNA. We have also known for a very long time that most of the genomic DNA in species with high C-value consists of retroviral and transposable elements sequences or their remnants, and that *some* (a few percentages at the most) of these sequences have been co-opted as iDNA or niDNA.

Based on these facts and rationales, the primary question addressed by the scholars in the field was whether the bulk of the genome (90% or more in the human genome), which consists primarily of viral and transposable elements, was simply parasitic or ‘junk DNA’ (jDNA), or it was functional niDNA.

As described by Doolitlle in his PNAS paper ( http://www.ncbi.nlm.nih.gov/pubmed/23479647), the two prevalent hypotheses advanced by the scholars in this field regarding potential non-informational functions for the so called jDNA were the ‘nucleo-skeletal’ (Cavalier-Smith) and ‘nucleotypic’ (Gregory) hypotheses:

The “selfish DNA” scenarios of 1980 (20–22), in which C-value represents only the outcome of conflicts between upward pressure from reproductively competing TEs and downward-directed energetic restraints, have thus, in subsequent decades, yielded to more nuanced understandings. Cavalier-Smith (13, 20) called DNA’s structural and cell biological roles “nucleoskeletal,” considering C-value to be optimized by organism-level natural selection (13, 20). Gregory, now the principal C-value theorist, embraces a more “pluralistic, hierarchical approach” to what he calls “nucleotypic” function (11, 12, 17).

In the material I mentioned to you before, I proposed that the ‘nucleo-skeletal’ and ‘nucleotypic’ hypotheses cannot explain the C-value enigma (i.e. they do not pass the ‘onion test’), and I discussed an old hypothesis that explains the evolution of genome size and the C-value enigma:

http://www.ncbi.nlm.nih.gov/pubmed/23479647#cm23479647_1429
http://biorxiv.org/content/early/2013/11/18/000588

Claudiu Bandea said...

This is to expand on Larry’s series of questions to Georgi:

It is hard to believe that the ENCODE scientists, many of whom occupy highly regarded positions in some of the finest academic institutions in the world, were not familiar with the C-value paradox, which has been one of the most fundamental concepts in genome biology for decades.

How is it possible to design and conduct a huge project on the human genome, without having this fundamental concept at the center of it?

Georgi, my understanding is that you just finished graduate school. Have you learned about the significance of the C-value paradox in any of your classes?

Georgi Marinov said...

You don't really take classes in grad school so your question is irrelevant.

Laurence A. MoranSaturday, May 31, 2014 8:25:00 AM

Are you saying that over a period of five or six years the majority of members of the ENCODE Consortium were just interested in data collection and storage and didn't think much about the implications or whether they were actually cataloguing sites that had biological significance?


This is a gross misrepresentation of what the consortium was doing. I can't even understand how you could say such a thing -- the consortium did an enormous amount of work developing experimental and computational methods for generating and analyzing genomic data (you know, the work that is contained in those hundreds of papers that came out of it other than the Nature one and that nobody ever discusses). To dismiss all that work (much of which involved truly serious intellectual effort) as "mere data collection is storage" is just such a base thing to say... I know you know better than that. The junk DNA debate is almost completely a separate thing.

Larry Moran said...

@Georgi Marinov

Let me make sure I understand what you are saying.

Are you saying that it's true that the members of the ENCODE Consortium weren't very interested in making sense of their results and trying to understand the biologial functions of the genome?

I'm not dismissing the data or the technical effort that went into collecting it. What astonishes me is the claim that the analysis of that data proceeded in the absense of any knowledge about junk DNA or the need to distinguish between spurious, low abundance, junk RNA transcripts and real transcripts or between functional binding sites and accidental ones.

You make it sound like nobody thought about the biology until the last minute and then put it all into one paper that they screwed up. You seem to be saying that it was a "serious intellectual effort" to analyze all that data without ever knowing about junk DNA and the evidence that 90% of the human genome has no biological function.

If that's what you're saying then it IS a base thing to say. You should be congratulated for the work in the Kellis et al. (2014) paper but that's the kind of thinking and data analysis that should have been seen in all those hundreds of papers that were published in 2012.

Georgi Marinov said...

You are setting a certain bait there that I am not going to bite on.

It is entirely possible to write hundreds of very high quality functional genomics papers without touching on the subject of junk DNA. It is a large field with a lot of questions to be answered. Junk DNA is relevant only to some of them.

What astonishes me is the claim that the analysis of that data proceeded in the absense of any knowledge about junk DNA or the need to distinguish between spurious, low abundance, junk RNA transcripts and real transcripts or between functional binding sites and accidental ones.

This does not accurately describe the reality of the situation

You make it sound like nobody thought about the biology until the last minute and then put it all into one paper that they screwed up.

This doesn't either.

I will end the discussion here.

John Harshman said...

Georgi,

A trivial point: did you really not take any classes in grad school? I took quite a few myself, and I always thought that was the norm. The catalogs of every university I'm familiar with include quite a few graduate-level courses. Who takes them?

Claudiu Bandea said...

@Giorgy,

Let me rephrase the questions:

1. Considering that the C-value paradox (also referred to as C-value enigma) has been one of the most fundamental concepts in genome biology for decades, how is it possible to design and conduct a huge project on the human genome, such as ENCODE, without having this fundamental concept at the center of it?

2. Apparently, you just finished graduate school, focusing I presume on studying genome biology; in your studies on genome biology and evolution have you learned about C-value paradox? More specifically have you studied the articles written by the scholars in the field such as, for example, those written by out host Larry Moran or Ryan Gregory?

@ Larry Moran (“You should be congratulated for the work in the Kellis et al. (2014) paper)

Apparently, you haven read the evaluation of the PNAS paper by Kellis et al. at Lior Pachter’s blog “Bits of DNA”:

http://liorpachter.wordpress.com/2014/04/30/estimating-number-of-transcripts-from-rna-seq-measurements-and-why-i-believe-in-paywall/

Georgi Marinov said...

Georgi,

A trivial point: did you really not take any classes in grad school? I took quite a few myself, and I always thought that was the norm. The catalogs of every university I'm familiar with include quite a few graduate-level courses. Who takes them?


You do take classes, but in biology it's just a few (we were required to take a total of 3 real classes; I think I took more graduate-level classes as an undergrad), and grading is lax. I definitely wish there were more classes but to be fair, it was made very clear to us from the very beginning that how much we learn is almost entirely up to us.

It surely varies from program to program but it is invariably the case that in math and physics they take a lot more classes in grad school. But that's because they generally come completely unprepared for research as the gap between what can be reasonably expected to be taught in four years of undergrad and where the cutting-edge research is is so vast. That's not the case in biology - you don't need to take a dozen classes of very advanced math to start pipetting on the bench, so you tend to start pipetting almost immediately. It is true, of course, that there is no way any program can teach you everything you need to know. But there are things that are absolutely necessary and they just don't even exist -- I am about to officially get my PhD in two weeks and in neither my undergraduate nor my graduate institution did I even have the option to take a serious evolution class...