Wednesday, May 14, 2014

What did the ENCODE Consortium say in 2012?

When the ENCODE Consortium published their results in September 2012, the popular press immediately seized upon the idea that most of our genome was functional and the concept of junk DNA was debunked. The "media" in this case includes writers at prestigious journals like Science and Nature and well-known science writers in other respected publications and blogs.

In most cases, those articles contained interviews with ENCODE leaders and direct quotes about the presence of large amounts of functional DNA in the human genome.

The second wave of the ENCODE publicity campaign is trying to claim that this was all a misunderstanding. According to this revisionist view of recent history, the actual ENCODE papers never said that most of our genome had to be functional and never implied that junk DNA was dead. It was the media that misinterpreted the papers. Don't blame the scientists.

You can see an example of this version of history in the comments to How does Nature deal with the ENCODE publicity hype that it created?, where some people are arguing that the ENCODE summary paper has been misrepresented.

Let's look at the summary paper published in Nature on September 6, 2012 in the issue devoted largely to ENCODE papers. The lead author was Ewan Birney, by his own admission, and the paper is usually referenced as Birney et al. (2012).

The scientific summary was accompanied by two articles and one video produced by Nature editors. In the first one by Brendan Maher (Maher, 2012) he attempts to explain the purpose of the ENCODe project. He says,
ENCODE was designed to pick up where the Human Genome Project left off. Although that massive effort revealed the blueprint of human biology, it quickly became clear that the instruction manual for reading the blueprint was sketchy at best. Researchers could identify in its 3 billion letters many of the regions that code for proteins, but those make up little more than 1% of the genome, contained in around 20,000 genes — a few familiar objects in an otherwise stark and unrecognizable landscape. Many biologists suspected that the information responsible for the wondrous complexity of humans lay somewhere in the ‘deserts’ between the genes. ENCODE, which started in 2003, is a massive data-collection effort designed to populate this terrain. The aim is to catalogue the ‘functional’ DNA sequences that lurk there, learn when and in which cells they are active and trace their effects on how the genome is packaged, regulated and read.
Presumably this description reflects the view of the ENCODE Consortium, which worked closely with Nature in coordinating the publicity campaign that accompanied the publications. You can't blame anyone for assuming that the goal of ENCODE is to look for function in junk DNA.

Maher goes on to describe the publication of the pilot project in 2007 when 1% of the genome was analyzed.
The pilot projects transformed biologists’ view of the genome. Even though only a small amount of DNA manufactures protein-coding messenger RNA,for example, the researchers found that much of the genome is ‘transcribed’ into non-coding RNA molecules, some of which are now known to be important regulators of gene expression. And although many geneticists had thought that the functional elements would be those that are most conserved across species, they actually found that many important regulatory sequences have evolved rapidly.
The results of the polot project did NOT "transformed biologists’ view of the genome" if you are referring to knowledgeable biologists. Those results came under heavy criticism for exactly the same reasons the more recent publications have been criticized. Nobody who believed that most of our genome is junk was swayed by the results of the pilot project because the interpretation of those results was flawed by illogical thinking and overinterpretation of data. Just like what happened five years later.

With reference to the results appearing in the Sept. 6, 2012 issue of Nature, Maher says,
The real fun starts when the various data sets are layered together. Experiments looking at histone modifications, for example, reveal patterns that correspond with the borders of the DNaseI-sensitive sites. Then researchers can add data showing exactly which transcription factors bind where, and when. The vast desert regions have now been populated with hundreds of thousands of features that contribute to gene regulation. And every cell type uses different combinations and permutations of these features to generate its unique biology. This richness helps to explain how relatively few protein-coding genes can provide the biological complexity necessary to grow and run a human being. ENCODE “is much more than the sum of the parts”, says Manolis Kellis, a computational genomicist at the Massachusetts Institute of Technology in Cambridge, who led some of the data-analysis efforts.
Now this sounds to me like a genome teaming with function and very little junk DNA but maybe that's not what Brendan Maher and the ENCODE Consortium really meant, right? (BTW, many of us think that our understanding of development in organisms like fruit flies show that a relatively small number of transcription factors working on a conserved set of genes can explain complexity. Apparently the ENCODE Consortium leaders think that something more is need to explain humans.)

The other editorial paper is by Skipper et al. (2012). The lead author is senior editor Magdalena Skipper who can be seen in this video with Ewan Birney announcing that 80% of the genome has a function, meaning that it is not junk.

The Skipper et al. paper introduces us to five researchers who were invited to participate in the publicity campaign by sharing "their views on what the results mean to them and their work" (Ecker et al. (2012). The first of these experts is Joseph Ecker who says,
One of the more remarkable findings described in the consortium's 'entrée' paper (page 57)2 is that 80% of the genome contains elements linked to biochemical functions, dispatching the widely held view that the human genome is mostly 'junk DNA'. The authors report that the space between genes is filled with enhancers (regulatory DNA elements), promoters (the sites at which DNA's transcription into RNA is initiated) and numerous previously overlooked regions that encode RNA transcripts that are not translated into proteins but might have regulatory roles.
Now, none of these editors and experts are ENCODE Consortium authors so it's quite possible that they have all misinterpreted the 'entrée' paper. This is the view now being suggested by the Consortium leaders (Kellis et al., 2014). They now argue that Genetic, Evolutionary, and Biochemical descriptions of "function" are all reasonable approaches to understanding junk DNA and genome composition. They now claim that just having the data available is their most important contribution and not claims about how much of the genome is functional.
In contrast to evolutionary and genetic evidence, biochemical data offer clues about both the molecular function served by underlying DNA elements and the cell types in which they act, thus providing a launching point to study differentiation and development, cellular circuitry, and human disease. The major contribution of ENCODE to date has been high-resolution, highly-reproducible maps of DNA segments with biochemical signatures associated with diverse molecular functions. We believe that this public resource is far more important than any interim estimate of the fraction of the human genome that is functional.
The implication is that they really didn't mean to say that 80% of our genome is functional. It was all a misunderstanding. They are now saying that the goal of the Consortium wasn't to discover function at all but merely to provide maps of places that might be functional, depending on your definition.

With that introduction, let's look at what Birney et al. actually said in their summary paper back in September 2012. You can judge for yourselves whether their statements were misinterpreted.

It's hard to avoid the impression that Birney et al. were attributing some sort of function to every single protein binding site and every single transcribed sequence. The abstract to their paper says,
The human genome encodes the blueprint of life, but the function of the vast majority of its nearly three billion bases is unknown. The Encyclopedia of DNA Elements (ENCODE) project has systematically mapped regions of transcription, transcription factor association, chromatin structure and histone modification. These data enabled us to assign biochemical functions for 80% of the genome, in particular outside of the well-studied protein-coding regions. Many discovered candidate regulatory elements are physically associated with one another and with expressed genes, providing new insights into the mechanisms of gene regulation. The newly identified elements also show a statistical correspondence to sequence variants linked to human disease, and can thereby guide interpretation of this variation. Overall, the project provides new insights into the organization and regulation of our genes and genome, and is an expansive resource of functional annotations for biomedical research.
The revisionist view of this history would have us believe that their claim of "new insights" is only tentative depending on whether "biochemical functions" really means anything.

The authors define what they mean by "function" in the introduction.
The Encyclopedia of DNA Elements (ENCODE) project aims to delineate all functional elements encoded in the human genome. Operationally, we define a functional element as a discrete genome segment that encodes a defined product (for example, protein or non-coding RNA) or displays a reproducible biochemical signature (for example, protein binding, or a specific chromatin structure). Comparative genomic studies suggest that 3–8% of bases are under purifying (negative) selection and therefore may be functional, although other analyses have suggested much higher estimates. In a pilot phase covering 1% of the genome, the ENCODE project annotated 60% of mammalian evolutionarily constrained bases, but also identified many additional putative functional elements without evidence of constraint. The advent of more powerful DNA sequencing technologies now enables whole-genome and more precise analyses with a broad repertoire of functional assays.
Now, I don't know about the rest of you, but that sounds pretty clear to me. The authors really mean it when they say that 80% of our genome is functional.
Here we describe the production and initial analysis of 1,640 data sets designed to annotate functional elements in the entire human genome. We integrate results from diverse experiments within cell types, related experiments involving 147 different cell types, and all ENCODE data with other resources, such as candidate regions from genome-wide association studies (GWAS) and evolutionarily constrained regions. Together, these efforts reveal important features about the organization and function of the human genome, summarized below.

• The vast majority (80.4%) of the human genome participates in at least one biochemical RNA- and/or chromatin-associated event in at least one cell type. Much of the genome lies close to a regulatory event: 95% of the genome lies within 8 kilobases (kb) of a DNA–protein interaction (as assayed by bound ChIP-seq motifs or DNase I footprints), and 99% is within 1.7 kb of at least one of the biochemical events measured by ENCODE.
I don't blame media types and other experts for reading this as a claim that junk DNA is only a small percentage of the genome.

Many of us recognized right away that this was ridiculous so we looked in the paper to see where they discussed nonfunctional binding and aberrant transcripts and referenced the papers that raised these issues. I couldn't find any such discussion. Did miss it? Can anyone give me the page numbers and the references numbers where they point out the possible limitations of their definition of function?

What we find instead is a description of the assays for function and a summary (page 60) that says ...
Accounting for all these elements, a surprisingly large amount of the human genome, 80.4%, is covered by at least one ENCODE-identified element (detailed in Supplementary Table 1, section Q). The broadest element class represents the different RNA types, covering 62% of the genome (although the majority is inside of introns or near genes). Regions highly enriched for histone modifications form the next largest class (56.1%). Excluding RNA elements and broad histone elements, 44.2% of the genome is covered. Smaller proportions of the genome are occupied by regions of open chromatin (15.2%) or sites of transcription factor binding (8.1%), with 19.4% covered by at least one DHS or transcription factor ChIP-seq peak across all cell lines. Using our most conservative assessment, 8.5% of bases are covered by either a transcription-factor-binding-site motif (4.6%) or a DHS footprint (5.7%). This, however, is still about 4.5-fold higher than the amount of protein-coding exons, and about twofold higher than the estimated amount of pan-mammalian constraint.

Given that the ENCODE project did not assay all cell types, or all transcription factors, and in particular has sampled few specialized or developmentally restricted cell lineages, these proportions must be underestimates of the total amount of functional bases. ... These estimates represent a lower bound, but reinforce the observation that there is more non-coding functional DNA than either coding sequence or mammalian evolutionarily constrained bases.
What we also find is a criticism of conservation (Evolutionary version of function) because it can't identify human specific functions that have arisen recently. They reference experiments that support such a conclusion and say, "This indicates that an appreciable proportion of the unconstrained elements are lineage-specific elements required for organismal function, consistent with long-standing views of recent evolution and the remainder are probably ‘neutral’ elements that are not currently under selection but may still affect cellular or larger scale phenotypes without an effect on fitness."

Such statements reinforce the idea that the ENCODE authors look to biochemical function as the definitive definition of function.

The "Concluding Remarks" say,
The unprecedented number of functional elements identified in this study provides a valuable resource to the scientific community as well as significantly enhances our understanding of the human genome....

The large spread of coverage—from our highest resolution, most conservative set of bases implicated in GENCODE protein-coding gene exons (2.9%) of specific protein DNA binding (8.5%) to the broadest, most general set of marks covering the genome (approximately 80%), with many gradations in between—represents a spectrum of elements with different functional properties discovered by ENCODE.
I wonder what "unprecedented number of functional elements" means? I wonder how we "significantly enhance our understanding of the human genome" if most of those biochemical functional elements are artifacts? Did the authors really means to say that maybe only 10% of our genome is functional and this does nothing to enhance our understanding of the human genome? I don't think so.

The authors then go on to reiterate that all of these functional elements are likely to be underestimates.

The press picked up on these statements and reported that most of our genome was functional, not junk. As I stated above, those press releases often quoted leaders of the ENCODE Consortium and we know for a fact that many of them actually believed they had debunked junk. In fact, many of them still do believe that most of our genome is functional in spite of the Kellis et al. paper.

Does anyone honestly believe that this whole publicity hype campaign orchestrated by Ewan Birney and Nature with the active collaboration of other ENCODE leaders was all a big misunderstanding? Are some ENCODE Consortium members honestly trying to tell use that the paper was totally misinterpreted though no fault of the authors and they were really saying something very different than that most of the human genome has a function?

'Cause if that's what they're saying then how come the ENCODE leaders didn't speak up at the time and distance themselves from the misleading stories in the press?

How come they didn't point to the sentences on page 71 of their paper and claim that they didn't really mean to say ...
Importantly, for the first time we have sufficient statistical power to assess the impact of negative selection on primate-specific elements, and all ENCODE classes display evidence of negative selection in these unique-to-primate elements. Furthermore, even with our most conservative estimate of functional elements (8.5% of putative DNA/protein binding regions) and assuming that we have already sampled half of the elements from our transcription factor and cell-type diversity, one would estimate that at a minimum 20% (17% from protein binding and 2.9% protein coding gene exons) of the genome participates in these specific functions, with the likely figure significantly higher.
What they are saying is that they truly believe that every single transcription factor binding site has a function.

That's just nonsense.

Birney , E. et al. (The ENCODE Consortium) (2012) An integrated encyclopedia of DNA elements in the human genome. Nature 489:57–74. [doi: 10.1038/nature11247]

Ecker, J.R., Bickmore, W.A., Barroso, I., Pritchard, J.K., Gilad, Y., and Segal, E. (2012) Genomics: ENCODE explained. Nature 489:52-55. [doi: 10.1038/489052a]

Kellis, M. et al. (2014) Defining functional DNA elements in the human genome. Proc. Natl. Acad. Sci. (USA) April 24, 2014 published online [doi: 10.1073/pnas.1318948111]

Skipper, M., Dhand, R., and Campbell, P. (2012) Presenting ENCODE. Nature 489:45. [doi: 10.1038/489045a]


  1. The lead author is senior editor Magdalena Skipper who can be seen in this video with Ewan Birney announcing that 805 of the genome has a function, meaning that it is not junk.

    I think you meant to write 80%, not 805.

  2. "The lead author was Ewan Birney"

    Time to refer him as Ewan Birney, FRS. Maybe then the royal society will take notice that they have been promoting junk science.

    1. Yeah, that's worse. At least Birney is capable of shame. He's like a little boy standing next to a broken cookie jar, making excuses. He knows he did wrong.

      But Mattick!

      Look Larry, you have to write a book and kick ass. Take a sabbatical or something and write the book. Run it past Georgi and Joe first.

    2. Is it a tendency perhaps? When Fred Hoyle left Cambridge and started turning maverick, he got knighted the very next year.

    3. From Larry's blog post on Mattick (see the link above). Quotation:

      Others are less convinced. Ewan Birney of the European Bioinformatics Institute in Cambridge, UK, has bet Mattick that of the processed RNAs yet to be assigned a function - representing 14 per cent of the entire genome - less than 20 per cent will turn out to be useful. "I'll get a case of vintage champagne if I win," Birney says [in 2007]

      Does anyone know if he got that case? The champagne could have been doped with something. You never know what lengths those rogue scientists will go to.

    4. As I pointed out in another Sandwalk thread based on his 2012 Tweet on the onion test which said "@leonidkruglyak (re:onions etc); polyploidy and letting your repeats go crazy" I think it is quite unlikely that Birney wasn't convinced of the 80% figure back then. Or should this be one of the Humour doesn't work on Twitter cases?

    5. I'm puzzled by the parenthetic comment on fugu: "(note vertebrate, not mammal, but still, vertebrate are *very* complex)". It sounds like something inspired by the Dog's Ass Plot.

    6. Pioyr - Fred Hoyle's knighthood was just a coincidence. British scientists who chaired a Research Council committee or later became Director General of one of the Research Councils which resulted from the breakup of the SRC, automatically received a knighthood for their services (or for women being made a Dame like the research supervisor of my master's research project).

      However it did raise some controversy at the time. Some were in favour recognizing it as a poor substitute for the Nobel prize he should have won for his work on stellar nucleosynthesis. Others objected because he had turned into a nut. While others because he had resigned from his postion at the SRC and was not entitled to the automatic knighthood. Thsi was in in argument about the sighting of a new telescope. Now astronomers recognize he was right on this but it was controversial at the time,

      However it was essential a formal bureaucratic decision and not designed as far a one can tell a reward for his emerging raving looney scientific opinions.

    7. Where'd Greenie go? She was telling us we are not allowed to prove Ewan Birney, FRS's 2012 claims wrong with actual facts, because he's a FRS and we Americans & Canadians are not eligible. Somebody copies in a quote from 2007 in which Ewan Birney, FRS, bet a case of champagne that OUR position was right & his 2012 claims would be false. Remember Greenie, you're not allowed to disagree with a FRS and Ewan Birney, FRS, said we're right.

  3. One thing I try to teach any scientists who read my comments, is that arguing with dishonest people is not like a normal scientific argument. Dishonest people employ tricks not normally involved in real scientific controversies. These include:

    1. Equivocation: switching between definitions between major and minor premise. e.g. "80% of the genome has a biological function, by 'function' we means a molecule interacts with it, therefore there is no Junk DNA"

    2. Evasion, changing the subject when you're caught in a contradiction or falsification. Gish Gallop.

    3. Insinuating the "facts" support your thesis, when the actual facts are the opposite, but you don't want to say something clear and explicit, because that would be falsifiable and you could be proven to be lying. But you insinuate the fake "facts" you need to support your thesis without stating them directly.

    So you have to deconstruct dishonest opponents. In this case, let's deconstruct the language used by Kellis et al. 2014:

    "The major contribution of ENCODE to date has been high-resolution, highly-reproducible maps of DNA segments with biochemical signatures associated with diverse molecular functions. We believe that this public resource is far more important than any interim estimate of the fraction of the human genome that is functional."

    Here they use insinuation rather than explicitly making a claim that they need to be true, because if they said it explicitly we could prove they're lying.

    Their previous claims were:

    1. 80% of the genome has biological function (so says the peer-reviewed publication),

    2. this definition of function is the proper definition to disprove Junk DNA (so says the press release),

    3. This disproof of Junk DNA was hugely important and

    4. It was the whole purpose of ENCODE, they said in 2012.

    But Kellis et al. 2014, with their "unimportant interim estimate" garbage, insinuate that they never, ever made the above three claims. Instead they insinuate that they really said:

    1. 80% of the genome has biochemical activity, they never said it was function,

    2. they never said this disproves Junk DNA,

    3. A disproof of Junk DNA is not important, what matters is a big list of TF binding sites,

    4. Disproving Junk DNA was never the whole purpose of ENCODE, they say in 2014.

    "Far more important than any interim estimate"! my ass.

  4. From the origins struggle view.
    if the media is so wrong in things they are paid and degree-ed not to be wrong about THEN creationists can say to the public that the evidence for evolution is NOT well researched by the media that communicates with the public.
    Leaving evolutionism in the hands of very small numbers of people now and in the past.
    Thats why its lingered and thats why its never been criticized in a schiolarrly way by science reporters.
    Just making a common creationist complaint here.

    1. Byers, the IDcreationists lied when they said ENCODE had assayed & identified functions in most of the genome. Since they lied about this, IDcreationists cannot have any credibility

      To repair your freako logic:

      if the creationists are so wrong in things they are paid and degree-ed not to be wrong about THEN scientists should say to the public that the evidence for creation is NOT well researched by the creationists that communicate with the public.

  5. I just wonder why Ewan Birney defended the 80% claims in 2012 when he bet in 2007 that the number is much lower.

    BTW, publishing some halfhearted non-retraction to obscure their 2012 mistakes doesn't really help and makes Nature look hypocrite especially, when they publish a new Mattick paper that is putting forward the very same 80% number although worded more carefully (actually, Mattick omitts the term junk in the paper). He says

    "ENCODE reports that ~80% of the genome is transcribing ncRNAs"


    "The initial findings were confirmed in 2005 (REFS 156–159) and extended by the Encyclopedia of DNA Elements (ENCODE) project, all of which showed that the vast majority (at least 80%) of the human and mouse genomes are differentially transcribed in one context or another; other studies also reported similar findings in all organisms examined. Indeed, it seems that most intergenic and, by definition, intronic sequences are differentially transcribed, and that the extent of the transcriptome therefore expands with developmental complexity"

    He just omitted the dog's ass plot this time.

  6. Robert, have you ever seen a science paper or science book? Did you notice the references? Your problem is the lack of creationism worth the paper it is printed on. Do you know any?

  7. Robert, at the left of this blog you'll find many links that should be of great interest to you because of your keen interest in evolution - albeit from a YEC viewpoint, i.e. the viewpoint that evolution = atheism = nonsense.

    Anyway, you'll find a list of Nobel laureates below the label Themes ( and I think it is safe to think that very close to 99.999 % are 'evolutionists' That's just Nobel prize winners, how many more do you think there are?

    What does that list tell us?

    That there is not any shortage of evolutionists in science at a level no creationist ever could aspire to.

    Where do you put yourself on the ladder of knowledge, understanding and intellectual achievment?

    1. Depends on the ladders direction!!
      Unless these nobel folks study evolution as their job WHO CARES what they think about evolution UNLESS they prove its a private study done well?
      In fact mu impression of nobel winners is they are obscure because what they get prizes for doesn't really advance things. They just need winners every year.
      Only a few do actual important accomplishments for the ages.
      I wish them well in their work but mostly they are not remembered.
      Unlike creationists who are taking on and winning a revolution in conclusions on origins.
      The ID leaders and YEC are the real thinkers and movers in ideas in our times on these subjects.
      Not cell watchers or chemists or string theory dreamers.
      They are not famous for a reason. Their stuff is just the next step in a simple knowledge progression.
      Creationism is a leap and a stomp on science conclusions.
      More thoughtful and more resisted because its a paradigm change threat.

    2. Unless [people] study evolution as their job WHO CARES what they think about evolution UNLESS they prove its a private study done well?

      Tee hee. So 'ID leaders and creationists' are an example of 'private study done well'? Sproing! Another irony meter bites the dust.

    3. Byers said:

      "Unless these nobel folks study evolution as their job WHO CARES what they think about evolution UNLESS they prove its a private study done well?"

      Robert, so that I'm fully aware of what you mean by a "private study", will you explain what you mean by that? Will you also tell me the names of some creationists (at least 5) who are privately studying origins and evolution without using any of the scientific methods, tools, studies, observations, data, inferences, hypotheses, theories that you and other creationists claim are wrong, useless, and non-scientific?

      Tell me who is doing "private study" using only creationist (and especially only YEC) methods, tools, etc., etc., etc., and exactly how they go about it.

      What about non-christian creationists, including non-christian YECs? Are they doing "private study" too, using the same methods, tools, etc., etc., etc., as christian creationists, and are they correct in their conclusions?

      Why should anyone with a clue care what you and other creationists think about evolution or anything else since you conveniently deny the parts of science and reality that oppose your non-scientific, religious beliefs that are based on impossible fairy tales? YOU, in fact, even claim that science doesn't exist, yet you take advantage of many things that science makes possible and available. Tell me, Robert, why didn't the biblical character jesus have an Ipad? Since jesus is god and god is all knowing and all powerful, was he too stingy to give himself an Ipad? And what about the internet? Wouldn't jesus/god/holy ghost have had a much easier and more effective opportunity of reaching and ministering to the people of the world if he had created computers and the internet for everyone from the 'beginning'? What's with the stone tablets when 'God' could easily have put the ten commandments on a blog?

      Do you believe that a serpent talked, as claimed in the bible? Do you believe that a man lived inside a fish for days and survived, as claimed in the bible? Do you believe that babies should be dashed against rocks, that people should suffer and even burn in Hell for eternity for the alleged sins of the biblical characters Adam and Eve, that people should be stoned to death for disobeying the alleged commands of an imaginary, so-called god, and that that so-called god is perfect, loving, and merciful?

      Do you believe that your so-called god is all powerful? If so, why hasn't your so-called god ever caused/enabled a person to regrow a severed limb? A lowly crab can regrow a severed limb, some lizards can regrow a severed tail. If humans are so 'special' and superior, why can't humans regrow severed limbs? Shouldn't humans be able to do anything and everything that any other life form can do plus much, much more? You often claim that humans got the best body from your so-called god, but our bodies are grossly inferior to the bodies of many animals and plants. How do you explain that if humans are special and superior? Surely many people are staunch christians and have prayed over and over again for a new arm or leg for themselves or others but for some reason your loving, merciful, perfect, all powerful, so-called god never answers those prayers in a positive way. And there are all of the people and animals that suffer with diseases, disfigurements, disabilities, injuries, horrible pain, etc. that your loving, so-called god should be able to fix in an instant. Doesn't anyone or anything deserve 'God's' love and mercy? What have animals and plants done to deserve being punished for Adam and Eve's alleged 'sin' against 'God'?

      And one last question, for now: WHY do you believe in, worship and promote an imaginary, genocidal, ecocidal, tyrannical, petty, destructive, hateful, narcissistic, sadistic MONSTER?

  8. Why don't they just put all to rest and define function to the genome as any sequence that is replicated (and / or is subject to telomerase activity) and in one fell swoop, they will be able to claim that 100% of the genome is functional.

  9. Right now I'm reading a review article about transcription in Cell. Two of the authors are Michael Levine and Robert Tjian. They discuss the ENCODE results favorably and say, refering to TFs specifically ( I think): "This amounts to a remarkable fraction of our genome-25% and probably more- devoted to regulatory information..."

  10. Robert says: YEC are the real thinkers and movers in ideas in our times on these subjects.
    Please name some, with reference to their real thougths, writings and 'moves', excluding your own tripe.

    1. All of them. Those who apply themselves to YEC siubjects and some iD folks are the advanced thinkers on these subjects.
      Thats why there is a revolution going on.
      The opposition is just responding to it but by its response raises the stakes.
      This is either a movement that will overthrow in a big way many conclusions in these subjects oR a movement that will uniquely crumble and be a proverb and a story for decades to come.
      Right or wrong its YEC and ID who are the great revolutionaries in our times.
      We are the thinkers about big things here. Right or wrong.

  11. Claudiu OK – this is where I get confused…

    In a previous post on that thread, I asked the following:

    I always understood that retroviruses co-opted host regulatory machinery and vice versa constituting the acme in molecular host-parasite coevolution.

    Meanwhile, the different distributions of Alu and LINE1 in the genome would suggest that selection pressure may be involved. Do Alus direct methylation? Are Alus and Line1 DNA symbionts?

    Claudiu, you agreed – and elaborated even further above. Let's see if I managed to capture your drift...

  12. @ Claudiu (con’t)

    I am now scratching my head at the continuing exchange and your rebuttal. We are no longer talking about whether selfish DNA is functional, but rather whether the repetitive sequences of ancestral retroviral symbiotic bulk DNA has been co-opted for global gene regulation and cell differential and whether this new state of affairs merits the designation of “functional DNA”.

    I am still attempting to wrap my head around exactly what is meant by “bulk” DNA and how to assess positive selection.

    “…repeated DNAs display very high frequencies of sequence changes during evolution that become homogenized across genomes. These observations suggest the presence of mechanisms that balance interactions and exchange of information between heterochromatic sequences with the need to avoid negative consequences to genome stability.” link

    Now that I find interesting… If I understand all this correctly, positive selection for conservation of sequence “type” is not equivalent to conservation of the original sequence; but that said, positive selection is still very real and very real for real important reasons.

    If so, I have addressed Allan's rebuttal.

    If so, then I also understand where Claudiu is coming from together with Claudiu's criticism of Doolittle’s PNAS rebuttal.

    Larry came up with a great list:

    1. The skeletal DNA hypothesis (more DNA = larger nucleus and more nuclear pores) (Cavalier-Smith)
    2. Spacers and loops (Zuckerkandl)
    3. Mutation protection (various authors)
    4. Teleological hypotheses (excess DNA is necessary for the evolution of new genes and new regulatory functions)

    I wonder out loud if this list is incomplete. So I will repeat myself:

    Let’s talk about heterochromatin “function” along epigenetic lines. I paraphrased a review above along these lines:

    Heterochromatin is employed as a platform for the recruitment of effectors across extended domains along chromosomes including but not restricted to silencing and anti-silencing factors. It gets better: Heterochromatin (both facultative and constitutive) also regulates cell-type specific spreading of protein complexes along chromosomes that ultimately controls transcription, chromosome segregation and long-range chromatin interactions.

    Sounds pretty “FUNCTIONAL” to me.

    Now of course – The exact sequences of “functional” murine heterochromatin would not be identical to the human equivalent making assay and identification difficult.

    That said – I am betting that strong positive selection exists for the maintenance of karyotype commonalities between primates (for example).

    So I wonder out loud: is it possible that ENCODE was right for all the wrong reasons?

    It would appear that ancient retrovirus sequences are the sine qua non of cell differentiation and global gene regulation (at least in primates) and do constitute functionality along lines originally espoused by ENCODE. I humbly suggest that 18–12 million years is a very long time to maintain karyotype commonalities in apes.

    Of course there still remains the entirely separate question of whether ENCODE had the data in hand to justify such lines of hypothesis... a completely separate question, altogether.

    Just the same, being guilty of hubris is not tantamount to being guilty of falsehood.

    Furthermore, even if some organisms and perhaps even some lineages (Ecydysozoa perhaps) do not manifest such “functionality”; that does not translate a priori to the conclusion that the importance of such functionality never exists.

    OK – somebody please help me out… what am I missing?

    1. When thinking about the evolution of genome size and C-value enigma, it is critical to realize that genomic DNA can play informational (iDNA) roles or functions, which are based on sequence specificity, or it can have non-informational (niDNA) functions, which are independent of the nucleotide sequence.

      We have known for half of century or so (and for good reasons, such as sequence variation and mutational load) that in organisms with high C-value, such as humans, only a low percentage of the genomic DNA can be iDNA. We have also known for a very long time that most of the genomic DNA in species with high C-value consists of retroviral and transposable elements sequences or their products or remnants, and that *some* (a few percentages at the most) of these sequences have been co-opted as iDNA or niDNA.

      Based on these facts and rationale, the primary question addressed by the scholars in the field was whether the bulk of the genome (90% or more in the human genome), which consists primarily of viral and transposable elements, was simply parasitic or ‘junk DNA’ (jDNA) or it was functional niDNA.

      As described by Doolitlle in his PNAS paper (, the two prevalent hypotheses advanced by the scholars in this field regarding potential non-informational functions for the so called jDNA were the ‘nucleo-skeletal’ (Cavalier-Smith) and ‘nucleotypic’ (Gregory) hypotheses:

      The “selfish DNA” scenarios of 1980 (20–22), in which C-value represents only the outcome of conflicts between upward pressure from reproductively competing TEs and downward-directed energetic restraints, have thus, in subsequent decades, yielded to more nuanced understandings. Cavalier-Smith (13, 20) called DNA’s structural and cell biological roles “nucleoskeletal,” considering C-value to be optimized by organism-level natural selection (13, 20). Gregory, now the principal C-value theorist, embraces a more “pluralistic, hierarchical approach” to what he calls “nucleotypic” function (11, 12, 17).

      In the material I suggest you might want to read, I proposed that the ‘nucleo-skeletal and ‘nucleotypic’ hypotheses cannot explain the C-value enigma (i.e. do not pass the ‘onion test’), and I discussed an old hypothesis that explains the evolution of genome size and the C-value enigma: