Tuesday, June 25, 2013

"Reasons to Believe" in ENCODE

Fazale "Fuz" Rana is a biochemist at Reasons to Believe". He and his colleagues are Christian apologists who try to make their faith compatible with science. Fuz was very excited about the ENCODE results when they were first published [One of the Most Significant Days in the History of Biochemistry]. That's because Christians of his ilk were very unhappy about junk DNA and the ENCODE Consortium showed that all of our genome is functional.1

Fuz is aware of the fact that some people are skeptical about the ENCODE results. He wrote a series of posts defending ENCODE.
  1. Do ENCODE Skeptics Protest Too Much? Part 1 (of 3)
  2. Do ENCODE Skeptics Protest Too Much? Part 2 (of 3)
  3. Do ENCODE Skeptics Protest Too Much? Part 3 (of 3)
The first post is merely a list of the objections many of us raised.

The second post is a lengthy discussion of the meaning of "function." Rana is happy with the ENCODE definition of "function" because it's a "causal definition" based on things like, transcription, the binding of transcription factors, DNA methylation etc. He says,
The implied assumption is that if a sequence is involved any of these processes—all of which play well-established roles in gene regulation—then the sequences must have functional utility.
He doesn't discuss whether transcription of known pseudgogenes means that the pseudogene has a function or whether the "function" of a random accidental binding site has any biological significance.

The third post is the one that's most interesting because it highlights some basic flaws in fact and logic. These flaws are not confined to creationists, many scientists make the same errors.

Let's look at Fuz Rana's third post from May 13, 2013. He discusses four significant objections to the ENCODE results.

1. Logical Errors in Assigning Function

According to Fuz Rana, the skeptics accuse the ENCODE Consortium of faulty logic when it comes to assigning functional parts of the genome. If part of the genome is expressed to produce a functional product (RNA or protein) then it has to be transcribed. In terms of logic, if the DNA is functional (F) then it is transcribed (T). In other words, if F, then T.

The ENCODE logic goes like this ....

If F, then T
T is observed
Therefore, F

This is faulty logic since the premise doesn't exclude the fact that T (transcription) could occur in the absence of F (function). Fuz Rana is specifically addressing an objection from Dan Graur ...
Graur’s team argues this conclusion is invalid. For example, it is possible that the transcription factor could bind randomly to DNA sequences that do not serve as promoters or enhancers, or in any other functional role. In other words, to conclude that DNA sequences are functional if they bind transcription factors is to affirm the consequent. Graur’s group believes the ENCODE Project committed this logical error for all the assays they performed when assigning function to DNA sequences.
To my mind, Graur's objection is a devastating criticism of the ENCODE results, especially since we know for a fact that transcription factors bind to nonfunctional DNA.

But to Fuz Rana it's all a question of philosophy.
In my opinion this concern is not nearly as problematic as Graur’s team makes it out to be. They conflate deductive reasoning with inductive; yet scientific investigations rely on induction, not deduction. While Graur’s group rightly points out that affirming the consequent is a logical fallacy when engaged in deductive reasoning, this error doesn’t apply to inductive reasoning. Induction produces conclusions that are probabilistic though not certain.

Let’s return to the example of transcription factor binding to DNA. As already noted, if a DNA sequence serves as an enhancer or promoter, it will bind a specific set of transcription factors. If a scientist then observes transcription factors binding to DNA, it is reasonable to conclude that these binding sites play a role in regulating gene expression. Though not certain, this conclusion is probabilistic. Despite the uncertainty associated with it, the conclusion is still reasonable because a vast body of data demonstrates that transcription factors bind to specific DNA sequences that regulate gene expression. Yes, another explanation for why these transcription factors bind to DNA may exist. Confirmatory experiments can reduce this uncertainty.

The key point is this: there is nothing wrong with concluding, when using inductive reasoning, that sequences that bind transcription factors are functional. By extension, there is nothing wrong with the reasoning the ENCODE Project employed to assign function to sequences in the human genome. The ENCODE Project did not affirm the consequent because they were making use of induction (as do all scientists), not deduction.
It is NOT reasonable to conclude that whenever a transcription factor binds to DNA that site must be a genuine promoter or enhancer. No rational scientist should make that mistake no matter what form of reasoning they employ.

But, to be fair, Rana is in good company since that's exactly what the ENCODE researchers concluded. They assign "function" to every site that bound a transcription factor and they assumed that every bit of DNA that was transcribed has a function.

2. Assigning Function to an Entire Class of Sequence Elements Based on a Few Members

It seems bizarre, but real scientists often discover that one of two pseudogenes or transposons have a function then leap to the conclusion that all pseudogenes and transposons must be functional.

According to Fuz Rana, it is perfectly legitimate to conclude that all pseudogenes and transposons have a function even though only a few have been identified as functional.
In other words, just because researchers identify function for, say, duplicated pseudogenes doesn’t mean all pseudogenes possess function. However, it would be natural in other scientific investigations to assume that if a particular property has been identified for a representative sample, then the entire system possesses that property. Yet when reviewing the ENCODE Project, some evolutionary biologists are eschewing this common practice. Many of the human DNA sequences to which ENCODE assigned function reside within regions of the genome long thought to be junk DNA. It seems Doolittle and others are reluctant to conclude that a part represents the whole due to a pre-commitment to the belief that junk DNA arises via evolutionary processes.
Most pseudogenes and defective transposons have all the hallmarks of broken genes. Therefore, it's logical to assume that all pseudogenes and defective transpsons are non-functional. However, we have now discovered a few exceptions where a pseudogene or a defective transposon has acquired a function. The exceptions prove the rule. What Fuz Rana says makes no sense. You don't assume that all broken genes must have a function just because you discover a few exceptions.

But there's more to his argument. He claims that most of these sequences have been shown to have a function.
The results of the ENCODE Project challenge this evolutionary perspective. The ENCODE team performed a large number of assays, systematically surveying the human genome and cataloging the functional sequences. They didn’t simply identify a few examples of function in particular members of a junk DNA category and then conclude the whole class must be functional. Instead, they identified, one by one, members of a sequence elements group that displayed function.

So, in effect, Doolittle’s complaint holds no weight. It also fails to take into account other work—such as research involving pseudogenes—that not only identifies function for individual members of this junk DNA class, but also presents an elegant framework to explain the function of all members of the category. (Go here and here to read about the competitive endogenous RNA hypothesis as a comprehensive model for pseudogene function.) This type of advance coheres nicely with the catalogue of functional elements ENCODE identified.
This is nonsense. ENCODE did not show that most pseudogenes and defective transposons have a function. At most, they showed that some of them still have promoters and enhancers that bind transcription factors. But that's exactly what we expect of broken genes.

3. Conflating Biochemical Activity with Function

We've been over this ground before. Many of us think that the mere existence of transcripts and DNA binding sites does not indicate function. The biggest mistake made by the ENCODE Consortium was to assume the opposite.

Here's what Fuz Rana says about that ...
To me, this criticism of ENCODE seems motivated by a strong commitment to the evolutionary paradigm. In other words, the experimentally generated ENCODE results don’t square with the expectations of the theory of biological evolution; therefore, the ENCODE results must be wrong. This is an example of theory-dependent reasoning, in which the theoretical framework holds more sway than the actual experimental and observational results. ENCODE skeptics’ commitment to the evolutionary paradigm is so strong it appears that they unwittingly abandoned one of science’s central practices: experimental results dictate a theory’s validity, not the other way around.

These criticisms ignore two important points: (1) biochemical noise costs energy; and (2) random interactions among genome components would be highly deleterious to the organism.
In other words, it's Fuz Rana who is committed to an evolutionary paradigm; namely, adaptationism. He doesn't understand evolution.

I don't know of any theory of biological evolution that predicted junk DNA. This is some kind of fairy tale made up by creationists.
While it’s true that biochemical activity doesn’t necessarily equate to function, the ENCODE researchers appear to have gone to great efforts to ensure that they measured activity with biological meaning. The idea that activities associated with the genome—such as the transcription of the genome, methylation of DNA, modification of histones, binding of transcription factors, and others—are mostly noise borders on the ridiculous because it ignores well-established principles of biochemical operations.
Fuz Rana doesn't understand biochemistry. He should read a textbook in order to learn about "well-established principles of biochemical operations."

4. Squaring with the C-Value Paradox

Most opponents of junk DNA ignore the evidence upon which the concept was founded. I'm referring to the genetic load argument and the C-Value Paradox. The C-Value Paradox refers to the vast range of genome sizes in otherwise similar species. These observations strongly suggest that genome size is unrelated to the number of functional elements in the genome. If two closely related species differ by a factor of two in genome size then most of the DNA in the one with the larger genome must be irrelevant. No other explanation makes sense.

Fuz Rana thinks that most of out genome is functional. That's a problem when we consider less complex organisms that have larger genomes. He has an explanation. In those species, the "extra" DNA performs an additional function that our genome doesn't need. Here's how he describes it ...
In light of the C-value paradox, the ENCODE results would mean that less sophisticated organisms with larger genomes (compared to humans) must also possess more functional elements. But such a scenario makes no sense—at least from an evolutionary perspective. Yet it is possible to account for the larger genomes in organisms less complex than humans. It may be that the excess DNA plays a role other than coding for proteins and regulating gene expression. A number of studies, for example, indicate that DNA dictates the size of the cell nucleus.
So it appears that our genome is full of functional elements that ENCODE can detect but larger genomes just contain "stuffer" DNA. I wonder how he explains complex species, like the pufferfish, that have much smaller genomes?

Fuz Rana concludes that ENCODE skeptics are biased by adherence to some sort of false evolutionary paradigm. He and his colleagues, on the other hand, are much more open-minded so they can follow the evidence wherever it may lead!!!
Despite these latest criticisms, I see no real scientific reason to dismiss the ENCODE Project’s results. Careful consideration reveals that the objections have more to do with philosophy than science. The ENCODE skeptics seem to feel that the ENCODE results must be wrong because they don’t line up with key concepts of the evolutionary paradigm. The ENCODE skeptics even depart from standard scientific practices to maintain their commitment to evolution in the face of the ENCODE discoveries.

The ENCODE Project’s conclusions—namely that at least 80 percent of the human genome is comprised of functional DNA sequences—remain valid evidence for elegant design, befitting the work of a Creator, in the human genome and, by extension, the genomes of other organisms.

1. I'm not sure where in the Bible it says that every bit of the human genome has to be functional. It's been quite a while since I read it. Can anyone supply chapter and verse?


  1. that at least 80 percent of the human genome is comprised of functional DNA sequences—remain valid evidence for elegant design, befitting the work of a Creator, in the human genome and, by extension, the genomes of other organisms.

    I wonder what the cutoff point is ?

    80% seems to me as if the creator just isn't trying hard enough.

    Or perhaps the creator delegated the task to a committee and they fucked up, spent more time deciding on the shape of the table and less on actual design.

    And wouldn't a human genome with 0% functional DNA sequences be really, really good evidence for a creator ?

    Perhaps our local IDiots and creotards can weigh in and illuminate me.


  2. I wonder if there's another way to persuade reasonable people that the genome cant have greater >80%. The steady state amount of junk should be determined by the rate junk is created minus the rate junk is removed( assuming no selection for or against). Such a calculation would be extremely imprecise, nevertheless, if someone claims the genome has 95% functionality that should require some combination of very low production and very high removal...all of which could be shown to be false.

  3. The steady state amount of junk is not the rate at which junk is created minus the rate at which it is removed. That is the rate at which it increases. Once it has reached its equilibrium level, those two numbers should be equal.

    The steady state amount would be the number of bases of junk added per generation, divided by the fraction of junk bases that are deleted each generation. Thus if 10000 bases of junk are added each generation, and each of the existing junk bases has probability 1/1000 of being removed each generation, the the equilibrium amount of junk would be 10000/(1/1000) = 10,000,000 bases.

    At that amount the addition (10000) is exactly balanced by the number of bases removed, which is 10000000/1000.

  4. "If two closely related species differ by a factor of two in genome size then most of the DNA in the one with the larger genome must be irrelevant. No other explanation makes sense."

    With this opinion are you taking into consideration how variable genome size in closely related species could provide novel insights into species histories? For instance, how do transposable elements contribute to genome evolution?

    1. With this opinion are you taking into consideration how variable genome size in closely related species could provide novel insights into species histories?


      For instance, how do transposable elements contribute to genome evolution?

      Not very much. About 50% of our genome consists of DNA that looks like broken transposons (pseudogenes) and bit & pieces of transposons. In other words, it looks like junk. The default explanation is that it is, indeed, just what it looks like.

  5. I was thinking beyond our own species. For instance the some closely related salamanders have genomes that vary between 15 and 75Gb, most of which is uncharacterized transposable elements. What's a good explanation for a 5x variation in nuclear DNA? Is 75Gb an upper limit or could it go higher?

    1. What's a good explanation for a 5x variation in nuclear DNA?

      It's junk.

    2. Ummm... Joe Felsenstein laid it out above. Relatively minor tweaks in the per-generation addition and deletion rates would lead to radically different equilibrium values.

    3. If giant genomes, depending of course on what they comprise--which we don't know for many species--contribute additively to reproductive isolation and all sorts of morphological features, from cell size to development and metabolism, where do draw the line at junk?

    4. @caynazzo,

      If it has a function, it ain't junk. The line is pretty clear.

  6. Karyotyping was done in salamanders back in the '90s. No polyploidy.