The ENCODE legacy
I addressed the meaning of "function" in Part I It is apparent that philosophers and scientists are a long way from agreeing on an acceptable definition. There has been a mini-explosion of papers on this topic in the past few years, stimulated by the ENCODE Consortium publicity campaign where the ENCODE leaders clearly picked a silly definition of "function" in order to attract attention.
Unfortunately, the responses to this mistake have not clarified the issue at all. Indeed, some philosophers have even defended the ENCODE Consortium definition (Germain et al., 2014). Some have opposed the ENCODE definition but come under attack from other scientists and philosophers for using the wrong definition (see Elliott et al, 2014). The net effect has been to lend credence to the ENCODE Consortium’s definition, if only because it becomes one of many viable alternatives.
Ford Doolittle anticipated this debate last year (Doolittle, 2013) when he wrote,
In the end, of course, there is no experimentally ascertainable truth of these definitional matters other than the truth that many of the most heated arguments in biology are not about facts at all but rather about the words that we use to describe what we think the facts might be. However, that the debate is in the end about the meaning of words does not mean that there are not crucial differences in our understanding of the evolutionary process hidden beneath the rhetoric.My position on the ENCODE publicity campaign is that they did not offer us a definition of function at all. Yes, they used the word "function" but they were completely wrong to use that word. What they were describing is sites that had some property, or exhibited some phenomenon. These might eventually turn out to be functional but all they were describing is data, not conclusions.
Doolittle et al. (2014) refer to these phenomena or properties as "effects" and the paper is devoted to Distinguishing between "Function" and "Effect"ion Genome Biology. The ENCODE Consortium was referring to "effects" and not functions. That should be the end of the story, in my opinion, but, unfortunately, there's some confusion among philosophers over the causal-role definition of function and it may encompass the "effects" that the ENCODE Consortium is talking about.
A working definition of function
I don’t think there’s much point in quibbling about the exact meaning of the word "function" and I don’t think that philosophers are going to make a substantive contribution to the debate other than by pointing out that no definition is entirely satisfactory. However, we do need some sort of definition even if it’s only a "working definition" with recognizable flaws and exceptions.
Personally, I think of a given stretch of DNA as having a "function" if deleting it from the genome affects the survivability of the organism or its progeny. Joe Felsenstein points out in the comments to Part I that this has to be seen in a long-term evolutionary context and not just the immediate survival of the organism and the next generation. That’s certainly the sense in which I think about "function" even if it’s not captured in the working definition. There are lots of other exceptions and quibbles about my preferred working definition. If anyone can offer something better then it can be changed.
There have been four recent papers on the Function Wars (Kellis et al, 2014; Germain et al., 2014; Elliott et al., 2014; Doolittle et al., 2014). None of them have proposed a definition of "function" that we can examine. The closest was Doolittle et al. Who propose that function should be tied to selected history (selected-effect or SE functionality). Doolittle offered a more precise definition in a paper he wrote on his own last year (Doolittle, 2013).
... the functions of a trait or feature are all and only those effects of its presence for which it was under positive natural selection in the (recent) past and for which it is under (at least) purifying selection now.This is a selected-effect definition of function. It's a pretty good definition but it's an historical and not an empirical definition. That distinction isn't terribly important because no definition is rigorous enough to withstand close scrutiny but I note, for the record, that newly evolved functional genes would be excluded by Doolittle's strict definition.
There are two ways to tie function to natural selection, sequence conservation or not. It’s possible to have selection for bulk DNA or spacers without regard to sequence and any working definition should make clear which selected effect function is being proposed. Doolittle's definition doesn't rule out selection in the absence of sequence conservation.
The other problem is that there appear to be legitimate examples of sequence conservation that are NOT tied to function in the sense of the junk DNA debate so any definition must come with qualifiers so that readers don’t assume that it’s exclusive. In other words, the definition has to be a pragmatic definition and not a philosophically rigorous one.
Apparently, the selected-effect definition offered by Doolittle differs substantively from the working definition I've been using. Here's what he says about that in the 2013 paper ...
Another way to attribute function is through experimental ablation: whatever organism-level effect E does not occur after deleting or blocking the expression of a region R of DNA is taken to be the latter’s function. This attribution is close to the everyday understanding of function, as in the function of the carburetor is to oxygenate gasoline. The approach embodies what philosophers would call a causal role (CR) definition of function and supposedly eschews evolutionary or historical justifications. Much biological research into function is done this way, but I think that most biologists consider that experimental ablation indirectly points to SE. They believe that effect E could, under suitable conditions, be shown to have contributed to the past fitness of organisms and most importantly, that R exists as it does because of E.That sounds okay to me. I don't really care if it's called a causal-role definition or a selected-effect definition as long as it works.
Is junk equivalent to nonfunction?
What about the definition of "junk"? The way I see it, DNA is either functional or it is junk so defining "function" is equivalent to defining "junk." Others see it differently, I think. For them, there’s either a third, unspecified, alternative, or a continuum with a fuzzy boundary. We can discuss this.
The clearest statement that I could find offering a contrary opinion comes from Ryan Gregory in his book on The Evolution of the Genome (Gregory, 2005). He says ...
Not only is ‘junk DNA’ an inappropriate moniker for noncoding DNA in general because of the minority status of pseudogenes within genome sequences, but it also has the unfortunate consequence of instilling a strong a priori assumption of nonfunction. As Zuckerkandl and Hennig (1995) pointed out, ‘given a sufficient lack of comprehension, anything (and that includes a quartet of Mozart) can be declared to be ‘junk.’ Indeed, it is becoming increasingly clear that some noncoding sequences play important regulatory of structural roles.That statement needs a bit of unpacking in order to get at the true meaning. First, it was written when Gregory was attempting to restrict the definition of "junk DNA" to pseudogenes but that’s no longer his position so we can ignore that part. Second, it criticizes the idea that "junk" is equivalent to "nonfunction" but this is presumably on the grounds that what is called junk might actually have a function. Third, it invokes the idea that "junk" is being used as a synonym for "lack of comprehension" and this is inappropriate.
My position is that the term "junk DNA" is, indeed, a synonym for "nonfunctional DNA." That’s pretty much the working definition I prefer. Like most definitions, it is a form of a priori assumption. That’s not a weakness, it’s a strength. Furthermore, I reject the idea that I, and others, are using a working definition of "junk DNA" as a reflection of our ignorance of the field.
Before moving on to a more specific example, let’s look at the criticism raised by Zuckerkandl and Hennig (1995) in the paper quoted by Ryan above. They say in their opening sentences ...
Given a sufficient lack of comprehension, anything (and that includes a quartet of Mozart) can be declared to be junk. The junk DNA concept has exercized such a hold over a large part of the community of molecular biologists that it appears worth while to reiterate a point made five years ago: heterochromation is, in fact, a collector’s item.They go on to describe some functions of heterochromatic regions of the genome. But if these regions really are functional then they are, by definition, not junk. The question before us is whether a large proportion of complex eukaryotic genomes are nothing but junk DNA and the evidence for that is very solid. If, from time to time, some new functions are discovered in that part of the genome, that does not mean that the entire concept has been overthrown and there is no such thing as "junk DNA."
Zuckerkandl believes that most of the genome has a function so he’s not a big fan of junk DNA. But he doesn’t make his case by implying that all junk proponents are ignorant (lack comprehension) and comparing us to someone who would think that a Mozart quartet is junk. This is not a semantic argument over inappropriate meanings of the word "junk." It’s a scientific dispute and the case for, and against, junk DNA has to be resolved by data.
Although the most recent papers on the Function Wars don't offer a definition of "junk," we do have a definition from Ford Doolittle's 2013 paper. He says,
... junk DNA—here specifically understood as DNA that does not encode information promoting the survival and reproduction of the organisms that bear it— ...That looks like the opposite of his definition of function. I think this is the consensus view these days: junk DNA is DNA that has no function.
Are active transposons junk, or not?
Let’s look at a specific example to see how we can define "junk." The Elliott et al. (2014) paper gives us a nice example to debate. The authors address the question of "selfish DNA" (transposons). It relates to whether the papers by Doolittle and Sapienza (1980) and Orgel and Crick, 1980) were arguments FOR junk DNA or AGAINST it. If our genome were actually full of active transposons that were acting selfishly as parasites, then surely this is a "function" and it would be wrong to say that our genome was full of junk. In that sense, the 1980 papers can be seen as arguments AGAINST the idea of junk DNA.2
Fortunately, that’s not the case. We now know that at least half of our genome consists of defective transposons (pseudogenes) and fragments of transposons. That’s junk by any definition unless it can be shown to have a secondary function. I think this what Elliott et al. would say but, unfortunately, they didn’t say it, so I’m not sure. Thus, it turns out that transposons can be used to explain the origins of junk DNA because the fate of most transposons is death, turning them into pseudogenes.
Active transposons make up only a tiny fraction (<0.1%) of the genome so deciding whether they are junk or functional—or something else—isn’t going to affect the big picture. What it does is help to clarify the discussion. That was one of the goals of the Elliott et al. paper. I don’t think they succeeded. Here’s how they describe the problem ....
Elliott et al. propose to (partially) solve this problem by distinguishing between different levels of function. This is the same approach used in the Doolittle et al. (2014) paper and it’s described much better there so I’ll quote Doolittle et al. (Stefan Linquist and Ryan Gregory are authors on both papers).
As described in this article, ascribing functions to specific components of the genome is uniquely challenging when the sequences involved are transposable elements. Their capacity for autonomous replication creates several major complications that confound the use of functional assessments typically implemented in studies of genes or regulatory regions
... the trait or its effects could indeed be a product of natural selection, but at a level of organization lower (intragenomic) or higher (population or species) than the usual level of evolutionary explanation, namely organisms and their fitness-determining genes. No one would consider the induction and replication of prophages to be the evolutionary "function" of bacterial cells; instead, it is well understood that there is selection at the level of the viruses themselves as well as among their bacterial hosts, so this would be a function of the prophages, not their hosts. Likewise, it would be odd to consider the harboring of nonviral retroelements to be a function of the human genome. These and other transposable elements are indeed products of selection, but at the intragenomic level rather than the organismal level, at least initially. Similarly, the wide prevalence (though probably not the origin) of sexual reproduction might best be explained by reference to selection above the organism level (i.e., among lineages). At every level at which selection might be said to operate, we imagine that the CR/SE distinction can be applied. Strictly speaking, some traits that are nonfunctional at the organism level might possess intragenomic or supra-organismal selected effects. Since the usual focus of functional discourse is on organisms, features selected positively or negatively at higher or lower levels but neutral (or negative) for organisms are considered to have only casual role functions for the purposes of figure 1.
The genome can be divided into three components:
- Nonfunctional DNA by any definition: This is presumably equivalent to junk.
- True functional DNA: These are regions of DNA that have a function at the organism level and they would count as functional by any reasonable definition.
- Functional DNA at some other level: This would include transposons and other forms of selfish or parasite DNA but it also might include "higher level" funcations that are only manifest at the population level.
T. Ryan Gregory Sorry, Larry -- you know I appreciate your blog posts, but it's clear that you didn't understand the paper(s) you criticize. As to your question: most active TEs are probably non-functional at the organism level. Some may be functional, and some may simply have beneficial side effects for the organism. This is all laid out in detail in the article.If active transposons are put in the junk DNA category then these would be examples of DNA with “functions” at some level but junk at another level. Conceptually that’s not much different than the ENCODE proposal (I think). They would also be examples of genes that encode functional enzymes but are still “junk.” In other words, junk coding DNA. I’m not comfortable with that.
Laurence A. Moran That's exactly how I interpreted your paper. The only part that was missing was the part where you said that active transposon sequences are junk DNA. Is that what you believe?
T. Ryan Gregory I just answered that question. Help me to understand what you're not following in my response or the paper.
Laurence A. Moran Are the words "non-functional" and "junk" synonyms? If so, then you answered my question. You believe that active transposons have a function at one level, but they are junk at another level. I disagree with you and that's what I wrote in my "muddled" post." I think it's extraordinarily muddled to say that a sequence that is transcribed to produce transposase or reverse transcriptase is "nonfunctional" or "junk" at any level. But the main point of my post is that it's muddled and unproductive to even have this metadiscussion where we quibble about the meaning of the word "function."
T. Ryan Gregory As I said, I really don't think you understood the papers or the arguments. Or perhaps you truly are not interested in dealing with the implications of different concepts of "function" (a bait and switch around which was how ENCODE made their hype campaign), and (apparently) you don't see multi-level selection as relevant, then we're really far apart on this point. In either scenario, it feels like it would be not a useful endeavour to go back and forth on our respective blogs. I've published what I think about the topics already.
If active transposons are neither junk nor functional (at the organismal level) then you can’t define “junk” as just DNA that doesn’t have a function because there would now be a third category of DNA that doesn’t have a function at one level but isn’t junk at that level. In this case, the third category would be “selfish DNA.” I assume there are additional categories such as the population-level category.
I prefer to avoid the discussion about different kinds of function and just say that active transposons have a function and, therefore, they are not junk. When describing the reasons why certain parts of the genome exist, it doesn’t really matter to me whether they were selected for selfish regions or for survival of the species. The problem is that this conflicts with my working definition of function since these active transposons could be deleted without harming the organism. Transposons also appear to be excluded by Doolittle's selected-effect definition but I'm not certain about that.
I don’t see a way out of this conundrum and I don’t think the paper by Elliott et al. was very helpful in this regard. As a matter of fact, I don't believe that it's possible to resolve these Function Wars by publishing more papers on the meaning of "function" or "junk."
Can anyone help? Do you have a philosophically sound definition of "function" or "junk" that will "clarify" the discussion? Do you think that junk DNA is any stretch of DNA that doesn't have a function?
1. I thank Alex Palazzo for coming up with the term "function wars."
2. I recently re-read the entire collection of Nature papers from 1980 (Doolittle and Sapienza, 1980; Orgel and Crick, 1980; Cavalier-Smith, 1980; Dover, 1980; Dover and Doolittle, 1980; Orgel, Crick and Sapienza, 1980; Jain, 1980). It's amazing how much these papers still remain relevant today. In fact, I venture the opinion that none of the recent Function Wars papers adds anything substantive to the the debate from 34 years ago!
Cavalier-Smith, T. (1980) How selfish is DNA? Nature 285, 617-618. [doi: 10.1038/285617a0]
Doolittle, W. F. and Sapienza, C. (1980) Selﬁsh genes, the phenotype paradigm and genome evolution. Nature 284, 601-3. [PDF
Doolittle, W.F. (2013) Is junk DNA bunk? A critique of ENCODE. Proc. Natl. Acad. Sci. (USA) published online March 11, 2013. [PubMed] [doi: 10.1073/pnas.1221376110]
Doolittle, W.F., Brunet, T.D., Linquist, S., and Gregory, T.R. (2014) Distinguishing between “function” and “effect” in genome biology. Genome biology and evolution 6, 1234-1237. [doi: 10.1093/gbe/evu098]
Dover, G. (1980) Ignorant DNA? Nature 285, 618-619.
Dover, G., and Doolittle, W.F. (1980) Modes of genome evolution. Nature 288, 646-647.
Elliott, T. A., Linquist, S. and Gregory, T. R. (2014) Conceptual and empirical challenges of ascribing functions to transposable elements. The American naturalist 184:14-24. [doi: 10.1086/676588]
Germain, P.-L., Ratti, E. and Boem, F. (2014) Junk or functional DNA? ENCODE and the function controversy. Biology & Philosophy, 1-25. (published online March 21, 2014) [doi: 10.1007/s10539-014-9441-3]
Gregory, T. R. (2005) Genome Size Evolution in Animals. In The Evolution of the Genome (Gregory, T. R., ed.), pp. 3-87, Elsevier Academic Press, New york, Oxford etc.
Jain, H.K. (1980) Incidental DNA. Nature 288, 647-648.
Kellis, M., Wold, B., Snyder, M. P., Bernstein, B. E., Kundaje, A., Marinov, G. K., Ward, L. D., Birney, E., Crawford, G. E. and Dekker, J. (2014) Defining functional DNA elements in the human genome. Proceedings of the National Academy of Sciences 111, 6131-6138. [doi: 10.1073/pnas.1318948111]
Orgel, L. E. and Crick, F. H. (1980) Selfish DNA: the ultimate parasite. Nature 284, 604-607. [doi: 10.1038/284604a0]
Orgel, L.E., Crick, F.H.C., and Sapienza, C. (1980) Selfish DNA. Nature 288, 645-646.
Zuckerkandl, E., and Hennig, W. (1995) Tracking heterochromatin. Chromosoma 104, 75-83.