More Recent Comments

Monday, April 06, 2020

The Function Wars Part VII: Function monism vs function pluralism

This post is mostly about a recent paper published in Studies in History and Philosophy of Biol & Biomed Sci where two philosophers present their view of the function wars. They argue that the best definition of function is a weak etiological account (monism) and pluralistic accounts that include causal role (CR) definitions are mostly invalid. Weak etiological monism is the idea that sequence conservation is the best indication of function but that doesn't necessarily imply that the trait arose by natural selection (adaptation); it could have arisen by neutral processes such as constructive neutral evolution.

The paper makes several dubious claims about ENCODE that I want to discuss but first we need a little background.


The ENCODE publicity campaign created a lot of controversy in 2012 because ENCODE researchers claimed that 80% of the human genome is functional. That claim conflicted with all the evidence that had accumulated up to that point in time. Based on their definition of function, the leading ENCODE researchers announced the death of junk DNA and this position was adopted by leading science writers and leading journals such as Nature and Science.

Let's be very clear about one thing. This was a SCIENTIFIC conflict over how to interpret data and evidence. The ENCODE researchers simply ignored a ton of evidence demonstrating that most of our genome is junk. Instead, they focused on the well-known facts that much of the genome is transcribed and that the genome is full of transcription factor binding sites. Neither of these facts were new and both of them had simple explanations: (1) most of the transcripts are spurious transcripts that have nothing to do with function, and (2) random non-functional transcription factor binding sites are expected from our knowledge of DNA binding proteins. The ENCODE researchers ignored these explanations and attributed function to all transcripts and all transcription factor binding sites. That's why they announced that 80% of the genome is functional.

Here's a reminder of what ENCODE actually said in 2012 (The ENCODE Project Consortium, 2012). The lead author is Ewan Birney, the ENCODE Consortium leader/spokesperson.
The human genome encodes the blueprint of life, but the function of the vast majority of its nearly three billion bases is unknown. The Encyclopedia of DNA Elements (ENCODE) project has systematically mapped regions of transcription, transcription factor association, chromatin structure and histone modification. These data enabled us to assign biochemical functions for 80% of the genome, in particular outside of the well-studied protein-coding regions.
As I said above, the real controversy is about the science and not about philosophical debates over the meaning of function. In 2012 it was ridiculous to dismiss all of the evidence for junk DNA and focus on transcripts and binding sites that knowledgeable scientists knew were spurious. It was ridiculous to claim that 80% of the human genome was functional. [see The truth about ENCODE and What did the ENCODE Consortium say in 2012?]

Evolution is at the heart of the controversy and that's why the ENCODE researchers were vehemently opposed by experts in molecular evolution. This strong opposition from knowledgeable experts is why the ENCODE leaders partially retracted their claim in 2014 (Kellis et al., 2014). As some scientists have pointed out, the ENCODE supporters have an adapationist (panglossian) view of evolution that's out of touch with modern views of evolution at the molecular level. Here's a nice summary in a paper by Casane et al. (2015).
In September 2012, a batch of more than 30 articles presenting the results of the ENCODE (Encyclopaedia of DNA Elements) project was released. Many of these articles appeared in Nature and Science, the two most prestigious interdisciplinary scientific journals. Since that time, hundreds of other articles dedicated to the further analyses of the Encode data have been published. The time of hundreds of scientists and hundreds of millions of dollars were not invested in vain since this project had led to an apparent paradigm shift: contrary to the classical view, 80% of the human genome is not junk DNA, but is functional. This hypothesis has been criticized by evolutionary biologists, sometimes eagerly, and detailed refutations have been published in specialized journals with impact factors far below those that published the main contribution of the Encode project to our understanding of genome architecture. In 2014, the Encode consortium released a new batch of articles that neither suggested that 80% of the genome is functional nor commented on the disappearance of their 2012 scientific breakthrough. Unfortunately, by that time many biologists had accepted the idea that 80% of the genome is functional, or at least, that this idea is a valid alternative to the long held evolutionary genetic view that it is not. In order to understand the dynamics of the genome, it is necessary to re-examine the basics of evolutionary genetics because, not only are they well established, they also will allow us to avoid the pitfall of a panglossian interpretation of Encode. Actually, the architecture of the genome and its dynamics are the product of trade-offs between various evolutionary forces, and many structural features are not related to functional properties. In other words, evolution does not produce the best of all worlds, not even the best of all possible worlds, but only one possible world.
There are many other scientists who have made the same points about ENCODE. I especially want to recommend Ford Doolittle's critique [Ford Doolittle's Critique of ENCODE].

On the meaning of the word "function"

The Sanger Institute (Cambridge, UK) was an important player in the ENCODE Consortium. It put out a press release on the day the papers were published [Google Earth of Biomedical Research]. The opening paragraph is ...

The ENCODE Project, today, announces that most of what was previously considered as 'junk DNA' in the human genome is actually functional. The ENCODE Project has found that 80 per cent of the human genome sequence is linked to biological function.

Many of us believe that 90% of our genome is junk and only 10% is functional so clearly there's a disagreement over the significance of the word "function." This debate has spawned the Function Wars.

Function warriors focus their attention on two definitions of function that have long been discussed by philosophers. The causal-role (CR) definition depends on identifying a role for a particular sequence; for example, a sequence that binds a transcription factor or a DNA sequence that's transcribed. The mere existence of an identifiable role for such a sequence is evidence of CR function. This is clearly nonsense because it ignores the fact that transcription factors can bind randomly to any DNA sequence that resembles a functional binding site and many DNA sequences are transcribed fortuitously by RNA polymerase from time to time creating a background noise of junk RNA transcripts.

The CR definition of function is pretty much useless as the only meaningful definition of function in biology, although identifying a causal role can be the first step in determining whether a sequence is actually functional. This is why Doolittle et al. (2014) recommend that scientists and philosophers stop talking about CR "function" and, instead, refer to CR "effects" (

The selected effect (SE) function is the other definition. To understand it, let's assume that all you have is sequence information and you are interested in determining how much of the genome is functional. One of the best ways of doing this is to look at which sequences are "conserved." By this I mean sequences that change more slowly than expected given the known rate of mutation.

The sequence must be conserved in closely related species and also within the population. If it's conserved then this is powerful evidence that it's under negative selection and that is the best evidence we have that a sequence currently carries out a biological function in the species.

The latest evidence on conservation in the human genome indicates that about 8% is under negative selection. This is consistent with decades of work on genetic load showing that species could not survive if more than (roughly) 10% of the genome had to be conserved.

But sequence information alone will not tell you the actual function of a particular sequence. For that you need biochemists and molecular biologists who look at the roles that sequences play in life of the cell/organism. Decades of work have demonstrated a strong correlation between biological function and conservation so that we can be extremely confident that conservation really does indicate function. This is how we know the genes, regulatory sequences, origins of replication, centimeters, and telomeres are functional parts of the genome.

The ENCODE Consortium relied exclusively on CR function to conclude that 80% of the genome is functional. They rejected the SE definition (see Stamatoyannopolis, 2012). All knowledgeable scientists now agree that ENCODE was wrong and that sequence conservation is the best evidence of function.

What did ENCODE researchers really think?

In a paper published last December (2019), Brović and Šustar claim that ENCODE used a very broad definition of function that confused actual function with evidence that a particular sequence is likely to have function. They reference Germain et al. (2014)1 as another example of philosophers who support this interpretation of the ENCODE claims.
What the proponents of ENCODE call function is, according to Germain et al. (2014), merely something that is likely to have a function. In this reading, ENCODE's biochemical function refers to activities that can be taken as evidence of potential biological functions.
This is function pluralism because it invokes both SE and CR definitions. Brović and Šustar say that there are two versions of function pluralism: methodological pluralism and theoretical pluralism. Methodological pluralism invokes both causal role data and conservation data in an effort to demonstrate true function without making a commitment as to whether CR and/or SE definitions of function are accurate. Theoretical pluralism, on the other hand, is the view that both causal-roles (CR) and conservation (SE) can prove real biological function.

The ENCODE leaders were justifiably invoking methodological pluralism in an attempt to discover function, according to Brović and Šustar. They weren't arguing that causal-roles on their own could demonstrate function.

I suppose one could look at the ENCODE leaders' "retraction" paper as support for this view (see Kellis et al. 2014) but in my opinion it is misguided—it is revisionist history in the worst sense of the word. I think the publicity campaign in 2012 showed unequivocally that ENCODE leaders really thought they had discovered true function putting to rest the idea that most of our genome is junk. They were not just advancing a hypothesis about whether 80% of the genome might possibly have a function based on their data—they actually claimed that it did have a function. They made no effort to question or correct any of the press reports saying that 80% of the human genome is functional, not junk. The fact that they backed off this claim under pressure from knowledgeable scientists doesn't mean they were misinterpreted in 2012.

"Weak etiological monism as a way out of the controversy"

The best way to resolve the controversy over how much of our genome is functional is to use scientific evidence to resolve ambiguous claims of function. Are all conserved sequences currently functional? The answer is "no" because we have examples. Are there non-conserved regions of the genome that are functional? The answer is "yes" because we have examples. The SE definition is not sufficient to resolve the controversy [see The Function Wars Part VI: The problem with selected effect function].

Are there lots of causal-role sequences that aren't functional? The answer is "yes" because we have examples. Is 90% of our genome junk? The answer is probably "yes" because that's what the cumulative data shows. Biology is messy and no strict definition of the word "function" is going to cover every possibility.

Brović and Šustar think they have come up with a philosophical way of resolving the controversy and they describe it in the last section of their paper. It covers four pages under the heading, "Weak etiological monism as a way out of the controversy." They say that etiological functions are those that define function in terms of their evolutionary history. The strong version, which they say is the standard SE version, defines function solely in terms of whether a particular trait has arisen by adaptation (whether it was selected for in the past). They quote Ford Doolittle as a proponent of strong SE monism because he said,
... the functions of a trait or feature are all and only those effects of its presence for which it was under positive natural selection in the (recent) past and for which it is under (at least) purifying selection now. (Doolittle, 2013)
The world is not inhabited exclusively by fools and when a subject arouses intense interest and debate, as this one has, something other than semantics is usually at stake.

Stephan Jay Gould (1982)
Weak etiological monism recognizes that a trait may have arisen by constructive neutral evolution but it is now maintained by natural selection (purifying selection).2 Thus, in the weak version, it is not necessary that a functional trait have arisen by positive selection (adaptation) only that it is currently conserved by purifying selection. Brović and Šustar recognize that Doolittle is probably a proponent of weak monism in spite of the definition he gave in his 2013 paper. They are correct; Doolittle has long been a proponent of constructive neutral evolution so he's quite familiar with the idea that a currently functional trait may have arisen by means other than adaptation.

I'd like to emphasize that no matter what you call it, the best evidence for function is whether a given stretch of DNA is currently being conserved by purifying selection and this definition is entirely based on scientific data. It's true that many scientists talk about SE definition as a historical definition and it's true that they often refer to selected effects as those that have arisen by adaptation (i.e. strong etiological monism). However, that's just sloppy writing because most proponents of SE function are well aware of traits that could have arisen by non-adaptive processes but are nevertheless currently under negative selection.

In summary, the essence of the Brović and Šustar paper is that ENCODE may have only been proposing a possible function for 80% of the genome and that somewhat justifies their reliance of causal-role effects. Brović and Šustar then propose that the SE definition of function should be mostly restricted to sequences currently under negative selection and not just to DNA that has arisen by adaptation; it should encompass traits that have arisen by neutral processes.

Function Wars
(My personal view of the meaning of function is described at the end of Part V.)

1. See The Function Wars: Part I for a discussion of the Germain et al. paper.

2. They refer to the evolution of the spliceosome as an example of constructive neutral evolution [see Constructive Neutral Evolution].

Brzović, Z., and Šustar, P. (2020) Postgenomics function monism. Studies in History and Philosophy of Science Part C: Studies in History and Philosophy of Biological and Biomedical Sciences, 101243.[doi: 10.1016/j.shpsc.2019.101243

Casane, D., Fumey, J., and Laurenti, P. (2015) L’apophénie d’ENCODE ou Pangloss examine le génome humain. médecine/sciences, 31:680-686. [doi: 10.1051/medsci/20153106023]

Doolittle, W. F. (2013) Is junk DNA bunk? A critique of ENCODE. Proceedings of the National Academy of Sciences 110:5294-5300. [doi: 10.1073/pnas.1221376110]

Germain, P.-L., Ratti, E., and Boem, F. (2014) Junk or functional DNA? ENCODE and the function controversy. Biology & Philosophy 29:807-821. ]doi: 10.1007/s10539-014-9441-3]

Kellis, M., Wold, B., Snyder, M. P., Bernstein, B. E., Kundaje, A., Marinov, G. K., Ward, L. D., Birney, E., Crawford, G. E., Dekker, J., Dunham, I., Elnitski, L., Farnham, E. A., Gerstein, M., Giddings, M. C., Gilbert, D. M., Gingeras, T. R., Green, E. D., Guigo, R., Hubbard, T., Kent, J., Lieb, J. D., Myers, R. M., Pazin, M. J., Ren, B., Stamatoyannopoulos, J. A., Weng, Z., White, K. P., and Hardison, R. C. (2014) Defining functional DNA elements in the human genome. Proceedings of the National Academy of Sciences, 111:6131-6138. [doi: 10.1073/pnas.131894811]

The ENCODE Project Consortium (2012) An integrated encyclopedia of DNA elements in the human genome. Nature, 489:57-74. [doi: 10.1038/nature11247]


apalazzo said...

Maybe the best response to this paper, is the one just publish by
Stefan Linquist, Ford Doolittle, and me, just this past week.

Here we point out that many individuals assume that a SE functional trait must have arisen by positive selection and we point out that this is false. Traits can be currently under purifying selection without having ever arisen by positive selection.

Larry Moran said...

Be patient, I'll get to your paper in a few days. I wanted to finish my post on the Brović and Šustar paper before addressing your paper because their paper was published four months ago and they make the same point that you make. You didn't reference the Brović and Šustar paper in your recent paper.

Here's what Brović and Šustar say in their December 2019 paper ....

"The main differences in function attributions is that our weak etiological account allows a wider range of genomic elements to be considered functional than is the case with the SE account. Most notably, the cases where the traits have arisen through a neutral process and not through selection. Such cases are invoked by proponents of constructive neutral evolution (see Graym 2012; Stoltzfus, 1999). One example is the eukaryotic spliceosome ..."