Sandwalk: On the Meaning of the Word "Function"

Friday, March 15, 2013

On the Meaning of the Word "Function"

A lot of the debate over ENCODE's publicity campaign concerns the meaning of the word "function." In the summary article published in Nature last September the authors said, "These data enabled us to assign biochemical functions for 80% of the genome ...." (The ENCODE Project Consortium, 2012).

Here's how they describe function.

Operationally, we define a functional element as a discrete genome segment that encodes a defined product (for example, protein or non-coding RNA) or displays a reproducible biochemical signature (for example, protein binding, or a specific chromatin structure).

What, exactly, do the ENCODE scientists mean? Do they think that junk DNA might contain "functional elements"? If so, that doesn't make a lot of sense, does it?

Ewan Birney tried to address this definitional morass on his blog [ENCODE: My own thoughts] where he says ....

It’s clear that 80% of the genome has a specific biochemical activity – whatever that might be. This question hinges on the word “functional” so let’s try to tackle this first. Like many English language words, “functional” is a very useful but context-dependent word. Does a “functional element” in the genome mean something that changes a biochemical property of the cell (i.e., if the sequence was not here, the biochemistry would be different) or is it something that changes a phenotypically observable trait that affects the whole organism? At their limits (considering all the biochemical activities being a phenotype), these two definitions merge. Having spent a long time thinking about and discussing this, not a single definition of “functional” works for all conversations. We have to be precise about the context. Pragmatically, in ENCODE we define our criteria as “specific biochemical activity” – for example, an assay that identifies a series of bases. This is not the entire genome (so, for example, things like “having a phosphodiester bond” would not qualify). We then subset this into different classes of assay; in decreasing order of coverage these are: RNA, “broad” histone modifications, “narrow” histone modifications, DNaseI hypersensitive sites, Transcription Factor ChIP-seq peaks, DNaseI Footprints, Transcription Factor bound motifs, and finally Exons.

That's about as clear as mud.

We all know what the problem is. It's whether all binding sites have a biological function or whether many of them are just noise arising as a property of DNA binding proteins. It's whether all transcripts have a biological function or whether many of those detected by ENCODE are just spurious transcripts or junk RNA. These questions were debated extensively when the ENCODE pilot project was published in 2007. Every ENCODE scientist should know about this problem so you might expect that they would take steps to distinguish between real biological function and nonfunctional noise.

Their definition of "function" is not helpful. In fact, it seems deliberately designed to obfuscate.

Let's see how other scientist interpret the ENCODE results. In a News & Views article published in Nature last September, Joseph R, Ecker (Salk Institute scientist) said ...

One of the more remarkable findings described in the consortium's 'entré' paper is that 80% of the genome contains elements linked to biochemical function, dispatching the widely held view that the human genome is mostly 'junk DNA.'

That makes at least one genomics worker who thinks that "biochemical function" and junk DNA are mutually exclusive.

Recently a representative of GENCODE responded to Dan Graur's criticism [On the annotation of functionality in GENCODE (or: our continuing efforts to understand how a television set works)]. This person (JM) says ...

Q1: Does GENCODE believe that 80% of the genome is functional?

As noted, we will only discuss here the portion of the genome that is transcribed. According to the main ENCODE paper, while 80% of the genome appears to have some biological activity, only “62% of genomic bases are reproducibly represented in sequenced long (>200 nucleotides) RNA molecules or GENCODE exons”. In fact, only 5.5% of this transcription overlaps with GENCODE exons. So we have two things here: existing GENCODE models largely based on mRNA / EST evidence, and novel transcripts inferred from RNAseq data. The suggestion, then, is that there is extensive transcription occurring outside of currently annotated GENCODE exons.

There's another scientist who thinks that 80% of the genome has some biological activity in spite of the fact that the ENCODE paper says it has "biochemical function." I don't think "biological activity" is compatible with "junk DNA," but who knows what they think?

Since this person is part of the ENCODE team, we can assume that at least some of the scientists on the team are confused.

The Sanger Institute (Cambridge, UK) was an important player in the ENCODE Consortium. It put out a press release on the day the papers were published [Google Earth of Biomedical Research]. The opening paragraph is ...

The ENCODE Project, today, announces that most of what was previously considered as 'junk DNA' in the human genome is actually functional. The ENCODE Project has found that 80 per cent of the human genome sequence is linked to biological function.

It looks like the Sanger Institute equates "biochemical function" and "biological function" and it looks like neither one is compatible with junk DNA.

I think the ENCODE leaders, including Ewan Birney, knew exactly what they were doing when they defined function. They meant "biological function" even though they equivocated by saying "biochemical function." And they meant for this to be interpreted as "not junk" even though they are attempting to backtrack in the face of criticism.

Function Wars
(My personal view of the meaning of function is described at the end of Part V.)

The ENCODE Project Consortium (2012) An integrated encyclopedia of DNA elements in the human genome. Nature 489: 57-74. (E. Birney, corresponding author)

25 comments :

NickM said...: Good post! But:

"I don't think "biological activity" is compatible with "junk DNA," but who knows what they think?"

<-- Mere "activity" certainly is compatible with "junk DNA", did you mean to say that or is this a typo?; Friday, March 15, 2013 3:37:00 PM
John Harshman said...: I believe Larry is distinguishing biological activity from biochemical activity, in which biological activity is limited to activity that has some effect on the biology of the cell or organism. Noise would be expected generally to have no such effect.; Friday, March 15, 2013 4:06:00 PM
Markk said...: I guess what I would like to ask would be something like this:

What part of the DNA sequence that ENCODE calls functional would cause any change to an active cell other then responding to the assays differently if it was changed to a different sequence with similar physical properties. Would the cell function differently? Say for example a stretch that codes for "RNA" as said above, suppose the RNA could even be identified as some small piece of a gene? Suppose a different variety of that gene was coded for instead. Would there be any change to the activity of the cell except for this expression?

Obviously even non-functional DNA physically takes up space and so causes some effects on a cell, but evidently the ENCODE Team drew the line there.

If using their assays or something similar is the only way one could tell cell varieties with different varieties of the "Functional" stretches and there is no other expressed effect then I would call their usage misleading and to be called out.; Friday, March 15, 2013 4:13:00 PM
Diogenes said...: I agree with Nick. Recall that in David Comings' book from 1972, he knew that at least 25% of the mouse genome was transcribed, much more than the coding regions, and Comings explicitly stated in 1972 that much Junk DNA will be transcribed.

This means that transcription is compatible with Junk, and always was. Anyone who says otherwise is an idiot, possibly small i.

If transcription = activity, then activity is compatible with Junk.; Friday, March 15, 2013 4:14:00 PM
khms said...: As I see it, if they define function as biochemical activity, and transcription counts, then the answer has to be all of it, because then mitosis has to count.

If, to avoid this, mitosis shouldn't count, then neither should transcription.

Having it both ways is dishonest.; Friday, March 15, 2013 6:02:00 PM
John Harshman said...: Diogenes: So far all we're arguing about is what "biological activity" means. Larry appears to define it to exclude non-functional transcription. This seems reasonable to me. Calm down and actually read what people say, rather than interpreting every comment as an assault on the reality of junk DNA.; Friday, March 15, 2013 6:09:00 PM
Diogenes said...: John: Calm down and actually read what people say

Your comment was published while I was composing mine, so it's not like I skipped over it.

In Larry's post he mentions
"biochemical function"
"biological function"
"biochemical activity"
"biological activity"

This will get confusing very fast.

Junk DNA may be transcribed into RNA, as Comings defined it.

Biochemical activity is a superset of DNA transcribed into RNA, as Birney defines it.

Thus Junk DNA overlaps biochemical activity, neither subset nor superset. Venn diagram.

Is biological activity non-compatible with Junk?

Is biological activity also a subset of biochemical activity?

Then the Junk that is biochemically active and the DNA that is biologically active are two non-overlapping subsets of biochemical activity.

What could be simpler?; Friday, March 15, 2013 11:13:00 PM
John Harshman said...: Sorry, thought you were responding to me. I hope "What could be simpler?" was meant ironically, but I can't tell for sure. But yeah, I think Larry was using "biologically active" to refer to functional DNA.; Friday, March 15, 2013 11:55:00 PM
SPARC said...: Is there any information about the number transcripts derived from a single junk sequnece per cell compared to the number of trancripts of a lowly expressed well defined gene?; Saturday, March 16, 2013 6:23:00 AM
Joe Felsenstein said...: Most scientists, hearing phrases like "biochemical activity", "biological function", etc. conclude one thing -- that the site is doing something that makes a difference to the fitness of the organism, and hence is not "junk". The fundamental fact is that the way ENCODE publicized their work (as excellent as most of the work was) has persuaded the popular science press, and also the public, and also most other scientists, that "junk DNA" was a mistaken notion, that essentially all of the genome is doing something that matters to the fitness of the organism.

It will take probably about 10 years for people to realize that the notion of junk DNA was not a delusion. If the ENCODE consortium wants to help in that process, it can. But so far all I have seen from their publicity machine is elaborate circumlocution.; Saturday, March 16, 2013 9:44:00 AM
Claudiu Bandea said...: Joe Felsenstein says: “It will take probably about 10 years for people to realize that the notion of junk DNA was not a delusion”

I think most scientists, including many members of ENCODE project, have realized already that the data produced by this project, as valuable as it might be, do not indicate that 80% of the human genome is functional. However, this does not mean that that the notion of junk DNA (jDNA) is not a delusion.

As I discussed at length here and elsewhere (e.g. (see: http://comments.sciencemag.org/content/10.1126/science.337.6099.1159), the so called jDNA provides a defense mechanism against insertional mutagenesis and, therefore, it’s functional. This is an indisputable, statistical fact and, therefore, the concept of jDNA is indeed a delusion.

There is no way Joe, that you will dispute the fact that jDNA provides a defense mechanism against insertional mutagenesis. Apparently, betting is popular here at Sandwalk, so I’ll bet that your answer Joe will confirm my assertion; obviously, choosing not to respond this challenge will confirm my claim.; Sunday, March 17, 2013 10:57:00 AM
Joe Felsenstein said...: I have pointed out 6 months ago
in an earlier discussion on Sandwalk what is wrong with your argument. Your response did not deal with the issue I raised -- that natural selection to retain a piece of junk DNA would usually have extremely weak selection. So I will just point you there again. Obviously failing once again to deal with my counterargument will confirm my claim! (Pointing out natural cases with little junk DNA such as hummingbirds and pufferfish does not deal with my argument).; Sunday, March 17, 2013 11:52:00 AM
Georgi Marinov said...: That's not a sequence-specific function, and given the vastness of the genome, it's not really a function at all because even you can delete many megabases of junk DNA and the genome will still be big and protected against insertational mutagenesis.; Sunday, March 17, 2013 11:54:00 AM
Claudiu Bandea said...: Joe Felsenstein: “Your response did not deal with the issue I raised -- that natural selection to retain a piece of junk DNA would usually have extremely weak selection”

The reason I didn’t specifically address your response was that, just like your response here, your previous comment allowed for some “quite small” or “extremely weak selection” on retaining “a piece of junk DNA.” I completely agree with that: indeed, the selection acting on each piece of jDNA is extremely small.

However, the selection to retain a large population of jDNA pieces as a defense mechanism against insertional mutagenesis is extremely high. Take for example our immune system which is made up of hundreds of different components, including dozens of antimicrobial peptides. The selection force to retain a specific antimicrobial peptide is probably very small, but that for retaining a large population of antimicrobial peptides is very high. You are an expert in ‘quantitative biology’ and I think you and other experts in this field would be able to confirm or refute my argument analytically.

Although we both look at biological phenomena primarily from a selection perspective, let’s consider that jDNA evolved purely by genetic drift, in the absence of any selection for a particular benefit to the host. My question to you is: does jDNA serve currently as a protective mechanism against insertional mutagenesis by endogenous and exogenous viral elements?

I think you agree with me that the answer to this question clearly: YES!

Georgi Marinov: “That's not a sequence-specific function, and given the vastness of the genome, it's not really a function at all because even you can delete many megabases of junk DNA and the genome will still be big and protected against insertational mutagenesis”

Apparently, you are not familiar with the notion that in addition to its informational role, DNA can have other functions (please see http://comments.sciencemag.org/content/10.1126/science.337.6099.1159, or the recent PNAS paper by W. Ford Doolittle).

Regarding your comment on the phenotypic effects of deleting jDNA, see my answer to Joe above.; Sunday, March 17, 2013 1:37:00 PM
Joe Felsenstein said...: [Claudiu Bandea]: The reason I didn’t specifically address your response was that, just like your response here, your previous comment allowed for some “quite small” or “extremely weak selection” on retaining “a piece of junk DNA.” I completely agree with that: indeed, the selection acting on each piece of jDNA is extremely small.

Good to see that you agree with me on that. That's progress.

[Claudiu Bandea]: However, the selection to retain a large population of jDNA pieces as a defense mechanism against insertional mutagenesis is extremely high.

Yes, it could be moderate-to-high. That might explain selection for a mechanism that brings a lot of junk DNA into existence at one go. But of course if the junk DNA was itself copies of transposons, and they were active, that would also be bringing into existence more insertional mutagenesis, as well as protecting against some of it.

On the other hand if the junk DNA came and went in small pieces, then my argument would apply -- the strength of natural selection on those changes will be very weak. Much weaker than the strength of selection for an individual antimicrobial peptide, obviously. There may be hundreds of antimicrobial peptides but the number of pieces (of similar length) of junk DNA is millions.

You can't just wave your hand and say that junk DNA is, as a whole, good for you, and think that you have explained its presence. As far as I can see you're not accounting for its dynamics.; Sunday, March 17, 2013 3:17:00 PM
Claudiu Bandea said...: Joe, I think I have reasonably addressed the broad dynamics of jDNA origin and retention in the model I proposed 2 decades ago. Of course, the model requires additional development and experimentation, but that's true to any model.

However, the question that I'm trying to get a straight answer from you is this: does jDNA serve as a protective mechanism against insertional mutagenesis or not? My answer is yes. What is yours?; Sunday, March 17, 2013 5:17:00 PM
Georgi Marinov said...: I'm curious what your theory says about introns - were they selected for because they provided defense against insertational mutagenesis, and if yes, are you seriously trying to say that when they were originally inserted for the first time, the benefits of having more DNA around to prevent against insertions of new TEs outweighed the fitness cost of their insertion, which is they very thing your theory is saying they exist to defend against...; Sunday, March 17, 2013 5:56:00 PM
Joe Felsenstein said...: [Claudiu Bandea]: However, the question that I'm trying to get a straight answer from you is this: does jDNA serve as a protective mechanism against insertional mutagenesis or not? My answer is yes. What is yours?

Yes if it does not itself bring in a higher rate of insertional mutagenesis. And ... saying that it "serves as" a protective mechanism is ambiguous. It implies that it was evolved for that reason. I'm not sure I'd accept "served as", I'd say that the presence of jDNA makes for fewer negative effects of insertional mutagenesis.

But even if we were to accept "serves as", that is not enough to say that this "service" is why it is there. For that you need a full quantitative treatment. And since you are the one pushing the idea, that is up to you.; Sunday, March 17, 2013 6:26:00 PM
Claudiu Bandea said...: Georgi Marinov: “I'm curious what your theory says about introns - were they selected for because they provided defense against insertational mutagenesis…”

Yes, that’s what the model predicts. Here is a quote in which I discuss this paradigm:

“...gene splicing evolved to allow for the presence of ncDNA within transcribed regions, which are preferred targets for the integration of viral genomes (65-68). It is well known that much of the ncDNA is composed of remnants of viral sequences, and that the eukaryal introns resemble the group II self-splicing introns, which in turn resemble retroviral elements. Likely, these elements are evolutionary related (51;57) and it is highly probable that the spliceosomal machinery originated from symbiotic endogenous viral species that coevolved with their host to protect the coding regions from insertional damage (more on the selective forces leading to the evolutionary origin of introns and spliceosomal machinery in the next section).”; Sunday, March 17, 2013 8:37:00 PM
Georgi Marinov said...: That makes no sense.

If they are so beneficial, why are splicosomal introns absent from prokaryotes and why were so many of them lost in unicellular eukaryotes?

People have done detailed analysis of the positions of introns across all eukaryotes - most of them seem to have been present in last common eukaryotic ancestor, which means those small eukaryotes with compressed genomes lost a lot of the introns they originally had. Why did they do that?

Spliceosomal introns indeed evolved from group II introns - but it wasn't because this was beneficial, it was because it was either that or extinction, due to the deleterious effect of those same group II introns.; Sunday, March 17, 2013 8:48:00 PM
Claudiu Bandea said...: Joe Felsenstein: Yes, if…

I appreciate your YES, even with an ‘if’ attach with it.

Counting Georgi (quote : “even you can delete many megabases of junk DNA and the genome will still be big and protected against insertational mutagenesis) there are 2 people here Sandwalk accepting this idea. Now, that’s progress!

As critical as the protective role of jDNA might be in the germline, the protection against insertional mutagenesis in the somatic cells might be much more critical. In humans, for example, given the enormous number of somatic cells and their high turnover rate during our reproductive span, the number of insertion events that would potentially lead to cancer in the absence of protective mechanisms would be evolutionarily drowning. Think, for example, about the number of insertions in the somatic cells caused by an exogenous retrovirus, such as HIV.

Joe, maybe you and your colleagues in the field of quantitative biology and mathematical modeling can help develop this model. What do you think?; Sunday, March 17, 2013 8:49:00 PM
Georgi Marinov said...: That does not answer my objection at all - I said that you can delete a lot of dead TE genomic junk and the genome will still be huge and protected against insetional mutagenesis in your theory. Somatic or germline, it does not matter. The selective coefficient will be tiny at best.

It also does not pass the onion test.

Why do closely related species with very similar lifestyles and life cycles need different amounts of junk DNA?; Sunday, March 17, 2013 8:55:00 PM
Claudiu Bandea said...: Georgi Marinov: Spliceosomal introns indeed evolved from group II introns - but it wasn't because this was beneficial, it was because it was either that or extinction, due to the deleterious effect of those same group II introns.

Maybe I misunderstand, but I think you are confirming my model by saying that without the evolution of spliceosomal machinery as a protective mechanism against inserting group II and other types of viral insertional elements, the eukaryal hosts would have gone extinct.

Species with compact or compressed genome genomes, such as Bacteria and Archaea, evolved other protective mechanisms against isertional mutagenesis, such specific integration sites. The evolution of these protective mechanism in organisms that have strong constrains on genome size is strong testimony for the extraordinary selective pressure imposed by inserting elements on their host.; Sunday, March 17, 2013 9:20:00 PM
Claudiu Bandea said...: Georgi, maybe I misunderstood you. So here is a straight question, which I think deserves a straight answer: does jDNA serve as a protective mechanism against insertional mutagenesis or not?

My answer is yes. What is yours?; Sunday, March 17, 2013 9:31:00 PM
Joe Felsenstein said...: [Claudiu Bandea]: Joe, maybe you and your colleagues in the field of quantitative biology and mathematical modeling can help develop this model. What do you think?

I am busy with other matters. Since you think that you have a selective mechanism that explains the presence of junk DNA, I think the onus is on you to recruit someone to see if they can work out why. And do that before you continue posting assertions that you have an explanation.; Sunday, March 17, 2013 9:47:00 PM

Quotations

The old argument of design in nature, as given by Paley, which formerly seemed to me to be so conclusive, fails, now that the law of natural selection has been discovered. We can no longer argue that, for instance, the beautiful hinge of a bivalve shell must have been made by an intelligent being, like the hinge of a door by man. There seems to be no more design in the variability of organic beings and in the action of natural selection, than in the course which the wind blows.Charles Darwin (c1880)

Although I am fully convinced of the truth of the views given in this volume, I by no means expect to convince experienced naturalists whose minds are stocked with a multitude of facts all viewed, during a long course of years, from a point of view directly opposite to mine. It is so easy to hide our ignorance under such expressions as "plan of creation," "unity of design," etc., and to think that we give an explanation when we only restate a fact. Any one whose disposition leads him to attach more weight to unexplained difficulties than to the explanation of a certain number of facts will certainly reject the theory.

Charles Darwin (1859)

Science reveals where religion conceals. Where religion purports to explain, it actually resorts to tautology. To assert that "God did it" is no more than an admission of ignorance dressed deceitfully as an explanation...

Peter Atkins

Quotations

The world is not inhabited exclusively by fools, and when a subject arouses intense interest, as this one has, something other than semantics is usually at stake. Stephen Jay Gould (1982)
I have championed contingency, and will continue to do so, because its large realm and legitimate claims have been so poorly attended by evolutionary scientists who cannot discern the beat of this different drummer while their brains and ears remain tuned to only the sounds of general theory. Stephen Jay Gould (2002) p.1339
The essence of Darwinism lies in its claim that natural selection creates the fit. Variation is ubiquitous and random in direction. It supplies raw material only. Natural selection directs the course of evolutionary change. Stephen Jay Gould (1977)
Rudyard Kipling asked how the leopard got its spots, the rhino its wrinkled skin. He called his answers "just-so stories." When evolutionists try to explain form and behavior, they also tell just-so stories—and the agent is natural selection. Virtuosity in invention replaces testability as the criterion for acceptance. Stephen Jay Gould (1980)
Since 'change of gene frequencies in populations' is the 'official' definition of evolution, randomness has transgressed Darwin's border and asserted itself as an agent of evolutionary change. Stephen Jay Gould (1983) p.335
The first commandment for all versions of NOMA might be summarized by stating: "Thou shalt not mix the magisteria by claiming that God directly ordains important events in the history of nature by special interference knowable only through revelation and not accessible to science." In common parlance, we refer to such special interference as "miracle"—operationally defined as a unique and temporary suspension of natural law to reorder the facts of nature by divine fiat. Stephen Jay Gould (1999) p.84

Quotations

My own view is that conclusions about the evolution of human behavior should be based on research at least as rigorous as that used in studying nonhuman animals. And if you read the animal behavior journals, you'll see that this requirement sets the bar pretty high, so that many assertions about evolutionary psychology sink without a trace.

Jerry Coyne
Why Evolution Is True

I once made the remark that two things disappeared in 1990: one was communism, the other was biochemistry and that only one of them should be allowed to come back.

Sydney Brenner
TIBS Dec. 2000

It is naïve to think that if a species' environment changes the species must adapt or else become extinct.... Just as a changed environment need not set in motion selection for new adaptations, new adaptations may evolve in an unchanging environment if new mutations arise that are superior to any pre-existing variations

Douglas Futuyma

One of the most frightening things in the Western world, and in this country in particular, is the number of people who believe in things that are scientifically false. If someone tells me that the earth is less than 10,000 years old, in my opinion he should see a psychiatrist.

Francis Crick

There will be no difficulty in computers being adapted to biology. There will be luddites. But they will be buried.

Sydney Brenner

An atheist before Darwin could have said, following Hume: 'I have no explanation for complex biological design. All I know is that God isn't a good explanation, so we must wait and hope that somebody comes up with a better one.' I can't help feeling that such a position, though logically sound, would have left one feeling pretty unsatisfied, and that although atheism might have been logically tenable before Darwin, Darwin made it possible to be an intellectually fulfilled atheist

Richard Dawkins

Another curious aspect of the theory of evolution is that everybody thinks he understand it. I mean philosophers, social scientists, and so on. While in fact very few people understand it, actually as it stands, even as it stood when Darwin expressed it, and even less as we now may be able to understand it in biology.

Jacques Monod

The false view of evolution as a process of global optimizing has been applied literally by engineers who, taken in by a mistaken metaphor, have attempted to find globally optimal solutions to design problems by writing programs that model evolution by natural selection.

Richard Lewontin

More Recent Comments

Friday, March 15, 2013

On the Meaning of the Word "Function"

25 comments :