Sandwalk: Are most transcription factor binding sites functional?

Thursday, June 22, 2017

Are most transcription factor binding sites functional?

The ongoing debate over junk DNA often revolves around data collected by ENCODE and others. The idea that most of our genome is transcribed (pervasive transcription) seems to indicate that genes occupy most of the genome. The opposing view is that most of these transcripts are accidental products of spurious transcription. We see the same opposing views when it comes to transcription factor binding sites. ENCODE and their supporters have mapped millions of binding sites throughout the genome and they believe this represent abundant and exquisite regulation. The opposing view is that most of these binding sites are spurious and non-functional.

The messy view is supported by many studies on the biophysical properties of transcription factor binding. These studies show that any DNA binding protein has a low affinity for random sequence DNA. They will also bind with much higher affinity to sequences that resemble, but do not precisely match, the specific binding site [How RNA Polymerase Binds to DNA; DNA Binding Proteins]. If you take a species with a large genome, like us, then a typical DNA protein binding site of 6 bp will be present, by chance alone, at 800,000 sites. Not all of those sites will be bound by the transcription factor in vivo because some of the DNA will be tightly wrapped up in dense chromatin domains. Nevertheless, an appreciable percentage of the genome will be available for binding so that typical ENCODE assays detect thousand of binding sites for each transcription factor.

This information appears in all the best textbooks and it used to be a standard part of undergraduate courses in molecular biology and biochemistry. As far as I can tell, the current generation of new biochemistry researchers wasn't taught this information.

In light of available knowledge of the properties of DNA binding proteins, it make sense to assume that most of these sites have nothing to do with regulating transcription. They could easily be sitting on junk DNA. That's not what ENCODE researchers conclude.

It seems to me that the onus is on those claiming that a transcription factor binding site is functional. In the absence of evidence for function we should assume that it's just spurious binding, especially since this is the predicted result based on 50 years of research on DNA binding proteins.¹

Some people have been concerned enough about the controversy to develop global tests for possible function. A recent paper by one of these groups caught my eye ...

Cusanovich, D.A., Pavlovic, B., Pritchard, J.K., and Gilad, Y. (2014) The functional consequences of variation in transcription factor binding. PLoS Genet, 10(3), e1004226.[doi: 10.1371/journal.pgen.1004226]

ABSTRACT: One goal of human genetics is to understand how the information for precise and dynamic gene expression programs is encoded in the genome. The interactions of transcription factors (TFs) with DNA regulatory elements clearly play an important role in determining gene expression outputs, yet the regulatory logic underlying functional transcription factor binding is poorly understood. Many studies have focused on characterizing the genomic locations of TF binding, yet it is unclear to what extent TF binding at any specific locus has functional consequences with respect to gene expression output. To evaluate the context of functional TF binding we knocked down 59 TFs and chromatin modifiers in one HapMap lymphoblastoid cell line. We then identified genes whose expression was affected by the knockdowns. We intersected the gene expression data with transcription factor binding data (based on ChIP-seq and DNase-seq) within 10 kb of the transcription start sites of expressed genes. This combination of data allowed us to infer functional TF binding. Using this approach, we found that only a small subset of genes bound by a factor were differentially expressed following the knockdown of that factor, suggesting that most interactions between TF and chromatin do not result in measurable changes in gene expression levels of putative target genes. We found that functional TF binding is enriched in regulatory elements that harbor a large number of TF binding sites, at sites with predicted higher binding affinity, and at sites that are enriched in genomic regions annotated as “active enhancers.”

Author Summary: An important question in genomics is to understand how a class of proteins called “transcription factors” controls the expression level of other genes in the genome in a cell-type-specific manner – a process that is essential to human development. One major approach to this problem is to study where these transcription factors bind in the genome, but this does not tell us about the effect of that binding on gene expression levels and it is generally accepted that much of the binding does not strongly influence gene expression. To address this issue, we artificially reduced the concentration of 59 different transcription factors in the cell and then examined which genes were impacted by the reduced transcription factor level. Our results implicate some attributes that might influence what binding is functional, but they also suggest that a simple model of functional vs. non-functional binding may not suffice.

The authors clearly understand the controversy and they clearly understand that spurious binding is a problem.

What they did was to construct cell lines where production of a given transcription factor was reduced (knockdown). The looked at expression of about 8,000 genes to see which ones, if any, were altered by reducing expression of the transcription factor (TF). The idea is to see whether all TF binding sites are affecting expression of nearby genes or whether only a subset of TF binding sites are actually affecting transcription. The result is that "the regulation of the vast majority of target genes is not affected by perturbations to the expression levels of the TFs." In other words, most transcription factor binding sites don't seem to play a role in regulating expression nearby genes. This is exactly what is predicted by the known properties of DNA binding proteins and it conflicts with the claims of ENCODE researchers who believe that most TF binding sites are functional.

What makes this paper a cut above the standard publications is the extensive, and critical, discussion of their findings in a lengthy Discussion section. They list several caveats that could challenge their conclusion. It's well worth reading.

ASIDE: I'm a big fan of teaching fundamental principles and concepts. That's why I spent some time on the general properties of DNA binding proteins in my molecular biology courses. I tried to explain these general properties using well-studied examples where the kinetics of binding were known and the equilibrium binding constants had been determined for specific and non-specific binding. They were usually examples from E. coli.

My colleagues and I also taught general concepts of regulation including the well-known fact that many bacterial transcription factors could function as both repressors and activators depending on the circumstances. There were some excellent examples we could use to illustrate this important concept. I incorporated some of them in one of my textbooks from 1994.

We have seen that CRP-cAMP can be both activator and a repressor, depending on which gene is being controled. It functions as an activator when its binding site is just upstream of the promoter, but it functions as repressor when the binding site overlaps the promoter and CRP-cAMP competes with RNA polymerase in binding DNA. There are many similar examples of regulatory proteins that can be both repressors and activators; one well-studied protein is AraC, which regulates genes involved in utilization of arabinose. The regulation of arabinose operons is complex; by binding to different sites on DNA, AraC functions as either a repressor (in the absence of arabinose) or an activator (when arabinose is available).

Finally, MerR is a simpler example of a regulatory protein that is both a repressor and an activator. The protein is required for the regulation of the mer operon, whose genes encode proteins that chelate mercury ions. MerR represses transcriptions of the mer operon by binding near the promoter. In the presence of mercury a MerR-Hg++ complex forms, and this complex acts directly as an activator at the same promoter.

I don't think these concepts are taught to undergraduates any more. I think most undergraduate courses have eliminated almost all references to to non-eukaryotic systems. What this means is that the fundamental concepts that were developed over several decades of work in simple systems are being ignored in undergraduate and graduate courses.

This point was brought home to me while reading the Cusanovich et al. paper. I came across the following statement.

In addition to considering the distinguishing characteristics of functional binding, we also examined the direction of effect that perturbing a transcription factor had on the expression level of its direct targets. We specifically addressed whether knocking down a particular factor tended to drive expression of its putatively direct (namely, bound) targets up or down, which can be used to infer that the factor represses or activates the target, respectively. Transcription factors have traditionally been thought of primarily as activators, and previous work from our group is consistent with that notion. Surprisingly, the most straightforward inference from the present study is that many of the factors function as repressors at least as often as they function as activators.

It's true that most transcription factors in eukaryotes function mostly as activators but the result wouldn't have been a surprise to the authors if they had been taught correctly as undergraduates and graduate students.

I think it's a bad idea that we are ignoring so much of the important work on phage and bacteria from the 1960s and 1970s. I recently asked my students if they knew anything about bacteriophage lamdba and was mostly met with blank stares. I didn't dare ask them if they could explain the genetic switch.

Mark Ptashne would not be amused.

1. I started working on DNA binding proteins for my Ph.D. thesis in 1968. That's only 49 years ago. Others had been working on the problem before me. :-)

5 comments :

Anonymous said...: ***It's true that most transcription factors in eukaryotes function mostly as activators but the result wouldn't have been a surprise to the authors if they had been taught correctly as undergraduates and graduate students.***

If I recall correctly repression would have to be by a completey different mechanism in eukaryotes because there is no steric inhibition of Pol2. It would have to be either by recruitment of inhibitory factors when present or even more indirectly, by activating a repressor gene.
Seems to me the extreme functionalists could claim a roundabout function for most TF binding sties. Presumbly the expression level for any particular TF is tuned to account for non-functional sites. So if one could delete all the 'non-functional' sites it would drastically increase the effective concentration of that TF and throw off gene expression; Thursday, June 22, 2017 4:11:00 PM
The Lorax said...: ***It's true that most transcription factors in eukaryotes function mostly as activators but the result wouldn't have been a surprise to the authors if they had been taught correctly as undergraduates and graduate students.***

Is there evidence this is universally true across eukarya or is this extrapolation from a couple of mammals? I know of numerous repressors in yeast for example.; Friday, June 23, 2017 2:47:00 PM
Gary Gaulin said...: I think that Larry is going to love this one too:

https://phys.org/news/2017-06-newly-small-rna-fragments-defend.html; Thursday, June 29, 2017 4:17:00 PM
CrocodileChuck said...: ".since this is the predicted result based on 50 decades of research on DNA binding proteins". [SNIP]

50 decades?; Thursday, August 31, 2017 5:31:00 PM
Larry Moran said...: Thanks. Fixed.; Friday, September 01, 2017 10:35:00 AM

Quotations

The old argument of design in nature, as given by Paley, which formerly seemed to me to be so conclusive, fails, now that the law of natural selection has been discovered. We can no longer argue that, for instance, the beautiful hinge of a bivalve shell must have been made by an intelligent being, like the hinge of a door by man. There seems to be no more design in the variability of organic beings and in the action of natural selection, than in the course which the wind blows.Charles Darwin (c1880)

Although I am fully convinced of the truth of the views given in this volume, I by no means expect to convince experienced naturalists whose minds are stocked with a multitude of facts all viewed, during a long course of years, from a point of view directly opposite to mine. It is so easy to hide our ignorance under such expressions as "plan of creation," "unity of design," etc., and to think that we give an explanation when we only restate a fact. Any one whose disposition leads him to attach more weight to unexplained difficulties than to the explanation of a certain number of facts will certainly reject the theory.

Charles Darwin (1859)

Science reveals where religion conceals. Where religion purports to explain, it actually resorts to tautology. To assert that "God did it" is no more than an admission of ignorance dressed deceitfully as an explanation...

Peter Atkins

Quotations

The world is not inhabited exclusively by fools, and when a subject arouses intense interest, as this one has, something other than semantics is usually at stake. Stephen Jay Gould (1982)
I have championed contingency, and will continue to do so, because its large realm and legitimate claims have been so poorly attended by evolutionary scientists who cannot discern the beat of this different drummer while their brains and ears remain tuned to only the sounds of general theory. Stephen Jay Gould (2002) p.1339
The essence of Darwinism lies in its claim that natural selection creates the fit. Variation is ubiquitous and random in direction. It supplies raw material only. Natural selection directs the course of evolutionary change. Stephen Jay Gould (1977)
Rudyard Kipling asked how the leopard got its spots, the rhino its wrinkled skin. He called his answers "just-so stories." When evolutionists try to explain form and behavior, they also tell just-so stories—and the agent is natural selection. Virtuosity in invention replaces testability as the criterion for acceptance. Stephen Jay Gould (1980)
Since 'change of gene frequencies in populations' is the 'official' definition of evolution, randomness has transgressed Darwin's border and asserted itself as an agent of evolutionary change. Stephen Jay Gould (1983) p.335
The first commandment for all versions of NOMA might be summarized by stating: "Thou shalt not mix the magisteria by claiming that God directly ordains important events in the history of nature by special interference knowable only through revelation and not accessible to science." In common parlance, we refer to such special interference as "miracle"—operationally defined as a unique and temporary suspension of natural law to reorder the facts of nature by divine fiat. Stephen Jay Gould (1999) p.84

Quotations

My own view is that conclusions about the evolution of human behavior should be based on research at least as rigorous as that used in studying nonhuman animals. And if you read the animal behavior journals, you'll see that this requirement sets the bar pretty high, so that many assertions about evolutionary psychology sink without a trace.

Jerry Coyne
Why Evolution Is True

I once made the remark that two things disappeared in 1990: one was communism, the other was biochemistry and that only one of them should be allowed to come back.

Sydney Brenner
TIBS Dec. 2000

It is naïve to think that if a species' environment changes the species must adapt or else become extinct.... Just as a changed environment need not set in motion selection for new adaptations, new adaptations may evolve in an unchanging environment if new mutations arise that are superior to any pre-existing variations

Douglas Futuyma

One of the most frightening things in the Western world, and in this country in particular, is the number of people who believe in things that are scientifically false. If someone tells me that the earth is less than 10,000 years old, in my opinion he should see a psychiatrist.

Francis Crick

There will be no difficulty in computers being adapted to biology. There will be luddites. But they will be buried.

Sydney Brenner

An atheist before Darwin could have said, following Hume: 'I have no explanation for complex biological design. All I know is that God isn't a good explanation, so we must wait and hope that somebody comes up with a better one.' I can't help feeling that such a position, though logically sound, would have left one feeling pretty unsatisfied, and that although atheism might have been logically tenable before Darwin, Darwin made it possible to be an intellectually fulfilled atheist

Richard Dawkins

Another curious aspect of the theory of evolution is that everybody thinks he understand it. I mean philosophers, social scientists, and so on. While in fact very few people understand it, actually as it stands, even as it stood when Darwin expressed it, and even less as we now may be able to understand it in biology.

Jacques Monod

The false view of evolution as a process of global optimizing has been applied literally by engineers who, taken in by a mistaken metaphor, have attempted to find globally optimal solutions to design problems by writing programs that model evolution by natural selection.

Richard Lewontin

More Recent Comments

Thursday, June 22, 2017

Are most transcription factor binding sites functional?

5 comments :