More Recent Comments

Tuesday, February 12, 2008

Repression of the lac Operon

There are many lesson to be learned from understanding the regulation of transcription of a well-studied system like the E. coli lac operon. Some of those lessons have consequences when we think about the problems of having large eukaryotic genomes. Read the description below and the implications that follow.

From Horton et al. (2006) p. 666

lac repressor binds simultaneously to two sites near the promoter of the lac operon. Repressor-binding sites are called operators. One operator (O1) is adjacent to the promoter, and the other (O2) is within the coding region of lacZ. When bound to both operators, the repressor causes the DNA to form a stable loop that can be seen in electron micrographs of the complex formed between lac repressor and DNA (bottom figure). The interaction of lac repressor with the operator sequences may block transcription by preventing the binding of RNA polymerase to the lac promoter. However, it is now known that, in some cases, both lac repressor and RNA polymerase can bind to the promoter at the same time. Thus, the repressor may also block transcription initiation by preventing formation of the open complex and promoter clearance. A schematic diagram of lac repressor bound to DNA in the presence of RNA polymerase is shown in the figure on the right. [See Monday's Molecule #61 for another view.] The diagram illustrates the relationship between the operators and the promoter and the DNA loop that forms when the repressor binds to DNA.

The repressor locates an operator by binding nonspecifically to DNA and searching in one dimension. (Recall from Section 21.3C that RNA polymerase also uses this kind of searching mechanism.) The equilibrium association constant for the binding of lac repressor to O1 in vitro is very high. As a result, the repressor blocks transcription very effectively. (lac repressor binds to the O2 site with lower affinity.) A bacterial cell contains only about 10 molecules of lac repressor, but the repressor searches for and finds an operator so rapidly that when a repressor dissociates spontaneously from the operator, another occupies the site within a very short time. However, during this brief interval, one transcript of the operon can be made since RNA polymerase is poised at the promoter. This low level of transcription, called escape synthesis, ensures that small amounts of lactose permease and β-galactosidase are present in the cell.

In the absence of lactose, lac repressor blocks expression of the lac operon, but when β-galactosides are available as potential carbon sources, the genes are transcribed. Several β-galactosides can act as inducers. If lactose is the available carbon source, the inducer is allolactose, which is produced from lactose by the action of β-galactosidse (Figure 21.18). Allolactose binds tightly to lac repressor and causes a conformational change that reduces the affinity of the repressor for the operators. [see Regulation of Transcription] In the presence of the inducer, lac repressor dissociates from the DNA, allowing RNA polymerase to initiate transcription. (Note that because of escape synthesis, lactose can be taken up and converted to allolactose even when the genes are repressed.)

Electron micrographs of DNA loops. These loops were formed by mixing lac repressor with a fragment of DNA bearing two synthetic lac repressor–binding sites. One binding site is located at one end of the DNA fragment, and the other is 535 bp away. DNA loops 535 bp in length form when the tetrameric repressor binds simultaneously to the two sites.
The strength of binding between a protein and a ligand is measured by an equilibrium binding constant (KB). In the case of lac repressor binding to its specific strong binding site (O1) KB = 1013 M-1. This is very high, in fact it is one of the tightest DNA bindings known in biology. What this means is that lac repressor will sit on the operon and repress transcription for at least 20 minutes under normal conditions.

However, the repressor will eventually fall off (dissociation rate constant k-1 = 6 × 10-4 s-1) and, as described above, the operon will be transcribed once (escape synthesis). A new repressor molecule finds the operator sequences very quickly because lac repressor binds non-specifically to DNA (KB = 4 × 104) and slides along the DNA searching for the operator in a process called one dimensional diffusion (association rate constant k1 = 1010 M-1 s-1). Even though the lac repressor only remains bound non-specifically for a few seconds, it is able to search about 2000 bp looking for a specific binding site.

Given the huge difference between the specific and non-specific binding constants, the cell only needs about ten molecules of lac repressor to ensure that the operator sequences are bound almost all of the time. At any given time nine of these molecules will be bound to random pieces of DNA in the genome and the other one will be bound to the lac operon.

Similar repressors and activators work in eukaryotic cells to regulate transcription. But in eukayotic cells we have a much bigger problem. First, there are very few regulatory proteins that have as strong a specific binding constant as lac repressor. Second, there is much more DNA in a eukaryotic cell. The consequences of having a large genome are: (a) it takes these DNA binding proteins much longer to find their specific binding site, and (b) at any one time, many more of the regulatory proteins are soaked up in non-specific binding to DNA. In eukaryotic cells with an abundance of junk DNA a typical regulatory protein has to be present at about 20,000 copies per cell in order to have a decent chance of biding to its specific regulatory site for a significant length of time. (Recall that only ten molecules of lac repressor are needed in E. coli.)

Given the properties of DNA binding that we have discovered and characterized in bacteria and bacteriophage, we can calculate that escape synthesis in eukaryotic cells in likely to be much more of a problem than in bacterial cells. Furthermore, accidental transcription of random bits of DNA is almost certainly going to be common in a cell with a large bloated genome. This is because RNA polymerase also binds non-specifically to DNA and also because the larger the genome, the more likely you are to encounter promoter and regulatory sequences that just by chance happen to be close matches to real functional sequences. This is a very important concept and one that is not widely appreciated. Based on our knowledge of basic biochemistry we expect that there will be random, infrequent transcription of a large percentage of the genome. These transcripts are merely a consequence of the properties of DNA binding proteins and they have no biological significance.

Some of these problems in eukaryotes are mitigated by a separate level of regulation at the level of chromatin structure. Large regions of the chromosome can be masked from DNA binding proteins by formation of a tight heterochromatic complex of nucleosomes and DNA. Less compact complexes are formed in non-active regions of the genome where the DNA is less accessible but not invisible. When genes in a region are transcribed, the chromatin opens out into an open complex where the DNA is easily accessible to regulatory proteins. This solves some of the problems discussed above but it is only a partial solution. We know for a fact that the concentrations of regulatorty proteins are high (20,000 copies) and a growing amount of evidence points to frequent accidental transcription.

©Laurence A. Moran and Pearson Prentice Hall

Horton, H.R., Moran, L.A., Scrimgeour, K.G., perry, M.D. and Rawn, J.D. (2006) Principles of Biochemisty. Pearson/Prentice Hall, Upper Saddle River N.J. (USA)


Anonymous said...

These transcripts are merely a consequence of the properties of DNA binding proteins and they have no biological significance.

But of course, you have no way of knowing this.

Larry Moran said...

anonymous says,

But of course, you have no way of knowing this.

Of course I do. That's the whole point. I may not be able to indentify whether a particular transcript is functional or not but a thorough understanding of biochemisty theory enables me to say with confidence that most rare transcripts will not have a biological function.

Unfortunately, there are too many so-called scientists out there who don't understand the basic concepts so it's easy for them to get confused.

I don't think we're doing a good job of teaching the principles and concepts.

Anonymous said...

Confidence is not a very convincing substitute for actual data. You know how science works... if you want to make a statement of fact, show the data, or give the reference.

Papers like this one:

Wu JQ, Du J, Rozowsky J, Zhang Z, Urban AE, Euskirchen G, Weissman S, Gerstein M, Snyder M.
Systematic analysis of transcribed loci in ENCODE regions using RACE sequencing reveals extensive transcription in the human genome.
Genome Biol. 2008 Jan 3;9(1):R3

make me considerably less sure of your conclusions. Speaking for myself only, I think the acquisition of more data is needed on this topic before the issue can be considered settled, although clearly your mileage may vary.

Anonymous said...

I've just read that Genome Biology paper and it's entirely consistent with Larry's points.

The key point, surely, is that knowing what we do about the biochemistry of transcription, the assumption should be that a transcript has no function unless proven otherwise. Anon, what's your justification for assuming the opposite?

Anonymous said...

...the assumption should be that a transcript has no function unless proven otherwise.

Papers like that of Wu et al. mean you don't need to make this assumption -- you can look at the actual data. A more meaningful question is, what fraction of these previously uncharacterized transcripts have some kind biological function. It would appear from this work that 10% is a reasonable, conservative answer, but again, this is an experimentally testable figure. The ENCODE project has clearly provided the motivation to actually do these kinds of studies. This value is going to be known with a lot more certainty in just the immediate future.

Larry Moran said...

anonymous says,

This value is going to be known with a lot more certainty in just the immediate future.

Dream on. If history is any judge, what we're going to see is exaggerated claims about enormous amounts of functional RNA. It will probably take about ten years for scientists to realize that those claims are bogus and make no sense in terms of basic biochemistry and fundamentals of evolution.

I fear we are entering the dark ages of science where wishful thinking and superstition trump serious intellectual activity.

These days, critical thought seems to be something that one avoids at all costs.

Anonymous said...

I have a question, I am a junior in high school and I am hoping someone can answer this for me. We were talking about the Lac Operon in my AP Biology class the other day and I learned that it is in the E. coli genome, not in the human genome, so how does lactose intolerance work? Couldn't you-in theory- just drink butter milk- which contains E. coli- to get new E.coli in your gut. Because the chances of ever strain of E.coli a single person comes in contact with being defective is pratically impossible. It can't be your body rejecting the E.coli because then you would have diarreha, because E.coli controls our water absortion. It can't be your body attacking the lactase (which is an enzyme, and enzymes are a protein) because that is an allergy and would express its-self in anaphalixis (hives, swelling, vomiting etc.) So, what exactly is it? Does it have to do with the production allactose (which binds with the represser to make the represser inactive and start the production of lactase)?

Anonymous said...

Very good explanation ,my background is synthetic chemistry but I clearly understand this topic