The amount of a given protein in Escherichia coli depends on a number of factors such as the amount of mRNA and the rate of translation. The standard model of regulation is based on decades of study of individual genes and it reveals that the amount of protein is mostly dependent on the amount of mRNA that was translated. This, in turn, indicates that most regulation occurs at the level of transcription initiation.
It's now possible to look simultaneously at the characteristics of large numbers of protein-coding genes to see whether this generality holds. That's what Balakrishan et al. (2022) reported in a Science paper a few years ago. They looked at the characteristics of 1900 protein-coding genes in E. coli to see how protein concentration was regulated.
The main result is that for most genes there's a strong correlation between protein and mRNA. This suggests that the standard model is correct and regulation occurs primarily at the level of transcription. To confirm this, the authors developed a model of gene expression that included a number of parameters.Balakrishnan, R., Mori, M., Segota, I., Zhang, Z., Aebersold, R., Ludwig, C. and Hwa, T. (2022) Principles of gene regulation quantitatively connect DNA to RNA and proteins in bacteria. Science 378:eabk2066. [doi: 10.1126/science.abk2066]
INTRODUCTION The intracellular concentration of a protein depends on the rates of several processes, including transcription, translation, and the degradation and/or dilution of messenger RNAs (mRNAs) and proteins. These rates can be vastly different for different genes and across different growth conditions because of gene-specific regulation. At the systems level, protein concentrations are further affected by the availability of shared gene expression machineries—e.g., RNA polymerases and ribosomes—and are constrained by the approximately invariant cellular mass density. Even in one of the best-characterized model organisms, Escherichia coli, it is unclear how the gene-specific and systems-level effects work together toward setting the cellular proteome. This knowledge gap has not only hindered our efforts in building a predictive framework of gene expression but has also limited our abilities in guiding the rational design of gene circuits.
RATIONALE We undertook a quantitative, genome-scale study, combining experimental and theoretical approaches, to tease apart the contribution of the specific and global effects on cellular protein concentrations in exponentially growing E. coli cells across a variety of growth conditions. We complemented genome-scale proteomic and transcriptomic data with biochemical measurements of total absolute mRNA abundances and synthesis rates. We compared these measurements to gene dosage and the concentrations of ribosomes and RNA polymerases to quantitatively characterize the activity of the gene expression machinery across conditions. This comprehensive dataset allowed us to analyze, in quantitative detail, the interplay between the activity of gene expression machinery, the activity of individual promoters, and the resulting protein concentrations.
RESULTS We compiled a comprehensive atlas of the determinants of gene expression across conditions—from the concentrations of genes, mRNAs, and proteins to the rates of transcriptional and translational initiation and mRNA degradation for thousands of genes. We were able to determine the on rate of each promoter, a quantity capturing the overall effect of transcriptional regulation that has been elusive through most existing gene expression studies. Unexpectedly, we found that for most genes, the cytosolic protein concentrations were primarily determined by the innate magnitude of their promoter on rates, which spanned more than three orders of magnitude. Changes in protein concentrations resulting from changes in growth conditions were typically much smaller—well within one order of magnitude—and were mostly exerted through changes in transcription initiation.
E. coli’s strategy to implement gene regulation can be summarized by two design principles. First, protein concentrations are predominantly set transcriptionally, with relatively invariant posttranscriptional characteristics (translation efficiencies and degradation rates) for most mRNAs and growth conditions. Second, the overall fluxes of transcription and translation are tightly coordinated: The average density of five ribosomes per kilobase is nearly invariant across mRNA species and across growth conditions, even though the mRNA and ribosome abundances can each vary substantially. We find this coordination to be implemented through the anti-sigma factor Rsd, which modulates the availability of RNA polymerases for transcription across different growth conditions. These two principles lead to a quantitative formulation of the central dogma of bacterial gene expression, connecting mRNA and protein concentrations to the regulatory activities of the corresponding promoters.
CONCLUSION These quantitative relationships reveal the unexpectedly simple strategies used by E. coli to attain desired protein concentrations despite the complexity of global physiological constraints: Individual protein concentrations are primarily set by gene-specific transcriptional regulation, with global transcriptional regulation set to cancel the strong growth rate dependence of protein synthesis. These relations provide the basis for understanding the behavior of more complex genetic circuits in different conditions and for the inverse problem of deducing regulatory activities given the observed mRNA and protein levels.
- Transcription: average transcription rate; average length of protein-coding genes and operons; rate of transcription initiation (promoter on rate)
- mRNA stability (degradation rate) under different growth conditions
- Protein synthesis: number of ribosomes under different growth conditions; rate of translation elongation; rate of translation initiation and average spacing of ribosomes; average length of polypeptides
- Protein stability: average lifetime of a typical protein under various growth conditions (it was difficult to account for excreted proteins but they were only a small fraction of the total)
- Cell size: the size of a typical E. coli cell changes under different growth conditions
- Number of genes: Most protein-coding genes are present in a single copy during most of a cell cycle but there are a few examples of multiple copies
Keep in mind that the analyses were done using growing cultures of bacteria under different growth conditions. This means that new proteins must be synthesized every time a cell divides. Most of the parameters show a normal distribution around an average score. The range can be several orders of magnitude in some cases but the fact that the distribution is normal means that the averages can be used in the model.
Some of the interesting parameters are translation initiation rate (0.2 sec); ribosome spacing (200 nt); mRNA degradation rate (~1 min); transcription elongation rate (~50 nt/sec). None of these are particularly surprising. They are close to the values that have been reported in the textbooks over the past 40 years.
The most interesting parameter is the promoter on rate. This varies considerably depending on the gene and the growth conditions. As expected, proteins that are required in large quantities transcribe their corresponding genes at higher rates than proteins that are much less abundant.The authors created 12 sets of parameters based on protein concentrations and plotted the average parameter vs protein concentration. The results show a remarkable correlation between protein concentrations and promoter on rate (red) but a much lower correlation with translation initiation rate (yellow) and a negative correlation with mRNA degradation rate (blue). (The green is number of genes.)
What this shows is that most regulation in bacteria is at the level of transcription initiation. This is not a surprise but it's nice to have some solid data for a large number of genes.
We don't think the results are as clear-cut in multicellular eukaryotes where other factors may play a greater role. There's lots of speculation about regulation at the level of mRNA stability and the possible involvement of regulatory RNAs and there's lots of speculation about splicing and protein turnover. Some of those speculations may turn out to be significant on a large scale but many of them might only apply to a small number of genes. In any case, it's worth keeping in mind that regulation at the level of transcription initiation was almost certainly the dominant mechanism for most of life's history and it's the default (null) explanation of regulation in all species.



No comments :
Post a Comment