Thursday, February 08, 2007

How Proteins Fold

The protein shown here is pyruvate kinase, one of the key enzymes in metabolism. This particular example comes from the common domestic cat (Felix domesticus).

Cartoons such as this one are intended to show how the backbone chain of amino acids is folded to produce the final three-dimensional structure of a protein. In this case the polypeptide chain is represented by a blue ribbon. There are spiral sections representing regions of secondary structure called α-helices and flattened sections called β-strands. The β-strand regions are often twisted.

This particular protein adopts a structure with three distinct parts called domains. As a general rule, each domain has a well-defined shape with a characteristic pattern of strands and helices. The pattern is called a fold and it it thought that there are only 1000 or so different folds in the protein universe. Different folds can be combined to make up all known proteins.

(For those who might be interested, the three domain folds in this protein are TIM beta/alpha barrel, PK beta-barrel, and the PK C-terminal domain.)

When proteins are first synthesized you can think of them as a long extended chain of amino acids with no particular secondary or tertiary structure. We refer to such unordered macromolecules as random coils. Within seconds, this random coil spontaneously folds itself into a highly ordered three-dimensional structure such that every single molecule of a given protein has the exact same shape. For example, every molecule of pyruvate kinase looks exactly like the one shown here.

The rapidity of this folding reaction tells us something about the mechanism of protein folding. We know that folding is rapid and spontaneous because proteins can be purified then unfolded by treating them with certain chemicals that cause them to become denatured or unfolded. These denatured proteins can then be allowed to re-fold when the chemicals are removed.

Cyrus Levinthal did some back-of-the-envelope calculations on the rate of protein folding. He assumed that a protein could randomly try all possible three-dimensional conformations until it found the correct one. Under those conditions it would take 1087 seconds to fold a protein of 100 amino acid residues. This is quite a bit longer than the age of the universe (6 x 1017 seconds).

Obviously, there's something wrong with the assumptions behind what came to be known as the Levinthal Paradox. As a matter of fact, the paradox was never really a paradox since the whole point of the calculation was to shown that proteins did not fold by randomly searching though the conceptual universe of all possible shapes.

The final structure of a protein minimizes the energy of the random coil by burying hydrophobic amino acids in the interior of the molecule. Hydrophobic (water fearing) amino acids are those that don't like to be exposed to water. Just as scattered oil droplets in your salad dressing will eventually coalesce to form a layer of oil over a layer of vinegar and water, so too will hydrophobic amino acids come together to form an "oily" globule in the middle of the protein. Water is excluded from this "molten globule" and this makes folding an entropically driven spontaneous reaction.

You can visualize the process by picturing a field of all possible energy levels of the random coil. The one representing the properly folded protein is the deepest well on the energy surface. The bottom of the well is the lowest energy level for the protein and this represents the stable three-dimensional structure. Protein folding, then, is like finding the well and falling down into it.

As mentioned above, the search for the lowest energy well is not a random search of all possible shapes. That would take far too long. Instead, folding proceeds in a cooperative stepwise manner with small regions of secondary structure forming first.

The most striking regions of secondary structure are the short α-helices. Certain stretches of amino acid residues will rapidly form α-helical regions involving local bonding between amino acids. These form extremely rapidly since the amino acids are already in close contact. Furthermore, the formation of these local secondary structures takes place simultaneously in many different parts of the random coil.

The helix and strand regions represent the minimal energy conformations of the local parts of the protein. Subsequent folding proceeds by forming the helices and strands into the appropriate three-dimensional folds that are characteristic of each domain. The possibilities here are much fewer than the total of all possible conformations because you are now combining blocks of amino acids that have already adopted some structure.

The figure below shows some hypothetical examples of folding pathways. Very few folding pathways have been worked out in detail but the basic principles are well understood. The biggest unsolved problem is predicting the three-dimensional structure of a protein from its amino acid sequence. This involves finding the predicted lowest energy level and that's turning out to be a tough problem indeed.



8 comments :

  1. This post got me to wonder how robust protein folding is. Proteins can fold wrongly as prions show, so I guess proteases hack up those proteins that do (but prions are resistant).

    But if proteins can defold and refold it is a robust process anyway. Though perhaps rather inefficient fabricated proteins (often stuck in wrong shape) are somehow selected against. So I guess my question becomes if robust folding is a selection criteria, or if it is a natural property of long amino acid chains?

    (The later wouldn't surprise me though if it is analogous to what they say of evolution in a multidimensional fitness space, these supposedly multidimensional conformation search spaces could almost always have a path towards lower energy.)

    ReplyDelete
  2. Another great post. I just have one comment. The hypothesis that proteins fold as they come off the ribosome is somewhat strengthened by the recent data showing that "silent" mutations appear to have an impact on protein structure (there are some good posts on this here and here). The idea being that codon biases result in silent mutations which modulate protein transcription rates, therefore changing the protein folding rates.

    ReplyDelete
  3. When proteins are first synthesized you can think of them as a long extended chain of amino acids with no particular secondary or tertiary structure. We refer to such unordered macromolecules as random coils. Within seconds, this random coil spontaneously folds itself into a highly ordered three-dimensional structure such that every single molecule of a given protein has the exact same shape.

    But since proteins are synthesized linearly, surely the early part of the chain could start folding right way, rather than wait until the entire chain is fabricated. Therefore this state in which the entire chain is freshly-fabricated and random is only hypothetical.

    You can visualize the process by picturing a field of all possible energy levels of the random coil. The one representing the properly folded protein is the deepest well on the energy surface. The bottom of the well is the lowest energy level for the protein and this represents the stable three-dimensional structure. Protein folding, then, is like finding the well and falling down into it.

    But since, as you have already stated, the entire folding space is not searched, then the folding could be path-dependent. With that conceeded, it would be difficult to maintain the proposal that the stable folded structure is the global minimum.

    I await the follow-up article in which chaperonins are discussed.

    ReplyDelete
  4. Dave says,

    The hypothesis that proteins fold as they come off the ribosome is somewhat strengthened by the recent data showing that "silent" mutations appear to have an impact on protein structure (there are some good posts on this here and here). The idea being that codon biases result in silent mutations which modulate protein transcription rates, therefore changing the protein folding rates.

    We know that proteins fold as they are being synthesized. The effect might be to speed up folding or prevent trapping in local energy wells but it won't affect the final shape.

    The idea that unusual codons might alter folding rates doesn't make a lot of sense. We've known about codon bias for 25 years but the common explanation is that it affects the rate of translation—not the rate of protein folding. That makes more sense to me so I'll reserve judgement on the latest speculations.

    One thing we do know is that some proteins need to be protected against improper folding while being synthesized. That's one reason for having chaperones associated with the translation machinery.

    ReplyDelete
  5. mustafa mond, fcd says,

    Therefore this state in which the entire chain is freshly-fabricated and random is only hypothetical.

    It's "hypothetical" in the sense that an unfolded complete polypeptide doesn't exist during normal protein synthesis. However, proteins do become unfolded under some conditions—or example, when cells are heated above a certain temperature. This is the condition known as heat shock and it occurs quite often. The proteins inside the cell have to be refolded properly and that often requires the assistance of heat shock proteins, which turned out to be chaperones.

    Lots of protein have been studied in vitro. That's why we know they can refold spontaneously from the denatured state. That part isn't "hypothetical." It shows us that proper folding isn't always linked to translation.

    But since, as you have already stated, the entire folding space is not searched, then the folding could be path-dependent. With that conceded, it would be difficult to maintain the proposal that the stable folded structure is the global minimum.

    Not at all. The global minimum includes the local regions of secondary structure that form first. All theoretical calculations show this.

    I await the follow-up article in which chaperonins are discussed.

    It's coming. You'll learn the difference between chaperones and chaperonin.

    ReplyDelete
  6. "We know that proteins fold as they are being synthesized. The effect might be to speed up folding or prevent trapping in local energy wells but it won't affect the final shape."

    Actually, the Wikipedia article (sorry about the layman choice of background material) hints at this, since it says that some proteins may take hours to fold. (Perhaps under in vitro circumstances, but that would not matter here.)

    ReplyDelete
  7. protein folding into a correct conformation is an entropicaly disallowed phenomenon so why a protein can fold spontaneously

    ReplyDelete
    Replies
    1. Your statement is incorrect. A well studied example of proteins that do spontaneously fold correctly are the RNase A enzymes. For the bigger proteins, enter the chaperones...

      Delete