Thursday, March 13, 2008

Levels of Protein Structure

There are four levels of protein structure. The primary structure refers to the sequence of amino acid residues in the polypeptide chain written left-to-right from the N-terminus to the C-terminus.

Secondary structures are ordered structures formed by internal hydrogen bonding between amino acid residues. The common secondary structures are the α helix, the β strand, and various loops and turns. The β sheet is often counted as secondary structure although, strictly speaking, it is a motif (see below).

The tertiary structure of a polypeptide is the three-dimensional conformation. Typical proteins contain α helices, β strands, and turns, although there are some proteins that only have α helices and turns, and others that have only β sheets and turns. In many cases, the final structure consists of distinct, independently folded regions called domains.

An example of a protein with multiple domains is shown on the left. This protein is the enzyme pyruvate kinase from cat (Felix domesticus). There are three separate domains indicated by the square brackets on the side. Note that each of the domains is connected to another by a short stretch of unordered polypeptide chain.

In some cases, a particular domain is shared by several proteins suggesting that different proteins can be formed by combining various domains that evolved separately. In other cases, similar domain structures might arise independently by convergent evolution.

Quaternary structure only applies to proteins that are composed of more than one polypeptide chain. Each of the polypeptides is called a subunit. The subunits might be identical, as in the example shown above, or they might be very different as in my favorite enzyme ubiquinone:cytochrome c oxidoreductase (complex III).

There are certain motifs that occur over and over again in different proteins. The helix-loop-helix motif, for example, consists of two α helices joined by a reverse turn. The Greek key motif consists of four antiparallel β strands in a β sheet where the order of the strands along the polypeptide chain is 4, 1, 2, 3. The β sandwich is two layers of β sheet [see β Strands and β Sheets].

The vast majority of motifs do not have a common evolutionary origin in spite of many claims to the contrary. They arise independently and converge on a common stable structure. The fact that these same motifs occur in hundreds of different proteins indicates that there are a limited number of possible folds in the universe of protein structures. The original primitive protein may have been relatively unstructured but over time there will be selection for more and more stable structures. This selection will favor the common motifs.

Larger motifs are often called domain folds because they make up the core of a domain. The parallel twisted sheet is found in many domains that have no obvious relationship other than the fact that they share this very stable core structure. The β barrel structure is found in many membrane proteins. There are dozens of enzymes that have adapted to an α/β barrel. These enzymes are not evolutionarily related. (The β helix is much less common.)

[Figure Credit: The figures are from Horton et al. (2006)]

Horton, H.R., Moran, L.A., Scrimgeour, K.G., perry, M.D. and Rawn, J.D. (2006) Principles of Biochemisty. Pearson/Prentice Hall, Upper Saddle River N.J. (USA)


  1. Three things: (1) The β-helix is my favorite structural motif ever. (2) Kudos for using "unordered" instead of the popular but conceptually sloppy "unstructured". (3) The greek key structure you show is not compatible with the numbering you use; the strand orientations indicate the numbering 4 1 2 3.

  2. mwc says,

    The greek key structure you show is not compatible with the numbering you use; the strand orientations indicate the numbering 4 1 2 3.

    Thanks. I fixed it to 4, 1, 2, 3. The numbers indicate the order of the strands in the polypeptide chain and the order indicates the position (from right to left) in the β sheet.

  3. "cat (Felix domesticus)"

    It's a trivial point, but I'm pretty sure this should be Felis cattus domestica Linnaeus 1758. Otherwise, this series on protein structure has been wonderfully informative.

  4. anonymous says,

    It's a trivial point, but I'm pretty sure this should be Felis cattus domestica Linnaeus 1758.

    There are three acceptable species names.

    Felis catus
    Felis silvestris catus
    Felis domesticus

    NCBI Taxonomy

  5. If anything was going to persuade me that proteins were intelligently designed, it would be the 11-stranded beta-barrel of GFP.

  6. Question: additions of other molecules to the protein, e.g., carbohydrate tails, heme groups, etc. - can these be also subsumed under the heading "quaternary structure" or is that a Big No-No and these are to be kept separate?