Each protein has a characteristic shape associated with its function. When we discuss the evolution of proteins, we like to divide the residues into three categories as shown below for the structure of myoglobin from sperm whale (Physeter catodon) [PDB 1A6M].
Myoglobin is a small protein with a bound heme group (shown as a space-filling molecule). It carries oxygen in the bloodstream and tissues. The oxygen molecule binds to the active site of the protein near one side of the heme group. There are specific amino acid residues at the active site that are absolutely required for binding oxygen. As you might expect, these amino acids are highly conserved—they will be found at that position in myoglobin from humans or any other species.
The second category of amino acid residues makes up the hydrophobic interior of the protein. Myoglobin is an all-α-helical protein and several of the helices group together to form a helix bundle. The interior of that bundle consists largely of hydrophobic amino acid residues. This is what stabilizes the three-dimensional structure and causes the polypeptide chain to spontaneously fold after it is synthesized.
The third category of residues is the surface residues. These are usually hydrophilic residues that interact with the surrounding water. The surface residues don't make as much of a contribution to the overall three-dimensional structure so their exact composition can be quite variable.
The class of proteins to which myoglobin belongs is called "globins." There are two other globins that you are probably familiar with: α-globin and β-globin are the two polypeptides that come together to form an α2β2 hemoglobin tetramer.
The three proteins (myoglobin, α-globin, and β-globin) descended by gene duplication from a common ancestral globin several hundred million years ago. Today their amino acid sequences are quite different due to the accumulation of random mutations and fixation by random genetic drift. In spite of the differences in primary structure, the three-dimensional structures of the three proteins are very similar. This can easily be shown by superimposing the three structures as shown in the figure (myoglobin=green, α-globin=blue, β-globin=purple).
Most people don't appreciate the amount of variation that underlies this conserved three-dimensional structure. It's worth taking a look at a bunch of aligned globin sequences from different species to see exactly which amino acids are highly conserved and which positions can tolerate almost any amino acid.
Let's go to the Pfam (protein family) database at the Sanger Institute in Cambridge (UK). The entry for the globin family is Globin PF00041. Click on "Alignments" in the left sidebar. This link takes you to the alignment page where you can create an alignment of all the known globin sequences. Choose 75 seeds (default) in the first table and select "Pfam viewer" from the pull-down menu under "Viewer." Click "View" to see the alignments.
Highly conserved amino acid residues are highlighted by vertical shading in the Pfam view. The first thing you should notice is that there are very few amino acids that are invariant. The conserved residue on the left (blue) is tryptophan (W). It's present in most of the globins from different species but not all. Look at the other positions and note that in most cases a variety of different amino acid residues can be substituted. Sometimes only hydrophobic residues (blue) can be found at a particular site and sometimes there are other restricted choices. Lots of insertions and deletions (dots) can be tolerated without major disruption to the overall three-dimensional structure.
Data like this reveals that the amino acid residues in the active site are usually conserved. Residues in the hydrophobic core are moderately conserved. And residues on the surface are hardly conserved at all.
The point is that there are literally billions of different proteins that have the same shape as globins and still function as carriers of oxygen. This is an important point. Opponents of evolution often take a single globin from a single species and calculate the probability that such a structure will form. They assume that only one out of twenty amino acids can be found at each position and the resulting probability (e.g., 20020) is enormous. Thus, they conclude, such a protein could never form by chance. They don't seem to appreciate the fact that we already know of billions of different proteins that can function as globins.
There are many other examples of this observation. The four structures below show the conformation of the cytochrome c polypeptide chain from tuna, rice, yeast, and a bacterium. The amino acid sequences have diverged considerably from their common ancestor of 3 billion years ago but the structures are very similar.
We conclude that the amino acid sequence of a polypeptide determines how it will fold in three-dimensional space but there are billions of different amino acid sequences that will adopt the same structure.
Finally, let's look at a more complicated example. The enzymes lactate dehydrogenase (below left) and malate dehydrogenase (below right) share a common ancestor even though they are different enzymes. This is a case where substitutions of amino acid residues in the active site gave rise to a new activity. Today the amino acid sequence similarity is barely above the threshold for defining homology but the structures are still very similar.
1. Other factors that contribute are bound ligands, such as heme groups, and interactions with other proteins as in multimeric proteins with sifferent subunits.