Lecture 14


Proteins III

Last Friday we saw that prirmary structure is responsible for activity. However, a linear (extended) chain of amino acids will not have much of a biological activity. Activity comes about due to they way in which the polypeptide folds into a defined three dimensional (3D) structure. This 3D structure or conformation, as we mentioned before, is in great part determined by the amino acid composition of our proteins, i.e., the primary structure.

Today we will start discussing the different elements of protein secondary structure, how is that they form, and the forces that stabilize them.

Protein Secondary Structure

As we mentioned last time, the secondary structure of a protein refers to the local conformation of relatively short sections of residues of the polypeptide. Before we start getting into protein structure we need to get some basic knowledge of the conformational behavior of the bonds composing peptides, the conformational characteristics of peptide bonds and residues, and the terminology employed to talk about them.

1) Backbone atoms. We will constantly refer to the atoms that define the direction and shape of the polypeptide as the backbone atoms. These are all the polypeptide chain atoms except for the atoms in the residues side chains. In each residue, the backbone atoms will be the carobyl carbon (C or C') , the a-carbon (Ca), the a-hydrogen (Ha), the nitrogen of the peptide bond (N), and the amide hydrogen (HN):

2) The peptide bond. We have been saying this for a while now. The amide bond formed between the carboxylate and amine of two amino acid residues has partial double bond character. This is because we can have two resonant structures:

The C-N distance (obtained from X-ray measurements) between the carbon and nitrogen atoms in the amide bond is 1.32 Å. This is shorter than the single bond C-N distance (1.45 Å) and longer than the double bond C-N distance (1.20 Å). This partial double bond carachter means that the substituents around that bond (the things attached to that bond) are all in the same plane. In other words, the atoms forming the peptide group are flat. This has huge implications on the geometry of the polypeptide backbone.

The dihedral angle associated with the peptide bond, called the omega angle (w), can only adopt two values, 0 or 180 degrees. It is the dihedral defined by Ca(i-1), C'(i-1), N(i), and Ca(i), were 'i' refers to the residue number in the polypeptide chain. As we will see, 99% of the time w will be 180 degrees, or trans, in order to minimize steric clashes between substituents to the peptide bond:

One common way of increasing the percentage of peptide residues with w = 0 ot to synthetically methylate the amide (NMe), because this removes the advantage of having w = 180 over w = 0. Condidering this, which amino acids will it be more likely to have w = 0 (or a cis peptide bond)?

3) If we consider the other atoms of the backbone, we have to more dihedral angles that define the shape of the bacbone. One involves the C'(i-1), N(i), Ca(i), and C'(i) atoms, and is called the phi angle (f), and the other one involves N(i), Ca(i), C'(i), and N(i+1), and is called the psi angle (y).

4) What about the side chains? The dihedral angles of the side chains are also important to the conformation of the protein, particularly those belonging to groups which participate in the binding of ligands or form part of the catalitic site of the protein. These angles are called the chi angles (c), and they are numberd depending on which carbons of the side chain forms them. For lysine, for example, we will have a dihedral angle formed by N, Ca, Cb, and Cg, which we call c1, and another one formed by Ca, Cb, Cg, and Cd, which we call c2. Despite that they are extremely important to the activity of the protein, we will not make emphasis on them as we will on w, f and y, because these three define the conformation of the peptide backbone.

One thing that we will probably see different to normal geometry is the way we use to define dihedral angles. Normaly, we use a 0 to 360 degree range to represent a dihedral angle. Not in chemistry/biochemistry. We use 0 to 180 & -180 to 0 angle ranges to represent angles. This is useful (?) because it gives us two positive and two negative quadrants:

Ramachandran plots

OK. So we have three dihedral angles per amino acid residue defining the conformation of the polypeptide backbone, one of them is a partial double bond and the other two are single bonds. What conformations can these dihedrals adopt? We saw that most of the time, w will be 180 degress, but what about f and y? In principle, since they are single bonds, the can adopt virtually any conformation. However, due to steric hindrance between the substituents around them, these angles will only adopt (or populate) certain ranges of angle values. In order to see which <f,y> combitations are allowed (or preferred) by different amino acid residues in a polypeptide, we can claculate the energies of different conformations adopting different <f,y> angle pairs and compare them. This will be a 3D plot of energy versus f and y dihedral angles, called a Ramachandran plot. Since looking in 3D is kind of tricky, we usually represent them as a two dimenssional (2D) contour plot:

In this plot, combination of f and y angles that fall in the green regions are more stable (favoured) than things that fall in the yellow or the pink. In any event, falling in any of the coloured regions is a lot more favoured that any of the white regions. As we see, only a limited conformational space is available to the peptide backbone. By the way, the blue dots are the actual <f,y> pairs for all the amino acids in the proteins MCP-1, a human chemokine involved in inflamatory response. You see that real proteins do fall almost exlusivelly in these regions of the Ramachandran plot.

One last thing. Every L-amino acid will have its own Ramachandran plot. The one ploted above is for L-alanine, and most amino acids with small side chains will have virtually identical plots to the one presented. Glycine, being small and lacking a real side chain, will have much larger regions of allowed <f,y> pairs. Proline, for obvious reasons, will have a much more restricted Ramachandran plot.

Elements of secondary structure

So what is so special about the favoured regions of the Ramachandran plot over other regions? Apart from minimizing steric clashes, we will see that these regions of dihedral angles correspond to regular secondary structures in which non-bonded interactions (mainly hydrogen bonds) are maximized, therefore stabilizing the conformation of the polypeptide backbone.

1) The first region we will review (it is also the most fun to review) is the one in which the f angle lies near -60 degrees and the y angle around -50 degrees. This region of <f,y> conformational space corresponds to the famous right handed a helix, which is perhaps the most characteristic element of secondary structure, and, if all things are condidered equal, it is also the most stable regular secondary structure element.

In an a helix segment, the f angles are ~ -60 degrees and the y angles ~ -50 degrees for all residues. This makes the polypeptide backbone to curl up around an imaginary axis into something of a wine bottle opener. Each full turn of the a helix has a height of 5.4 Å, and involves 3.6 amino acid residues. This is also called the pitch of the helix, meaning the hight we climb on each full turn. The twist of the helix is to the right, therefore the 'right handed' a helix.

Why is this structural motif so stable? If you look at the disposition of the backbone atoms in the a helix, you see that the amide hydrogen of a certain residue (i) is at an optimal distance to hydrogen bond to the oxygen of the amide carbonyl of residue (i+3), i.e., three residues away. This means that except for the two residues at the C-terminus of the helix, all amide hydrogens will be participating in h-bonds. If you remember, the energy involved in each h-bond was ~ 2 to 5 Kcal/mol, so for a helical segment comprising 10 amino acids, we will have 8 hydrogen bonds, and a stabilization of approximately 20 to 40 Kcal/mol.

Another property of the a helix is its polarity. Due to their orientation, all amide carbonyls will be pointing in the same direction: Their carbonyl oxygen points towards the C-terminus of the helix, and the carbonyl carbon towards the N-terminus. If you remember, each carbonyl in an amide generates a little dipole, with a d+ on the carbon and a d- on the oxygen. Now we have a bunch of those, all ordered in the same way: The effects of all dipoles are added, in the same way that the voltage of batteries is added when we put them in series. The net result is that the s helix will have a net dipole moment with the negative end in the C-terminus and the positive end in the N-terminus.

Another consequence of the geometry of the a helix is the position of the side chains. They will all poke to the outside of the helix axis. As we will see next, this has an important effect on the nature and charge of the side chains we can have in a helix.

What type of amino acids are more prone to form a helices? Although there are statistical analyses of the ocurrence of certain amino acids in helices, this is by no means clear-cut. What we can more or less determine is which amino acids cannot be in a helix. Most of the time we will have to look at sections of sequence in order to determine this:

- For example, proline, which has a highly constrained backbone, cannot adopt the standard values of f and y of the normal a helix. Therefore, proline is an 'helix breaker': It will not allow for the really tight turn we see in the standard a helix due to its limited conformational flexibility. Furthermore, since the the amide nitrogen in proline is doubly substituted (a secondary amide), it has no hydrogens to lend in hydrogen bonding, destabilizing the a helix even more.

- Glycine, which by now we are seeing is a bad example for a simple amino acid and is sort of a jack-of-all-trades, will be happy to be anywere in an a helix. Why?

- If we have several bulky side chains contiguous to one another, such as -Phe-Phe-Phe-, there will be large steric clashes between the side chain atoms (remember, they can only span outwards from the center of the helix). This will have a destabilizing effect on the whole a helix.

- The same goes for amino acids which have the same charge at a certain pH, such as Glu and Asp, or Arg and Lys. A segment of contiguous negatively (or positively) charged residues will destabilize the a helix due to unfavourable electrostatic interactions.

- On the other hand, having, for example, two phenylalanine residues spaced 3 or 4 residues apart (-Phe-Xaa-Xaa-Phe-) will stabilize the helix, because the aromatic ring of one residue will sit on top of the other, and there will be favourable p-p stacking interactions. Remember that the helix has 3.6 residues per turn, which means that these two phenylalanines would be right on top of each other, but separated enough as to not butt into each other. The same thing goes for an arrangement such as -Glu-Xaa-Xaa-Arg-. Why?

- Finally, due to its polarity, the a helix will be stabilized by negatively charged residues at its N-terminal end and positively charged residues at its C-terminal end.

As you see, there is also a region of allowed <f,y> conformational space for a left handed a helix. The angles correspond to f = 60 and y = 50. In such a helix, the side chains will be poking inwards, which means that there will be a lot of bumps if we had relatively bulky amino acids. Therefore, these type of helices are not found commonly, except when we have glycine residues.

Even if not forming part of a large a helix, the <f,y> angles for a certain residue in a protein can be in the a helix region of the Ramachandran plot. In those cases, we do not say that the residue forms a 1-residue a helix, but that it is in the conformational space belonging to right handed a helical conformation, or aR conformation. You may find this kind of notation in structural biochemistry papers.

2) The second region in which we find stable <f,y> combinations (a big number of 'hits' for MCP-1 as shown above) is around f = -120 and y = 140. These angles correspond to a partially extended polypeptide backbone, known as the b strand. In a b strand, unfavourable steric interactions between adjacent residues of the polypeptide are minimized by the zig-zaging of the polypeptide backbone. As opposed to a helices, there are really not that many h-bonds in an isolated b strand to stabilize it. However, several b strands normally come together to form large layered structures known as b pleated sheets, or b sheets for short.

In b sheets the amide hydrogens of one b strand hydrogen bond with the amide carbonyl oxygens of the other, and vice-versa. Two or more b strands can come together in a b sheet. The ones in the middle of the sheet will share h-bonds with two other strands, stabilizing the whole structural motif.

Depending on the direction of the different b strands we will have two varieties of b sheets. In one, both b strands are oriented in such a way that both their N-termini and C-termini are to the same side. In other words, they have the same direction. These are called parallel b sheets.

On the other hand, if the strands forming the b sheet are going in different directions (one has N-termini to C-termini, the other C-termini to N-termini), we will have an anti-parallel b sheet.

Due to the disposition of the hydrogen bonding donors and acceptors, the h-bonds in a parallel b sheet are less stable than those in an anti-parallel b sheet. We can see that the <N-H-O> angle in the parallel b sheet is further away from the ideal 180 degrees (linear) than the anti-parallel b sheet.

As was the case for a helices, we may find an isolated residue in a polypeptide chain for which the <f,y> angles correspond to a b strand conformation. Instead of saying that the residue forms a 1-residue b strand, we say that it lies in a C5 conformation. Again, those reading structural biochemistry papers will see this oftenly.

3) The last repetitive conformational motif we will discuss is one in which f = -70 and y = 160. This corresponds to a structural element that is not as extended as the b strand, but not as curled up as the a helix. It is known as the collagen helix or polyproline II helix. A collagen helix is left handed (remember that the a helix was right handed), and has only three residues per turn. It is therefore more open than an a helix.

Due to their conformational characteristics, prolines are well suited to participate in collagen helices (therefore the name polyproline helices). In collagen, the segment -Gly-Xaa-Pro- or -Gly-Xaa-(HO)Pro- is found as a repeating unit. Here Xaa is any amino acid, and (HO)Pro is hydroxyproline.

In collagen, three of these single stranded helixes coil up together to form a triple helix which provides tensil strenght characteristic of cartilage, tendons, etc.

4) The last conformational feature we will discuss are polypeptide turns. As the name implies, polypeptide turns or hairpins introduce a sharp change in the direction of the polypeptide backbone, usually reversing its direction.

The most abundant turn type is the b turn, which involves four residues of a polypeptide chain. The b turn is stabilized by a h-bond between the amide carboxyl oxygen of residue (i) to the amide hydrogen of residue (i+3). Depending on the values of the <f,y> angles of residues (i+1) and (i+2), we will have up to 8 different types of b turns. The most common ones are type I and type II. In a type I b turn, we have f = -60 and y = -30 for residue (i+1), and f = -90 and y = 0 for residue (i+2). In a type II b turn, we have f = -60 and y = 120 for residue (i+1), and f = 80 and y = 0 for residue (i+2).

Another type of turns are g turns. They involve only 3 residues of the polypeptide, and are stabilized by a h-bond between the amide carboxyl oxygen of residue (i) to the amide hydrogen of residue (i+2). Since they involve less residues, they are a lot more open than b tuns. Also, they are a lot more constrained and less favoured energetically due to the less than optimal geometry of the h-bond. Therefore, they are a lot less common.

Turns are not only important as conformational motifs that aid in the formation of other structural motifs, such as b sheets. They have been found to play a very important role in protein-protein and peptide protein recognition. Turns are usually found in the surface of proteins, making the chain reverse away from the solvent. Therefore, a turn motif has the ability of exposing four residues of the polypeptide chain to the 'world', and therefore a 'message'. The bioactive conformation of many hormones and neuropeptides (enkephalin) is beleived to be a b turn.

5) Finally, we have to mention a badly termed structural motif. What about everything else that does not fall into a helix, b sheet, collagen helix, or turns? Polypeptide segments that don't fall into any of these cathegories are usually refered to as having a random coil conformation. Despite its name, there is nothing random about them. Most of the times, the randomness comes from the fact that their structure is ill defined by the physical techniques used to study them.

A random coil section of a polypeptide can actually be forming an a helix for a short period of time, then shifting to a collagen helix, then to a b strand, etc., etc. However, the change in conformation is usually to fast for us to see it with the techniques available (X-ray and NMR), and we end up seeing a fuzzy (random) average.

Most of the times, the C-termini and N-termini of a polypeptide chain will apear as random coils in X-ray and NMR derived structures. Also, polypeptide loops connecting different b strands and a helices will appear as random coils. Additionaly, we will see regions of the polypeptide that poke out from a protein into solution as random coils.

If it is finished, you may want to check out the all the conformational features of polypeptide chains we have discussed here in my very lame chime page. It will hopefully help you see how all these conformational motifs in 3D. You will need the CHIME plug-in installed in your computer.

Next class we will continue looking at proteins 3D structure.


Prepared by Guillermo Moyna, 1999.