Proteins V - Protein structure determination
All this mumbo-jumbo was extremelly interesting, but how the heck do we find the 3D structure of a protein? Several techniques can be used. Most of them, like circular dichroism and fluorecense, can tell us about types and percentages of different elements of secondary structure are present in a polypeptide, but nothing about their 3D conformation or tertiary structure. X-ray crystallography and nuclear magnetic resonance (NMR), on the other hand, can actually give us a 3D picture at atomic resolution of the conformation (or ensemble of conformations) of a protein. We will try to understand how these two techniques work at a level low enough that nobody is going to be helplessly lost.
X-ray crystallography
X-ray crystallography was the first technique used to study the 3D structure of proteins, dating back to the 30's. The first 3D structure of a protein, myoglobin, was determined by X-ray crystallography in 1959 by John Kendrew.
In order to understand the basics of how X-ray crystallography works, we can make an analogy with the way a snapshot works: Light from the Sun or a lamp shines on the object we want to take a picture from. In this case, the reflected light on the surface of the object gets trhough the lenses of the camera and activates the photographic film. We have to remember that the while the shining light is polychromatic (many colors), the wavelenght (color) of the reflected light will depend on the surface reflecting it: Red stuff will reflect red-er light, blue stuff will reflect blue-er light, etc. Now, the reflected light also has an energy associated with it, which is what is responsible for activating the photographic film.
Now, with this in mind, we could in principle take pictures of smaller and smaller objects. The problem is that we reach a point in which the sizes of the objects we are trying to photograph using normal light, or light with wavelenghts in the visible spectrum range, will be smaller than the wavelenght of the light (around 400 to 700 nm) we are shining on them. Therefore, they will appear to be invisible, because our light beam cannot hit them and we cannot get any light reflected from them.
The solution is to use light sources with smaller and smaller wavelenghts. If we want to 'see' atoms (or more accurately, the electron clouds buzzing around them), we have to use very short wavelenghts. The lenght will have to be comparable to the object we want to see, wich in the case of atoms or bonds is around 0.7 to 1.5 Å (0.07 to 0.15 nm). These wavelenghts correspond to X-rays.
Now, the problem is that we cannot simply put a camera in front of the molecule and see the reflected light comming from it. What we do is we put the film on the other side of the molecule, and we see how the X-ray beam passes through it. We record the refraction of the X-rays by the molecule, or put in other words, we record how the molecule affects the angle and intensity of our shining X-ray beam. The passing X-ray beams will make a pattern on the film which depends on how the different atoms in the molecule affected its path angle and intensities. By analyzing the pattern we can reconstruct the structure of our molecule.
It is sort of a forensic science in a way: A forensic investigator can tell the trajectory of a bullet through room (and whoever was in its path) by analyzing the angle at which the bullet hits a wall. He can also determine how fast the bullet was traveling by analyzing how deep it got stuck into the wall. By looking at how mangled the bullet gets, he can also tell about the strenght of the matter it passed through. With all this data, he can tell from where the bullet was fired, the possible composition of the objects it went through, and even if the objects were moving or not. He can also say if the bullet hit a bone on a victim. If we get even darker in our analogy, he can analyze blood splater patterns and asses if the victim was falling, standing, kneeling, etc., etc.
In X-ray crystallography we do something similar. We look at the angles in which the X-ray beam was diffracted, how much intensity the lost, etc., etc., and from this we can make a picture of the electron coluds it must have bumped against while passing through the molecule.

The problem in this explanation is that we used one molecule. As you all know, we cannot hold a molecule in a stick and shine it with X-ray beams. We have to use a sizeable ammount of sample. Now comes a problem: If our sample is amorphous, like the powder in an aspirin tablet, each molecule in it will have a completely random orientation, and each will be at different angles with respect to the shining X-ray beam. This makes the diffraction pattern to be almost completely useles for the purpose of determining our structure, because the picture we get on the photographic film has no information on how the molecules were arranged with respect to one another: An atom in a particular molecule in our sample could have been pointing up, giving a certain diffraction pattern, while the same atom in another molecule of the sample can be pointing sideways, or down, giving a completely different pattern in the diffraction map.
How do we solve this? We need a sample with an ordered orientation of molecules in it, and we can find this in a crystal. In a crystal all molecules are oriented in a particular way with respect to one another. Even if we have several orientations, these will be regular and repeating. We will have a relativelly small number of orientations, and the different orientations will have some sort of ordering with respect to one another, or symmetry.
Now, although solvable, having different orientations (even if regular) makes the analysis of the diffraction patterns more complicated. Second, think that the molecule is a protein, with thousands of atoms. Things get really hairy. Obviously, the analysis of the diffraction pattern in this case has to be done in a big honking computer. Furthermore, we have a collection of diffraction angles and intensities, and we have to transform them into pictures of electron clouds around the atoms in the protein, or electron density maps. This transformation between difraction angles and intensisties and 3D electron densities is done with a mathematical algorithm called the Fourier Transform. What we get at the very end is stuff that looks like this:

We are not done yet. The final step in the X-ray determination of a structure is trying to put the protein in a conformation that matches the diffraction pattern. This process is known as refinement, and again, is done by computer, using moelcular modeling tools.
X-ray is not all that peachy. There are many problems that stand in the way of getting a nice, well resolved, 3D structure of a protein:
- First, we need a crystal. As you probably know, crystallyzing a small organic molecule can sometimes be painful. Imagine trying to crystallyze a molecule with 10,000 atoms! Protein crystallization is more an art than a science, and requires lots and lots and lots of patience. Furthermore, there are proteins that don't crystallize, period.Many of these problems are solved by nuclear magnetic resonance (NMR), a technique that we will briefly discuss below. In any event, X-ray crystallography is an amazing technique, and the 3D structure of ~ 8000 proteins has been solved in this way. Look in the Protein Data Bank, which allows you to download and check out all the proteins that have been crystallized.- Second, when analyzing the diffraction pattern, we need to know the angles (or phase) at which the rays came out of the crystal. These is sometimes very hard to find, making the determination of the electrom density map nightmerish.
-Third, although close, the conformation of the protein in a crystall is not exactly the conformation of the protein in solution. In a crystall we don't have water surrounding the protein, but other protein molecules. The polarity of the media is different to solution. In a crystal, we also have crystal packing forces (the forces between different molecules butting against each other) which will distort the conformation of the polypeptide with respect to its conformation in solution.
-Finally, the molecules in the crystal are in a pretty rigid environment. This will limit the movement that the protein may have in solution, which may be crucial for activity. Miss this, and you may miss how the protein works.
NMR spectroscopy
NMR is what I do, so sorry if I numb you with it. Lets see how much you remember about this. NMR is based on the absorption of electromagnetic radiation by the nuclei of atoms with spin number (I) different than zero. In biological macromolecules, such nuclei are 1H and 31P. Using protein engeneering and chemical synthesis, we can get molecules with 1H, 13C, 15N, and 31P isotopes, which all have I of 1/2. In the most basic description of NMR, we can consider atoms with I of 1/2 as tiny bar magnets:

When we put them into a large magnetic field, these tiny bar magnets will tend to align with the large external magnetic field. However, quantum mechanics tell us that not all the atomic magnets will align in the same way, as larger macroscopical magnets would do, but they will distribute betwen nuclei aligned in favour and against the magnetic field, which we call states. Nuclei with I = 1/2 will have two states:

Obviously, nuclei aligned with the external magnetic field will have less energy than those against the magnetic field. The distribution between the two states, called a and b, is determined by the difference in energies and the temperature. The energy is proportional to the type of nuclei and the strenght of the external magnetic field:

As we know from quantum mechanics, the difference in energy can also be related to a particular frequency, n. Most importantly, for the same type of nuclei, say 1H, the energy is also dependent on the environment around the atom. Each nuclei in a molecule will be in different environments, and therefore they will have different energies associated with the transition between the a and b states.
If we then irradiate our sample with energies with different frequencies, different nuclei will be exited at different frequencies, depending on their chemical environment. This is the famous chemical shift that you know from organic chemistry. Making a very long story short, when we take the 1H-NMR spectrum of a protein, we see something like this:

Each differnt peak corresponds to a different proton in the protein. The first thing we have to do when analyzing a protein by NMR is assign a peak to each of the hydrogens in the protein, which is a time consuming task. However, since there are regular patterns, computers can usually be used, and a big chunk of the work is commonly automated.
Now, this is all very nice, but how do we get a structure out of it? One thing we have forgoten to mention is interactions between two nuclei in a molecule. The behaviour of a certain nuclei (A) in the molecule is perturbed if there is another nuclei (B) nearby. What is usually perturbed is the relaxation of the nuclei, i.e., the way the nuclei releases the energy we gave to it. Then, if A has a buddy (B) nearby, it can pass some of the energy to it to relax. Therefore, nuclei B will have more energy, and its signal will be enhanced. Now, if we starutate a certain nuclei (A in this case), the intensisties of nuclei close in space (B and others) will be affected. This pehnomenon is called the Nuclear Overhausser Effect (NOE):

The NOE is (very roughly) proportional to the inverse sixth power of the distance between the two nuclei participating in it (1/r6) Therefore, from measuments of NOE enhancements we can calculate approximate distances between nuclei in a molecule. This is what we need in order to compute a 3D structure of a molecule.
In the case described above we did it for a single proton. If we think that we have ~ 1,000 or more protons in a protein, doing it one by one would take foreveer. Fortunately, 'NMR spectroscopist do it in many dimensions'. Using a two-dimensional (2D) technique called NOESY, we can compute all the interactions (and enhancements) at the same time. What we get are things that look like this:

In this spectrum, what we have is correlations between different nuclei in the molecule. Each correlation has an intensity that is, as we mentioned above, proportional to 1/r6, were r is the distance between the two nuclei giving rise to the correlation. Now, after we compute all these distances, we feed them to a computer molecular modeling program that uses them as constraints in the generation of possible structures in agreement with the NMR data. The program also uses other constraints, such as the covalent structure of the molecule and van der Waals interactions, to create the structures. What we get at the end is an ensemble of structures that represent the structure in solution of our protein. Since there are many structures (usually 20), we usually omit side chains and protons from our pictures:

By the way, this is MCP-1, the protein I have been showing in many of the examples. You can clearly see how NMR can tell the flexibility in solution of the ends of the polypeptide chain (they appear fuzzy in the picture). If you are really up to it, you may want to dive into my NMR spectroscopy course.
NMR has several advantages over X-ray crystallography:
- First, we don't need crystals, we just need the pure protein in solution. This not only allow to investigate proteins that don't normally crystallize, but it also allow us to study the protein under near phydiological conditions, i.e., in conditions that approach those in which the protein is usually found in the cell.However, the larger the protein, the more 1H signals we will have, and the more complicated it will be to assign all the signals to the different hydrogens. Although there are several techniques that nowadays allow us to study relativelly large proteins, the maximum size protein that can be determined almost routinely with NMR is ~ 30 KDa.- Second, since the protein is in solution, we can actually study the movements of different sections of the polypeptide chain, or the dynamics of the polypeptide chain.
In most cases, X-Ray, NMR, and molecular modeling simulations are used together to study the structure and function of proteins. Next time we will start analyzing how the 3D structure of proteins affect their biological function, using myoglobin and hemoglobin (two oxygen carrier proteins) as examples.
Prepared by Guillermo
Moyna, 1999.