THE PROTEIN PUNDIT: 2010-07-18

In order to solve the structure of a protein molecule using X-ray crystallography the intensity and phase, also known as the Structure Factor (Fp), of each reflection must be determined. Unfortunately, only the intensity information can be directly measured in a crystallographic experiment while the phase information must be indirectly calculated. Typically the phase information is acquired in a "divide and conquer” approach where the substructure of “anomalous atoms” within the protein molecule is deciphered leading to the estimated protein reflection phases. In the standard approach today, the “anomalous atom” Selenium replaces the sulfur atom in methionine generating selenomethionine. The selenomethionine is incorporated into the protein molecule by normal protein translation and crystallized. At a specific X-ray wavelength, the crystallized protein modified with “anomalous atoms” alters the reflections relative to the unmodified protein reflections due to the absorption and delayed emission of X-rays by the “anomalous atoms” thus providing the information necessary to elucidate the “anomalous atom” substructure. The traditional approach for solving the “anomalous atom” substructure is calculating a Fourier transformation using only the anomalous intensities (really it’s the amplitudes which are essentially the square root of the intensities) calculated from the simple equation fa = fpa-fp where the measured intensities from the protein atoms (fp) are subtracted from the measured intensities of the protein plus anomalous atoms (fpa). This produces a "difference Patterson map" specifically containing the interatomic vectors between all “anomalous atoms”. Due to crystallographic symmetry, some interatomic vectors are located on a particular section of the Patterson map known as the Harker section. Patterson maps use the coordinate system U,V,W calculated from the simple relationship U=X1-X2, V=Y1-Y2, and W=Z1-Z2 where, for example, X1 is the x coordinate of any atom and X2 is a symmetrically related x coordinate with similar relationships for both Y and Z. These simple equations produce Harker sections and "anomalous atom" coordinates due to crystallographic symmetry. For example, a 2-1 screw axis symmetry operation along the Y axis converts Y1 to Y2+1/2 producing the Harker section Y = ½ from the relationship V=Y1-(Y2+1/2)=1/2. Subsequently peaks on the Y = 1/2 Harker section are given the U coordinate where that same 2-1 screw axis converts X1 to a negative X2. Therefore U=X1-(-X2) equaling 2X. Therefore, measuring the U peak position on this Harker section and dividing it by 2 will give you the X coordinate of the "anomalous atom". When this is done for all three coordinates X, Y, and Z and for every "anomalous atom", the positions are converted to the anomalous Structure Factor (Fa). From this information one can calculate the estimated protein Structure Factors (Fp) by using the simple equation Fp = Fpa-Fa in combination with a Harker circle diagram. The original estimation produces significant phase ambiguity as there are two possible solutions to this simple equation Fp = Fpa-Fa (Fig. 1 shows the two possible solutions for the protein phase, highlighted by red asterisks, on the Harker circles). However, this phase ambiguity is rectified by precise “real space” modifications of the protein molecules electron density map. The density modification procedure is based on the immutable physical characteristics of protein molecules. Finally, the protein Structure Factors (Fp) are plugged into the Fourier transformation producing the protein molecule's initial electron density map. Subsequently, a model of the protein molecule is built into the original electron density map and refined by iterative cycles of model building and refinement. Protein model alterations continue until the Structure factor amplitudes calculated from the model (Fc) converge with the Structure factor amplitudes observed from the diffraction experiment (Fo). Once the phase refinement is done, the real excitement begins. The crystallographer interprets the complex structural features that influence protein molecule function.

THE PROTEIN PUNDIT

Monday, July 19, 2010

Solving the phase problem.