Jump to content

X-ray crystallography: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
use music to explain Fourier transform intuitively
m typo
Line 116: Line 116:
[[Image:X Ray Diffractometer.JPG|thumb|200px|right|A diffractometer]]
[[Image:X Ray Diffractometer.JPG|thumb|200px|right|A diffractometer]]


The smaller X-ray sources found in laboratories are based on the following mechanism. Electrons are boiled off of a cathode and accelerated through an strong electric potential of roughly 50&nbsp;[[Volt|kV]]; having reached a high speed, the electrons collide with a metal plate, emitting ''[[bremsstrahlung]]'' and some strong spectral lines corresponding to the excitation of [[Atomic orbital|inner-shell electrons]] of the metal. The most common metal used is [[copper]], which can be kept cool easily, due to its high [[thermal conductivity]], and which produces strong K<sub>α</sub> and K<sub>β</sub> lines. The K<sub>β</sub> line is sometimes suppressed with a thin layer (0.0005&nbsp[[inch|in.]] thick) of nickel foil. The simplest and cheapest variety of [[sealed X-ray tube]] has a stationary anode and produces ''circa'' 2&nbsp;[[Watt#Kilowatt|kW]] of X-ray radiation. The more expensive variety has a [[conventional X-ray generator|rotating-anode type source]] that produces ''circa'' 14&nbsp;[[Watt#Kilowatt|kW]] of X-ray radiation.
The smaller X-ray sources found in laboratories are based on the following mechanism. Electrons are boiled off of a cathode and accelerated through an strong electric potential of roughly 50&nbsp;[[Volt|kV]]; having reached a high speed, the electrons collide with a metal plate, emitting ''[[bremsstrahlung]]'' and some strong spectral lines corresponding to the excitation of [[Atomic orbital|inner-shell electrons]] of the metal. The most common metal used is [[copper]], which can be kept cool easily, due to its high [[thermal conductivity]], and which produces strong K<sub>α</sub> and K<sub>β</sub> lines. The K<sub>β</sub> line is sometimes suppressed with a thin layer (0.0005&nbsp;[[inch|in.]] thick) of nickel foil. The simplest and cheapest variety of [[sealed X-ray tube]] has a stationary anode and produces ''circa'' 2&nbsp;[[Watt#Kilowatt|kW]] of X-ray radiation. The more expensive variety has a [[conventional X-ray generator|rotating-anode type source]] that produces ''circa'' 14&nbsp;[[Watt#Kilowatt|kW]] of X-ray radiation.


Regardless of how they are produced, the X-rays must be filtered to a single wavelength (made monochromatic) and collimated to a single direction. The former is important not only for the data-analysis, but also removes radiation that degrades the crystal but does not contribute to its diffraction pattern. Filtering is usually done with a single-crystal monochromator. Collimation is done either with a collimator (basically, a long tube) or with a clever arrangement of gently curved mirrors. Mirrors are preferred for small crystals (under 0.3&nbsp;[[millimetre|mm]]) or with large unit cells (over 150&nbsp;[[Ångström|Å]]).
Regardless of how they are produced, the X-rays must be filtered to a single wavelength (made monochromatic) and collimated to a single direction. The former is important not only for the data-analysis, but also removes radiation that degrades the crystal but does not contribute to its diffraction pattern. Filtering is usually done with a single-crystal monochromator. Collimation is done either with a collimator (basically, a long tube) or with a clever arrangement of gently curved mirrors. Mirrors are preferred for small crystals (under 0.3&nbsp;[[millimetre|mm]]) or with large unit cells (over 150&nbsp;[[Ångström|Å]]).

Revision as of 03:26, 23 May 2007

Structure determination by X-ray crystallography.
Structure determination by X-ray crystallography.

X-ray crystallography, also known as single-crystal X-ray diffraction, is the oldest and most common crystallographic method for determining the structure of molecules. In general, it yields the position of most atoms to within a few tenths of an Ångström (which is 10-10 m, one tenth of one billionth of a meter), a resolution matched by few other methods. X-ray crystallography is used in chemistry and biochemistry to determine the structures of inorganic compounds, DNA, RNA and proteins. Such crystal structures have many scientific and medical applications; for example, they can shed light on the molecular events of enzyme catalysis or serve as the basis of structure-based drug design.

The technique of X-ray crystallography has three basic steps. The first and generally most difficult step is to produce an adequate crystal of the molecule(s) under study. The crystal must be sufficiently large, pure in composition and regular in structure, with no large internal imperfections such as cracks. In the second step, the crystal is placed in an intense beam of X-rays of a single wavelength, producing a series of spots called reflections. As the crystal is gradually rotated, previous reflections disappear and new ones appear; the intensity of every spot is recorded meticulously at every orientation of the crystal. Multiple data sets may have to be collected, with each covering a full rotation of the crystal and containing tens of thousands of reflection intensities. In the third step, these data are combined computationally with prior chemical information about the molecular structure to produce the atomic resolution model. Such models are often stored in large public databases.

Larger molecules such as proteins are much harder to crystallize than small molecules, primarily since they are difficult to purify and can unfold easily, restricting the solution conditions used for crystallization. Having many more atoms, such larger molecules also require much more information to determine their atomic positions accurately. Therefore, one generally discerns small-molecule crystallography, which typically involves molecular structures of less than 100 atoms, from macromolecular crystallography, which can have tens of thousands of atoms. Crystallography has proven possible even for viruses with hundreds of thousands of atoms. However, larger molecules generally provide a less well-resolved picture of the atomic positions.

The term "X-ray crystallography" is sometimes applied to X-ray scattering techniques on polycrystalline materials, such as powder diffraction. More generally, X-ray crystallography belongs to a large family of scattering techniques, such as fiber diffraction and inelastic scattering.

Definition

File:Myoglobindiffraction.png
An X-ray diffraction image for the protein myoglobin. The spots (reflections) are clearly visible.

X-ray crystallography involves the scattering of X-rays of a single wavelength from a single, pure crystal.[1] This scattering produces a diffraction pattern, a set of intense spots (also called reflections) on a screen behind it. The spots can be related to the density of electrons in the crystal through a mathematical operation called a Fourier transform. This technique gives the frequency components of any signal, just as the human ear can discern different musical notes in a chord. Each spot corresponds to a particular spatial variation of the electron density along a particular direction within the crystal. By combining these independent fluctuations, the electron distribution can be reconstructed, just as chord can be played on the piano once its notes are known.[1]

The reflections vary in intensity, and by gradually rotating the crystal and recording the intensities of the spots, one may determine the magnitude of the Fourier transform of the density of electrons within the crystal. By using data on related molecules, or by recording several sets of data with specific changes in the scattering, the phases corresponding to these magnitudes may be computed. Combining the phases and magnitudes yields the full Fourier transform of the electron density, which may be inverted to obtain the electron density in terms of position within the crystal. The known chemical structure of the molecule(s) composing the crystal allows the electron density to be converted into a model of the position of every atom of the molecule(s) within the crystal.

History

X-ray crystallography was first carried out by Max von Laue in 1911 on salt. Its potential for determining the structure of molecules — then only known vaguely from chemical and hydrodynamic experiments — was realized immediately. Early pioneers included William Bragg and John Desmond Bernal.

A representation of the 3D structure of myoglobin, showing colored alpha helices. Myoglobin was the first protein to have its structure solved by X-ray crystallography by Max Perutz and Sir John Cowdery Kendrew in 1958, which led to their receiving a Nobel Prize in Chemistry.

Many complicated inorganic and organometallic systems have been analyzed using single crystal methods, such as fullerenes, metalloporphyrins, and many other complicated compounds. Single crystal is also used in the pharmaceutical industry, due to recent problems with polymorphs. The major limitation to the quality of single-crystal data is crystal quality, these crystals are usually obtained by recrystallization.

The first solved protein crystal structure was of Sperm Whale myoglobin, as determined by Max Perutz and Sir John Cowdery Kendrew in 1958, which led to a Nobel Prize in Chemistry.[2] Today X-ray crystallography is used by pharmaceutical companies to determine specifically how drug lead compounds interact with their protein targets.[3] Biological X-ray crystallography is, to date, the most prolific discipline within the area of structural biology; out of the ~42000 protein structures solved, X-ray crystallography is responsible for ~36000. NMR spectroscopy has contributed almost 6000 and electron microscopy just over 140. Other biophysical methods, such as IR spectroscopy and powder diffraction make up the remaining structures, according to the Protein Data Bank (PDB).[4]

The number of protein structures that have been determined is growing rapidly and current high-throughput methods have allowed the development of the new filed of structural genomics. These projects aim to determine over 10,000 protein structures over the next few years, giving a vast new resource to the structural biology field.[5] However, a major hindrance to these efforts is the difficulty in crystallizing membrane proteins, which make up a large proportion of proteins of pharmacological importance, such as ion channels and receptors.[6][7]

Scattering technology

Elastic vs. inelastic scattering

X-ray crystallography is a form of elastic scattering; the outgoing X-rays have the same energy as the incoming X-rays, only with altered direction. Since the energy of a photon is inversely proportional to its wavelength, elastic scattering means that the outgoing photons have the same wavelength as the incoming photons. By contrast, inelastic scattering occurs when energy is transferred from the incoming X-ray to the crystal, e.g., by exciting an inner-shell electron to a higher energy level. Such inelastic scattering changes the wavelength of the outgoing beam, making it longer and less energetic. Inelastic scattering is useful for probing such excitations of matter, but are not as useful in determining the distribution of scatterers within the matter, which is the goal of X-ray crystallography.

The wavelength of an X-ray is roughly 1 Å (0.1 nm = 10-10 m), which is on the scale of a single atom. Longer wavelength photons (such as ultraviolet radiation) would not have sufficient resolution to determine the atomic positions. At the other extreme, shorter wavelength photons such as gamma rays are difficult to produce in large numbers, difficult to focus, and interact too strongly with matter, producing particle-antiparticle pairs. Therefore, X-rays are the "sweetspot" for wavelength when determining the structure of molecules using electromagnetic radiation.

Other types of X-ray scattering

X-ray crystallography involves the scattering of X-rays from a single crystal. There are other forms of elastic X-ray scattering, such as powder diffraction, SAXS and various forms of X-ray fiber diffraction, which was used by Rosalind Franklin in determining the double-helix structure of DNA. All of these scattering methods generally use monochromatic X-rays, X-rays restricted to a single wavelength with minor deviations. In general, X-ray diffraction produces isolated spots ("reflections"), while the other methods produce smooth, continuous scattering. However, none of these techniques offer as much structural information as X-ray diffraction.

A typical protein crystal produces 20-30 thousand reflections, each of which represents an independent piece of data about the structure. Full sets of reflections are typically collected under different conditions, either by scattering with multiple wavelengths of X-rays or with small metallic additives that help in solving the structure. These hundreds of thousands of data are assembled by powerful computers into the atomic-resolution model of the electron density.

No other structural technique offers so many independent data on the structure; for example, protein NMR typically collects a hundred-fold fewer data. Thanks to its enormous amount of independent data, X-ray crystallography is the best technique for determining a static atomic-resolution structure of any molecule. However, X-ray crystallography requires a crystal, and obtaining a crystal of sufficient quality is generally the key stumbling block to determining the structure. Some crystallographers have devoted over ten years of their lives to obtaining a single crystal of an important protein, e.g., the viral protein gp120 used by HIV to enter human cells.

Electron and neutron diffraction

As derived below, elastic scattering can be represented as a Fourier transform of the density of the objects doing the scattering (the "scatterers"), as long as the scattering is weak; the scattered beams should be much less intense than the incoming beam. When the scattering is weak, the scattered beams do not produce re-scattered beams of their own, at least not with any significant amplitude. Re-scattered waves are called "secondary scattering". This is not a problem for X-ray diffraction, since the X-rays interact relatively weakly with the electrons. However, this may be a problem for diffraction with other types of beams, such as electron diffraction. There, the interaction between the electrons in the incoming beam and in the crystal is a million-fold stronger, resulting in significant secondary scattering. The solution is to use very thin samples; the primary scattered electron beams leave the sample before they have a chance to undergo secondary scattering. A promising direction is the electron diffraction of isolated macromolecular assemblies, such as viral capsids and molecular machines, which may be carried out with a cryo-electron microscope.

Neutron diffraction is an excellent method for structure determination, although it has been difficult to obtain intense, monochromatic beams of neutrons in sufficient quantities. Traditionally, nuclear reactors have been used, although the new Spallation Neutron Source holds much promise in the near future. Being uncharged, neutrons scatter much more readily from the atomic nuclei rather than from the electrons. Therefore, neutron scattering is very useful for observing the positions of light atoms with few electrons, especially hydrogen, which is essentially invisible in the X-ray diffraction of larger molecules. Neutron scattering also has the remarkable property that the solvent can be made invisible by adjusting the ratio of normal water, H2O, and heavy water, D2O.

Advantages of a crystal

A crystalline sample is by definition periodic; a crystal is composed of a unit cell repeated over and over in three independent directions. Such periodic systems have a Fourier transform that is concentrated at periodically repeating points in reciprocal space known as Bragg peaks; the Bragg peaks correspond to the reflection spots observed in the diffraction image. Since the amplitude at these reflections grows linearly with the number N of scatterers, the observed intensity of these spots should grow quadratically, like N2. In other words, using a crystal concentrates the weak scattering of the individual unit cells into a much more powerful, coherent reflection that can be observed above the noise. This is an example of constructive interference.

In a non-crystalline sample, molecules within that sample would be in random orientations and therefore would have a continuous Fourier spectrum that spreads its amplitude more uniformly and with a much reduced intensity, as is observed in SAXS. More importantly, the orientational information is lost. In the crystal, the molecules are all held at precisely the same orientation within the crystal, whereas in a liquid, powder or amorphous state, the observed signal is averaged over all possible orientations of the molecules. It has proven nearly impossible to obtain atomic-resolution structures from such rotationally averaged scattering data. An intermediate case is fiber diffraction in which the subunits are arranged periodically in at least one dimension, if not in three dimensions.

Methods

Crystallization and data collection

Crystallization

A protein crystal seen under a microscope. Most crystals used in X-ray crystallography are less than a millimeter across.

Crystallography requires a crystal, and growing a diffraction-quality crystal is almost always the chief barrier to solving the atomic-resolution structure of the molecule(s).[8] The process of growing crystals is poorly understood, and is often described as an art rather than a science.[9]

The basic idea is to lower the solubility of the molecules very gradually; if done too quickly or in a random way, the molecules will simply precipitate, forming a dust on the bottom of the container. Crystal growth is typically marked by two stages: nucleation of a microscopic crystallite (possibly having only 100 molecules), followed by growth.[10] The solution conditions that favor nucleation are not always the same conditions that favor its subsequent growth. The goal is to identify solution conditions that favor the development of a single, large crystal, since larger crystals offer improved resolution of the molecule. Consequently, the solution conditions should disfavor nucleation but favor growth, so that only one crystal forms per droplet. If nucleation is favored too much, a shower of small crystallites will form in the droplet, rather than one large crystal; if favored too little, no crystal will form whatsoever.

It is extremely difficult to predict the proper conditions for nucleation or growth of well-ordered crystals.[11] Therefore, in practice, favorable conditions are identified by screening; a very large batch of the molecules is prepared, and then a huge variety of crystallization solutions are tested.[12] It is not uncommon to try out hundreds, even thousands, of solution conditions before finding one that works. The solution conditions differ in how they lower the solubility of the molecule; some vary in pH, some contain salts of the Hofmeister series or chemicals that lower the dielectric constant of the solution, and still others contain large polymers such as polyethylene glycol that drive the molecule out of solution by entropic effects. Some solution conditions combine such additives as well. It is also common to try several temperatures for encouraging crystallization, or to gradually lower the temperature so that the solution becomes supersaturated. These methods require large amounts of the target molecule, as they use high concentration of the molecule(s) to be crystallized, e.g., up to 100 mg/ml of protein, but more typically between 10-20 mg/ml.

Some factors are known to inhibit crystallization. The growing crystals should be held at a constant temperature and not subjected to any shocks or vibrations. Impurities in the molecules to be crystallized or in the crystallization solutions are often inimical to crystallization. This requirement for high purity is a particular burden for protein crystallographers, since purifying large amounts of a single protein (typically, 50-100 mg) for screening is often difficult, particularly for eukaryotic proteins.[13] Conformational flexibility of the crystallizing molecules — such as flexible loops or even disordered, unfolded domains of a protein — inhibits crystallization by entropic effects.[8] At the other extreme, molecules that spontaneously self-associate — such as proteins that assemble into fibrils — do not crystallize well in general.

Twinning can occur when a unit cell can pack equally favorably in multiple orientations; this is a problem, since single crystals are generally required in crystallography.[14] Although recent advances in computational methods have begun to allow the structures of twinned crystals to be solved, it is still very difficult. The Twinning often occurs when the unit cell has a net electric dipole moment (ferroelectric crystals), because of a subtle interaction between the dipoles of the bulk crystal with polarization charges expressed on the crystal's surface.

Crystallization methods

Crystallization of small molecules has traditionally followed three methods

  • Diffusion gradient- solubility or temperature
  • Concentration through evaporation
  • Sublimation (not recommended due to low-quality crystals).

Even though small molecules are easier to crystallize than macromolecules, there are many compounds reported that have failed to give diffraction quality crystals.

Crystallization of macromolecules is not trivial. Traditional methods of crystallizing inorganic molecules have been modified to be gentle enough for proteins, which are sensitive to temperature and high concentrations of organic solvents. Many methods exist to crystallize proteins, but the two most successful methods are the microbatch and vapor diffusion techniques.[15] Concentrated solutions of the protein are mixed with various solutions, which typically consist of:

  • a buffer to control the pH of the experiment
  • a precipitating agent, to induce supersaturation (typically polyethylene glycols, salts such as ammonium sulphate, or organic alcohols).
  • other salts or additives, such as detergents or molecule co-factors

In either microbatch or vapor diffusion, the protein solutions are allowed to concentrate over time. The most common modern methods for crystallization uses vapor diffusion, such as the hanging drop and sitting drop methods (add Figure). In the hanging drop method, a small droplet of concentrated protein- and precipitant-containing solution is applied to a glass coverslip that is then inverted. The droplet is suspended above a larger reservoir of a similar solution lacking protein but containing a higher concentration of precipitant. In some cases, a thin coating of oil is added to the droplet to slow the vapor diffusion. A closed environment is formed containing the suspended droplet and reservoir. Over time, the droplet containing protein equilibrates with the larger reservoir beneath it as volatile water in the droplet transfers to the reservoir, effectively increasing the precipitant concentration in the protein droplet. In solutions of a favorable composition, the protein becomes supersaturated and crystal nuclei form, leading to crystal growth. This is the optimal outcome. Otherwise (and typically) the protein forms a useless and amorphous mass as protein precipitates out of solution. Typically protein crystallographers can screen hundreds or thousands of conditions before a suitable condition is found that leads to a crystal of suitable quality. As a rule of thumb, some useful structural detail can be gained from a crystal that diffracts with a resolution of better than 4 Ångströms (400 pm).

There are many methods of crystallization, such as batch methods (mix the two solutions directly), simple dialysis (proteins can be un-"salted in"), concentration dialysis (proteins crystallize as their concentration is increased gradually), liquid diffusion, and the classic method of evaporation.

Many biomolecules of interest still have not been successfully crystallized. Imperfections in the crystal structure, caused by impurities, sample contamination, or multiple stable conformations of the subject protein can prevent the acquisition of atomic resolution images. Convection caused by temperature variations within the forming crystal can also cause imperfections, and one of the proposed scientific applications of the International Space Station is the growth of crystals, because convection is reduced in the free fall environment of an orbiting spacecraft.[16]

Oftentimes, one observes a crystal growing that is unfortunately not composed of the molecule of interest. For example, the salts of the crystallization solution used to crystallize a protein may themselves form a crystal. Such crystals can be discerned with dyes or checking the stiffness of the crystal. Protein crystals are pliant and absorb small dyes, being typically 50% water in their makeup. By contrast, salt crystals do not absorb dyes and are much more rigid.

Mounting the crystal

Once they are full-grown, the crystals are mounted so that they may be held in the X-ray beam and rotated. There are several methods of mounting. One old-fashioned way was to load the crystal into a glass capillary with some of the crystallization solution (the mother liquor). A more modern approach is to scoop the crystal up in a tiny loop, made of nylon or plastic and attached to a solid rod, that is then flash-frozen with liquid nitrogen.[17] This freezing reduces the radiation damage of the impinging X-rays, as well as the noise in the Bragg peaks due to thermal motion (the Debye-Waller effect). However, untreated crystals will usually crack if flash-frozen; therefore, they must be pre-soaked in a cryoprotectant solution prior to freezing.[18] Unfortunately, this pre-soak may itself cause the crystal to crack, ruining it for crystallography. Generally, a successful cryo-condition is obtained through trial and error

The capillary or loop is mounted on a goniometer, which allows it to be positioned accurately within the X-ray beam and rotated. Since both the crystal and the beam are so small in general, the crystal must be centered within the beam to within roughly 25 microns accuracy, which is aided by a camera focused on the crystal. The most common type of goniometer is the "kappa goniometer", which offers three angles of rotation: the ω angle, which rotates about an axis roughly perpendicular to the beam; the κ angle, about an axis at roughly 50° to the ω axis; and, finally, the φ angle about the loop/capillary axis. When the κ angle is zero, the ω and φ axes are aligned. The κ rotation allows for convenient mounting of the crystal. The oscillations carried out during data collection (mentioned below) are carried out about the ω axis only. An older type of goniometer is the four-circle goniometer, and its relatives such as the six-circle goniometer.

X-ray sources

Crystals are then mounted on a diffractometer coupled with an X-ray source, a machine that emits a beam of monochromatic X-rays. The brightest and most useful X-ray sources are synchrotrons; their much higher luminosity allows for better resolution. They also make it convenient to tune the wavelength of the radiation, which is useful for Multi-wavelength Anomalous Dispersion (MAD) phasing, described below. Synchrotrons are generally national facilities, each with several dedicated beamlines where data is collected around the clock, seven days a week. Crystallographers apply for a slot of time, which they must use whenever it is granted, even at 3am on a national holiday. Crystallographers will sometimes stay awake for days, collecting data continuously until their allotted time runs out.

A diffractometer

The smaller X-ray sources found in laboratories are based on the following mechanism. Electrons are boiled off of a cathode and accelerated through an strong electric potential of roughly 50 kV; having reached a high speed, the electrons collide with a metal plate, emitting bremsstrahlung and some strong spectral lines corresponding to the excitation of inner-shell electrons of the metal. The most common metal used is copper, which can be kept cool easily, due to its high thermal conductivity, and which produces strong Kα and Kβ lines. The Kβ line is sometimes suppressed with a thin layer (0.0005 in. thick) of nickel foil. The simplest and cheapest variety of sealed X-ray tube has a stationary anode and produces circakW of X-ray radiation. The more expensive variety has a rotating-anode type source that produces circa 14 kW of X-ray radiation.

Regardless of how they are produced, the X-rays must be filtered to a single wavelength (made monochromatic) and collimated to a single direction. The former is important not only for the data-analysis, but also removes radiation that degrades the crystal but does not contribute to its diffraction pattern. Filtering is usually done with a single-crystal monochromator. Collimation is done either with a collimator (basically, a long tube) or with a clever arrangement of gently curved mirrors. Mirrors are preferred for small crystals (under 0.3 mm) or with large unit cells (over 150 Å).

Recording the reflections

When a crystal is mounted and exposed to an intense beam of X-rays, it scatters the X-rays into a pattern of spots or reflections that can be observed on a screen behind the crystal. A similar pattern may be seen by shining a laser pointer at a compact disc. The relative intensities of these spots provide the information to determine the arrangement of molecules within the crystal in atomic detail. The intensities of these reflections may be recorded with photographic film, an area detector or with a charge-coupled device (CCD) image sensor. The peaks at small angles correspond to low-resolution data, whereas those at high angles represent high-resolution data; thus, an upper limit on the eventual resolution of the structure can be determined from the first few images. Some measures of diffraction quality can be determined at this point, such as the mosaicity of the crystal and its overall disorder, as observed in the peak widths. Some pathologies can be quickly diagnosed as well, such as twinning or a prominent ice ring.

One image of spots is insufficient to reconstruct the whole crystal; it represents only a small slice of the full Fourier transform. To collect all the necessary information, the crystal must be rotated step-by-step through 180°, with an image recorded at every step; in many cases, a smaller angle such as 90° or 45° may be recorded, owing to a higher symmetry of the crystal. Moreover,, the axis of the rotation should generally be changed at least once, to avoid developing a "blind spot" in reciprocal space close to the rotation axis. It is customary to rock the crystal slightly (by 0.5-2°) to catch a broader region of reciprocal space.

Multiple data sets may be necessary for certain phasing methods. For example, MAD phasing requires that the scattering be recorded at at least three (and usually four, for redundancy) wavelengths of the incoming X-ray radiation. A single crystal may degrade too much during the collection of one data set, owing to radiation damage; in such cases, data sets on multiple crystals must be taken.[19]

Data analysis

Crystal symmetry, unit cell, and image scaling

In order to process the data, a crystallographer must first index the reflections within the multiple images recorded. This means identifying the dimensions of the unit cell and which image peak corresponds to which position in reciprocal space. A byproduct of indexing is to determine the symmetry of the crystal, i.e., its space group. Some space groups can be eliminated from the beginning, since they require symmetries known to be absent in the molecule itself. For example, symmetries with reflection symmetries cannot be observed in chiral molecules; thus, only 65 space groups of 243 possible are allowed for protein molecules which are almost always chiral. Indexing is generally accomplished using an autoindexing routine[20]. Having assigned symmetry, the data is then integrated. This converts the hundreds of images containing the thousands of reflections into a single file, consisting of (at the very least) records of the Miller index of each reflection, and an intensity for each reflection.

A full data set may consist of hundreds of separate images taken at different orientations of the crystal. The first step is to merge and scale these various images, that is, to identify which peaks appear in two or more images (merging) and to scale the relative images so that they have a consistent intensity scale. This is important, since the relative intensities of the peaks is the key information from which the structure is determined. The technique of crystallographic data collection and the often high symmetry of crystalline materials, means that many symmetry-equivalent reflections are recorded multiple times - this allows a merging or symmetry related R-factor to be calculated, thus giving a score to assess the quality of the data.

Initial phasing

The data collected from a diffraction experiment is a reciprocal space representation of the crystal lattice. The position of each diffraction 'spot' is governed by the size and shape of the unit cell, and the inherent symmetry within the crystal. The intensity of each diffraction 'spot' is recorded, and is proportional to the square root of the structure factor amplitude. The structure factor is a complex number containing information relating to both the amplitude and phase of a wave. In order to obtain an interpretable electron density map, phase estimates must be obtained (an electron density map allows a crystallographer to build a starting model of the molecule). This is known as the phase problem, and can be solved in a variety of ways:

  • Molecular replacement - if a structure exists of a related protein, it can be used as a search model in molecular replacement to determine the orientation and position of the molecules within the unit cell. The phases obtained this way can be used to generate electron density maps.[21]
  • Anomalous scattering (MAD or SAD phasing) - the X-ray wavelength may be scanned past an absorption edge of an atom, which changes the scattering in a known way. By recording full sets of reflections at three different wavelengths (far below, far above and in the middle of the absorption edge) one can solve for the substructure of the anomalously diffracting atoms and thence the structure of the whole molecule. This method has become the standard method for protein crystallography of novel folds. The Se absorption edge of selenomethionine is used; Se-Met is introduced using methionine auxotrophs.[22]
  • Heavy atom methods - If electron-dense metal atoms (can be introduced into the crystal, direct methods or Patterson-space methods can be used to determine their location and to obtain initial phases. A common method in protein crystallography is to attach mercuric compounds covalently to cysteine residues, although a wide variety of metals have been developed for this purpose. As in MAD phasing, the changes in the scattering amplitudes can be interpreted to yield the phases. Although this is the original method by which protein crystal structures were solved, it has largely been superseded by MAD phasing with selenomethionine.[21]
  • Ab Initio phasing - if high resolution data exists (better than 1.4 Å (140 pm) direct methods can be used to obtain phase information.[23] Hitherto, this has worked only with extremely small proteins.[24]

Model building and phase refinement

Having obtained initial phases, an initial model can be built. For proteins, this is called "chain tracing" and is usually done by first making a poly-alanine chain through the electron density, using a "baton" method. At first, only short segments of the chain can be recognized and correlated with the amino-acid sequence. Large aromatic residues are frequently observable as a large blob, while the carbonyl oxygen of the main chain can be observed as a regularly recurring "bump" in the tube of electron density. Secondary structure elements can usually be recognized at this stage, especially helices. The "N-to-C" direction of these helices can often be inferred from the direction of its sidechains; the CA-CB vector of each sidechain generally points "downward" towards the N-terminus of the helix, giving a "Christmas-tree" effect. The boundary between the molecule and the solvent can be sometimes detected and used to improve the phases (solvent flattening).

Given an initial model of some atomic positions, these positions and their respective Debye-Waller factors (accounting for the thermal motion of the atom) can be refined to fit the observed diffraction data, ideally yielding a better set of phases. A new model can then be fit to the new electron density map and a further round of refinement is carried out. This continues until the correlation between the diffraction data and the model is maximized. The agreement is measured by an R-factor defined as

A similar quality criterion is Rfree, which is calculated from a subset (~10%) of reflections that were not included in the structure refinement. Both R factors depend on the resolution of the data. As a rule of thumb, Rfree should be approximately the resolution in Ångströms divided by 10; thus, a data-set with 2 Å resolution should yield a final Rfree of roughly 0.2. The stereochemistry, hydrogen bonding and distribution of peptide bond angles are other sensitive measures of the model quality for proteins. Phase bias is a serious problem in such iterative model building. Omit maps are a common technique used to check for this.

It may not be possible to observe every atom of the crystallized molecule. In some cases, there is too much residual disorder in those atoms, which renders them invisible. Weakly scattering atoms such as hydrogen are routinely invisible. At the other extreme, a single atom may appear multiple times in an electron density may, e.g., if a protein sidechain has multiple allowed conformations. In still other cases, the crystallographer may detect that the covalent structure deduced for the molecule was incorrect, or changed. For example, proteins may be cleaved or undergo posttranslational modifications that were not detected prior to the crystallization.

Deposition of the structure

Once the model of a molecule's structure has been finalized, it is often deposited in a crystallographic database such as the Protein Data Bank (for protein structures) or the Cambridge Structure Database (for small molecules). Many structures obtained in private commercial ventures to crystallize medicinally relevant proteins, are not deposited in public crystallographic databases.

Diffraction theory

The main goal of X-ray crystallography is to determine the density of electrons f(r) throughout the crystal. To do this, X-ray scattering is used to collect data about its Fourier transform F(q), which is then inverted mathematically to obtain the density defined in real space, using the formula

The corresponding formula for a Fourier transform is

which will be used below. The Fourier transform F(q) is generally a complex number, and therefore has a magnitude |F(q)| and a phase φ(q) related by the equation

The intensities of the reflections observed in X-ray diffraction give us the magnitudes |F(q)| but not the phases φ(q). To obtain the phases, full sets of reflections are collected with known alterations to the scattering, either by modulating the wavelength past a certain absorption edge or by adding strongly scattering (i.e., electron-dense) metal atoms such as mercury. Combining the magnitudes and phases yields the full Fourier transform F(q), which may be inverted to obtain the electron density f(r).

Scattering as a Fourier transform

The incoming X-ray beam has a polarization and should be represented as a vector wave; however, for simplicity, let it be represented here as a scalar wave. We also ignore the complication of the time dependence of the wave and just focus on the wave's spatial dependence. Plane waves can be represented by a wave vector kin, and so the strength of the incoming wave at time t=0 is given by

At position r within the sample, let there be a density of scatterers f(r); these scatterers should produce a scattered spherical wave of amplitude proportional to the local amplitude of the incoming wave times the number of scatterers in a small volume dV about r

where S is the proportionality constant.

Let's consider the fraction of scattered waves that leave with a outgoing wave-vector of kout and strike the screen at rscreen. Since no energy is lost (elastic, not inelastic scattering), the wavelengths are the same as are the magnitudes of the wave-vectors |kin| = |kout|. From the time that the photon is scattered at r until it is absorbed at rscreen, the photon undergoes a change in phase

The net radiation arriving at rscreen is the sum of all the scattered waves throughout the crystal

which may be written as a Fourier transform

where q = kout - kin. The measured intensity of the reflection will be square of this amplitude

The electron density f(r) is a real function, which imposes a constraint on its Fourier transform. Specifically, the Fourier transform of a negative frequency must have the same magnitude as the corresponding positive frequency, but opposite phase

In other words, the Fourier transforms of the negative and positive frequency vectors are complex conjugates of one another; in X-ray crystallography, these corresponding reflections are called Friedel mates. This allows one to measure the full Fourier transform from only half the reciprocal space, e.g., by a 180° rotation (see next section). In symmetric crystals, other reflections may have the same intensity (Bijvoet mates); in such cases, one can measure even less of the reciprocal space, e.g., 90°.

Ewald's sphere

Each X-ray diffraction image represents only a slice, a spherical slice of reciprocal space, as may be seen by the Ewald sphere construction. Both kout and kin have the same length, due to the elastic scattering, since the wavelength has not changed. Therefore, they may be represented as two radial vectors in a sphere in reciprocal space, which shows the values of q that are sampled in a given diffraction image. Since there is a slight spread in the incoming wavelengths of the incoming X-ray beam, the values of |F(q)| can be measured only for q vectors located between the two spheres corresponding to those radii. Therefore, to obtain a full set of Fourier transform data, it is necessary to rotate the crystal through a full 180°, or sometimes less if sufficient symmetry is present. A full 360° rotation is not needed because of a symmetry intrinsic to the Fourier transforms of real functions (such as the electron density). In practice, the crystal is rocked by a small amount (0.25-1°) to incorporate reflections near the boundaries of the spherical Ewald shells.

Patterson function

A well-known result of Fourier transforms is the autocorrelation theorem, which states that the autocorrelation c(r) of a function f(r)

has a Fourier transform C(q)that is the squared magnitude of F(q)

Therefore, the autocorrelation function c(r) of the electron density (also known as the Patterson function) can be computed directly from the reflection intensities, without computing the phases. In principle, this could be used to determined the crystal structure directly; however, it is difficult to realize in practice. The autocorrelation function corresponds to the distribution of vectors between atoms in the crystal; thus, a crystal of N atoms in its unit cell may have N(N-1) peaks in its Patterson function. Given the inevitable errors in measuring the intensities, and the mathematical difficulties of reconstructing atomic positions from the interatomic vectors, this technique is rarely used to solve structures, except for the simplest crystals.

See also

Bibliography

  • International Tables for Crystallography. Brief Teaching Edition of Volume A, Space-group Symmetry (4th revised and enlarged edition, ed. Theo Hahn ed.). Dordrecht: Kluwer Academic. 1996. ISBN 0-7923-4252-6.
  • Macromolecular Crystallography, Part A (Methods in Enzymology, v. 276) (edited by CW Carter, Jr. and RM Sweet ed.). San Diego: Academic Press. 1997. ISBN 0-12-182177-3.
  • Macromolecular Crystallography, Part B (Methods in Enzymology, v. 277) (edited by CW Carter, Jr. and RM Sweet ed.). San Diego: Academic Press. 1997. ISBN 0-12-182178-1.
  • Crystallization of Nucleic Acids and Proteins: A Practical Approach (2nd edition, edited by A. Ducruix and R. Giegé ed.). Oxford: Oxford University Press. 1999. ISBN 0-19-963678-8.
  • Blow, D (2002). Outline of Crystallography for Biologists. Oxford: Oxford University Press. ISBN 0-19-851051-9.
  • Drenth, J (1999). Principles of Protein X-Ray Crystallography. New York: Springer-Verlag. ISBN 0-387-98587-5.
  • Giacovazzo, C (1992). Fundamentals of Crystallography. Oxford: Oxford University Press. ISBN 0-19-855578-4. {{cite book}}: Unknown parameter |coauthors= ignored (|author= suggested) (help)
  • Glusker, JP (1994). Crystal Structure Analysis for Chemists and Biologists. New York: VCH Publishers. ISBN 0-471-18543-4. {{cite book}}: Unknown parameter |coauthors= ignored (|author= suggested) (help)
  • McPherson, A (1999). Crystallization of Biological Macromolecules. Cold Spring Harbor, NY: Cold Spring Harbor Laboratory Press. ISBN 0-87969-617-6.
  • McPherson, A (2003). Introduction to Macromolecular Crystallography. John Wiley & Sons. ISBN 0-471-25122-4.
  • McRee, DE (1993). Practical Protein Crystallography. San Diego: Academic Press. ISBN 0-12-486050-8.
  • Rhodes, G (2000). Crystallography Made Crystal Clear. San Diego: Academic Press. ISBN 0-12-587072-8.
  • Zachariasen, WH (1945). Theory of X-ray Diffraction in Crystals. New York: Dover Publications. LCCN 67-0 – 0.

References

  1. ^ a b Staff. "Introduction to X-ray Diffraction". Materials Research Laboratory, University of California, Santa Barbara. Retrieved 2007-05-03. Cite error: The named reference "mrl_ucsb" was defined multiple times with different content (see the help page).
  2. ^ Kendrew J, Bodo G, Dintzis H, Parrish R, Wyckoff H, Phillips D (1958). "A three-dimensional model of the myoglobin molecule obtained by x-ray analysis". Nature. 181 (4610): 662–6. PMID 13517261.{{cite journal}}: CS1 maint: multiple names: authors list (link)
  3. ^ Scapin G (2006). "Structural biology and drug discovery". Curr. Pharm. Des. 12 (17): 2087–97. PMID 16796557.
  4. ^ "PDB Statistics". RCSB Protein Data Bank. Retrieved 2007-05-03.
  5. ^ Burley SK, Bonanno JB (2002). "Structuring the universe of proteins". Annual review of genomics and human genetics. 3: 243–62. PMID 12194989.
  6. ^ Lundstrom K (2006). "Structural genomics for membrane proteins". Cell. Mol. Life Sci. 63 (22): 2597–607. PMID 17013556.
  7. ^ Lundstrom K (2004). "Structural genomics on membrane proteins: mini review". Comb. Chem. High Throughput Screen. 7 (5): 431–9. PMID 15320710.
  8. ^ a b Geerlof A, Brown J, Coutard B, Egloff MP, Enguita FJ, Fogg MJ, Gilbert RJ, Groves MR, Haouz A, Nettleship JE, Nordlund P, Owens RJ, Ruff M, Sainsbury S, Svergun DI, Wilmanns M (2006). "The impact of protein characterization in structural proteomics". Acta Crystallogr. D Biol. Crystallogr. 62 (Pt 10): 1125–36. PMID 17001090.{{cite journal}}: CS1 maint: multiple names: authors list (link)
  9. ^ Durbin SD, Feher G (1996). "Protein crystallization". Annual review of physical chemistry. 47: 171–204. PMID 8983237.
  10. ^ Chernov AA (2003). "Protein crystals and their growth". J. Struct. Biol. 142 (1): 3–21. PMID 12718915.
  11. ^ Rupp B, Wang J (2004). "Predictive models for protein crystallization". Methods. 34 (3): 390–407. PMID 15325656.
  12. ^ Chayen NE (2005). "Methods for separating nucleation and growth in protein crystallization". Prog. Biophys. Mol. Biol. 88 (3): 329–37. PMID 15652248.
  13. ^ Dieckman L, Gu M, Stols L, Donnelly MI, Collart FR (2002). "High throughput methods for gene cloning and expression". Protein Expr. Purif. 25 (1): 1–7. PMID 12071692.{{cite journal}}: CS1 maint: multiple names: authors list (link)
  14. ^ Yeates TO, Fam BC (1999). "Protein crystals and their evil twins". Structure. 7 (2): R25-9. PMID 10368291.
  15. ^ Chayen NE (1998). "Comparative studies of protein crystallization by vapor-diffusion and microbatch techniques". Acta Crystallogr. D Biol. Crystallogr. 54 (Pt 1): 8–15. PMID 9761813.
  16. ^ Lorber B (2002). "The crystallization of biological macromolecules under microgravity: a way to more accurate three-dimensional structures?". Biochim. Biophys. Acta. 1599 (1–2): 1–8. PMID 12479400.
  17. ^ Jeruzalmi D (2006). "First analysis of macromolecular crystals: biochemistry and x-ray diffraction". Methods Mol. Biol. 364: 43–62. PMID 17172760.
  18. ^ Helliwell JR (2005). "Protein crystal perfection and its application". Acta Crystallogr. D Biol. Crystallogr. 61 (Pt 6): 793–8. PMID 15930642.
  19. ^ Ravelli RB, Garman EF (2006). "Radiation damage in macromolecular cryocrystallography". Curr. Opin. Struct. Biol. 16 (5): 624–9. PMID 16938450.
  20. ^ Powell HR (1999). "The Rossmann Fourier autoindexing algorithm in MOSFLM". Acta Crystallogr. D Biol. Crystallogr. 55 (Pt 10): 1690–95. PMID 10531518.
  21. ^ a b Taylor G (2003). "The phase problem". Acta Crystallogr. D Biol. Crystallogr. 59 (Pt 11): 1881–90. PMID 14573942.
  22. ^ Ealick SE (2000). "Advances in multiple wavelength anomalous diffraction crystallography". Current opinion in chemical biology. 4 (5): 495–9. PMID 11006535.
  23. ^ Hauptman H (1997). "Phasing methods for protein crystallography". Curr. Opin. Struct. Biol. 7 (5): 672–80. PMID 9345626.
  24. ^ Usón I, Sheldrick GM (1999). "Advances in direct methods for protein crystallography". Curr. Opin. Struct. Biol. 9 (5): 643–8. PMID 10508770.

External links