Questions and Answers for Sections 4, 5 and 6.
Questions for Section 4
1. A protein crystal has cell dimensions of a = 58.4Å, b = 58.4 Å, c = 58.4Å, α = 90.0, β = 90.0, γ = 90.0. What possible crystal systems could it belong to?
It belongs to the cubic crystal systems which have a minimum of 23 point group symmetry, four 3 folds rotational symmetry and the dimensions and angles meet the following conditions:
a=b=c, and α=β=γ
Where a, b, c and α, β, γ are the lattice constants.
2. What is the difference between the unit cell and the asymmetric unit of a crystal? How might these relate to the biological molecule?
The unit cell is defined as the basic building block of a crystal and is a repeating unit which forms the 3D structure of the crystal. It is an imaginary mathematical device which simplifies calculations and allows us to define a crystal uniquely. The unit cell is formed by the three lengths a, b, c and the three angles α, β and γ which are collectively called the lattice constants.
The unit cell in turn is comprised of the asymmetric unit which is the unique, smallest part of the unit cell, lacking internal symmetry. The asymmetric unit is used to calculate the symmetry operators of the crystal which like the unit cell is another imaginary mathematical device. The unit cell volumes, asymmetric unit volumes as well as the symmetry operators are used to calculate the number of molecules in an asymmetric unit and the number of copies.
The asymmetric unit does not necessarily represent the biological molecule. Depending on the asymmetric unit, the biological unit is defined by the other space group symmetry operations; rotation, translation and screw, and may consist of a single asymmetric unit, multiple copies of it or just a portion of it.
The relationship between the asymmetric unit and unit cell is illustrated below in 2D.
The asymmetric unit is represented by the green arrow, which is then copied and rotated about 180° around the 2-fold rotational symmetry axis (black oval) to produce a second copy. The two copies then make up the unit cell which when translationally repeated in two directions make up the crystal. (From Protein Data Bank)
3. Which elements are particularly suitable for anomalous scattering in protein crystallography and how are they incorporated into crystals?
Selenium is the element preferentially used in anomalous scattering in protein crystallography in the form of selenomethionine, an analogue of methionine (Met) in which the sulphur (S) atom has been replaced with a selenium (Se) atom. Se is suitable for this role because it is isosteric with Met and does not disturb the structure of the amino acid. In fact, studies show that Se-Met substitution increases stability in some proteins (Gassner et al, 1999). In prokaryotes Se-Met is also easily incorporated into proteins which can be crystallized for anomalous diffraction studies.
i. Se can be incorporated into proteins using a methionine auxotroph (Met-) and grown in media containing all the individual amino acids, except for Met which is substituted with Se-Met. The host expression system will incorporate the Se-Met into the recombinant protein which can be purified and crystallized.
ii. The second method is using a Met prototrophic bacterial strain in which Met biosynthesis is inhibited using feedback inhibition. Thr and Lys can be used to block Asp phosphorylation, an essential step in Met biosynthesis (Madduri et al).
iii. Se-Met can also be incorporated into proteins in vitro by cell-free synthesis using a cell-free system and dialysis. This method has shown up to 95% incorporation of Se-Met and crystals of the target protein having the same lattice constants as protein crystal synthesized in vivo (Kigawa et al, 2001).
Questions for Section 5
1. Explain the principles of how the lac repressor is used in bacterial expression vectors.
The lac repressor is a protein produced by the lac I gene; which is one of three regulatory genes that control the lac operon. The regulatory genes consist of lac I, lac P and lac O, of which lac P and O are the operator genes while lac I is the repressor gene. The structural genes of the lac operon consist of lac Z, lac Y and lac A genes.
The lac repressor is responsible for binding to the operator genes, thereby blocking the action of RNA pol to transcribe polycistronic mRNA from the structural genes. This ability of the lac repressor is manipulated in recombinant protein expression in which the structural genes are replaced with genes coding for the desired recombinant protein. The lac repressor acts as a switch used to begin recombinant protein expression at an appropriate time of the growth phase of the host expression system.
This is done using an inducer such as lactose or a lactose analogue which binds to the repressor protein and inhibits it from properly interacting with the DNA in the lac operator region. With the lac operator free of the lac repressor protein, RNA polymerase can transcribe the cloned structural gene, which is then translated into the desired recombinant protein.
2. What steps can be taken to ensure a high level of incorporation of selenomethionine into a protein?
Since some E.coli strains can synthesize Met de novo, culturing Met auxotroph host expression systems in media containing Se-Met in place of Met ensures that it is preferentially incorporated into proteins. In other experiments, both Met and Se-Met are present in the ratio 1:9 for optimal incorporation of Se-Met into proteins.
The concentration of Se-Met in the culture medium has been known to affect its incorporation into proteins in the expression host. A higher concentration of the Met homolog in medium will result in better incorporation, but may lower protein yields (Sreenath et al, 2005).
In using Met prototrophs as host expression systems, Met biosynthesis has to be blocked using amino acids such as Lys, Ile and Thr which in inhibit aspartokinases needed in the methionine biosynthesis pathway. In the Met prototroph, Met and again replaced with Se-Met although a small amount (5-10%) of Met is added to stimulate growth of the host cells out of the stationary phase.
3. What variables can be adjusted to improve the yields of soluble protein expressed?
Protein yields in expression of recombinant proteins are dependent on several factors but most importantly on the design of the recombinant gene as well as its accompanying repressor, promoter sequences and transcription initiation and translation signals. In designing the recombinant gene, the stability of mRNA synthesized from it and the potential presence of secondary structures such as hairpins which interfere with translation should be considered and if present, removed to ensure efficient transcription and translation of the gene.
By running trials on cultures in which other variables are changed, and screening them to determine protein yields, the most optimal combination of recombinant genes can be selected. The choice of vector and host is another variable to be considered. Bacterial strains of E.coli are usually the host of choice due to its various tools for protein expression; it is also fast and cost effective to culture. Codon usage in prokaryotes also differs from that of eukaryotes.
Another variable to consider is the composition of culture medium and induction parameters such as temperature, duration of induction and aeration. Some cultures require supplementary salts and co-factors for optimal growth of the host expression system and improved yields. It has also been suggested that reducing the induction level may also improve yields of recombinant proteins while auto-induction protocols can also take advantage of auxotrophic host cells to ensure optimal protein yields.
The purification methods used to isolate the recombinant proteins also affect protein yields. Isolation methods used include batch and column elution, using an affinity tag on the protein, followed by size exclusion chromatography and finally removal of the affinity tag.
In many expression host systems, over expression of the recombinant protein often leads to production of insoluble inclusion bodies. These need to be separated and the proteins allowed to refold appropriately using different techniques such as centrifugation, solubilization by denaturants and finally renaturation by dialysis or dilution. Currently, matrix-bound renaturation has become a common method used in the proper refolding of recombinant proteins. Other methods are high hydrostatic pressure; an industrial process being used for simultaneous solubilization and renaturation, and research methods using awater–sodium bis-2-ethylhexyl sulfosuccinate–isooctane reverse micellar system.
Question for Section 6
1. Unlike protein crystallography, phase information is not lost during a TEM experiment. Describe briefly why this is the case. (4 marks). Nonetheless converting TEM data into a structure is not a straightforward process. Discuss the problems that are typically encountered with TEM data, and how these are resolved during the data processing to generate usable class averages (14 marks).
a) X-ray diffraction is a technique that uses X-rays to analyze 3D crystals in order to determine structure at the atomic level. A diffraction plane shows diffraction spots which are a result of molecules within the crystal sample diffracting X-rays. The intensities of the diffraction spots are measured and Fourier Transformation relates these intensities to structural factor. Structural factors are waves, and as such have amplitude and phases. The amplitudes of the waves can be easily determined but to obtain phases, additional methods such as Isomorphous Replacement and Molecular Replacement have to be used. Electron microscopy uses electrons in the same manner that X-ray diffraction uses X-rays, to determine the 3D structure of 2D specimens. Compared to X-ray diffraction, electron microscopy has the advantage of utilizing micrographs to obtain a magnified image of the sample in the image plane, from which amplitudes and phases can be determined, thereby eliminating the ‘phase problem' encountered in X-ray diffraction. Therefore, electron microscopy uses data obtained from the diffraction plane, as well as the image plane to determine the molecular structure of samples, whereas in X-ray diffraction, data is obtained from the diffraction plane only.
The diffraction plane and image plane are indicated to show how diffraction spots and a magnified image of the specimen are used to collect data for 3D reconstruction of the specimen.
In illustrating X-ray diffraction, data can only be obtained from the diffraction plane, which lacks phase information. Additional tests are performed to obtain phases and calculate FT, before the 3D structure can be determined.
b) Because of the sensitivity of biological samples to radiation damage by electron beams, it is necessary to collect data using lower doses of electron beams. This results in images with a poor signal-to-noise ratio (SNR) which has to be compensated for. Image processing, alignment, 3D reconstruction and signal recovery procedures are thus used to improve the quality of the images. Using a computer, images of equivalent images, recorded in varying orientations are used to increase the SNR and average the images. The higher the sum of similar images that are averaged together, the better the quality of the image so it is necessary to collect as many images for each orientation, to improve the image obtained after summing the distinct orientations together.
The noise in images can also be filtered out to improve the image. This is done by reducing the high frequencies during the Fourier Transformation (FT). The FT image is low passed to different resolutions using a circular mask to filter out the high frequencies. However, filtering compromises on the high resolution fine details of the image obtained, i.e. an image low pass filtered at 30Å will have less detail than one low pass filtered as 10Å.
Owing to the low density of proteins and mainly light atoms in biological samples, they are low contrast objects. Therefore, the data is collected out of focus to increase image contrast. Defocusing, which is considered as a compromise between contrast and resolution, can lead to reduced information on the frequency around zero transitions of the contrast transfer function (CTF) in the image, inversion of phases in some regions of the reciprocal space, and decrease the Fourier amplitudes in the high spatial frequency region. This is compensated for using various, complementary defocus settings and to correct for the CTF, which results in images of different levels of detail, depending on the defocus settings used. The correct amplitudes of the curve are restored during image processing and having images recorded at the varying defocus settings mentioned above also ensures that no information is lost.
The images that would have been recorded at the various orientations, and undergone the above mentioned processes of improving SNR by defocusing and filtering are finally sorted and selected for averaging. The images are in 2D and similar ones at different orientations are selected, making sure to avoid those too close to each other or overlapping. The ultimate goal is to reconstruct a 3D image from the various 2D images at varying orientations. Having sorted and selected the images, they are then aligned and classified according to their unique views, which are finally averaged by orientation.
In EM tomography, there is the challenge of missing data (wedge) when a single tilt axis is used to record images of the sample due to limitations of the tilt angle. To resolve this, two orthogonal single tilt axes can be used, and the data aligned. FT of the two tomograms obtained from each tilt series are selectively combined to make a single tomogram which provides the wedge, and shows good resolution at any orientation in the plane and depth of the specimen being analyzed.
Each unique class average is then used to project a 3D reconstruction of the molecule from the different views or orientations of the 2D images. In TEM, internal information is also conserved, which is essential for the 3D reconstruction. FT is calculated for each set of images, whose representations are central sections of the 3D FT object. Having obtained the different sections of the 3D image, the 3D density map is finally obtained using the inverse FT.
Lastly, another challenge that is faced in TEM is a rise in layer lines in the diffraction pattern of helices obtained after a Fourier transformation. This is because helical structures have a 1D structure. The diffraction pattern of these structures provides the pitch, radius and subunit spacing, leaving only its symmetry to be determined. With this information, the 3D structure can be determined. The advantage of helical structures is that various images can be recorded, providing more information than a 2D structure would, at different orientations. Data collection procedures are the same as for 2D specimens, except that if the helical fibres are bent, they should be computationally straightened to provide the correct diffraction pattern.