![]()
|
SFALL (CCP4: Supported Program)NAMEsfall - Structure factor calculation and X-ray refinement using forward and reverse FFTSYNOPSISsfall [ XYZIN foo.brk ] [ MAPIN foo.map ] [ HKLIN foo.mtz ] [ MAPOUT foo_out.map ] [ HKLOUT foo_out.mtz ] [ XYZOUT foo_out.brk ] [ GRADMAT foo.sfmat ][Keyworded input] CONTENTS
DESCRIPTIONThis program calculates structure factors and X-ray gradients for refinement using inverse and forward fast Fourier techniques.If coordinates are input, it builds an `atom map' on a suitable grid over the asymmetric unit of the spacegroup. Atomic density is distributed as a Gaussian centred on the atom position. The value of the Gaussian depends on the distance from the atom centre, the atom type, and temperature factor. Its value is calculated at each grid point within a given radius for each atom, and the values are summed to give an `atom map'. This `map' can then be used for an inverse fast Fourier transform, which gives the structure factor. It is important that the grid is fine enough to sample the atomic Gaussians accurately - the accuracy of the structure factor estimates depends on this. I recommend using a grid which gives a sampling of ~0.5Å. X-ray gradients are estimated by `differential synthesis'. A difference map is calculated, and this is convoluted with the atomic density. By default, SFALL scales Fobs to Fcalc. This then lets you do maps on (or close to) an absolute scale if the map comes from coordinates. If you are back-transforming a map, e.g. in solvent-flattening or averaging, then the map isn't on an absolute scale anyway, and so this scaling is not particularly sensible (although not usually harmful). Note that in iterative procedures, the Fobs you bring in to SFALL on each cycle should be the original Fobs, not the one carried through from the previous cycle (which may have been scaled, selected, or otherwise corrupted). In this case, it is probably more sensible to use the NOSCALE option in SFALL, then either (i) use SIGMAA to generate Partial or Combined Fourier coefficients for the next map (if you want to weight them); or (ii) use RSTATS to scale Fcalc to Fobs. For a description of the original method see [1]. A modification in calculating the gradients suggested by Alain Lifchitz is used. A list of common problems (with possible solutions) can now be found below. 1) Generate atom mapInput: i) XYZIN
Output: i) MAPOUT
2) Calculate structure factorsBy default a scale factor is calculated between Fobs and Fc and is applied to the output Fobs. If this is not wanted then the keyword NOSCALE must be used. Input: i) either coordinates or a map (compulsory)
Output: i) HKLOUT containing: H K L ... FC PHIC (WANGWT)
ii) MAPOUT - The atom `map' (optional).
3) Generate X-ray gradients for refinementThese are used for minimisation of X-ray differences between FP and FC. This calculation uses the forward FFT. Input: i) XYZIN
ii) HKLIN
Output: i) XYZOUT
ii) GRADMAT
Further details of the specification of input and output files is given in section INPUT AND OUTPUT FILES. See description of keyword MODE and examples for uses for the various options. Space GroupsSFALL has special subroutines to handle the following spacegroups:P1 P21(4) P21212(1018 - alt. origin) P212121(19) P4122/P4322(91/95) P41212/P43212 (92/96) P31/P32(144/145) P3/R3(143/146) P3121/P3221(152/154) P61/P65 (169/170)It is possible to use the P1 version (or any suitable lower symmetry) for the calculations. In particular, any non-primitive spacegroup can be run in the appropriate primitive one. See keyword SFSGROUP for details. BUT BEWARE: To use SFSG P1 you must have your reflections sorted with l>=0. This is NOT the default for P3121/P3221 or P6i22. You would be sensible to run CAD with the keyword: `OUTLIM SPACEGROUP P1' before calculating structure factors. The crystal symmetry will be used to generate atom positions from a unique molecule. See NOTES ON HKLIN for special requirements for the HKLIN file when using a lower symmetry version. KEYWORDED INPUTAll input is keyworded. Only the first 4 characters of a keyword are significant. The keyworded records can be in any order.Anything on a line after an ! or # is ignored and lines can be continued by using a minus sign. The possible keywords are:
DESCRIPTION OF KEYWORDSMODE [ ATMMAP ... | SFCALC ... | SFREF ... ]This keyword controls the course of the calculation.
SFSGROUP <sfsgroup><sfsgroup> is the spacegroup number or name of the spacegroup used for the calculation. You don't normally need to specify this as a sensible default will be used (the highest symmetry one the program can use which is compatible with your data) as printed in the output.N.B. - This IS NOT NECESSARILY the spacegroup of your structure. E.g., you may well be calculating structure factors in P1 for a structure with spacegroup F43212. The program will check whether the requested <sfsgroup> is compatible with your data. The possibilities for <sfsgroup> are: P1, P21 (4), P21212 (1018 - alternative origin), P212121 (19), P4122/P4322 (91/95), P41212/P43212 (92/96), P31/P32 (144/145), P3/R3 (143/146), P3121/P3221 (152/154), P61/P65 (169/170). Any spacegroup can be run in an appropriate one whose symmetry operators are a subset of the one of interest (P1 can always be used as long as the hkl data are sorted with l>=0). By judiciously shifting origins many spacegroups become supergroups. $CLIBD/symop.lib contains the modified symmetry operators for some spacegroups. You may move the origin of your coordinates with PDBSET. For example:
LABIN <program label>=<file label> ...Note: This is essential for all refinement and if HKLIN is assigned for a structure factor calculation.The following labels can be assigned. H K L FP SIGFP FREE FPART PHIP PHIP F0 F1 F2 F3 F4 F5 F6 F7 F8 F9 F10 F11 F12 F13 F14 F15 F16 F17 F18 F19Assignments for FP and SIGFP are always required. If the free R flag is assigned to FREE, a subset of reflections can be excluded from the calculation or refinement. See the keyword FREERFLAG. If you wish to add a known FPART to the structure factors, assign FPART and PHIP. This is useful if you want to calculate hydrogen contributions, add in an anisotropic heavy atom, add in a bulk solvent correction, which has been calculated somewhere else. The FPART can be scaled relative to the FC by inputting a SCALE. If you wish to copy some but not all the other columns from your input to your output file, you may assign F0 F1 ... F19. These will be transcribed to the output file. If you want to copy everything over, use ALLIN on the LABOUT line. LABOUT <program label>=<file label> ... [ ALLIN ]<program labels> can be given for FC PHIC (and WANGWT, if required).Any column assigned on LABIN will be copied to the output file. You can change that file label with LABOUT if you absolutely have to, but it is unwise! ALLIN will copy all columns in the input file to the output file. OPTIONAL KEYWORDS:AVMODE [ RADIUS <radius> ] [ WEIGHT <modewt> ]When you are preparing the 'averaged map' for generating the Wang envelope, this sets the radius of the map sphere and the type of weighting.Subsidiary keywords:
BADD <badd>A parameter added to all atomic B values. It will smear the Gaussians used to generate the atomic density over a larger volume. L.Ten Eyck suggests this should allow a coarser sampling grid. G. Bricogne is sceptical and certainly it introduces serious errors into the B12 coenzyme structure factors at 1Å resolution (ref Hugh Savage). EJD shares the scepticism. The default value (0.0) gives reasonable results for most problems.BINS <nbins><nbins> (default 50) is the number of bins for the analysis of R-factor v. resolution. Ranges of 4sin**2/Lambda**2 are of width 1.0/<nbins>. For high resolution data this needs to be reset.BRESET <bmin>B-factors less than <bmin> (default bmin=5.0) are set equal to <bmin>. For high resolution data this needs to be reset.CELL <a> <b> <c> [ <alpha> <beta> <gamma> ]Input cell parameters explicitly. This is checked against the cell read from the MTZ file and that read from the PDB file. The program will continue with a warning if there is a small inconsistency and stop if there is a serious one. <alpha> <beta> <gamma> default to 90 degrees.CHAIN <chnlab> <iresn> <iresc>Only needed for MODE ATMMAP RESMOD if your coordinate file has more than one chain. You will need one CHAIN card for each chain.
EDEN <escale> <cmp> <cmn>Done for Cecil Tate. The input map is scaled by:xscal=escale/nx*ny*nz and truncated. if density(x,y,z)<0 density(x,y,z)=xscal*cmn*tanh(density(x,y,z)/cmn)) if density(x,y,z)>0 density(x,y,z)=xscal*cmp*tanh(density(x,y,z)/cmp)) ENDThis ends input.FORMFACTOR [ NEUTRON | NGAUSS <ngauss> ] <atnam1> <atnam2> ...See the NOTES ON FORMFACTORS.SFALL has hardcoded X-ray and neutron formfactors available for atom types H C N O S. (Default CuKa X-ray formfactors.) Other atom types are tabulated in $CLIBD/atomsf.lib (X-ray) and $CLIBD/atomsf_neutron.lib (neutron). If your atom identifier is not recognised (i.e. it is not C, N, O, or S) the program will read atomsf.lib and take the FIRST table which matches your atom ID. This may not be satisfactory if your atom type is Fe+3, say, and you may wish to specify the formfactor yourself. Subsidiary keywords:
or
Example: FORM NEUTRON P Co+3 D FORM NGAUSS 5 Fe+3 FREERFLAG <freeflag>Reflections where the value of the column FREE, taken from the MTZ file, equals <freeflag> (default 0) are omitted from the refinement or FC calculation and their R-factor is monitored separately. If you do not wish a free-R set to be used don't assign FREE.GRID <nx> <ny> <nz><nx>, <ny>, <nz> are integers giving the number of divisions along each whole unit cell edge. The atomic density is sampled on this grid, so if it is too coarse you will introduce errors into the structure factor calculation. These default to values near to celledge*2, which defines a grid spacing of 0.5Å. Different spacegroups have special requirements for factorisation. No prime factors higher than 19 are permitted (The `atom map' generated on this grid has the asymmetric unit defined for this spacegroup. See table below.).The following general restrictions must be observed: <nx> >= 2 * HMAX + 1 <ny> >= 2 * KMAX + 1 <nz> >= 2 * LMAX + 1In addition there are further space group dependent conditions as follows (where `n' is an integer, i.e. <nz> must be even, for instance): <nx> <ny> <nz> P1 - 2n 2n P21 2n 4n 2n P21212 4n 4n 4n P21212a '' '' '' P212121 '' '' '' P4122/P4322 4n 4n 8n P41212/P43212 '' '' '' P31/P32 6n 6n 6n P3/R3 '' '' '' P3121/p3221 '' '' '' P61/P65 6n 6n 12n H2OBG <h2obg>Bulk water scattering to fill in all empty regions. <h2obg> (default 0.0) is the value which will fill the SOLVMAP. ICOEFL uses this.NOSCALEScale between FP and FC is calculated but not applied to output FP and SIGFP.OMIT <th>Reflections are not included in the refinement if mod(Fo/Fc)><th> or mod(Fc/Fo)><th>. This provides a means of excluding data with bad agreement from the refinement and it should be used with care (Default: 1000).RATIO <ratio>Large xyz and B shifts are truncated to <ratio>*rms shift. This is only used for UNRESTRAINED refinement.At the beginning of a refinement, the value of ratio should be kept fairly low (1.5 - 2.0) as shifts tend to be large when the errors in the parameters are large. When most atoms are well refined, then the ratio may be increased to 3 or 4, to allow for the refinement of these atoms which are poorly placed. RESOLUTION <dmin> [ <dmax> ]<dmin>, <dmax> (default 1, 1000) are resolution limits in Angstroms (or in 4*sin**2/l**2 if both are <1.0. They can be given in either order. If only one value is given, it is assumed to be the high resolution cutoff. The SFCALC option outputs FC for all reflections to the outer limit (You usually calculate SFs to make maps and it is best to calculate all SFs and then limit resolution etc. in FFT).Default values are overwritten from HKLIN to include all reflections. RMSSHIFT [XYZ] <rmsxyz> [B] <rmsb>The shifts are scaled to limit the rms XYZ shift and B shift to be equal to the set value, or to the calculated one, whichever is smaller. Only useful during unrestrained refinement. Equivalent to setting STEPSIZE. If RMSSHIFT is given STEPSIZE will be ignored. Default values: 0.0 0.0.RSCB <dsmin> <dsmax>in Angstroms (either order) or 4s**2/lambda**2. Default values: 4.5Å and 1.0Å.If appropriate the program refines the value of the scale and overall B value to fit <Fo**2> to <Fc**2> using reflections within this resolution range. The default resolution range is chosen to use only those reflections where the <Fo**2> distribution fit Wilson statistics reasonably well. A standard protein distribution shows a sharp 5Å dip, which can distort the scale factors. <Fc**2> is often unrealistically high for low resolution data, until the solvent is adequately modelled. The lack of `solvent contrast' causes this. Users are often worried because they get lower R-Factors if this scale is fitted for all data. This does not necessarily mean such a scale is more correct! The default HKLOUT will have:
If NOSCALE is specified HKLOUT will have:
After the end of each structure factor calculation a new scale factor is calculated from the following expression, where SCNEW is the new scale factor and SCALE is the previous scale factor. Sum {(SCALE * Fcalc) * (Fobs)} SCNEW = SCALE * -------------------------------------- Sum {(SCALE * Fcalc) * (SCALE * Fcalc)} SCALE [FPART] <scale> <bov>These parameters are applied to the Fpart as <scale>*exp(-<bov>*ssq/ll) (Defaults: <scale>=1, <bov>=0).
Some structure factor outputs (notably SHELX) include an Fcalc scaled
as 10*absolute.
SFALL calculates FC on the absolute scale. If you wanted to combine these
values you would need
SIGMA <sigma>Data with Fobs < sigma * sd(Fobs) will not be included in refinement calculations. This cutoff has no effect on structure factor calculations. Do not be alarmed when your R-Factor appears to be higher for them than when you are refining; you will have more reflections in the structure factor sum. Default value: -1.0, i.e. no cutoff.STEPSIZE <szo> <szb> <szfrc>This is only used during UNRESTRAINED refinement. XYZ shifts are multiplied by <szo>, B shifts are multiplied by <szb>. <szfrac> is the fractional change in step size above which a second calculation is done. NOT used if RMSSHIFT is given. Default values: 1.0 1.0 0.3.The program tries to estimate these using an algorithm of R. Agarwal's. See NOTES ON STEPSIZE. If you are doing unrestrained refinement, structure factor calculation will be repeated if abs(initial step - optimum step)/(initial step). SYMMETRY <sg>(Defaults to the symmetry of the SFSGROUP). <sg> is the spacegroup number or name. If HKLIN is assigned the symmetry will be picked up from the MTZ file header and should NOT be specified with this keyword. If HKLIN is not assigned you may need to give this. See examples below.TITLE <title>Title (up to 80 characters) used on the printer output. The title information after the keyword is used as a heading for map sections and also as title information for the output reflection data file if present.TRUNCATE <rhmin> <rhmax>The input map is truncated - all values <<rhmin> are set to <rhmin>, and all values ><rhmax> set to <rhmax>. This should be used for the solvent flattening procedure.VDWR <dlim><dlim> specifies the maximum radius for which an atom is considered to contribute to the electron density while building the `atom map'. Default: 2.5.The speed and accuracy of the calculation depend to some extent on the values chosen. For 1.5Å data a value of 2.5 is sufficient (a maximum value of 3.0 is allowed). For neutron data, or for atoms with low B values, it could safely be reset to 1Å. VERBOSELots and lots of output - useful if debugging. Please note that this may produce a message telling you that atom names have been set to ZZZ or residue names to DUM if you have non-standard names. This is not important as the atomic number is determined from the original atom type. See Notes on Formfactors and Notes on Atom Names below.WEIGHT <w>The weight (default 0.0) applied to each structure factor is given by(2sintheta/lambda)**<w> (See [1]). At the beginning of the refinement use <w>=-1.5 or -1.0 to put most weight on the low angle data. As the refinement proceeds increase <w> to 0.0 (But actually we never use it...). MEMORY ALLOCATIONThe program allocates a large work array, whose size may be changed from the default by setting the logical name MEMSIZE to an appropriate integer value. The current value is printed in the output (look for `Memory allocation'). There are other dimensions which can currently only be changed by recompilation and the error messages are currently not very clear about changing the MEMSIZE.INPUT AND OUTPUT FILESThe following input and output files are used by the program (in addition to some scratch files which are deleted at the end of program execution):Input Files:
Output Files:
The files used in a single run of the program depend on the options requested via the data control keywords. NOTES ON FORMFACTORSThe form factors are read from the file assigned to ATOMSF. This defaults to $CLIBD/atomsf.lib. Coefficients for analytical approximation to the scattering factors for atom identifier at a given resolution sintheta/lambda (=S) are given as:
a1*exp(-b1*s*s) + a2*exp(-b2*s*s) - 2 Gaussian approximation (Agarwal, 1978) or as: a1*exp(-b1*s*s) + a2*exp(-b2*s*s) + a3*exp(-b3*s*s) + a4*exp(-b4*s*s) + c - 5 Gaussian approximationSee [2] pp. 99-101 (Table 2.2B) or [5] p. 500ff, table 6.1.1.4 for values a1-a4, b1-b4, c for X-rays. For neutron scattering see [2] pp. 99-101 (Table 2.2B). The atom name can be used to assign a formfactor to it.
For the common atom types the program includes tables for form factors for the required value of NGAUSS. They are C N O H and S. All others must be read from the library files atomsf.lib or atomsf_neutron.lib which are automatically assigned. These contain tables of formfactors for every known atom (Kim Henrick monument). atomsf.lib contains the five Gaussian form for X-ray data and the two Gaussian form for atoms H C N O and S. Unless you are working at very high resolution (>1.5Å) the two Gaussian approximations are adequate and speed up the calculations somewhat. atomsf_neutron.lib has the constant value needed for neutron scattering. NOTES ON XYZIN, ATOM NAMES ETC.Coordinates are assigned to XYZIN in Brookhaven format.The CRYST1 card (which gives the cell dimensions) and the SCALEi cards (which give the required transform for turning the coordinates into fractional ones) should be included in XYZIN. Cell parameters from the HKLIN file or CELL keyword (if either are present) will be used in the absence of CRYST1 and SCALEi cards, but otherwise the results of omitting the cards are somewhat unpredictable - in some cases the program may fail, in others it may proceed but using random or inappropriate values for the cell parameters. If an atom has occupancy set =0.0 it will be copied to the output file if appropriate, but not used in the calculation. This means it is easy to exclude suspect atoms from any calculation without deleting them from the file. Example of part of XYZIN: REMARK 2ZN refined coordinates CRYST1 82.500 82.500 34.000 90.00 90.00 120.0 1 R3 SCALE1 0.01212 0.00700 0.00000 0.00000 SCALE2 0.00000 0.01400 0.00000 0.00000 SCALE3 0.00000 0.00000 0.02941 0.00000 ATOM 1 N GLY A 201 1 -8.863 16.944 14.289 1.00 21.88 ................................. ATOM 826 CA ALA D 130 4 -5.022 21.709 -11.876 1.00 15.58 ATOM 830 CA WAT A 249 1 -0.002 -0.004 7.891 0.33 10.40 ATOM 831 ZN2 WAT A 249 1 0.000 0.000 -8.039 0.33 11.00 ATOM 832 OW1 WAT A 249 1 0.000 0.000 -8.039 0.00 11.00 ................................. Atom 1 is Nitrogen, 826 is Carbon, 830 is Calcium, 831 is Zinc and 832 is Oxygen. NOTES ON HKLINList of H K L .. FP SIGFP ( FPART PHIP F0 .. F19 )An MTZ file with column labels corresponding to H K L .. FP SIGFP [FREE FPART PHIP F0 .. F19]. The header will contain the cell dimensions and spacegroup of the structure (if you want to change these, use MTZUTILS). For SFCALC this file MUST BE SORTED on h k l (i.e. so h is the slowest varying index, then k and l is the fastest), and be within the asymmetric unit expected by the program; see below for expected limits (CAD will reorder your file if necessary). It is sensible to do this if your data has not been processed by CCP4 programs. XDS, MADNESS etc. sometimes have different default asymmetric units to the CCP4 suite. # an example of sorting native data and checking for correct # asymmetric unit cad HKLIN1 $CEXAM/toxd/toxd HKLOUT toxd_sorted <<EOF TITLE Toxd data sorted - default symmetry P212121 LABIN FILE 1 E1=FTOXD3 E2=SIGFTOXD3 CTYPE FILE 1 E1=F E2=Q EOF For SFREF this file MUST BE SORTED on h k l (i.e. so h is the slowest varying index, then k and l is the fastest), and be extended if necessary to cover the whole asymmetric unit expected by the program for the SF spacegroup; see below for expected limits. If you are working at a lower symmetry than optimal the data must have been extended and sorted (CAD will do this too). There is a reason - sorting data is quite slow, and you will probably use this file many times during the course of a refinement. # an example of extending and sorting native data to P1 +-h,+-k,+l cad HKLIN1 $CEXAM/toxd/toxd HKLOUT toxd_p1 << EOF OUTLIM SPACEGROUP P1 TITLE Toxd data extended to P1 cell LABIN FILE 1 E1=FTOXD3 E2=SIGFTOXD3 CTYPE FILE 1 E1=F E2=Q EOF The agreement factor analysis is carried out against the values of FP. Sigma values are assumed to be in column SIGFP. IF FPART and PHIP have been defined, the FC vector equals the FPART vector plus the FC contribution calculated from the input. FPART can be scaled by entering a suitable value on the SCALE input. Other columns in the input file can be copied to the output file by assigning them to F0= F1= ... Or all columns can be copied over if the ALLIN subkeyword is given in LABOUT. Limits for Indices for the various space groups (these are the same as those used in the data reduction programs and FFT): P1 h k l : l >= 0 h k 0 : h >= 0 0 k 0 : k >= 0 P21 h k l : k >= 0, l >= 0 h k 0 : h >= 0 P21212 h k l : h >= 0, k >= 0, l >= 0 P212121 h k l : h >= 0, k >= 0, l >= 0 P41212 h k l : h >= 0, k >= 0, l >= 0, h >= k (+ other tetragonal) P31/P32 h k l : h >= 0, k > 0 0 0 l : l > 0 P3/R3 h k l : h >= 0, k > 0 0 0 l : l > 0 P3121/P3221 h k l : h >= 0, k >= 0, k <= h (all l) h h l : l >= 0 P61/P65 h k l : h >= 0, k >= 0, k <= h (all l) h h l : l >= 0 NOTES ON MAPINA standard MAP file. The header will contain the cell dimensions and spacegroup of the structure.If you wish to generate structure factors from a input map it is essential that the map is in an appropriate format for the SFALL spacegroup you request. Common errors are: 1) using a different grid for the FFT calculation and the SFALL calculation. FATAL! 2) Changing the axis order in the FFT calculation - both have the same defaults for the same spacegroup! FATAL! 3) Choosing a different asymmetric unit for the FFT and the SFALL calculation. You must make sure your map has the one which covers the SFALL requirements. There is no flexibility here. Limits for axes for the various space groups (these are the same as those used as defaults in FFT):
In space group P1, P21 and P21212a, 'b' is taken as the unique axis. X1 X2 X3 Range of X3 Axis order P1 Z X Y 0 to Y Z X Y P21 Z X Y 0 to Y/2-1 Z X Y P21212a Z X Y 0 to Y/4 Z X Y P212121 X Y Z 0 to Z/4 Y X Z P4122 X Y Z 0 to Z/8 Y X Z P41212 X Y Z 0 to Z/8 Y X Z P4322 X Y Z 0 to Z/8 Y X Z P43212 X Y Z 0 to Z/8 Y X Z P31 X Y Z 0 to Z/3-1 Y X Z P32 X Y Z 0 to Z/3-1 Y X Z P3 X Y Z 0 to Z-1 Y X Z R3 X Y Z 0 to Z/3-1 Y X Z P3121 X Y Z 0 to Z/6 Y X Z P3221 X Y Z 0 to Z/6 Y X Z P61 X Y Z 0 to Z/6-1 Y X Z P65 X Y Z 0 to Z/6-1 Y X Z OUTPUT from Structure Factor CalculationNote: The logfile has been flagged with XLOGGRAPH symbols to allow selected output to be displayed on an Xterminal.
NOTES ON XYZOUTCoordinates in Brookhaven format with the XYZ or B shifts applied - output of unrestrained refinement.The proportion of the shift to be applied can be modified (see the description of keyword STEPSIZE). If an input atom has occupancy of 0.0 it will be copied to the output file. NOTES ON HKLOUTThe output file contains FC and PHIC only for those reflections which exist in the HKLIN file, even if the structure factor calculation has covered more of reciprocal space. The list will include ALL reflections in HKLIN within the outer resolution limit. There is no inner resolution cutoff, or sigma cutoff. Hence there may well be more reflections included than during refinement and your R-factor may well be higher than the one output during refinement. All columns specified in LABIN will be written to HKLOUT. If ALLIN is specified then all input columns will be in HKLOUT.If no HKLIN has been assigned the program outputs a list of H K L FC PHIC for all possible reflections within the reciprocal asymmetric unit for the SFSGROUP that are within the requested resolution limits. FC and PHIC can be modified for use in solvent flattening (see keyword AVMODE). NOTES ON MAPOUTSeveral types of maps can be output - see keyword MODE for how to produce each one.
OUTPUT from the Gradients Calculation - NOTES ON GRADMATA direct access file of gradient information is output ready for input to PROLSQ (no longer available in the CCP4 Suite). The following items are output in this section:
OUTPUT from the Calculation of the Optimum ShiftsThis section gives PII (minimisation function) values, relative slopes, calculated optimum step sizes and, when determined, the actual step size used in applying the shifts.NOTES ON STEPSIZER. Agarwal attempted to estimate the best step size to use during UNRESTRAINED refinement. A displacement of the form alphaDelta(u), which minimises the minimisation function of the least squares P(u + alphaDelta(u)), will be the optimum where alpha is the optimum step size. As P as a function of alpha approximates to a quadratic function, it is possible to calculate the optimum step size, alphaopt, from the values of P(alpha) at alpha = 0 and alpha = alpha1 and from the slope of P(alpha) at alpha = 0. The initial step size alpha1 used in the calculation is chosen as 1.0 or the value input in the control data. If the fractional change in step size is greater than SZFRC (by default 0.3) then a second calculation is done with alpha1 taken as the optimum step size just calculated.COMMON PROBLEMS
EXAMPLESCalculate map from atom coordinatessfall XYZIN /f/scratch/swift/pa_model1_2.pdb XYZOUT $SCRATCH/junk.pdb MAPOUT $SCRATCH/junk.map << END-sfall TITL Phasing on initial PA structure (refined twice) GRID 52 64 76 MODE ATMMAP RESO 30 2.1 BINS 60 RSCB 5.0 2.1 SFSG 1 END END-sfall Read in coordinates to calculate structure factorssfall HKLIN $SCRATCH/pt2_pt4_shhg2_os2_nat.mtz HKLOUT $SCRATCH/pa_model1_2_phase.mtz XYZIN /f/scratch/swift/pa_model1_2.pdb << END-sfall TITL Phasing on initial PA structure (refined twice) GRID 52 64 76 MODE SFCALC XYZIN HKLIN RESO 30 2.1 BINS 60 RSCB 5.0 2.1 SFSG 1 LABI FP=F_V4 SIGFP=SIGF_V4 FREE=FreeRflag LABO FC=FC PHIC=AC END END-sfall Read in map to calculate structure factorssfall HKLIN $SCRATCH/pt2_pt4_shhg2_os2_nat.mtz HKLOUT $SCRATCH/pa_model1_2_phase.mtz MAPIN $SCRATCH/junk.map << END-sfall TITL Phasing on initial PA structure (refined twice) GRID 52 64 76 MODE SFCALC MAPIN HKLIN RESO 30 2.1 BINS 60 RSCB 8.0 2.1 SFSG 1 LABI FP=F_V4 SIGFP=SIGF_V4 LABO FC=FC PHIC=AC END END-sfall Iteration_cycles to do the back-transformation and phase combination cycles in solvent flattening procedure# - back-transform map with sfall to obtain Fcs and phases, scale # and calculate R-factor # - combine the phases with sigmaa # - run fft to calculate a new map with Fobs and combined phases as # input to flatmap # sfall HKLIN ../mtz/fvb18_sigmaa.mtz HKLOUT fvb_flat_14.mtz MAPIN fvb_flattened_14.map << EOF-sfall titl back transformation for cyc 14 mode SFCALC MAPIN HKLIN grid 104 128 160 reso 200.0 3.0 BINS 40 rscb 10.0 3.0 sfsg 19 AVMODE RADIUS 8 badd 0 FORM NGAUSS 5 C H N O S LABI FP=fvb_Fnp SIGFP=fvb_SIGFnp - F0=PHIB_I3 F1=FOM_I3 - F2=A F3=B F4=C F5=D LABO FC=FCWANG PHIC=PHICWANG WANGWT=WC8 END EOF-sfall Unrestrained refinementsfall HKLIN ../data/taka_fx_trunc.mtz XYZIN ../data/takaxp_model8_b.pdb XYZOUT $SCRATCH/junk.pdb << END-sfall TITL TAKAXP_MODEL8_9 SFS GRID 104 136 264 MODE SFREF XYZ UNRESTRAINED FREE 3.0 RESO 20 2.1 60 RSCB 8.0 2.1 SFSG 19 FORM NGAU 2 Ca Cr Fe+2 P LABI FP=FP SIG=SIGFP FREE=FreeRflag END-sfall Prolsq refinement cycles#!/bin/csh -f # # REFINE consists of sfall, protin and prolsq # set LastCycle=146 set Number_of_Cycles=8 set CycleCounter=0 # Begin: # @ CurrentCycle=$LastCycle + 1 # # ========== SFALL ========== # set DATE=`date` echo ' *************************** '$CurrentCycle' **************' echo ' ' echo ' Running SFALL for Cycle Number :' $CurrentCycle' on '$DATE echo ' ' echo ' *************************** '$CurrentCycle' **************' # sfall HKLIN s91a.mtz XYZIN s91a_cyc${LastCycle}.brk GRADMAT s91a_GRADMAT${CurrentCycle}.tmp << END-sfall TITL Barnase s91a Mutant MODE SFREF XYZB RESTRAINED ! for prolsq GRID 90 90 120 !div CELL by these should give .=. 0.7 A BINS 60 RESO 10 2.2 !last no. is nstep RATI 3 !truncate xyz and B shrifts to RATI*RMS BADD 0 !smear the gaussians to gen atomic density [10.0] RSCB 4 1.5 !limits for refining scale and B, same as RESO ? SFSG 145 ! fft space group LABIN FP=S91A_F SIGFP=S91A_SIGF FREE=FreeRflag END END-sfall # /bin/rm ${SCRATCH}s91a_junk.brk ${SCRATCH}sfall* # # ========== PROTIN ========== # set DATE=`date` echo ' *************************** '$CurrentCycle' **************' echo ' ' echo ' Running PROTIN for Cycle Number :' $CurrentCycle' on '$DATE echo ' Running PROTIN for Cycle Number :' $CurrentCycle' on '$DATE echo ' ' echo ' *************************** '$CurrentCycle' **************' # protin XYZIN s91a_cyc${LastCycle}.brk PROTOUT s91a_protout.out PROTCOUNTS s91a_protcounts.out DICTPROTN ${LIBD}protin.dict << END-protin TITLE Barnase Ser91 -> Ala SYMMETRY 145 ! CHNNAME ID A CHNTYP 1 ROFFSET 0 CHNNAME ID B CHNTYP 2 ROFFSET 0 CHNNAME ID C CHNTYP 3 ROFFSET 0 chnname id E chntyp 4 roffset 0 ! CHNTYP 1 NTER 3 VAL 3 CTER 110 ARG 2 CHNTYP 2 NTER 3 VAL 3 CTER 110 ARG 2 CHNTYP 3 NTER 2 GLN 3 CTER 110 ARG 2 chntyp 4 wat ! LIST FEW ![few]/some (for >20A)/all PEPP 5 !no. of atoms restrained to be in a plane [5] VDWR 1 CPR 8 3.4 END END-protin # # ========== PROLSQ ========== # set DATE=`date` echo ' *************************** '$CurrentCycle' **************' echo ' ' echo ' Running PROLSQ for Cycle Number :' $CurrentCycle' on '$DATE echo ' ' echo ' *************************** '$CurrentCycle' **************' # prolsq XYZIN s91a_cyc${LastCycle}.brk PROTOUT s91a_protout.out PROTCOUNTS s91a_protcounts.out GRADMAT s91a_GRADMAT${CurrentCycle}.tmp XYZOUT s91a_cyc${CurrentCycle}.brk << END-prol TITLE Barnase s91a Mutant CGCYCLES 20 100 1 !max no. cyc for conjugate gradient soln [50] ORIGIN 0 0 0 !origin constraint flags RESOLUTION 10 2.2 !select Fobs OUTPUT XYZ !o/p XYZ(.brk)/SF(.mtz)/SHIFTS(shifts file) RTEST -1 1 1 1 1 1 1 1 !-1 -> +1 when R factor .=25% NOUPDATE !not update atomic coord in PROTIN o/p file MATRIX .30 .85 .85 !1st param adjusts SF/geom contribution !start=.08;.12;.2;.25;.3;.35; final=.08 !2nd&3rd:shift factors for position and B BATOM !(+) when refining indiv B ! ! ***** Restraints ***** DISTANCE 1 0.02 0.04 0.05 0.01 0.1 PLANE 1 0.02 CHIRAL 1 0.12 TEMPERATURE 1.0 1.0 1.5 1.5 2.0 0.1 !start !TEMPERATURE 1.0 1.5 2.0 2.0 2.5 3.0 !when R factor <25% HOLD 0.1 1.0 0.1 !TORSION 1 20 20 30 50 !when R factor >25% TORSION 1 10 10 15 45 !when R factor <25% VANDERWAAL 1 0.2 -0.3 0.0 -0.2 0.0 ! ! ***** Monitor ***** MONITOR DISTAN 3. VANDERWAAL .5 TORSION 3 MONITOR PLANE 3 CHIRAL 3 !for R factor<25% END END-prol # Cleanup: # /bin/rm s91a_protout.out s91a_protcounts.out s91a_GRADMAT${CurrentCycle}.tmp # @ CycleCounter=$CycleCounter + 1 @ LastCycle=$LastCycle + 1 # # Test to see whether Number_of_cycles done # if ($CycleCounter<$Number_of_Cycles) goto Begin # Simple Runnable ExamplesAUTHORSEleanor Dodson and Ted Baker, based on a program by Ramesh Agarwal and Neil Isaacs. Documented for the Daresbury Laboratory protein crystallography system by John Campbell.PROGRAM FUNCTIONThe program is based on a least-squares refinement technique using fast Fourier transform (FFT) methods as devised by Ramesh Agarwal. His paper [1] on the technique should be studied to get a detailed understanding of the method and only a broad description of the method is given in this section.The program may be used for
Modelling of the Electron Density and Calculation of Structure FactorsThe contribution of an atom to a structure factor is the product of an atomic form factor function and a Gaussian temperature factor function. If the form factor function is approximated by a sum of Gaussian functions, then this contribution may be represented by the sum of Gaussian terms and so may the modelled electron density. The amount of computation required in setting up the electron density map is proportional to the number of Gaussian terms used in approximating the form factor. The use of a two term Gaussian is probably adequate in most cases where the resolution of the data does not extend beyond 1.5Å. The program does, however, allow for the input of other numbers of terms. Non default atoms defined in FORMFACTOR are usually represented as five Gaussians since heavy atom scattering factors are more complex.Tables of coefficients for a five term Gaussian approximation to the atomic form factors are given in the International Tables for X-ray Crystallography Vol.IV [2]. Some values for two and one term Gaussian approximations are given in the Agarwal paper. The radius of the atoms (beyond which the electron density is considered to be negligible) is also required in the calculation. The value for this radius must be chosen to balance the requirements of accuracy (large radius) and speed (small radius). The tables below indicate the radii required for carbon atoms, for different temperature factors (Bm), to give a given fractional loss of electron density (DeltaZm/Zm). DeltaZm/Zm 0.02 0.01 0.005 Two term Gaussian Bm 5 1.89 2.06 2.22 10 2.02 2.21 2.38 20 2.26 2.47 2.68 More values are given in the Agarwal paper. The electron density map is built up using the contributions from each atom within, or near (within the atom radius) to, the part of the map being calculated. A three dimensional FFT of the electron density map is used to calculate the structure factors. In this calculation the space group symmetries are used whenever possible. If the sampling grid is too coarse, significant errors may occur in the calculation of the structure factors. These errors may be reduced [3] either by using a finer grid for the calculation or by adding an artificial extra temperature factor to all the atoms when modelling the density (not recommended). This factor has also to be taken into account when the calculated structure factors are compared with the observed structure factors. Calculation of the GradientsThe gradients are calculated by convoluting a modified difference density function with the electron densities of the individual atoms being refined. The density function is calculated by calculating the Fourier transform (again using the FFT) of a difference function based on the weighted values of the errors in the structure factors. A three dimensional difference FFT is calculated and convoluted with the derivatives of the atomic density for x y and z, or B. The convolution is carried out by summing, over all the grid points enclosed by the atom (radius defined as described above), the product of the electron density of the atom and the value of the modified difference density function. Thus, the calculation of the gradients is similar to the calculation of the structure factors with the steps being reversed.The Normal MatrixOnly the diagonal terms (interacting between parameters of the same atom) are calculated for the normal matrix. The method described by Agarwal is used but is generalised to allow for a variable number of terms in the Gaussian approximations to the atomic form factors. Also the function A'xx(bi) is tabulated for each value of bi from 1 to 150 and the nearest values, rather than interpolated values, are selected from the table when forming the diagonal terms of the matrix.A set of terms, H, is only calculated for one parameter as the other diagonal terms may be calculated to a good degree of accuracy from this initial set. An inverse matrix is set up for calculating the orthogonalised contributions of the gradient. Calculation of the ShiftsThe shifts, calculated as described above, may not in fact be the optimum ones to use due to the approximate nature and the non-linearity of the method used. In practice, it is unwise to apply large shifts as these may give rise to instability in the refinement or to `oscillating atoms' and thus provision is made to truncate the larger shifts. The program allows for the input of a ratio, RATIO, which may be used to truncate the maximum value of a shift toRATIO * r.m.s. value of the shiftsAt the beginning of a refinement, the value of RATIO should be kept fairly low (1.5-2.0) as shifts tend to be large when the errors in the parameters are large. When most atoms are well refined, then the ratio may be increased to 3 or 4, to allow for the refinement of these atoms which are poorly placed. Even these truncated shifts may not be the optimum shifts to be applied. A displacement of the form alphaDelta(u), which minimises the minimisation function of the least squares P(u + alphaDelta(u)), will be the optimum where alpha is the optimum step size. As P as a function of alpha approximates to a quadratic function, it is possible to calculate the optimum step size, alphaopt, from the values of P(alpha) at alpha = 0 and alpha = alpha1 and from the slope of P(alpha) at alpha = 0. The initial step size alpha1 used in the calculation is chosen as 1.0 or the value input in the control data. If the fractional change in step size is greater than SZFRC (by default 0.3) then a second calculation is done with alpha1 taken as the optimum step size just calculated. REFERENCES
SEE ALSOcad, freerflag, fft, icoefl, mtzutils, overlapmap, pdbset, prolsq (no longer available in the CCP4 Suite), protin, sigmaa, watersearch (1), xloggraph, X-PLOR manual |