  GNU-Darwin Web

# SIGMAA (CCP4: Supported Program)

## NAME

sigmaa - Improved Fourier coefficients using calculated phases

## SYNOPSIS

sigmaa HKLIN foo_in.mtz [ HKLOUT foo_out.mtz ]
[Keyworded input]

## DESCRIPTION

The program SIGMAA (Read, 1986) can be used to combine a set of calculated phases with a set of previously determined phases for which the phase probability profiles are held in the form of Hendrickson-Lattman coefficients.

It calculates weighted Fourier coefficients either from the calculated phase from a (partial) model structure, or by combining phase probabilities from isomorphous phases with those from one or more (partial) structures.

WARNING: SIGMAA has been converted so that it will work with MNFs. In a similar fashion to FFT (see also documentation on Missing Number Flags); Fo will be replaced by DFc, if it is missing, for the FWT map coefficient. Also, when combining phases of missing data, the phase probability will be assumed to be uniform. However, the procedure may not be optimal, hopefully a version from Randy Read will be available in a subsequent version.

There are 3 main options:

1. PARTIAL
use partial structure information, writing out a weight and coefficients for maps in columns as follows:
WCMB
A weight (analagous to `Sim weight') to estimate the reliability of AlphaCalc.
DELFWT (m|Fo| - D|Fc|) exp(i AlphaCalc)
For a difference map, FFT input: F1=DELFWT PHI=PHIC
FWT (2m|Fo| - D|Fc|) exp(i AlphaCalc)
Analogous to 2Fo-Fc map, FFT input: F1=FWT PHI=PHIC

where Fo, Fc are observed and calculated structure factors. Note that for centric terms, the (2m|Fo|-D|Fc|) coefficients are replaced by m|Fo|; these coefficients reduce/remove model bias.

2. COMBINE PART
combine isomorphous phase (preferably input by Hendrickson-Lattman coefficients ABCD) with calculated phases from up to 3 sources; output the combined phase (PHCMB WCMB) and coefficients which minimise model bias. (labelled again: FWT PHFWT and DELFWT PHDELFWT)
3. COMBINE MIR2
combine two sets of experimental phases with or without Hendrickson-Lattman coefficients. This can only be done pair-wise, it might be argued that instead all data should be used in calculating the phase.

The program first calculates, iteratively in resolution bins, the value of SigmaA as defined by Srinivasan, 1966; and then for each reflection, the figure of merit m and the estimate of the error in the partial structure from coordinate errors D (Luzzati, 1952). There is an option to scale these to modify the weight assigned to the partial structure information, or to read in values of SigmaA derived previously.

If EPS is the multiplicity for the reflection zone (Rogers, 1965),

```           SigmaA = D*sqrt(sigmaP/sigmaN)
Eo   = Fo/sqrt(EPS*sigmaN)
and   Ec   = Fc/sqrt(EPS*sigmaN)
where  sigmaN = <Fo**2/EPS> and sigmaP = <Fc**2/EPS>.
```

The figure of merit m = <cos(AlphaTrue - AlphaCalc)> is calculated from Eo, Ec and SigmaA, while the map coefficients arise from the approximation that

```    m Eo exp(iAlphaCalc) = 0.5 Eo exp(iAlphaTrue)
+  0.5 SigmaA Ec exp(iAlphaCalc)
```

If coordinate errors are assumed to be normally distributed,

```    ln SigmaA = intercept - slope * (sintheta/lambda)**2
where intercept = 0.5 * ln(sigmaP/sigmaN)
and     slope = pi**3 * (mean square coordinate error)
```

## KEYWORDED INPUT

The various data control lines are identified by keywords. Only the first 4 characters need be given. Those available are:

COMBINE, END, ERROR, LABIN, LABOUT, PARTIAL, RANGES, RESOLUTION, SIGMAA, SYMMETRY, TITLE

### COMBINE [ PART <nps> ] [ DAMP <d1> [ <d2> [ <d3> ]]] [ RESOLUTION <Rmin> <Rmax> ]

[Required for option (b).]

Use this option to combine experimental phase information from isomorphous replacement (columns PHIBP, WP, HLA, HLB, HLC, HLD from the input data file) with that from (partial) model structures. This option produces an output data file assigned to HKLOUT.

PART <nps>
<nps> is the number of (partial) model structures, default: 1, maximum allowed: 3.
DAMP <d1> <d2> <d3>
<d1> <d2> <d3> (default 1.0) are values to multipy the SigmaA values generated for the partial structures. Once the Rfactor between Fobs and Fcalc is below 30% or thereabouts the SigmaA Weights become close to 1. This means that there will be very little contribution to the combined phase from the MIR information. Giving values of di<1.0 may be helpful. See keyword SIGMAA for Randy's preferred solution.
RESOLUTION <Rmin> <Rmax>
If resolution limits <Rmin>, <Rmax> are given here, phase combination is only done within this resolution shell: typically this would be used to include experimental phases only for high resolution data during a phase extension process. In this case, a low resolution limit would be set, allowing lower resolution data which has already been phased in previous cycles to diverge from the (incorrect) experimental phases according to phase information from averaging or density modification.

### COMBINE [MIR2] [ RESOLUTION <Rmin> <Rmax> ]

Merge together two sets of MIR phases. RESOLUTION is the same as above.

### ERROR

If this command is present, a straight line is fitted to the plot of ln (SigmaA) against resolution in order to estimate the rms coordinate error.

### LABIN <program label>=<file label> .....

Input column assignments. Program labels for the various options are:

PARTIAL
FP SIGFP FC PHIC
COMBINE PART ...
FP SIGFP PHIBP WP [HLA HLB HLC HLD], with FC PHIC or FC1 PHIC1 FC2 PHIC2 [FC3 PHIC3]
COMBINE MIR2 ...
FP SIGFP PHIBP WP [HLA HLB HLC HLD], with PHIB2 W2 [HLA2 HLB2 HLC2 HLD2]

### LABOUT <program label>=<file label> .....

Output column assignments. Program labels for the options producing output data are:

PARTIAL
DELFWT FWT WCMB
COMBINE MIR2
HLAC HLBC HLCC HLDC WCMB PHCMB
COMBINE PART ...
DELFWT PHDELFWT FWT PHFWT WCMB PHCMB For details of these, see INPUT AND OUTPUT FILES.

### PARTIAL [ DAMP <d1> ]

Produce weighted map coefficients from a partial structure. This is the default option. It produces an output .mtz data file.

DAMP <d1>
<d1> is the damping factor for the SigmaA weights (default 1.0).

### RANGES [ <nbin> <mon> ]

Set the number of resolution bins <nbin> and the reflection monitoring interval <mon>. Defaults: 20 1000; maximum <nbin> allowed: 50.

<nbin> is the number of resolution bins (equal width in [sin(theta)/(lambda)]**2 in which to divide partial structure data for normalization and sigmaA estimation. It is IMPORTANT that resolution ranges contain sufficient reflections. It is best to use as large a value of <nbin> as possible, as long as the estimates of sigmaA vary smoothly with resolution. If they do not, <nbin> should be reduced until sigmaA does vary smoothly. A good first guess is the number of reflections divided by 1000. If sigmaA refinement converges to zero in one or more of the ranges (which happens sometimes when the correct value is low), this can usually be circumvented by decreasing <nbin>.

Information about every <nmon>-th reflection will be written to the log file.

### RESOLUTION [ <rmin> ] <rmax>

Low and high resolution limits in either order or upper limit if only one is specified. These may are in Angstroms or if both are <1.0, units of 4(sintheta/lambda)**2. By default, all the data in the file are used.

### SIGMAA <nps> <nbin>

Input SigmaA values from another source. Normally these values will be calculated in the program so this keyword is unnecessary. However if the agreement between Fobs and Fcalc becomes very good - for example if the Rfactor is <25% - then the calculated SIGMAA values weight up the PHIcalc at the expense of the experimental phases. This may not be desirable and you may need either to invoke the DAMP keyword or retain an early estimate of sigmaA.

<nps>
number of partial structures.
<nbin>
number of bins, followed by <nbin> lines of the form:
``` SigmaA(1,1)    [ SigmaA(2,1)  ...   [SigmaA(nps,1)]]
SigmaA(1,2)    [ SigmaA(2,2)  ...   [SigmaA(nps,2)]]
..........
SigmaA(1,nbin) [ SigmaA(2,nbin) ... [SigmaA(nps,nbin)]]
```

### SYMMETRY <name> | <number> | <operators>

Spacegroup number or name or operators in International Tables format. By default, symmetry information is read from the input file header.

### TITLE <title>

A title written to the log file and in the header of the output MTZ data file (if produced).

End of input.

## INPUT AND OUTPUT FILES

### Input reflection data file

This is an MTZ file assigned to logical name HKLIN. The following column assignments are required (those which are optional are enclosed in square brackets):

1. PARTIAL option:
H K L FP SIGFP FC PHIC
with
FP, SIGFP
native amplitude and standard deviation
FC, PHIC
calculated amplitude and phase (degrees)
2. COMBINE option:
1. Combination of two sets of MIR phases:
H K L FP SIGFP PHIBP WP [HLA HLB HLC HLD]
PHIB2 W2 [HLA2 HLB2 HLC2 HLD2]
with
FP, SIGFP
native amplitude and standard deviation
PHIBP
isomorphous centroid phase (degrees)
WP
figure of merit
HLA...HLD
Hendrickson-Lattman probability coefficients corresponding to isomorphous phase. If these are absent, a unimodal probability distribution will be set up around PHIBP.
PHIB2
isomorphous centroid phase for second set
W2
figure of merit for second set
HLA2..HLD2
Hendrickson-Lattman probability coefficients for second set. If these are absent, a unimodal probability distribution will be set up around PHIB2.
2. Combination of one set of MIR phases with PARTIAL information:
H K L FP SIGFP PHIBP WP [HLA HLB HLC HLD]
plus FC PHIC
or FC1 PHIC1 FC2 PHIC2 [FC3 PHIC3]
with
FP, SIGFP
native amplitude and standard deviation
PHIBP
isomorphous centroid phase (degrees)
WP
figure of merit
HLA...HLD
Hendrickson-Lattman probability coefficients corresponding to isomorphous phase. If these are absent, a unimodal probability distribution will be set up around PHIBP.
FC, PHIC
calculated amplitude and phase (degrees) for one partial structure
FC1, PHIC1
calculated amplitude and phase (degrees) for first partial structure when nps > 1
FC2, PHIC2
calculated amplitude and phase (degrees) for second partial structure when nps = 2
FC3, PHIC3
calculated amplitude and phase (degrees) for third partial structure when nps = 3

### Output reflection data file

This is an MTZ file assigned to logical name HKLOUT. The file will contain all the columns from the input file with extra columns appended, the number depending on which option was used. The default labels of these columns are given below; these may be changed with LABOUT command.

1. PARTIAL option:
The new columns are: WCMB DELFWT FWT, with
WCMB
figure of merit m of calculated phase (Sim weight)
DELFWT
Fourier amplitude for `difference' map (mFo-DFc)
FWT
Fourier amplitude for '2Fo-Fc' map (2mFo-DFc) These terms may be positive or negative.

The phases used for these maps will always be PHIC.

2. COMBINE option:
The new columns are: PHCMB WCMB FWT PHFWT DELFWT PHDELFWT, with
PHCMB
combined phase angle (degrees)
WCMB
combined figure of merit
FWT
Fourier amplitude for '2mFo-DFc' map
PHFWT
Combined phase for this term
DELFWT
Fourier amplitude for 'mFo-DFc' map
PHDELWT
Combined phase for this term

## EXAMPLES

### Difference and 2Fo-Fc maps from calculated phases

```sigmaa HKLIN hktmpico.mtz HKLOUT hksigmaa1.mtz
TITLE   SIGMAA m*Fo-Fc map  pfk B.st. BP2.. PROLSQ cycle<1>..
RESOLUTION 100.0 2.6 ! Resolution limits in Angstroms
RANGES   30 5000     ! Number of bins for analysis v. resolution
! Monitor every 5000th reflection
PARTIAL              ! Option for difference map coefficients
ERROR                ! Use sigmaA v resolution for coordinate error
LABIN FP=FO SIGFP=SIGFO FC=FC PHIC=PHIC
END
```

Note: This example uses the default output file labels. To calculate the `difference' map, use DELFWT in FFT. To calculate the `2Fo-Fc' map, use FWT.

### Phase combination

```sigmaa
HKLIN ../data/sp400_monster2.mtz
HKLOUT ../data/sp400_phase_comb.mtz
<< END-sigmaa
TITLE   TRYIT
RANGES  10 1000      ! Number of analysis bins, monitor interval
RESOLUTION 0.0 0.25  ! Resolution limits in 4(sintheta/lambda)**2
ERROR                ! Use sigmaA v resolution for coordinate error
COMBINE PART 1       ! Combine isomorphous + 1 partial model
LABIN   FP=F(Mer) SIGFP=SIGF(Mer) PHIBP=PHIBEST WP=FOM -
HLA=A HLB=B HLC=C HLD=D -
FC=FC PHIC=AC
END
END-sigmaa
```

## NOTES

### Phase combination

The phase combination method used in sigmaa depends on the Hendrickson and Lattman (1970) formulation of the phase probability profile for a phase Alpha:

```  P(Alpha) = exp(A cosAlpha + B sinAlpha + C cos2Alpha + D sin2Alpha)

```

A, B, C, D are known as the phase coefficients. Phase information from different sources can be combined by a simple addition of the phase coefficients from each determination. The application of a weighting scheme proposed by Sim (1959) allows for the inclusion of phase information determined from a partial structure.

The principles of the method and details of the original phase combination program are described by Bricogne (1976).

### Model bias on combination

It is assumed that the coefficients giving least bias vary as a linear function of partial structure influence. The variation of information is the parameter used to measure the contribution of each partial structure to the combined phase probability profile; and this is normalised to give partial structure weights w. These are tabulated as a function of resolution in the log file. If there are p partial structures, the modified map coefficients are given by

```         [2mFo - sum_over_p(wDFc)] / [2 - sum_over_p(w)]
```

## REFERENCES

1. Read, R.J.: Acta Cryst. A42 (1986) 140-149.
2. Srinivasan, R.: Acta Cryst. 20 (1966) 143-144.
3. Hauptman, H.: Acta Cryst. A38 (1982) 289-294.
4. Luzzati, V.: Acta Cryst. 6 (1953) 142-152.
5. Rogers, D. in Computing Methods in Crystallography (Rollett, J.S.,ed.) (1985) pp. 126-127, Pergamon Press.
6. Hendrickson, W.A. & Lattman, E.E.: Acta Cryst. B26 (1970) 136-143.
7. Bricogne, G.: Acta Cryst. A32 (1976) 832-847.
8. Sim, G.A.: Acta Cryst. 12 (1959) 813-815; 13 (1960) 511-512.
9. Read, R. J.: Acta Cryst. A46 (1990) 140-9.
10. Read, R. J.: Acta Cryst. A46 (1990) 900-12.
11. Vellieux, F.M.D., Livnah, O., Dym, O., Read, R.J. & Sussman, J.L., manuscript in preparation.