SORTWATER (CCP4: Supported Program)


sortwater - sort waters by the protein chain to which they "belong"


sortwater xyzin input.brk xyzout output.brk
[Keyworded input]


This is a program to sort waters by the protein chain to which they "belong", in the case of a protein with several equivalent subunits related by non-crystallographic symmetry (and possibly crystallographic symmetry as well). The program reads a coordinate file (Brookhaven format) containing several protein chains with different chain identifier characters, and water molecules. Each water is allocated to the protein chain which has the nearest [non-carbon] atom, using crystallographic symmetry if necessary, consistent with non-crystallographic symmetry. Waters may be reallocated to different protein chains so that two waters related by non-crystallographic symmetry are not in the same chain. The waters are written out to a file (XYZOUT) in the same format as the input file. Water atoms related by non-crystallographic symmetry will be given the same residue number, but different chain identifiers. Duplicate waters (after applying symmetry operators) are removed.

Note that there will always be ambiguities in waters close to subunit interfaces, so the program may make mistakes.


The allowed keywords are:

CHAINS <protein_chain_names>

Define all chain IDs of "protein" (i.e. non-water) chains.

WCHAINS <water_chain_names>

Define chain names for water chains to correspond to "protein" chains in output file (irrespective of input water chain names). There must be the same number of water chains defined as "protein" chains, but the same water chain may be assigned to more than one protein chain, provided that they are not related by non-crystallographic symmetry.

SYMMETRY <space_group_name>|<space_group_number>|symmetry

Define crystallographic symmetry.

WATER <water_residue_name> [<water_atom_name>]

Residue name for waters [default HOH], and atom name [default O].

CARBON ["Yes"|"No"]

No: store only non-carbon non-water atoms for contact checking (.true.) [default]
Yes: store all atoms (.false.).

DISTANCE [<maximum_similarity_distance>] [<maximum_distance_from_protein>]

Maximum distance between putative NCS-related waters to accept [default 2.0]
maximum distance from non-water atom to accept as belonging to chain [default 6.0].

NCS <Chain1> <Chain2> [ "ODB" <O_operator_filename> | MATRIX <r11> <r12> <r13> <r21> <r22> <r23> <r31> <r32> <r33> <t1> <t2> <t3> | IDENTITY ] [ SAME <Chain3> <Chain4> ]

Define NCS operator to transform chain with ID "Chain1" to "Chain2".

Operators may be given as the filename of an O data block, or as 12 numbers following the keyword MATRIX (note the ODB file contains the transposed matrix).

The keyword SAME defines the transformation from "Chain3" to "Chain4" as being the same as that for "Chain1" to "Chain2". This may be put at the end of a line defining an operator.

Implied operators will be generated automatically (e.g. B->A from A->B, and A->C from A->B & B->C).


Set verbose printing flag.


A very simple runnable unix example script can be found in $CEXAM/unix/runnable/ A more involved non-runnable unix example script is in $CEXAM/unix/non-runnable/


Phil Evans, MRC LMB, January 1995