User Guide
This guide is intended to be used as a reference manual. You may also want to follow the simple steps described in the tutorials which give usage examples of the most important utilities. More documentation is also available on the Methodology page and in the published articles .
Content:

colacor - Cross-Correlation Calculation and Refinement of Manual Docking
colores - FFT Accelerated 6D Exhaustive Search
eul2pdb - Graphical Representation of Euler Angles
map2map - Format Conversion
pdb2sax - Create a Simulated SAXS Bead Model from a PDB
pdb2vol - Create a Volumetric Map from a PDB
pdbsymm- Symmetry Builder
qdock - Rigid-Body Docking of High- and Low-Resolution Data
qpdb - Vector Quantization of a PDB
qrange - Automatic Vector Quantization and Rigid-Body Docking
qvol - Vector Quantization of Volumetric Map
vol2pdb - Create a PDB from a Volumetric Map
volcube - Creating Isocontour Surfaces
voldiff - Discrepancy / Difference Mapping
voledit - Inspecting 2D Cross Sections of Density Values
volhist - Inspecting and Shifting the Voxel Histogram
volvoxl - Grid Interpolation
Header File and Library Routines

colacor -  Cross-Correlation Calculation and Refinement of Manual Docking

Purpose:

colacor = Combined Off-LAttice COrrelation and Refinement. Since colores centers maps and structures automatically, a special tool is needed to compute the cross-correlation coefficient between a density map and a (resolution-lowered) atomic structure at a given, user defined geometry. As an additional useful option, colacor also performs a single run of off-lattice Powell optimization that (rigid-body) refines a manual docked fit to the nearest maximum of the cross-correlation.

Basic usage (at shell prompt):
 

./colacor <Situs density map> <PDB structure> -res <number> -cutoff <number>

The basic input parameters are:

<Situs density map> Low resolution map in Situs format.  To convert your EM map to Situs format use the map2map utility.

<PDB structure> Atomic structure in PDB format.

-res <value> Estimated resolution of the density map in Å.  [default -res 15.0]

-cutoff  <value> Density map threshold value. All density levels below this value are set to zero. You can use volhist to rescale the density levels or to shift the background peak in the voxel histogram to the origin.  [default -cutoff 0.0]

-corr   <number>  Two options are implemented:

-corr 0 Standard linear cross-correlation.  The scalar product between the density maps of the low resolution map and the low-pass filtered atomic structure. For resolutions > 10Å this criterion is less discriminative. Note that this needs to be turned on explicitely.

-corr 1 [default] A Laplacian filter is applied by default to maximize the fitting contrast. This is the recommended docking criterion for low resolution docking (up to 25Å resolution). To provide for a more robust algorithm when dealing with cropped or thresholded experimental data we implemented a mask that filters out hard artificial surfaces.  Due to the masking and filtering expect overall smaller correlation values compared to -corr 0.
More advanced options (at shell prompt):

-ani <value>   Defines the resolution anisotropy factor (z direction vs. x,y plane)  [default: -ani 1.0]. Allows one to set a different resolution in the z direction vs. the x,y plane. E.g. "-ani 1.5 -res 20" specifies a 30 A resolution in the z direction, and a 20 A resolution in x,y. This is useful for researchers dealing with membrane protein or tomography reconstructions that have a reduced resolution in the z direction.

-nopowell This flag skips the Powell optimization. Only the cross-correlation coefficient is computed. By default the Powell optimization is turned on.

-pwti <value number> Powell tolerance and max number of iterations of the Powell algorithm. This two parameters control the convergence of the optimization. By default the tolerance is set to 1e-6 and the max iterations are limited to 25.

-pwdr <value value> Initial gradient of the translational and rotational search in the Powell optimization. By default the initial rotational gradient is set to 3.5 degrees. The rotational gradient cannot be larger than 10 degrees. If a larger value is chosen, that value is ignored and the gradient is set to 10 degrees. The translational initial gradient is set to 25% of the voxel spacing. To use the default value only for the rotational or only for the translational gradient, choose a negative number for the parameter that must be left at default, and your chosen value for the other.

-pwcorr <number> This option sets the Powell correlation algorithm. By default, the fastest algorithm which reproduces the standard cross correlation coefficient to within the Powell tolerance is determined at runtime.

-pwcorr 0                     Determined at runtime [default]
-pwcorr 1                     Standard three-step code
-pwcorr 2                     Three-step code with mask applied
-pwcorr 3                     One-step code for small probe structures

Input at program prompt:
None.

Output:
(Shell window:) The cross-correlation coefficient and other useful information about the structure and map used. Depending on the overlap of structure and map, a good fit with -corr 1 (default Laplacian filter) setting may have values upto 0.5, and with -corr 0 (standard correlation) may have values upto 0.9. These values are smaller if the structure does not account for the entire map density.

colores -  FFT-Accelerated 6D Exhaustive Search

Purpose:

colores = COrrelation based LOw RESolution docking. A general purpose, multi-processor capable rigid-body search tool, suitable for situations where not all density is accounted for by the atomic structure. The translational search is FFT-accelerated and the program supports the use of a Laplacian-style filter (can be turned off) that drastically increases the fitting contrast at medium to low resolution (for more info see the corresponding methods page). This tool performs a full exhaustive search in 6D search space (3 translational and 3 rotational degrees of freedom) within a couple of minutes of compute time, depending on system size and numbers of processors used.

Basic usage (at shell prompt):
 

./colores <Situs density map> <PDB structure> -res <number> -cutoff <number> -deg <number> -nprocs <number>

The basic input parameters are:

<Situs density map> Low resolution map in Situs format.  To convert your EM map to Situs format use the map2map utility.

<PDB structure> Atomic structure in PDB format.

-nprocs <number> This option sets the number of processors used for the on-lattice 6D search and the off-lattice Powell optimization. Colores supports systems with multiple, dual-core and/or hyperthreaded processors. For example, on a dual processor Intel Xeon machine with Hyperthreading, '-nprocs 4' produces the shortest runtimes (two processors times two threads). The same holds for a dual-core Pentium4 with Hyperthreading. On the other hand, '-nprocs 2' is the optimal value for a dual processor AMD Opteron system. Note that on some architectures the threads may not show in the UNIX "top" window, if in doubt do your own time comparison. [default -nprocs 1]

-res <value> Estimated resolution of your density map in Å.  [default -res 15.0]

-cutoff  <value> Density map cutoff value. All density levels below this value are set to zero. You can use volhist to rescale the density levels or to shift the background peak in the voxel histogram to the origin.  [default -cutoff 0.0]

-deg  <value> Angular sampling of the rotational search space in degrees. For typical electron microscopy maps the angular step size should be between 6-20°.  [default -deg 15.0]

-corr   <number>  This option controls the fitting criterion.
Two options are implemented: 


-corr 0 Standard linear cross-correlation.  The scalar product between the density maps of the low resolution map and the low-pass filtered atomic structure.  -corr 1 [default] A Laplacian filter is applied by default. This reduces somewhat the expected correlation values but the fitting is more precise for low resolution maps.

      -corr 0 Standard linear cross-correlation.  The scalar product between the density maps of the low resolution map and the low-pass filtered atomic structure. For resolutions > 10Å this criterion is less discriminative. Note that this needs to be turned on explicitely.

      -corr 1 [default] A Laplacian filter is applied by default to maximize the fitting contrast. This is the recommended docking criterion for low resolution docking (up to 25Å resolution). To provide for a more robust algorithm when dealing with cropped or thresholded experimental data we implemented a mask that filters out hard artificial surfaces.  Due to the masking and filtering expect overall smaller correlation values compared to -corr 0.

More advanced options (at shell prompt):

-ani <value>   Defines the resolution anisotropy factor (z direction vs. x,y plane)  [default: -ani 1.0]. Allows one to set a different resolution in the z direction vs. the x,y plane. E.g. "-ani 1.5 -res 20" specifies a 30 A resolution in the z direction, and a 20 A resolution in x,y. This is useful for researchers dealing with membrane protein or tomography reconstructions that have a reduced resolution in the z direction.

-erang <value value value value value value>   Defines the rotational space limits according to the range of the Euler angles (psi, theta, phi). By default the entire rotational space is considered    [default: -erang 0 360 0 180 0 360]. Note that the Euler angle range is not limited to these standard intervals, so you can specify also negative values (within certain limits), but in any event any colores output angles are remapped to the standard intervals. For example, if you want to perform a fine search of 2° angular sampling in only one of the Euler angles, these are the options:

./colores  em.sit atoms.pdb -res 9.0 -cutoff 1.0  -deg 2 -erang 0 360 0 0 0 0

-euler <number>  There are three ways to generate an exhaustive list of Euler angles that covers (nearly) uniformly the rotational space for a given angular sampling (option -deg).  The proportional method yields very even results and also performs well for smaller intervals specified via '-erang'. The pole sparsing method is widely used but yields slightly less uniform distributions. The spiral method also produces a less uniform distribution, but for medium to low resolution docking it is quite reasonable. The Euler angles are saved to a file col_eulers.dat, and you can edit this file and reload it using the -euler 3 option. This way, you can also load any manually generated Euler angle files. 

-euler 0                     Proportional method [default]
-euler 1                     Pole sparsing method
-euler 2                     Spiral method
-euler 3  <filename>   Input file
Here is an example generating the Euler angle distribution with 8° angular sampling using the spiral method: 
 
./colores  em.sit  atoms.pdb -res 9.0 -cutoff 1.0  -deg 8 -euler 2

-peak <number> This option sets the peak search algorithm. By default, an advanced filtering-based approach is used. The older, sorting-based algorithm is available as an option.

-peak 0                     Original peak search via sort
-peak 1                     New peak search via filtering [default]

-explor <number> Controls the number of the best fits found in the 6D on-lattice search to be subsequently refined by Powell optimization.  By default an off-lattice optimization is applied to the first 10 best on-lattice docking results. [default -explor 10]

-sizef <value> To avoid that the atomic structure is placed outside the target map during rotation the low resolution map is enlarged by a margin of width sizef times the maximum radius of the atomic structure.  Ideally, sizef should be set to 1.0 but in practice a value as low as 0.2 seems to work fine and saves compute time [default -sizef 0.2]

-nopowell This flag skips the Powell optimization and only the on-lattice search is performed. By default the Powell optimization is turned on.

-pwti <value number> Powell tolerance and max number of iterations of the Powell algorithm. This two parameters control the convergence of the optimization. By default, the tolerance is set to 1e-6 and the max iterations are limited to 25.

-pwdr <value value> Initial gradient of the translational and rotational search in the Powell optimization. By default the initial rotational gradient is set to 25% of the angular sampling (but not larger than 10°), and the translational initial gradient is set to 25% of the voxel spacing. To use the default value only for the rotational or only for the translational gradient, choose a negative number for the parameter that must be left at default, and your chosen value for the other.

-pwcorr <number> This option sets the Powell correlation algorithm. By default, the fastest algorithm which reproduces the standard cross correlation coefficient to within the Powell tolerance is determined at runtime.

-pwcorr 0                     Determined at runtime [default]
-pwcorr 1                     Standard three-step code
-pwcorr 2                     Three-step code with mask applied
-pwcorr 3                     One-step code for small probe structures

-nopeaksharp This flag skips the peak sharpness estimation procedure in order to save processing time. By default, the peak sharpness estimation is turned on.

Input at program prompt:

None.

Output:

Here is a brief description of the output files (see also the file headers);

col_best*.pdb The atomic coordinates in PDB format of the best fits found in the search. The total number of best fits saved is controlled by the option -explor, but only non-degenerate fits are returned, so the number may be smaller than specified by the -explor option. The PDB header contains information about the docking (sampling, fit criteria used, correlation values, position and orientation etc.). It also includes a table containing the angular variability of the correlation about the fit.

col_rotate.log This file contains the best translational fit (on-lattice) found for each rotation. The first 3 columns are the Euler angles (in degrees), the next 3 columns are the translational coordinates that gave the highest correlation value, followed by the correlation value (not normalized).

col_powell.log This file contains information about the Powell off-lattice search performed for the best fits from 6D lattice search.  As before, rotational and translational coordinates correspond to the first 6 columns , but note that the Euler angles are in radian units.

col_trans.sit  The on-lattice translation function in Situs format. Since the translational search space corresponds to the input map lattice , we can generate a map in which density values are the correlation values normalized by the maximum. This is a Situs format map, so you can use VMD to display it or use map2map to convert to other formats.

col_trans.log  Same as col_trans.sit, but instead of a map, the translational correlation values are stored in a regular file. Each row that corresponds to a lattice coordinate (columns 4,5,6) shows the corresponding Euler angles (columns 1,2,3) in degrees that exhibit the highest correlation value (column 7). 

col_eulers.dat This file contains the list of uniform Euler angle triplets that defines the rotational space search. You can load such a file by using the option -euler 3. You can also inspect this file with the eul2pdb tool.

Notes:

  • Depending on the overlap of structure and map, a good fit with -corr 1 (default Laplacian filter) setting may have correlation values upto 0.5, and with -corr 0 (standard correlation) may have values upto 0.9. These correlation values are smaller if the structure does not account for the entire map density.
  • The time estimation gives a quite accurate estimate of the on-lattice 6D search time, however, the subsequent off-lattice Powell optimization is not considered in the estimation and depends on the -explor number. You can balance the precision of the search (e.g. angular sampling, option -deg) with the compute expense. Remember that for medium to low resolution maps angular sampling steps of < 6 ° degrees are not particularly useful.
  • You can also save time if you use only the carbon alpha or backbone atoms of the input structure. Typically, at low resolution the docking does not depend sensitively on the level of detail in the atomic structure.Take into account that the exhaustive 6D search scales approximately as N log N  with the voxel number N in your map.
  • If you have a large map you can try to crop the data to a region of interest (e.g. the asymmetric unit). Allow for sufficient room for the probe structure since you do not know its exact location a priori.
  • This is a rigid body search. If you expect large, induced fit conformational changes, you can dissect your atomic structures in rigid-body domains and perform the docking with each of them individually. Alternatively, you may want to try our flexible fitting strategies.
eul2pdb - Graphical Representation of Euler Angles

Purpose:

The eul2pdb utility is used to generate a graphical representation of a set of Euler angles resulting from a colores run.  The eul2pdb programs writes a pseudo-atomic structure in a PDB formatted file where the set of Euler angles is represented as a set of points on a 10 A radius sphere. This file can then be either inspected with a visualization program (for example VMD).

Usage (at shell prompt):
 

./eul2pdb col_eulers.dat out_file
out_file: output file, PDB format

Input at program prompt:

None.

Output:

Pseudo-atomic structure in PDB-format. Each triplet of Euler angles in the input file is represented as a point on a 10 A radius sphere. The phi Euler angle (rotation in the projection plane) is encoded in the B-factor column of the PDB file (in radians units), whereas theta and psi correspond to longitude and latitude on the sphere.

map2map - Format Conversion

Former names: convert, conformat

Purpose:

Volumetric density data is converted to a Situs-specific format on a cubic lattice. This allows Situs programs to keep track of coordinate systems and it makes the core Situs programs independent of the ever changing map format standards. 

The map2map utility reads many file formats used by standard EM application software. These include the MRC, SPIDER, and CCP4 formats, as well as similar 4-byte floating-point binary formats (automatic byte-order adjustment). X-PLOR maps in ASCII format, and ASCII files that contain a sequence of density values in free format are also recognized. The reverse conversion of Situs format files to CCP4, MRC, X-PLOR, SPIDER format is also supported to limited extent, to facilitate the visualization of Situs-generated maps.

Usage (at shell prompt):
 

./map2map file1 file2

   file1: inputfile
   file2: outputfile

Interactive input at program prompt (for automation see below):

  • Input file format.
  • Number of x, y, and z increments (columns, rows, and sections).
  • Voxel size (grid spacing).
  • Order of x, y, and z increments in the input file.

Output:

Density file in selected format. The order of values in the sequence of densities is altered, if necessary, such that x increments change fastest and z increments change slowest. In Situs format, a short header holds the voxel size and numbers of x, y, and z increments, as well as the 3D coordinates of the first voxel (x=1, y=1, z=1). The default origin of the map coordinate system in Situs is the origin of the unit cell (if input file is in X-PLOR, MRC and CCP4 formats) or voxel (1,1,1) (all other input formats). The Situs header is followed by the sequence of data values. The converted Situs files are in ASCII format, allowing the user to verify the successful conversion of the data. If a SPIDER, MRC, CCP4, or X-PLOR map is created, the unit cell spans the space from the coordinate system origin to the maximum extent of the input Situs map. 

Notes and Known Bugs:

  • You can automate this program by "overloading" the standard input if you put expected values in a script!
  • You should always check out the free conversion programs MAPMAN, part of Gerard Kleywegt's RAVE package, and especially Image Science 's em2em program.
  • The MRC and CCP4 map formats are not fully supported as many flavors are available. If you encounter problems, we recommend that you use the above third party conversion tools. It is generally safe to use the SPIDER format which is fully supported by map2map, so if in doubt convert to SPIDER first. Known problems: Some users have reported a mixup of axes / scrambling of densities with some particular maps. Very old CCP4 maps may correspond to the format currently denoted "MRC", whereas future MRC maps may correspond to the current "CCP4" format. For a history and future plans of this development read Stephen Fuller's page at EMBL.
  • Avoid round-trip conversions "EM map format" -> Situs -> "EM map format", because information about the unit cell extent beyond that of the actual range of data is lost. Also, Situs lattices are cubic, so any skewed unit cell parameters (alpha, beta, gamma, etc) of the original map are not saved.
pdb2sax - Create a Simulated Bead Model from a PDB

Purpose:

The pdb2sax utility allows one to fill an input atomic structure with close-packed spheres on a hexagonal lattice. It allows one to create simulated bead models for validating Situs modeling applications.

Usage (at shell prompt):
 

./pdb2sax file1 file2 radius

   file1: inputfile, PDB format
   file2: outputfile (bead model), PDB format
   radius: bead radius in Angstrom

Interactive input at program prompt (also suitable for automation):

  • Choice of atom mass-weighting and B-factor cutoff level. Atoms with B-factors above the cutoff level will be ignored. You can automate this program by "overloading" the standard input if you put expected values in a script!

Output:

PDB file that contains the centers of the simulated beads with the radii in the occupancy column. This file can then be either inspected with a visualization program (for example VMD), or processed into a bead volume map using pdb2vol.

pdb2vol - Create a Volumetric Map from a PDB

Former name: pdblur

Purpose:

The pdb2vol utility is a real-space convolution tool. It allows one to lower the resolution of an atomic structure to a user-specified value, or to create a bead model from atomic coordinates. The structure is first projected to a cubic lattice by trilinear interpolation. Subsequently, each lattice point is convoluted with one of five supported kernel (point-spread) functions.

Usage (at shell prompt):
 

./pdb2vol file1 file2

   file1: inputfile, PDB format
   file2: outputfile, Situs format

Interactive input at program prompt (also suitable for automation):

  • If water, hydrogen,codebook vector atoms are present, choice of ignoring them.
  • Choice of atom mass-weighting and B-factor cutoff level. Atoms with B-factors above the cutoff level will be ignored.
  • Desired voxel spacing for output map.
  • Kernel width, defined by either the kernel half-max radius r-half (enter positive value) or by the target resolution of the output map (enter value of resolution as negative number). The standard deviation (sigma) of the kernel is assumed to be half the target resolution.
  • Type of smoothing kernel:
    • Gaussian, exp(-1.5 r^2 / sigma^2)
    • Triangular, max(0, 1 - 0.5 |r| / r-half)
    • Semi-Epanechnikov, max(0, 1 - 0.5 |r|^1.5 / r-half^1.5)
    • Epanechnikov, max(0, 1 - 0.5 r^2 / r-half^2)
    • Hard Sphere,  max(0, 1 - 0.5 r^60 / r-half^60)
  • Choice of correction for lattice smoothing (subtract the lattice projection mean-square deviation from the kernel variance).
  • The kernel amplitude at the kernel origin (r=0).
  • You can automate this program by "overloading" the standard input if you put expected values in a script!

Output:

Density file in Situs format. The new grid follows the coordinate system origin convention of the atomic structure and forms the smallest possible box that fully encloses the structure convoluted by the kernel.

pdbsymm - Symmetry Builder

Supersedes former program: hlxbuild

Purpose:

The pdbsymm tool generates multiple copies of the input structure according to a user-specified symmetry. Currently supported symmetry types include: C, D and helical. If an input map (optional) is also specified and the map has square cross-sections, the x- and y-position of the principal symmetry axis (for symmetries C and D) or of the helical axis (for helical symmetry) is automatically set to the geometric center of a cross-section (in the map coordinate system). If the map is cubic and D symmetry is requested, the z-position of the secondary axes is set to the geometric center of the y-z cross-section. This functionality is useful if one wants to generate assemblies from subunits that had been fitted to the input map earlier.

Usage (at shell prompt):
 

./pdbsymm file1 [file2] file3

   file1: inputfile, PDB format
   file2: (optional) inputfile for helical or symmetry axis, Situs format
   file3: outputfile, PDB format

Interactive input at program prompt (also suitable for automation):

Depending on symmetry type:

  • Helical rise per subunit (in z-direction).
  • Angular twist per subunit (sign determines handedness).
  • Desired number of subunits to be placed before file1 structure.
  • Desired number of subunits to be placed after file1 structure.
  • Order of the principal symmetry axis.
  • [If file2 is unspecified: x- and y-position of helical axis (offset from file1 coordinate system origin).]
  • z-position of secondary symmetry axes (for D symmetry - offset from file1 coordinate system origin).
  • You can automate this program by "overloading" the standard input if you put expected values in a script!

Output:

Symmetry PDB file containing multiple copies of input PDB file.

Note:

file2 input maps that don't have square (x,y) cross-sections or whose helical axis is not in the geometric center of the square cross-sections must first be crop-edited with voledit.

qdock - Rigid-Body Docking of High- and Low-Resolution Data

Purpose:

Specialized tool to dock a high-resolution structure to low-resolution data using a fixed number of codebook vectors . Much of the functionality of qdock has been superseded by the combined vector quantization / docking utility qrange and by the cross-correlation based colores. qdock remains part of the distribution to support correlation-coefficient based docking. It is assumed that the user has already determined a suitable number k of codebook vectors with qrange and has created the vector files with qpdb and qvol .

Similar to qrange, qdock carries out an exhaustive search of the k! = k*(k-1)* ...*2 permutations of corresponding codebook vectors and returns a list of best least-squares fits. The limitation of qdock is that the docking can be carried out only for a fixed number k of vectors. The advantage of qdock is that the results can be ranked (if desired) by the correlation coefficient that measures the overlap of the high- and low-resolution data. A lower rms deviation (rmsd) of the least-squares fit typically corresponds to a higher correlation coefficient. However, the coefficients typically lie within a very narrow numeric range and care must be taken because fits based on the correlation coefficient alone are often ambiguous.

The user has the choice to change the origin of the map coordinate system. This option might be helpful if a graphics program has a convention for defining the map coordinate system that differs from that of the map2map utility.

Usage (at shell prompt):
 

./qdock file1 file2 file3 file4
file1: inputfile 1, PDB format, codebook vectors from qvol

file2: inputfile 2, Situs format, volumetric map corresponding to qvol vectors 

file3: inputfile 3, PDB format, codebook vectors from qpdb

file4: inputfile 4, PDB format, high-resolution structure corresponding to qpdb vectors

Interactive input at program prompt (also suitable for automation):

  • Choice of volumetric map coordinate system. 
  • Ranking by vector rmsd or by correlation coefficient (carbon alpha or full atom). 
  • Selection of a docked high-resolution structure from a list. 
  • Filename for the selected output structure. 
  • If desired, select and export additional results from the list.
  • You can automate this program by "overloading" the standard input if you put expected values in a script!

Output:
  • (Program level:) Possible pairs of corresponding codebook vectors. The (default: 20) best least-squares fits are ranked by their rmsd (in Angstrom), or by their correlation coefficient. Permutations indicate the order of qpdb vectors fitted to qvol vectors. See the Situs manuscripts for more information. Also, the positional and orientational accuracy of the fitting is given, calculated based on the statistical vector variability.
  • (Shell level:) The superposition of the selected corresponding codebook vectors defines a rigid-body transformation which superimposes high- and low-resolution data sets. After transformation, the docked high-resolution structure is written to a file in PDB format. The file3 codebook vectors are sorted and transformed in concert with the structure; they are appended, together with the file1 vectors, to the output PDB file.
qpdb - Vector Quantization of a PDB

Purpose:

Specialized tool to perform a vector quantization of atomic resolution data. qpdb supports the correlation-coefficient based docking with qdock, and also flexible docking. To enable skeleton-based flexible docking with qvol, qpdb includes options to learn vector distances and to export the Voronoi cells generated by the vectors. 

In qpdb a small number of calculations (8 by default) are repeated with different random number seeds. The averaged codebook vectors and their statistical variability are then written to the output file.

Usage:

First, the user must determine a suitable number of codebook vectors e.g. with qrange or by visual inspection. For rigid-body docking, a small number (3-6) is usually sufficient. qpdb employs an efficient mass-weighting scheme using "equally weighted input vectors". The program also allows to ignore flexible or poorly defined atoms with high crystallographic B-factors. This option should only be chosen if there is an indication that parts of the protein are not visible in the low-resolution data due to disorder.

Usage (at shell prompt):
 

./qpdb file1 file2

   file1: inputfile (atomic structure), PDB format
   file2: outputfile (codebook vectors), PDB format

Interactive input at program prompt (also suitable for automation):

  • If water, hydrogen,codebook vector atoms are present, choice of ignoring them.
  • B-factor cutoff level. Atoms with B-factors above this level will be ignored.
  • Number of codebook vectors.
  • Choice of computing the vector connectivities (neighborhood relationships) with the Competitive Hebb Rule (Wriggers et al., 1998) and writing them to a file.
  • Choice of computing the Voronoi cells and writing them to a file.
  • You can automate this program by "overloading" the standard input if you put expected values in a script!

Output:

  • (Program level:) The sphericity, a measure between 0 and 1 that characterizes how spherical the shape of the structure (file1) is. After the vector quantization the program prints the average rms variability of the codebook vectors in Angstrom. Also given in Angstrom is the radius of gyration of the vectors. 
  • (Shell level:) Codebook vectors in PDB-formatted output file. The vector rms variabilities, representing the precision of the codebook vectors, are written to the occupancy fields of the PDB-style atom entries. The effective radii of the Voronoi cells are written to the B-factor fields of the PDB-style atom entries. (Optional) Vector connectivities can be written to a PSF file or a distance constraints file. Constraint file entries are triples of free-format values in the order <index1> <index2> <distance in Angstrom>, where the indices correspond to the order of vectors in file2, counting from 1. (Optional) The Voronoi cells can be written to a PDB file consisting of the file1 atom entries where the index of each corresponding vector is written to the B-factor field of the output file.
qrange - Automatic Vector Quantization and Rigid-Body Docking

Purpose:

This program automatically superimposes atomic structures with corresponding low-resolution density. Practically all density must be accounted for by the atomic structure for this to work. If in doubt use colores. You've been warned!

qrange consolidates the functions of the older vector quantization routines qvol and qpdb and of the rigid-body docking routine qdock into a single program to enable a more user-friendly fitting. The major conceptual innovation of  qrange and related programs is the discretization of the configurational search space by vector quantization . This discretization yields a list of best-scoring superpositions of the codebook vectors that represent the structural data sets. The search is computationally very efficient and can be carried out on a standard workstation within seconds.

The global optimum of vector positions is difficult to find. In qrange a small number of TRN calculations (8 by default) are repeated with different random number seeds. The averaged codebook vectors and their statistical variability are then saved. In general, a low vector variability indicates good convergence and reliable vector positions. The variability depends on the shape of the 3D input density distribution and on the number of vectors. Therefore, it is a good selection criterion for finding an optimal number k of codebook vectors.

To optimally represent the input density distribution, qrange employs a multi-resolution approach and computes vectors for a range of k (3 <= k <= 9 by default). Note that for globular shapes a small number of vectors (3-4) is usually sufficient to encode the shape unambiguously. A maximum number of k=9 vectors ( = 27 degrees of freedom) provides sufficient leeway to encode even the most complex shapes.

After computing the k vectors for both high- and low resolution data, the subsequent docking then determines the six rigid-body degrees of freedom by a least-squares fit [Kabsch, 1976] of the k pairs of vectors. The corresponding vectors are not known a priori, and all k! = k*(k-1)* ...*2 possible permutations are explored. The program saves a list of best least-squares fits (for each number k), ranked by the remaining rms deviation (rmsd) after superposition of the vectors. The ranking by codebook vector rms deviation typically produces a clear prediction of the optimum docking configuration.

Usage:

In a practical application, the user selects the number k that produces the smallest statistical vector variability. Alternatively, the user may also consider the number k that results in the smallest rmsd (the two selection criteria give the same results in most cases). For a given k the user then selects the fit with the smallest vector rmsd from a list of (default: 20) best scoring results. This strategy was validated in detail by Wriggers and Birmanns; it identifies a clear winner in most situations (for more info read the manuscript). The positional and orientational accuracy of the fitting is also estimated based on the statistical vector variability.

Usage (at shell prompt):
 

./qrange file1 file2

   file1: inputfile 1, Situs format
   file2: inputfile 2, PDB format

Interactive input at program prompt (also suitable for automation):

  • Choice of utilities to inspect the density distribution (e.g. voxel histogram).
  • Threshold (cutoff) density value.
  • Choice of volumetric map coordinate system.
  • If water molecules are present, choice of ignoring them.
  • B-factor cutoff level. Atoms with B-factors above this level will be ignored.
  • Selection of the vector number k from a list.
  • Selection of a docked high-resolution structure from a list.
  • Filename for the selected output structure.
  • If desired, select and export additional results from the lists.
  • You can automate this program by "overloading" the standard input if you put expected values in a script!

Output:

  • (Program level:) The sphericity, a measure between 0 and 1 that characterizes how spherical the shape of the structure (file2) is. After the multiple vector quantizations the program prints the selection criteria (vector variability and vector rmsd) as a function of k. Also shown are the possible pairings of corresponding codebook vectors. The best least-squares fits are ranked by their vector rmsd (in Angstrom). Also, the correlation coefficient is given. Permutations indicate the internal order of vectors. Finally, the positional and orientational accuracy of the selected fit is given, as computed from the statistical vector variability.
  • (Shell level:) The superposition of the selected corresponding codebook vectors defines a rigid-body transformation which superimposes high- and low-resolution data sets. After transformation, the docked high-resolution structure is written to a file in PDB format. The codebook vectors are appended to this output file. QVOL atoms represent the low-resolution vectors and QPDB atoms represent the high resolution vectors. The vector rms variabilities, representing the precision of the codebook vectors, are written to the occupancy fields of the PDB-style atom entries. 

Notes:

  • qrange employs an efficient atom mass-weighting scheme using "equally weighted input vectors".
  • The user has the choice to change the origin of the map coordinate system. This option might be helpful if a graphics program has a different convention for defining the map coordinate system than map2map .
  • The program also allows to ignore flexible or poorly defined atoms with high crystallographic B-factors. This option should only be chosen if there is an indication that parts of the protein are not visible in the low-resolution data due to disorder.
  • Often there is no one-to-one correspondence between low-resolution data and what one would consider the physical density of the structure, so users may need to experiment with the volumetric and B-factor threshold values until the results are satisfying. To help estimate appropriate threshold values for single molecules, qvol and qpdb print the radius of gyration of the codebook vectors, a measure of vector compactness. At the proper threshold densities the radius of gyration returned by qvol   should be approximately equal to that of qpdb .
qvol - Vector Quantization of Volumetric Map

Purpose:

Specialized tool to perform a vector quantization of low-resolution, single molecule data. For rigid-body docking, much of the functionality of qvol has been superseded by the combined vector quantization / docking utility qrange . qvol remains part of the distribution to support the correlation-coefficient based docking with qdock, and to support flexible docking. In the absence of existing vector positions, qvol carries out a global search using the TRN algorithm. If start vectors are already known, the LBG local search algorithm is used instead of TRN. LBG allows to add distance constraints to the vector refinement that are useful for flexible docking.

The global optimum of vector positions is difficult to find. With TRN, a small number of calculations (8 by default) are repeated with different random number seeds. The averaged codebook vectors and their statistical variability are then written to the output file. With LBG, no statistical clustering is performed. In this case it is important to specify reliable initial positions from a prior qvol run. 

Usage:

In a practical application of qvol, one should extract from the volumetric data a region of interest corresponding to a single molecule using e.g. voledit. Next, the user must determine a suitable number of codebook vectors. Only densities above a user-defined threshold value are considered by qvol to eliminate background noise in the low-resolution data. Depending on the noise, this threshold value should be at 50-80% of the level that is typically considered the "molecular surface" of the biopolymer in the low-resolution data. 

New vector positions are calculated automatically with the TRN method if no start vectors are specified. Subsequently, these vector positions can be refined in a second qvol run with the LBG method (this is done automatically if start vectors are specified). Also, any distance constraints can be read from a file or entered at the command prompt at this time.

Usage (at shell prompt):
 

./qvol file1 [file2] file3

   file1: inputfile, Situs format
   file2: inputfile, start vectors, PDB format (optional)
   file3: outputfile, PDB format

Interactive input at program prompt (also suitable for automation):

  • Choice of utilities to inspect the density distribution (e.g. voxel histogram).
  • Threshold (cutoff) density value.
  • Number of codebook vectors.
  • (If file2 is specified): Choice of entering distance constraints manually or from a file. There are two constraint file options. Constraint file entries generated e.g. with qpdb are triples of free-format values in the order <index1> <index2> <distance in Angstrom>, where the indices correspond to the order of vectors in file2, counting from 1. It is also possible to read the connectivities from a PSF file in which case the missing distances are computed from file2.
  • Choice of computing the vector connectivities (neighborhood relationships) with the Competitive Hebb Rule (Wriggers et al., 1998) and writing them to a file.
  • You can automate this program by "overloading" the standard input if you put expected values in a script!

Output:

  • (Program level:) Statistical analysis of the vectors and their radius of gyration, i.e. the radial rms deviation from the vector center of mass. 
  • (Shell level): Codebook vectors in a PDB-formatted output file. The vector rms variabilities, representing the precision of the codebook vectors, are written to the occupancy fields of the PDB-style atom entries. (Optional) Vector connectivities can be written to a PSF file or a distance constraints file.

Notes:

  • Vector connectivities in PSF format can be visualized and edited as bond connections (together with the atom-style PDB entries of file2 and file3) using the molecular graphics program VMD. Simply overload the PSF file into the PDB file in the VMD 'Molecule' menu. Then under the 'Mouse' menu select 'Add/Remove Bonds'. The edited connectivity can then be saved later into a PSF file from the VMD command console (assuming your molecule is 'top'):

    set sel [atomselect top all]
    $sel writepsf my.psf


  • If there are cluster size deviations from the expected value (default: 8) when using the TRN algorithm, refine the found vector positions by passing them to qvol as input file of a second, LBG run.
  • Distance constraints do not determine the chirality (handedness) of vector connections. If you encounter mirror images or otherwise flipped connections after running qvol compared to connections determined with qpdb, you need to experiment with the indexing of your constraints. The LBG method combined with the SHAKE constraint algorithm is relatively insensitive to the position of start vectors.
vol2pdb - Create a PDB from a Volumetric Map 

Purpose:

The vol2pdb utility allows one to encode positive density values of a 3D map into a PDB file with the densities written to the PDB occupancy column. This is useful for colores and colacor, both of which require a PDB and a map as input parameters.

Usage (at shell prompt):
 

./vol2pdb file1 file2

   file1: inputfile 1, Situs format
   file2: outputfile, PDB format

Input at program prompt:

None.

Output:

PDB format file with densities written to occupancy field (if rescaling necessary the conversion factor is given by the program).

volcube - Creating Isocontour Surfaces

Purpose:

The program volcube is needed only for special applications or older versions of VMD since VMD now reads Situs formatted maps directly. It produces wireframe meshes or solid surfaces of isocontours that can be source from the VMD command console. The isosurfaces are generated with an improved version of the "marching cubes'' algorithm [Lorensen and Cline, 1987]. 

Usage (at shell prompt):
 

./volcube file1 file2
file1: inputfile, Situs format

file2: outputfile, graphics rendering primitives in VMD script format

Interactive input at program prompt (also suitable for automation):

  • New voxel size for rendering (the input grid is automatically interpolated).
  • Isocontour surface density level.
  • Rendering style: lines (wireframe), triangles (solid, flat shades), or trinorm (solid, smooth shades).
  • You can automate this program by "overloading" the standard input if you put expected values in a script!

Output:


  • (Program level:) Diagnostic messages and classification of the "marching cube" intersection patterns according to Heiden et al., J. Comp. Chem. 14 (1993), 246-250.
  • (Shell level:) VMD script file with graphics primitives. The primitives can be sourced from the VMD text console (cf. VMD user guide ). File parameters are written to a header. Example: The following sequence of commands in the VMD text console will first list all files in the current directory and then render the graphics primitives of file "trinorm.vmd" in red", and those of file "wireframe.vmd" in green:
ls 
draw color red
source trinorm.vmd
draw color green
source wireframe.vmd

Known Bugs:

To prevent shading of the wireframe lines, the Tcl output scripts turn off the VMD "materials" rendering property. Subsequently loaded solid surfaces will therefore be rendered flat. This can be prevented by sourcing all solid surfaces before sourcing any wireframes.

voldiff - Discrepancy / Difference Mapping

Former name: subtract

Purpose:

The voldiff utility allows one to compute the difference density map (discrepancy map) of two volume data sets in Situs format. The input datasets can differ in their parameters. If necessary the second input file is resampled to the grid of the first input file.

Usage (at shell prompt):
 

./voldiff file1 file2 file3

   file1: inputfile 1, Situs format
   file2: inputfile 2, Situs format
   file3: outputfile, Situs format

Input at program prompt:

None.

Output:

Density file in Situs format. The new density values are computed by subtracting the corresponding values of input grid 2 (which is resampled by trilinear interpolation, if necessary) from those of input grid 1. The output grid inherits the parameters from grid 1.

voledit - Inspecting and Editing 3D Maps

Supersedes former programs: volslice, floodfill, volpad (padup), volcrop (pindown)

Purpose:

The floodfill utility is used to extract a targeted contiguous volume from a given density map. Originating from the vicinity of a given start voxel, floodfill finds recursively the maximum contiguous volume formed by neighboring voxels that exceed a given contour density level. Suitable start positions can be identified by visual inspection with the voledit or voledit3d utilities. The routine tolerates near misses of the start position.

Cross sections of the density data in the (x,y)-, (y,z)-, or (z,x)-planes can be inspected with the simple terminal window graphics program voledit.  The utility can also be used to write individual 2D slices or 3D volumes to files. Volumes can be edited by cropping, zero padding, polygon clipping, and segmentation (specified under options).

Usage (at shell prompt):

 

./voledit file1 

   file1: inputfile, Situs format

Interactive input at program prompt (also suitable for automation):

  • Type of cross section, (x,y), (y,z), or (z,x).
  • Threshold (cutoff) value for the rendering of the density.
  • z, x, or y position of the cross section plane (grid units).
  • Polygon clipping parameters and vertices (options).
  • Cropping parameters in voxel units (options).
  • Zero padding in voxel units (options).
  • Segmentation parameters to extract a targeted contiguous volume. Originating from the vicinity of a given start voxel, voledit finds recursively the maximum contiguous volume formed by neighboring voxels that exceed a given threshold density level. An additional layer is added for aesthetic reasons to facilitate isocontouring near the cutoff level. Although the extracted grid contains some voxels (in the contour layer) with densities below the cutoff, all voxels with density values above the cutoff are guaranteed to be part of the found contiguous volume. Voxels outside the contour layer are assigned a density value of 0.
  • File name for 2D slice or 3D volume output file (options).
  • You can automate this program by "overloading" the standard input if you put expected values in a script!

Output:

(Shell window:) Cross section of the input map in Situs format. Note that pairs of voxels neighboring in the vertical direction are represented by a single character:

'^' if the upper voxel density exceeds a threshold (cutoff) level,
'u', if the lower voxel density exceeds the threshold level,
'0' if both upper and lower voxel densities exceed the threshold level, and
' ', if the densities are below the threshold.
(2D Output File:) Voxel indices and density of specified cross section.

(3D Output File:) Density file in Situs format. The new grid inherits the voxel size (grid spacing) of the old grid. The number of x, y, and z increments, and the coordinates of voxel (1,1,1) depend on the chosen editing options (cropping, padding, segmentation). For example, in segmentation when shrinking of the box is selected, the grid dimensions are determined by the minimum box that contains both the contiguous volume plus one layer of neighboring voxels with density values below the threshold.

Notes:

This program requires the use of a fixed-width font in the shell window.

volhist - Inspecting and Manipulating the Voxel Histogram

Former name: histovox

Purpose:

The volhist utility prints the voxel histogram [Frank et al., 1991] of the density values. The histogram illustrates two general properties of low-resolution density distributions. First, a pronounced peak at low densities is due to background scattering. The protein density typically corresponds to a second, broader peak at higher densities. When integrating the histogram ``from the top down'', the known molecular volume of a protein can be used to compute its boundary density value. The volhist program also allows the user to add a constant value to the densities to shift the background density peak to the origin, and to rescale the densities.

Usage (at shell prompt):
 

./volhist file1 [file2] 

   file1: inputfile, Situs format
   file2: (optional) outputfile, Situs format

Interactive input at program prompt (if file2 specified):

  • Offset density value (will be added to all voxels). 
  • Scaling factor.
  • You can automate this program by "overloading" the standard input if you put expected values in a script!

Output:

  • (Program level:) Voxel histogram and fractional volume of volumetric data echoed to the screen. The histogram bars are normalized by the second highest density peak.
  • (Shell level:) Density file in Situs format (if specified). The new density values are computed by adding the offset value and by multiplying the scaling factor entered at the program prompt. The new grid inherits all size and position parameters of the old grid.
volvoxl - Grid Interpolation

Former name: interpolate

Purpose:

The volvoxl utility allows one to change the voxel size (grid spacing) of a grid. Density values of the new grid are computed by trilinear interpolation.

Usage (at shell prompt):
 

./volvoxl file1 file2

   file1: inputfile, Situs format
   file2: outputfile, Situs format

Interactive input at program prompt (also suitable for automation):

The desired new voxel size (grid spacing). You can automate this program by "overloading" the standard input if you put expected values in a script!

Output:

Density file in Situs format. The new grid inherits the coordinate system origin and forms the largest possible box that is fully enclosed by the old grid.

Header File and Library Routines

The suite of programs is supported by a header file (situs.h) containing user-defined parameters and by auxiliary library programs. The library programs and their respective header files handle input and output of atomic coordinates in PDB format (lib_pio.c), input and output of volumetric data (lib_vio.c), input of data at the prompt (lib_std.c), Eigenvector computation for real symmetric 3x3 matrices (lib_jac.c), Euler angle generation (lib_eul.c), random number generation (lib_rnd.c), array management (lib_vec.c), Powell optimization (lib_pow.c), map manipulation (lib_vwk.c), PDB manipulation (lib_pwk.c), symmetric multiprocessing (lib_smp.c), and timing (lib_tim.c).

Return to the front page .