|
User Guide
|
| This guide is
intended to be
used as a reference manual. You may also want to follow the simple
steps
described in the tutorials
which give usage
examples
of the most important utilities. More documentation is also available
on
the methodology page
and in the published
articles . |
|
Content:
colacor
- Cross-Correlation Calculation and Refinement of Manual Docking
colores
- FFT Accelerated 6D Exhaustive Search
eul2pdb
- Graphical Representation of Euler Angles
map2map -
Format Conversion
matchpoint
- Point Cloud Matching
pdb2sax
- Create a Simulated SAXS Bead Model from a PDB
pdb2vol
- Create a Volumetric Map from a PDB
pdbsymm-
Symmetry Builder
qdock
- Rigid-Body Docking of High- and Low-Resolution Data
qpdb
- Vector
Quantization of a PDB
qplasty
- Interpolation of Sparsely Sampled Displacements
qrange
- Automatic Vector Quantization and Rigid-Body Docking
qvol
- Vector
Quantization of Volumetric Map
vol2pdb
- Create a PDB from a
Volumetric Map
voldiff
- Discrepancy / Difference Mapping
voledit
- Inspecting 2D Cross Sections of Density Values
volhist
- Inspecting and Shifting the Voxel Histogram
Header
File
and Library Routines
|
| colacor
- Cross-Correlation Calculation and Refinement of Manual Docking
Purpose:
colacor = Combined Off-LAttice
COrrelation and Refinement. Since colores
centers maps and structures automatically,
a special tool is needed to compute the cross-correlation coefficient
between a density map and a (resolution-lowered) atomic structure at a
given, user defined geometry. As an additional useful option, colacor also
performs a single run of off-lattice Powell optimization that
(rigid-body) refines a manual docked fit to the nearest maximum of the
cross-correlation.
Basic usage
(at shell prompt):
| ./colacor <Situs density map> <PDB
structure> -res <float>
-cutoff <float> |
The basic
input parameters
are:
<Situs
density map> Low resolution map in Situs format. To
convert your
EM map to Situs format use the map2map
utility.
<PDB
structure> Atomic structure in PDB format.
-res <float>
Estimated resolution of the density map in Å. [default
-res 15.0]
-cutoff
<float> Density map threshold value. All density levels below
this
value
are set to zero. You can use volhist
to
rescale
the density levels or to shift the background peak in the voxel
histogram
to the origin. [default -cutoff 0.0]
-corr
<int> This option
controls the fitting
criterion.
Two options
are implemented:
-corr
0 [default
for res.<10Å] Standard
linear cross-correlation. The scalar product
between
the density maps of the low resolution map and the low-pass filtered
atomic
structure. For resolutions > 10Å this criterion is less
discriminative.
-corr
1 [default for res.>=10Å] A
Laplacian filter is applied by default to maximize
the
fitting
contrast. This is the recommended docking criterion
for low resolution docking (up to 25Å resolution). To
provide for a more robust algorithm when dealing with cropped
or thresholded experimental
data we implemented a mask that filters out hard artificial
surfaces. Due to the masking and filtering expect overall smaller
correlation
values compared to -corr 0.
More advanced
options (at shell prompt):
-ani <float>
Defines the resolution anisotropy factor (z direction vs. x,y
plane)
[default: -ani 1.0]. Allows
one to set a different resolution in the z direction vs. the x,y plane.
E.g. "-ani 1.5 -res 20" specifies a 30 A resolution in the z direction,
and a 20 A resolution in x,y. This is useful for researchers dealing
with
membrane protein or tomography reconstructions that have a reduced
resolution
in the z direction.
-nopowell
This flag skips the Powell optimization. Only the cross-correlation
coefficient is computed. By default the Powell optimization is turned
on.
-pwti
<float int> Powell tolerance and max number of iterations of
the
Powell algorithm. This two parameters control the convergence of the
optimization.
By default the tolerance is set to 1e-6 and the max iterations are
limited
to 25.
-pwdr <float
float> Initial gradient of the translational and rotational search
in the
Powell optimization. By default the initial rotational gradient is set
to
3.5 degrees. The rotational gradient cannot be larger than 10 degrees.
If a larger value is chosen, that value is ignored and the gradient is
set to 10 degrees. The translational initial gradient is set to 25% of
the voxel spacing. To use the default value only for the rotational or
only for the translational gradient, choose a negative number for the
parameter that must be left at default, and your chosen value for the
other.
-pwcorr
<int> This option sets the
Powell correlation algorithm. By default, the fastest algorithm which
reproduces the standard cross correlation coefficient to within the
Powell tolerance is determined at runtime.
-pwcorr
0 Determined at runtime [default]
-pwcorr 1 Standard three-step code
-pwcorr 2 Three-step code with mask applied
-pwcorr 3 One-step code for small probe
structures
-sizef
<float> To avoid that the atomic structure is placed outside the
target
map, the low resolution map may be enlarged by a margin of
width sizef times
the map dimensions. In practice a value as low as 0-0.2 seems to
work
fine and saves compute time [default -sizef 0.0]
Input at
program prompt:
None.
Output:
(Shell window:) The cross-correlation
coefficient and other useful information about the structure and map
used. Depending on the overlap of structure and map, a good fit with
-corr 1 (default Laplacian filter) setting may have values upto 0.5,
and with -corr 0 (standard correlation) may have values upto 0.9. These
values are smaller if the structure does not account for the entire map
density.
|
| colores
- FFT-Accelerated 6D Exhaustive Search
Purpose:
colores = COrrelation based LOw
RESolution docking. A general purpose, multi-processor
capable rigid-body
search
tool, suitable for situations where not all density is accounted for by
the atomic structure. The translational search is FFT-accelerated and
the program supports
the use of a Laplacian-style filter (can be turned off) that
drastically increases the fitting
contrast
at medium to low resolution (for more info see the corresponding
methods page). This tool performs a full exhaustive search in 6D
search space (3
translational
and 3 rotational degrees of freedom) within a couple of minutes of
compute
time, depending on system size and numbers of processors used.
Basic usage
(at shell prompt):
./colores <Situs density map> <PDB
structure> -res <float>
-cutoff <float> -deg <float> -nprocs <int>
|
The basic
input parameters
are:
<Situs
density map> Low resolution map in Situs format. To
convert your
EM map to Situs format use the map2map
utility.
<PDB
structure> Atomic structure in PDB format.
-nprocs
<int> This option sets the number of processors used for the
on-lattice 6D search and the off-lattice Powell optimization. Colores
supports systems with multiple core and/or hyperthreaded
processors. Note that on
some architectures the threads may not show in the UNIX "top" window,
if in doubt do your own time comparison. [default
-nprocs 1]
-res <float>
Estimated resolution of your density map in Å. [default
-res 15.0]
-cutoff
<float> Density map cutoff value. All density levels below this
value
are set to zero. You can use volhist
to
rescale
the density levels or to shift the background peak in the voxel
histogram
to the origin. [default -cutoff 0.0]
-deg
<float> Angular sampling of the rotational search space in
degrees.
For typical electron microscopy maps the angular step size should be
between
6-20°. [default -deg 15.0]
-corr
<int> This option controls the fitting
criterion.
Two
options are implemented:
-corr
0 [default
for res.<10Å] Standard
linear cross-correlation. The scalar product
between
the density maps of the low resolution map and the low-pass filtered
atomic
structure. For resolutions > 10Å this criterion is less
discriminative.
-corr
1 [default for res.>=10Å] A
Laplacian filter is applied by default to maximize
the
fitting
contrast. This is the recommended docking criterion
for low resolution docking (up to 25Å resolution). To
provide for a more robust algorithm when dealing with cropped
or thresholded experimental
data we implemented a mask that filters out hard artificial
surfaces. Due to the masking and filtering expect overall smaller
correlation
values compared to -corr 0.
More advanced
options (at shell prompt):
-ani <float>
Defines the resolution anisotropy factor (z direction vs. x,y
plane)
[default: -ani 1.0]. Allows
one to set a different resolution in the z direction vs. the x,y plane.
E.g. "-ani 1.5 -res 20" specifies a 30 A resolution in the z direction,
and a 20 A resolution in x,y. This is useful for researchers dealing
with
membrane protein or tomography reconstructions that have a reduced
resolution
in the z direction.
-erang
<float float float float float float>
Defines the rotational space limits according to the range of the Euler
angles (psi, theta, phi). By default the entire rotational space is
considered
[default: -erang 0 360 0 180 0 360]. Note that the Euler angle
range
is not limited to these standard intervals, so you can specify also
negative
values (within certain limits), but in any event any colores output
angles
are remapped to the standard intervals. For example, if you want to
perform
a fine search of 2° angular sampling in only one of the Euler
angles,
these are the options:
| ./colores em.sit
atoms.pdb
-res 9.0 -cutoff 1.0 -deg 2 -erang 0 360 0 0 0 0 |
-euler
<int> There are three ways to generate an exhaustive
list of Euler
angles that covers (nearly) uniformly the rotational space for a given
angular sampling
(option -deg). The proportional method yields very even
results and also performs well for smaller intervals specified via
'-erang'. The pole sparsing method is widely used but yields slightly
less
uniform distributions. The spiral method also produces a less uniform
distribution, but for medium
to low resolution docking it is quite reasonable. The Euler angles are
saved to a file col_eulers.dat, and you can edit this file and reload
it
using the -euler 3 option. This way, you can also load any manually
generated
Euler angle files.
-euler
0 Proportional method [default]
-euler 1
Pole sparsing method
-euler 2 Spiral method
-euler 3 <filename> Input file
Here is
an example generating the Euler angle distribution with 8° angular
sampling using the spiral method:
./colores em.sit
atoms.pdb
-res 9.0 -cutoff 1.0 -deg 8 -euler 2
|
-peak
<int> This option sets the peak search algorithm. By
default, a combined sorting and filtering-based approach is used. A
stand-alone filtering-based algorithm is available as an option.
-peak
0 Original peak search by sort and filter [default]
-peak 1 Peak search by filter only
-explor
<int> Controls the number of the best fits found in the 6D
on-lattice
search to be subsequently refined by Powell optimization. This number
is only an upper bound for the final number, since redundant solutions
are removed in the Powell stage. [default -explor 10]
-sizef
<float> FFT zero padding factor. The low resolution map is
enlarged by a margin of
width sizef times
the map dimensions. We have optimized the zero padding empirically [default
-sizef 0.1 for standard and 0.2 for Laplacian correlation]
-sculptor Save
additional outout files for interactive
exploration of fits with Sculptor
[default: Off]
-nopowell
This flag skips the Powell optimization and only the on-lattice search
is performed. By default the Powell optimization is turned on.
-pwti
<float int> Powell tolerance and max number of iterations of
the
Powell algorithm. This two parameters control the convergence of the
optimization.
By default, the tolerance is set to 1e-6 and the max iterations are
limited
to 25.
-pwdr <float
float> Initial gradient of the translational and rotational search
in the
Powell optimization. By default the initial rotational gradient is set
to
25% of the angular sampling (but not larger than 10°), and the
translational
initial gradient is set to 25% of the voxel spacing. To use the default
value only for the rotational or only for the translational gradient,
choose a negative number for the parameter that must be left at
default, and your chosen value for the other.
-pwcorr
<int> This option sets the
Powell correlation algorithm. By default, the fastest algorithm which
reproduces the standard cross correlation coefficient to within the
Powell tolerance is determined at runtime.
-pwcorr
0 Determined at runtime [default]
-pwcorr 1 Standard three-step code
-pwcorr 2 Three-step code with mask applied
-pwcorr 3 One-step code for small probe
structures
-nopeaksharp
This flag skips the peak sharpness estimation procedure in order to
save processing time. By default, the peak sharpness estimation is
turned on.
Input at
program prompt:
None.
Output:
Here is a
brief description
of the output files (see also the file headers);
col_best*.pdb
The atomic coordinates in PDB format of the best fits found in the
search.
The total number of best fits saved is controlled by the option
-explor,
but only non-degenerate fits are returned, so the number may be smaller
than specified by the -explor option. The PDB header contains
information
about the docking (sampling, fit criteria used, correlation values,
position
and orientation etc.). It also includes a table containing the angular
variability of the correlation about the fit.
col_rotate.log This
file contains the best translational fit (on-lattice) found for each
rotation.
The first 3 columns are the Euler angles (in degrees), the next 3
columns
are the translational coordinates that gave the highest correlation
value,
followed by the correlation value (not normalized).
col_powell.log
This
file contains information about the Powell off-lattice search performed
for the best fits from 6D lattice search. As before, rotational
and
translational coordinates correspond to the first 6 columns , but note
that the Euler angles are in radian units.
col_trans.sit
The on-lattice translation function in Situs format. Since the
translational
search space corresponds to the input map lattice , we can generate a
map
in which density values are the correlation values normalized by the
maximum.
This is a Situs format map, so you can use VMD
to display it or use map2map to
convert to
other formats.
col_trans.log
Same as col_trans.sit, but instead of a map, the translational
correlation
values are stored in a regular file. Each row that corresponds to a
lattice
coordinate (columns 4,5,6) shows the corresponding Euler angles
(columns
1,2,3) in degrees that exhibit the highest correlation value (column
7).
col_eulers.dat
This file contains the list of uniform Euler angle triplets that
defines
the rotational space search. You can load such a file by using the
option
-euler 3. You can also inspect this file with the eul2pdb
tool.
col_lo_fil.sit
The zero padded and filtered target volume in Situs format just prior
to correlation calculation. This map is useful mainly for inspecting
the effect of filtering.
col_hi_fil.sit
The filtered (and centered) probe structure on the lattice in Situs
format just prior
to correlation calculation. This map is useful mainly for inspecting
the effect of filtering.
Additional output files will be written for interactive
exploration
with the -sculptor option (see
colores output and the Sculptor
documentation).
Notes:
- Depending
on the overlap of structure and map, one can expect correlation values
of about 0.6-0.9 or 0.3-0.5 for standard and Laplacian correlation,
respectively. These guideline numbers would be
smaller if the fitted atomic structure does not account for the entire
map density.
- The time
estimation gives
a quite accurate estimate of the on-lattice 6D search time, however,
the
subsequent off-lattice Powell optimization is not considered in the
estimation
and depends on the -explor number. You can balance the precision of the
search (e.g. angular sampling, option -deg) with the compute expense.
For medium to low resolution maps angular sampling steps of <
10° are not particularly useful since the Powell algorithm
has a large radius of convergence. You could save time
if you use only the carbon alpha or backbone atoms of the input
structure but the savings are often insignificant so we recommend
fitting with all heavy (non-hydrogen) atoms.
- If you have a large map you can try to crop
the
data
to a region of interest (e.g. the asymmetric unit). Allow for
sufficient
room for the probe structure since you do not know its exact location a
priori.
- This
is a rigid body
search. If you expect large, induced fit conformational changes, you
can
dissect your atomic structures into rigid-body domains and perform the
docking
with each of them individually. Alternatively, you may want to try our flexible
fitting strategies.
|
| eul2pdb - Graphical
Representation of Euler Angles
Purpose:
The eul2pdb
utility is used to generate a graphical representation of a set of
Euler angles resulting from a colores
run. The eul2pdb programs writes a pseudo-atomic structure in a
PDB formatted file where the set of Euler angles is represented as a
set
of points on a 10A radius sphere. This file
can then be inspected with a visualization program (for example VMD).
Usage (at
shell prompt):
./eul2pdb col_eulers.dat out_file
out_file: output file, PDB format |
Input at
program prompt:
None.
Output:
Pseudo-atomic structure
in PDB-format. Each triplet of Euler angles in the input
file is represented as a point on a 10A radius sphere. The phi Euler
angle (rotation in the projection plane) is encoded in the B-factor
column of the PDB file (in radians units), whereas theta and psi
correspond to longitude and latitude on the sphere.
|
| map2map
- Format Conversion
Former names: convert, conformat
Purpose:
Volumetric
density data is converted
to a Situs-specific format on a cubic lattice. This allows Situs
programs to keep track of coordinate systems and it makes the core
Situs programs
independent
of the ever changing map format standards. In the editable (ASCII or text) Situs
format, a short header holds the voxel spacing WIDTH, the map origin as defined by
the 3D coordinates of the first voxel ORIGX,
ORIGY, ORIGZ, and the map dimensions
(number of increments) NX, NY,
NZ. This minimalist header is followed by the data fields such
that x increments change fastest and z
increments change slowest.
The map2map
utility reads many file
formats used by standard EM application software. These include the
MRC2000,
SPIDER,
and CCP4 formats, as well as similar 4-byte floating-point binary
formats
(automatic byte-order adjustment). X-PLOR maps in ASCII format, and
ASCII
files that contain a sequence of density values in free format are also
recognized. The reverse conversion of Situs format files to CCP4,
MRC2000,
X-PLOR, or SPIDER format is also supported.
Usage (at
shell prompt):
| ./map2map file1 file2
file1: inputfile
file2: outputfile
|
Interactive input at
program prompt (for automation see below):
- Input file
format.
- Input file
specific header fields if they are missing or if manual editing of
fields is selected.
Output:
Density file in
selected format.
If necessary, the program permutes the axis order (CCP4 and
MRC2000) and interpolates maps to a cubic lattice (X-PLOR, CCP4,
MRC2000). Details vary by map format and are too numerous to list
here, please inspect the program text in the terminal window carefully.
Notes:
- This version of
map2map was coordinated with the developers of VMD, Chimera or Sculptor, and em2em to give consistent side by side results
with other map formats. There are too many details to list here, if you
have questions please contact us.
- Also check
out the free conversion programs MAPMAN, and
especially em2em
which also has a Situs format option.
- The maps are now quite robust under
most round trip conversions "EM
map format" -> Situs -> "EM map format", but note that the Situs fields WIDTH,
ORIGX,
ORIGY, ORIGZ are not part of the SPIDER
specification and cannot be saved.
- Advanced
users may try the manual assignment of header fields, if available.
This is useful for validation or debugging purposes as header fields
can be set arbitrarily.
- You can automate
this interactive program by "overloading" the standard input (if you
put expected
values in a script).
|
| matchpoint (matchpt)
- Point Cloud Matching
Purpose:
The matchpoint utility is a
command-line program for matching arbitrary sized 3D point sets, using the output of the Situs programs qpdb and qvol
for the docking. It
was developed as an alternative to the more limited qdock
and qrange tools and will eventually supersede
them. matchpoint can dock a subunit into a larger oligomeric structure
(find N codebook vectors within another set
with M vectors, N<M). To solve this np-complete problem, matchpoint
uses a heuristic and investigates only a subset of all possible
permutations of feature-points.
Basic usage (at
shell prompt):
./matchpt
file1 file2 file3 file4 [options]
file1: inputfile 1, Codebook
vectors from qvol in PDB format
file2: inputfile 2, Density map in Situs
format. Use NONE if no correlation calculation desired
file3: inputfile 3, Codebook vectors from
qpdb in PDB format
file4: inputfile 4, High-resolution
structure in PDB format. Use NONE if only the codebook vectors should
be matched.
|
Optional command
line parameters:
-res <float>
Estimated resolution of file2 in Å. This affects only the
computed cross-corelation value, not the docking. [default
15.0Å]
-explor
<int> Controls the maximum number of docking solutions that are
'explored' and written
to disk. This number is an upper bound since the solutions must pass
the anchor point matching criteria below. [default
10]
-anchor
<float> Radius of initial anchor point triangle search
space in Å.
[default 12.0Å, the larger the slower]
-radius <float> Radius of the neighbor-search in Å.
[default 10.0Å, the larger the slower]
-wildcards
<int> Wildcards: How many unmatched points are allowed. To avoid
false positives, it should not be larger than 10% of N. [default
0, the larger the slower]
-penalty
<float> Wildcard penalty in Å.
How much should the solutions be penalized if they include unmatched
points. [default: 1Å]
-runs <int>
Number of runs. The algorithm will try different anchor point triangles
if set to > 1. [default: 3]
-ident
<float> Distance threshold in Å
for removing identical solutions. Useful only for oligomeric systems. [default
0Å, the higher the more results are filtered]
Output:
- (Program
level:) Solution
filenames, codebook vector RMSDs, cross-correlation coefficients and
permutations are printed out. The permutations indicate the order of
low
res. vectors fitted to high res. vectors, which is the opposite of how
they are shown in qdock and qrange.
- (Shell
level:) Solution files with the fitted structures. Codebook vector
files with the fitted feature-points corresponding to the fitted
structures.
Notes:
For situations where a smaller
monomer has to be docked into a
larger oligomeric map, two parameters should be adjusted. (1)
The -explor parameter controls how many files are written to disk.
This should be at least the number of subunits of the system, but in
practice it should be set to a higher value to avoid false negatives
(sometimes the algorithm finds multiple possible orientations for a
single subunit which might push another solution out of the the top
ranking list). (2) The parameter -ident can also help avoid finding
multiple instances of the same unit. -ident will
filter the solutions based on the distance of the centroids of the
found subunits: if two configurations are too close, only the one with
the higher score is considered. It is recommended to try to increase
the number of solutions first before one filters the found units
with the -ident parameter.
Sometimes the default parameters simplify the search process too much
and an insufficient number of solutions (or none) are found. In this
case try first to increase the number of anchor-point triangles (via
-anchor), leading to only a moderatly increased runtime of the program.
In a second step
one could also try to increase the search radius for potentially
matching points
(via -radius), which will increase the runtime more significanlty.
|
| pdb2sax - Create a
Simulated Bead Model from a PDB
Purpose:
The pdb2sax
utility allows one to fill an input atomic structure with close-packed
spheres on a hexagonal lattice. It allows one to create simulated bead
models for validating Situs modeling applications.
Usage
(at shell prompt):
| ./pdb2sax file1 file2 radius
file1: inputfile, PDB format
file2: outputfile (bead model),
PDB format
radius: bead radius in Angstrom
|
Interactive input at
program prompt (also suitable for automation):
Choice of
atom
mass-weighting and
B-factor cutoff level. Atoms with B-factors above the cutoff level will
be ignored. You can automate
this program by "overloading" the standard input if you put expected
values in a script!
Output:
PDB file that contains the centers of
the simulated beads with the radii in the occupancy column. This file
can then be either inspected with a visualization program (for example VMD), or processed into a bead volume map
using pdb2vol.
|
pdb2vol
- Create a Volumetric Map from a PDB
Former name:
pdblur
Purpose:
The pdb2vol
utility is a real-space
convolution tool. It allows one to lower the resolution of an atomic
structure
to a user-specified value, or to create a bead model from atomic
coordinates.
The structure is first projected to a cubic lattice by trilinear
interpolation.
Subsequently, each lattice point is convoluted with one of five
supported
kernel (point-spread) functions.
Usage
(at shell prompt):
| ./pdb2vol file1 file2
file1: inputfile, PDB format
file2: outputfile, Situs format
|
Interactive input at
program prompt (also suitable for automation):
- If water,
hydrogen,codebook vector atoms are present, choice
of ignoring them.
- Choice of
atom
mass-weighting and
B-factor cutoff level. Atoms with B-factors above the cutoff level will
be ignored.
- Desired
voxel
spacing for output
map.
- Kernel
width,
defined by either the
kernel half-max radius r-half (enter positive value) or by the
target
resolution of the output map (enter value of resolution as negative
number).
The standard deviation (sigma) of the kernel is assumed to be
half
the target resolution.
- Type of
smoothing kernel:
- Gaussian,
exp(-1.5
r^2 / sigma^2)
- Triangular,
max(0,
1 - 0.5 |r|
/ r-half)
- Semi-Epanechnikov,
max(0, 1 -
0.5 |r|^1.5 / r-half^1.5)
- Epanechnikov,
max(0, 1 - 0.5 r^2
/ r-half^2)
- Hard
Sphere, max(0, 1 -
0.5 r^60 / r-half^60)
- Choice of
correction for lattice
smoothing (subtract the lattice projection mean-square deviation from
the
kernel variance).
- The kernel
amplitude at the kernel
origin (r=0).
- You can automate
this interactive program by "overloading" the standard input (if you
put expected
values in a script).
Output:
Density file in
Situs format.
The new grid follows the coordinate system origin convention of the
atomic
structure and forms the smallest possible box that fully encloses the
structure
convoluted by the kernel.
|
pdbsymm
- Symmetry Builder
Supersedes
former program: hlxbuild
Purpose:
The
pdbsymm tool generates multiple copies of the input structure according
to a user-specified symmetry. Currently supported symmetry types
include: C, D and helical. If an input map (optional) is also specified
and the map has square cross-sections, the x- and y-position of the
principal symmetry axis (for symmetries C and D) or of the helical axis
(for helical symmetry) is automatically set to the geometric center of
a cross-section (in the map coordinate system). If the map is cubic and
D symmetry is requested, the z-position of the secondary axes is set to
the geometric center of the y-z cross-section. This functionality is
useful if one wants to generate assemblies from subunits that had been
fitted to the input map earlier.
Usage (at
shell prompt):
| ./pdbsymm file1 [file2] file3
file1: inputfile, PDB format
file2: (optional) inputfile for
helical or symmetry axis, Situs
format
file3: outputfile, PDB format
|
Interactive input at
program prompt (also suitable for automation):
Depending on
symmetry type:
- Helical
rise
per subunit (in z-direction).
- Angular
twist
per subunit (sign determines
handedness).
- Desired
number
of subunits to be
placed before file1 structure.
- Desired
number
of subunits to be
placed after file1 structure.
- Order of
the principal symmetry axis.
- [If file2
is unspecified: x- and y-position
of helical axis (offset from file1 coordinate system origin).]
- z-position
of secondary symmetry axes (for D symmetry - offset from file1
coordinate system origin).
- You can automate
this interactive program by "overloading" the standard input (if you
put expected
values in a script).
Output:
Symmetry
PDB file containing
multiple copies of input PDB file.
Note:
file2 input maps
that don't have
square (x,y) cross-sections or whose helical axis is not in the
geometric
center of the square cross-sections must first be cropped with voledit.
|
| qdock
- Rigid-Body Docking of High- and Low-Resolution Data
Purpose:
Specialized tool
to dock a high-resolution
structure to low-resolution data using a fixed number k of codebook
vectors. Much of the functionality of qdock has been superseded by matchpoint, qrange
and colores. qdock is slated for
removal in one of the next Situs versions.
It is assumed that the user has
already determined a
suitable
number k of codebook vectors with qrange
and has
created the vector files with qpdb and
qvol
.
Similar to qrange,
qdock carries out an exhaustive search of the k! = k*(k-1)* ...*2
permutations
of corresponding codebook vectors and returns a list of best
least-squares
fits. The limitation of qdock is that the docking can be carried out
only
for a fixed number k of vectors. The advantage of qdock is that the
results
can be ranked (if desired) by the correlation coefficient that measures
the overlap of the high- and low-resolution data. A lower rms deviation
(rmsd) of the least-squares fit typically corresponds to a higher
correlation
coefficient. However, the coefficients typically lie within a very
narrow
numeric range and care must be taken because fits based on the
correlation
coefficient alone are often ambiguous.
The user has the
choice to change
the origin of the map coordinate system. This option might be helpful
if
a graphics program has a convention for defining the map coordinate
system
that differs from that of the map2map
utility.
Usage (at
shell prompt):
./qdock file1 file2 file3 file4
file1: inputfile 1, PDB format,
codebook vectors from qvol
file2: inputfile 2, Situs format,
volumetric map corresponding to
qvol vectors
file3: inputfile 3, PDB format, codebook
vectors from qpdb
file4: inputfile 4, PDB format,
high-resolution structure corresponding
to qpdb vectors
|
Interactive input at
program prompt (also suitable for automation):
- Choice of
volumetric map coordinate
system.
- Ranking by
vector rmsd or by correlation
coefficient (carbon alpha or full atom).
- Selection
of a
docked high-resolution
structure from a list.
- Filename
for
the selected output
structure.
- If desired,
select and export additional
results from the list.
- You can automate
this interactive program by "overloading" the standard input (if you
put expected
values in a script).
Output:
- (Program
level:) Possible pairs of
corresponding codebook vectors. The (default: 20) best least-squares
fits
are ranked by their rmsd (in Angstrom), or by their correlation
coefficient.
Permutations indicate the order of qpdb vectors fitted to qvol vectors.
See the Situs manuscripts
for more
information. Also, the positional and orientational accuracy of the
fitting
is given, calculated based on the statistical vector variability.
- (Shell
level:)
The superposition
of the selected corresponding codebook vectors defines a rigid-body
transformation
which superimposes high- and low-resolution data sets. After
transformation,
the docked high-resolution structure is written to a file in PDB
format.
The file3 codebook vectors are sorted and transformed in concert with
the
structure; they are appended, together with the file1 vectors, to the
output
PDB file.
|
| qpdb
- Vector Quantization of a PDB
Purpose:
Specialized tool
to perform a vector
quantization of atomic resolution data. qpdb supports the
correlation-coefficient based
docking
with qdock, and also flexible
docking. To enable skeleton-based
flexible
docking
with qvol, qpdb includes options to
learn vector
distances
and to export the Voronoi
cells generated by
the
vectors.
In qpdb a small number of calculations
(8 by
default)
are repeated with different random number seeds. The averaged codebook
vectors and their statistical variability are then written to the
output
file.
Usage:
First, the user
must determine
a suitable number of codebook vectors e.g. with qrange
or by visual inspection. For rigid-body docking, a small number (3-6)
is
usually sufficient. qpdb employs an efficient mass-weighting scheme
using
"equally weighted input vectors". The program also allows to ignore
flexible
or poorly defined atoms with high crystallographic B-factors. This
option
should only be chosen if there is an indication that parts of the
protein
are not visible in the low-resolution data due to disorder.
Usage (at
shell prompt):
| ./qpdb file1 file2
file1: inputfile (atomic
structure), PDB format
file2: outputfile (codebook
vectors), PDB format
|
Interactive input at
program prompt (also suitable for automation):
- If water,
hydrogen,codebook vector atoms are present, choice
of ignoring them.
- B-factor
cutoff
level. Atoms with
B-factors above this level will be ignored.
- Number of
codebook vectors.
- Choice of
computing the vector connectivities
(neighborhood relationships) with the Competitive Hebb Rule (Wriggers
et al., 1998) and writing them to a file.
- Choice of
computing the Voronoi
cells and writing them to a file.
- You can automate
this interactive program by "overloading" the standard input (if you
put expected
values in a script).
Output:
- (Program
level:) The sphericity,
a measure between 0 and 1 that characterizes how spherical the shape of
the structure (file1) is. After the vector quantization the program
prints
the average rms variability of the codebook vectors in Angstrom. Also
given
in Angstrom is the radius of gyration of the vectors.
- (Shell
level:)
Codebook vectors in
PDB-formatted output file. The vector rms variabilities, representing
the
precision of the codebook vectors, are written to the occupancy fields
of the PDB-style atom entries. The effective radii of the Voronoi cells
are written to the B-factor fields of the PDB-style atom entries.
(Optional)
Vector connectivities can be written to a PSF file or a distance
constraints file. Constraint file entries are
triples
of free-format values in the order <index1> <index2>
<distance
in Angstrom>, where the indices correspond to the order of vectors
in file2,
counting from 1. (Optional) The Voronoi cells can be written to a PDB
file
consisting of the file1 atom entries where the index of each
corresponding
vector is written to the B-factor field of the output file.
|
| qplasty
- Interpolation of Sparsely Sampled Displacements
Purpose:
This program performs an approximative
flexible fitting of
atomic resolution data
based on a coarse input model of displacements. The interpolation-based
flexing is quite reasonable
at the carbon alpha level of proteins, but bond lengths and angles at
the atomic level may get distorted a bit. Flexing with qplasty is
offered as a user-friendly alternative to a more complicated molecular
dynamics simulation protocol. The
qplasty-flexed structures may be processed further by a variety of
simulation or structure refinement tools.
Usage:
First, the user
must create
a suitable coarse model of the displacements using codebook
vectors as simulated markers for the feature positions before and
after flexing. Details of
the modeling steps are explained in the basic
and advanced flexing tutorials. The
displacements in the form of two PDB input files are then applied in
the UNIX command shell as follows. By default, the program assumes
Global IDW interpolation with exponent 8. The user may specify an
option -byres to turn on
interpolation by residue, or -interactive
for a free choice of interpolation methods and parameters.
Usage (at
shell prompt):
./qplasty file1 file2 file3 file4
[options]
file1:
inputfile (atomic structure), PDB
format
file2: inputfile (start codebook vectors),
PDB format
file3: inputfile (end codebook vectors or
displacements), PDB format
file4: outputfile (flexed atomic structure),
PDB format
[options]:
optional flag for default
parameters or full interactive mode:
(default) or
-byatom : interpolation by atom
-byres : interpolation by residue,
to reduce
the number of broken bonds
-interactive : free choice of
methods and parameters
|
Optional interactive / manual input at
program prompt with -interactive:
- The choice of interpolation method
(Thin-Plate-Splines, Elastic Body Splines, Global Inverse Distance
Weighting, Local Inverse Distance Weighting). The default method (Global IDW interpolation with exponent
8) performed best in our tests (Rusu et al, 2008),
so there is no need to change it except for further validation of the
supported algorithms.
- Various kernel and parameter choices
(for details see Rusu et al, 2008).
Output:
- (Program
level:) Various interpolation parameters and methods.
- (Shell
level:) The flexed atomic coordinates will be exported into file4.
|
| qrange
- Automatic Vector Quantization and Rigid-Body Docking
Purpose:
This program
automatically superimposes
atomic structures with corresponding low-resolution
density. Practically all density must be
accounted for by the atomic
structure for this to work. The functionality of qrange has been
superseded in part by the newer matchpoint that is more tolerant of point
mismatches and outliers. If in doubt you may also use colores or colacor instead.
qrange
consolidates the functions
of the older vector
quantization routines qvol
and qpdb and of the rigid-body docking
routine qdock
into a single program to enable a more user-friendly fitting. The major
conceptual innovation of qrange and related programs was the
discretization
of the configurational search space by vector
quantization. This discretization yields a list of best-scoring
superpositions of
the
codebook vectors that represent the structural data sets. The search is
computationally very efficient and can be carried out on a standard
workstation
within seconds.
In qrange a small number of TRN calculations
(8 by default) are repeated with different random number seeds. The
averaged
codebook vectors and their statistical variability are then saved. In
general,
a low vector variability indicates good convergence and reliable vector
positions. The variability depends on the shape of the 3D input density
distribution and on the number of vectors. Therefore, it is a good
selection
criterion for finding an optimal number k of codebook vectors.
To optimally
represent the input
density distribution, qrange employs a multi-resolution approach and
computes
vectors for a range of k (3 <= k <= 9 by default). Note that for
globular shapes a small number of vectors (3-5) is usually sufficient
to
encode the shape unambiguously. A maximum number of k=9 vectors (27
degrees
of freedom) provides sufficient leeway to encode even the most complex
shapes.
After computing
the k vectors
for both high- and low resolution data, the subsequent docking then
determines
the six rigid-body degrees of freedom by a least-squares fit [Kabsch,
1976]
of the k pairs of vectors. The corresponding vectors are not known a
priori,
and all k! = k*(k-1)*...*2 possible permutations are explored. The
program
saves a list of best least-squares fits (for each number k), ranked by
the remaining rms deviation (rmsd) after superposition of the vectors.
The ranking by codebook vector rms deviation typically produces a clear
prediction of the optimum docking configuration.
Usage:
In a practical
application, the
user selects the number k that produces the smallest statistical vector
variability. Alternatively, the user may also consider the number k
that
results in the smallest rmsd (the two selection criteria give the same
results in most cases). For a given k the user then selects the fit
with
the smallest vector rmsd from a list of (default: 20) best scoring
results.
This strategy identifies a clear winner in most situations. The
positional and orientational accuracy of
the fitting is also estimated based on the statistical vector
variability.
Usage (at
shell prompt):
| ./qrange file1 file2
file1: inputfile 1, Situs format
file2: inputfile 2, PDB format
|
Interactive input at
program prompt (also suitable for automation):
- Choice of
utilities to inspect the
density distribution (e.g. voxel histogram).
- Threshold
(cutoff) density value.
- Choice of
volumetric map coordinate
system.
- If water
molecules are present, choice
of ignoring them.
- B-factor
cutoff
level. Atoms with
B-factors above this level will be ignored.
- Selection
of
the vector number k
from a list.
- Selection
of a
docked high-resolution
structure from a list.
- Filename
for
the selected output
structure.
- If desired,
select and export additional
results from the lists.
- You can automate
this interactive program by "overloading" the standard input (if you
put expected
values in a script).
Output:
- (Program
level:) The sphericity,
a measure between 0 and 1 that characterizes how spherical the shape of
the structure (file2) is. After the multiple vector quantizations the
program
prints the selection criteria (vector variability and vector rmsd) as a
function of k. Also shown are the possible pairings of corresponding
codebook
vectors. The best least-squares fits are ranked by their vector rmsd
(in
Angstrom). Also, the correlation coefficient is given. Permutations
indicate
the internal order of vectors. Finally, the positional and
orientational
accuracy of the selected fit is given, as computed from the statistical
vector variability.
- (Shell
level:)
The superposition
of the selected corresponding codebook vectors defines a rigid-body
transformation
which superimposes high- and low-resolution data sets. After
transformation,
the docked high-resolution structure is written to a file in PDB
format.
The codebook vectors are appended to this output file. QVOL atoms
represent
the low-resolution vectors and QPDB atoms represent the high resolution
vectors. The vector rms variabilities, representing the precision of
the
codebook vectors, are written to the occupancy fields of the PDB-style
atom entries.
Notes:
- qrange
employs
an efficient atom
mass-weighting scheme using "equally weighted input vectors".
- The user
has
the choice to change
the origin of the map coordinate system. This option might be helpful
if
a graphics program has a different convention for defining the map
coordinate
system than map2map .
- The program
also allows to ignore
flexible or poorly defined atoms with high crystallographic B-factors.
This option should only be chosen if there is an indication that parts
of the protein are not visible in the low-resolution data due to
disorder.
- Often there
is
no one-to-one correspondence
between low-resolution data and what one would consider the physical
density
of the structure, so users may need to experiment with the volumetric
and
B-factor threshold values until the results are satisfying. To help
estimate
appropriate threshold values for single molecules, qvol
and qpdb print the radius of
gyration of
the
codebook vectors, a measure of vector compactness. At the proper
threshold
densities the radius of gyration returned by qvol should be approximately equal to
that of qpdb.
|
| qvol
- Vector Quantization of Volumetric Map
Purpose:
Specialized tool
to perform a vector
quantization of low-resolution, single molecule data. qvol supports
the correlation-coefficient
based docking with qdock, and
flexible
docking with qplasty. In the absence
of existing vector positions, qvol carries
out a global search using the TRN algorithm. If
start vectors are already known, the LBG local
search
algorithm is used instead of TRN, or connectivities can be learned. LBG
allows to add distance
constraints to the vector refinement that are useful for flexible
docking.
With TRN,
a small number
of calculations (8 by default) are repeated with different random
number
seeds. The averaged codebook vectors and their statistical variability
are then written to the output file. With LBG,
no statistical clustering is performed. In this case it is important to
specify reliable initial positions from a prior qvol run.
Usage:
In a practical
application of
qvol, one should extract from the volumetric data a region of interest
corresponding to a single molecule using e.g. voledit.
Next, the user
must determine
a suitable number of codebook vectors. Only densities above a
user-defined threshold
value are considered by qvol to eliminate background noise in the
low-resolution
data. Depending on the noise, this threshold value should be at 50-80%
of the level that is typically considered the "molecular surface" of
the
biopolymer in the low-resolution data.
New vector
positions are calculated
automatically with the TRN
method if no start
vectors
are specified. Subsequently, these vector positions can be refined in a
second qvol run with the LBG
method.
Also, any distance constraints can be read from a file or entered at
the
command prompt at this time.
Usage (at
shell prompt):
| ./qvol file1 [file2] file3
file1: inputfile, Situs format
file2: inputfile, start vectors,
PDB format (optional)
file3: outputfile, PDB format
|
Interactive input at
program prompt (also suitable for automation):
- Choice of
utilities to inspect the
density distribution (e.g. voxel histogram).
- Threshold
(cutoff) density value.
- Number of
codebook vectors.
- (If file2
is
specified): Choice of
entering distance constraints manually or from a file.
There are two constraint file options. Constraint
file entries generated e.g. with qpdb
are triples
of free-format values in the order
<index1>
<index2> <distance in Angstrom>, where the indices
correspond to
the order of vectors in file2, counting from 1. It is also possible to
read the connectivities from a PSF file in which case the missing
distances are computed from file2.
- Choice of
computing the vector connectivities
(neighborhood relationships) with the Competitive Hebb Rule (Wriggers
et al., 1998) and writing them to a file.
- You can automate
this interactive program by "overloading" the standard input (if you
put expected
values in a script).
Output:
- (Program
level:) Statistical analysis
of the vectors and their radius of gyration, i.e. the radial rms
deviation
from the vector center of mass.
- (Shell
level):
Codebook vectors in
a PDB-formatted output file. The vector rms variabilities, representing
the precision of the codebook vectors, are written to the occupancy
fields
of the PDB-style atom entries. (Optional) Vector connectivities can be
written to a PSF
file
or a distance constraints file.
Notes:
-
Vector
connectivities in PSF format
can be visualized and edited as bond connections (together with the
atom-style PDB
entries of file2 and file3) using the molecular graphics program VMD. Simply overload
the
PSF file into the PDB file in the VMD 'Molecule' menu. Then
under the 'Mouse' menu select 'Add/Remove Bonds'. The edited
connectivity can then be saved later into a PSF file from the VMD
command console (assuming your molecule
is 'top'):
set sel [atomselect top all] $sel writepsf my.psf
|
- If there
are
cluster size deviations
from the expected value (default: 8) when using the TRN
algorithm, refine the found vector positions by passing them to qvol as
input file of a second, LBG
run.
- Distance
constraints do not determine
the chirality (handedness) of vector connections. If you encounter
mirror
images or otherwise flipped connections after running qvol compared to
connections determined with qpdb, you need to experiment with the
indexing
of your constraints. The LBG method combined with the SHAKE constraint
algorithm is relatively insensitive to the position of start vectors.
|
| vol2pdb
- Create a PDB from a
Volumetric Map
Purpose:
The vol2pdb
utility allows one to encode positive density values of a 3D map into a
PDB file with the densities written to the PDB occupancy column. This
is useful for colores and colacor,
both of which require a PDB and a map as
input parameters.
Usage (at
shell prompt):
| ./vol2pdb
file1 file2
file1:
inputfile 1, Situs format
file2:
outputfile, PDB format
|
Input at
program prompt:
None.
Output:
PDB format file
with densities written to occupancy field (if rescaling necessary the
conversion factor is given by the program).
|
| voldiff
- Discrepancy / Difference Mapping
Former name:
subtract
Purpose:
The voldiff
utility allows one to compute the difference density map (discrepancy
map)
of two volume data sets in Situs format. The input datasets can differ
in their parameters. If necessary the second input file is resampled to
the grid of the first input file.
Usage (at
shell prompt):
| ./voldiff file1 file2 file3
file1:
inputfile 1, Situs format
file2:
inputfile 2, Situs format
file3:
outputfile, Situs format
|
Input at
program prompt:
None.
Output:
Density
file in Situs format. The new density values are computed by
subtracting
the corresponding values of input grid 2 (which is resampled by
trilinear interpolation, if necessary) from those of input grid 1.
The output grid inherits the parameters from grid 1.
|
voledit
- Inspecting and Editing 3D Maps
Supersedes
former programs: volslice, floodfill, volpad (padup), volcrop
(pindown), volvoxl (interpolate)
Purpose:
Cross
sections
of the density
data in the (x,y)-, (y,z)-, or (z,x)-planes can be inspected with the
simple terminal window graphics program voledit. The
utility
can also be used to write individual 2D slices or 3D volumes to files.
Volumes can be edited by cropping, zero padding, polygon clipping, and
segmentation (specified under options).
Usage (at
shell prompt):
| ./voledit file1
file1: inputfile, Situs format
|
Interactive input at
program prompt (also suitable for automation):
- Type of cross
section, (x,y),
(y,z), or (z,x).
- Threshold
(cutoff) value for
the rendering of the density (options).
- z, x, or y
position of the cross
section plane (grid units).
- Display voxel step (for display of
larger maps).
- Polygon
clipping
parameters and
vertices (options).
- Cropping
parameters in voxel units (options).
- Zero padding
in voxel units (options).
- Segmentation
parameters to extract a targeted
contiguous volume.
Originating
from the vicinity of a given start voxel, voledit finds recursively
the
maximum contiguous volume formed by neighboring voxels that exceed a
given threshold density level. An additional layer is added for
aesthetic
reasons
to facilitate isocontouring near
the cutoff
level. Although the extracted grid contains some voxels (in the contour
layer) with densities below the cutoff, all voxels with density values
above the cutoff are guaranteed to be part of the found contiguous
volume.
Voxels outside the contour layer are assigned a density value
of
0.
- File name for
2D slice or 3D
volume output
file (options).
- You can automate
this interactive program by "overloading" the standard input (if you
put expected
values in a script).
Output:
(Shell window:)
Cross section
of the input map in Situs format. Larger maps are resaled by the voxel
step parameter that can be set under options. Pairs of displayed voxels
neighboring
in the vertical direction are represented by a single character:
'^' if
the upper voxel
density exceeds a threshold (cutoff) level,
'u', if the lower
voxel density
exceeds the threshold level,
'0' if both upper
and lower voxel
densities exceed the threshold level, and
' ', if the
densities are below
the threshold.
(2D Output File:)
Voxel indices and
density of specified cross section.
(3D Output
File:) Density file in
Situs format.
The new grid inherits the voxel size (grid spacing) of the old grid.
The
number of x, y, and z increments, and the coordinates of voxel (1,1,1)
depend on the chosen editing options (cropping, padding, segmentation).
For example,
in segmentation when shrinking of the box is selected, the grid
dimensions are determined by the minimum box that contains
both
the contiguous volume plus one layer of neighboring voxels with density
values below the threshold.
Notes:
This program
requires the use
of a fixed-width font in the shell window.
|
volhist
- Inspecting and Manipulating the Voxel Histogram
Former name:
histovox
Purpose:
The volhist
utility prints the
voxel histogram [Frank et al., 1991] of the density values. The
histogram
illustrates two general properties of low-resolution density
distributions.
First, a pronounced peak at low densities is due to background
scattering.
The protein density typically corresponds to a second, broader peak at
higher densities. When integrating the histogram ``from the top down'',
the known molecular volume of a protein can be used to compute its
boundary
density value. The volhist program also allows the user to add a
constant
value to the densities to shift the background density peak to the
origin,
and to rescale the densities.
Usage (at
shell prompt):
| ./volhist file1 [file2]
file1: inputfile, Situs format
file2: (optional) outputfile,
Situs format
|
Interactive input at
program prompt (if file2 specified):
- Offset
density
value (will be added
to all voxels).
- Scaling
factor.
- You can automate
this interactive program by "overloading" the standard input (if you
put expected
values in a script).
Output:
- (Program
level:) Voxel histogram
and fractional volume of volumetric data echoed to the screen. The
histogram
bars are normalized by the second highest density peak.
- (Shell
level:)
Density file in Situs
format (if specified). The new density values are computed by adding
the
offset value and by multiplying the scaling factor entered at the
program
prompt. The new grid inherits all size and position parameters of the
old
grid.
|
| Header
File and Library Routines
The suite
of programs is supported by a header file (situs.h)
containing
user-defined parameters and by auxiliary library programs. The library
programs and their respective header files handle
input and output of atomic coordinates in PDB format (lib_pio.c), input
and output of volumetric data (lib_vio.c), input of data at the prompt
(lib_std.c), Eigenvector computation for real symmetric 3x3 matrices
(lib_jac.c),
Euler angle generation (lib_eul.c), random number generation
(lib_rnd.c), array management (lib_vec.c), Powell optimization
(lib_pow.c),
map manipulation (lib_vwk.c), PDB manipulation (lib_pwk.c), matchpoint
support (lib_mpt.cpp), symmetric
multiprocessing (lib_smp.c), and timing (lib_tim.c).
|
| Return
to the front page . |
|