|
User Guide
|
| This guide is
intended to be
used as a reference manual. You may also want to follow the simple
steps
described in the tutorials
which give usage
examples
of the most important utilities. More documentation is also available
on
the Methodology page
and in the published
articles . |
|
Content:
colacor
- Cross-Correlation Calculation and Refinement of Manual Docking
colores
- FFT Accelerated 6D Exhaustive Search
eul2pdb
- Graphical Representation of Euler Angles
map2map -
Format Conversion
pdb2sax
- Create a Simulated SAXS Bead Model from a PDB
pdb2vol
- Create a Volumetric Map from a PDB
pdbsymm-
Symmetry Builder
qdock
- Rigid-Body Docking of High- and Low-Resolution Data
qpdb
- Vector
Quantization of a PDB
qrange
- Automatic Vector Quantization and Rigid-Body Docking
qvol
- Vector
Quantization of Volumetric Map
vol2pdb
- Create a PDB from a
Volumetric Map
volcube
- Creating Isocontour Surfaces
voldiff
- Discrepancy / Difference Mapping
voledit
- Inspecting 2D Cross Sections of Density Values
volhist
- Inspecting and Shifting the Voxel Histogram
volvoxl
- Grid Interpolation
Header
File
and Library Routines
|
| colacor
- Cross-Correlation Calculation and Refinement of Manual Docking
Purpose:
colacor = Combined Off-LAttice
COrrelation and Refinement. Since colores
centers maps and structures automatically,
a special tool is needed to compute the cross-correlation coefficient
between a density map and a (resolution-lowered) atomic structure at a
given, user defined geometry. As an additional useful option, colacor also
performs a single run of off-lattice Powell optimization that
(rigid-body) refines a manual docked fit to the nearest maximum of the
cross-correlation.
Basic usage
(at shell prompt):
| ./colacor <Situs density map> <PDB
structure> -res <number>
-cutoff <number> |
The basic
input parameters
are:
<Situs
density map> Low resolution map in Situs format. To
convert your
EM map to Situs format use the map2map
utility.
<PDB
structure> Atomic structure in PDB format.
-res <value>
Estimated resolution of the density map in Å. [default
-res 15.0]
-cutoff
<value> Density map threshold value. All density levels below
this
value
are set to zero. You can use volhist
to
rescale
the density levels or to shift the background peak in the voxel
histogram
to the origin. [default -cutoff 0.0]
-corr
<number> Two options are implemented:
-corr
0 Standard linear cross-correlation. The scalar product
between
the density maps of the low resolution map and the low-pass filtered
atomic
structure. For resolutions > 10Å this criterion is less
discriminative. Note that this needs to be turned on explicitely.
-corr
1 [default] A Laplacian filter is applied by default to maximize
the
fitting
contrast. This is the recommended docking criterion
for low resolution docking (up to 25Å resolution). To
provide for a more robust algorithm when dealing with cropped
or thresholded experimental
data we implemented a mask that filters out hard artificial
surfaces.
Due to the masking and filtering expect overall smaller correlation
values compared to -corr 0.
More advanced
options (at shell prompt):
-ani <value>
Defines the resolution anisotropy factor (z direction vs. x,y
plane)
[default: -ani 1.0]. Allows
one to set a different resolution in the z direction vs. the x,y plane.
E.g. "-ani 1.5 -res 20" specifies a 30 A resolution in the z direction,
and a 20 A resolution in x,y. This is useful for researchers dealing
with
membrane protein or tomography reconstructions that have a reduced
resolution
in the z direction.
-nopowell
This flag skips the Powell optimization. Only the cross-correlation
coefficient is computed. By default the Powell optimization is turned
on.
-pwti
<value number> Powell tolerance and max number of iterations of
the
Powell algorithm. This two parameters control the convergence of the
optimization.
By default the tolerance is set to 1e-6 and the max iterations are
limited
to 25.
-pwdr <value
value> Initial gradient of the translational and rotational search
in the
Powell optimization. By default the initial rotational gradient is set
to
3.5 degrees. The rotational gradient cannot be larger than 10 degrees.
If a larger value is chosen, that value is ignored and the gradient is
set to 10 degrees. The translational initial gradient is set to 25% of
the voxel spacing. To use the default value only for the rotational or
only for the translational gradient, choose a negative number for the
parameter that must be left at default, and your chosen value for the
other.
-pwcorr
<number> This option sets the
Powell correlation algorithm. By default, the fastest algorithm which
reproduces the standard cross correlation coefficient to within the
Powell tolerance is determined at runtime.
-pwcorr
0
Determined at runtime [default]
-pwcorr 1
Standard three-step code
-pwcorr 2
Three-step code with mask applied
-pwcorr 3
One-step code for small probe structures
Input at
program prompt:
None.
Output:
(Shell window:) The cross-correlation
coefficient and other useful information about the structure and map
used. Depending on the overlap of structure and map, a good fit with
-corr 1 (default Laplacian filter) setting may have values upto 0.5,
and with -corr 0 (standard correlation) may have values upto 0.9. These
values are smaller if the structure does not account for the entire map
density.
|
| colores
- FFT-Accelerated 6D Exhaustive Search
Purpose:
colores = COrrelation based LOw
RESolution docking. A general purpose, multi-processor
capable rigid-body
search
tool, suitable for situations where not all density is accounted for by
the atomic structure. The translational search is FFT-accelerated and
the program supports
the use of a Laplacian-style filter (can be turned off) that
drastically increases the fitting
contrast
at medium to low resolution (for more info see the corresponding
methods page). This tool performs a full exhaustive search in 6D
search space (3
translational
and 3 rotational degrees of freedom) within a couple of minutes of
compute
time, depending on system size and numbers of processors used.
Basic usage
(at shell prompt):
./colores <Situs density map> <PDB
structure> -res <number>
-cutoff <number> -deg <number> -nprocs <number>
|
The basic
input parameters
are:
<Situs
density map> Low resolution map in Situs format. To
convert your
EM map to Situs format use the map2map
utility.
<PDB
structure> Atomic structure in PDB format.
-nprocs
<number> This option sets the number of processors used for the
on-lattice 6D search and the off-lattice Powell optimization. Colores
supports systems with multiple, dual-core and/or hyperthreaded
processors. For example, on a dual processor Intel Xeon machine with
Hyperthreading, '-nprocs 4' produces the shortest runtimes (two
processors times two threads). The same holds for a dual-core Pentium4
with Hyperthreading. On the other hand, '-nprocs 2' is the optimal
value for a dual processor AMD Opteron system. Note that on
some architectures the threads may not show in the UNIX "top" window,
if in doubt do your own time comparison. [default
-nprocs 1]
-res <value>
Estimated resolution of your density map in Å. [default
-res 15.0]
-cutoff
<value> Density map cutoff value. All density levels below this
value
are set to zero. You can use volhist
to
rescale
the density levels or to shift the background peak in the voxel
histogram
to the origin. [default -cutoff 0.0]
-deg
<value> Angular sampling of the rotational search space in
degrees.
For typical electron microscopy maps the angular step size should be
between
6-20°. [default -deg 15.0]
-corr
<number> This option controls the fitting
criterion.
Two
options are implemented:
-corr
0 Standard linear cross-correlation. The scalar product
between
the density maps of the low resolution map and the low-pass filtered
atomic
structure. -corr
1 [default]
A Laplacian filter is applied by default. This reduces somewhat the
expected correlation values but the fitting is more precise for low
resolution maps.
-corr
0 Standard linear cross-correlation. The scalar product
between
the density maps of the low resolution map and the low-pass filtered
atomic
structure. For resolutions > 10Å this criterion is less
discriminative. Note that this needs to be turned on explicitely.
More advanced
options (at shell prompt):
-ani <value>
Defines the resolution anisotropy factor (z direction vs. x,y
plane)
[default: -ani 1.0]. Allows
one to set a different resolution in the z direction vs. the x,y plane.
E.g. "-ani 1.5 -res 20" specifies a 30 A resolution in the z direction,
and a 20 A resolution in x,y. This is useful for researchers dealing
with
membrane protein or tomography reconstructions that have a reduced
resolution
in the z direction.
-erang
<value value value value value value>
Defines the rotational space limits according to the range of the Euler
angles (psi, theta, phi). By default the entire rotational space is
considered
[default: -erang 0 360 0 180 0 360]. Note that the Euler angle
range
is not limited to these standard intervals, so you can specify also
negative
values (within certain limits), but in any event any colores output
angles
are remapped to the standard intervals. For example, if you want to
perform
a fine search of 2° angular sampling in only one of the Euler
angles,
these are the options:
| ./colores
em.sit atoms.pdb
-res 9.0 -cutoff 1.0 -deg 2 -erang 0 360 0 0 0 0 |
-euler
<number> There are three ways to generate an exhaustive
list of Euler
angles that covers (nearly) uniformly the rotational space for a given
angular sampling
(option -deg). The proportional method yields very even
results and also performs well for smaller intervals specified via
'-erang'. The pole sparsing method is widely used but yields slightly
less
uniform distributions. The spiral method also produces a less uniform
distribution, but for medium
to low resolution docking it is quite reasonable. The Euler angles are
saved to a file col_eulers.dat, and you can edit this file and reload
it
using the -euler 3 option. This way, you can also load any manually
generated
Euler angle files.
-euler
0
Proportional method [default]
-euler 1
Pole sparsing method
-euler 2
Spiral method
-euler 3 <filename>
Input file
Here is
an example generating the Euler angle distribution with 8° angular
sampling using the spiral method:
./colores
em.sit atoms.pdb
-res 9.0 -cutoff 1.0 -deg 8 -euler 2
|
-peak
<number> This option sets the peak search algorithm. By
default, an advanced filtering-based approach is used. The older,
sorting-based algorithm is available as an option.
-peak
0
Original peak search via sort
-peak 1
New peak search via filtering [default]
-explor
<number> Controls the number of the best fits found in the 6D
on-lattice
search to be subsequently refined by Powell optimization. By
default
an off-lattice optimization is applied to the first 10 best on-lattice
docking results. [default -explor 10]
-sizef
<value> To avoid that the atomic structure is placed outside the
target
map during rotation the low resolution map is enlarged by a margin of
width sizef times
the maximum radius of the atomic structure. Ideally, sizef
should be set to 1.0 but in practice a value as low as 0.2 seems to
work
fine and saves compute time [default -sizef 0.2]
-nopowell
This flag skips the Powell optimization and only the on-lattice search
is performed. By default the Powell optimization is turned on.
-pwti
<value number> Powell tolerance and max number of iterations of
the
Powell algorithm. This two parameters control the convergence of the
optimization.
By default, the tolerance is set to 1e-6 and the max iterations are
limited
to 25.
-pwdr <value
value> Initial gradient of the translational and rotational search
in the
Powell optimization. By default the initial rotational gradient is set
to
25% of the angular sampling (but not larger than 10°), and the
translational
initial gradient is set to 25% of the voxel spacing. To use the default
value only for the rotational or only for the translational gradient,
choose a negative number for the parameter that must be left at
default, and your chosen value for the other.
-pwcorr
<number> This option sets the
Powell correlation algorithm. By default, the fastest algorithm which
reproduces the standard cross correlation coefficient to within the
Powell tolerance is determined at runtime.
-pwcorr
0
Determined at runtime [default]
-pwcorr 1
Standard three-step code
-pwcorr 2
Three-step code with mask applied
-pwcorr 3
One-step code for small probe structures
-nopeaksharp
This flag skips the peak sharpness estimation procedure in order to
save processing time. By default, the peak sharpness estimation is
turned on.
Input at
program prompt:
None.
Output:
Here is a
brief description
of the output files (see also the file headers);
col_best*.pdb
The atomic coordinates in PDB format of the best fits found in the
search.
The total number of best fits saved is controlled by the option
-explor,
but only non-degenerate fits are returned, so the number may be smaller
than specified by the -explor option. The PDB header contains
information
about the docking (sampling, fit criteria used, correlation values,
position
and orientation etc.). It also includes a table containing the angular
variability of the correlation about the fit.
col_rotate.log This
file contains the best translational fit (on-lattice) found for each
rotation.
The first 3 columns are the Euler angles (in degrees), the next 3
columns
are the translational coordinates that gave the highest correlation
value,
followed by the correlation value (not normalized).
col_powell.log
This
file contains information about the Powell off-lattice search performed
for the best fits from 6D lattice search. As before, rotational
and
translational coordinates correspond to the first 6 columns , but note
that the Euler angles are in radian units.
col_trans.sit
The on-lattice translation function in Situs format. Since the
translational
search space corresponds to the input map lattice , we can generate a
map
in which density values are the correlation values normalized by the
maximum.
This is a Situs format map, so you can use VMD
to display it or use map2map to
convert to
other formats.
col_trans.log
Same as col_trans.sit, but instead of a map, the translational
correlation
values are stored in a regular file. Each row that corresponds to a
lattice
coordinate (columns 4,5,6) shows the corresponding Euler angles
(columns
1,2,3) in degrees that exhibit the highest correlation value (column
7).
col_eulers.dat
This file contains the list of uniform Euler angle triplets that
defines
the rotational space search. You can load such a file by using the
option
-euler 3. You can also inspect this file with the eul2pdb
tool.
Notes:
- Depending
on the overlap of structure and map, a good fit with -corr 1 (default
Laplacian filter) setting may have correlation values upto 0.5, and
with -corr 0
(standard correlation) may have values upto 0.9. These correlation
values are
smaller if the structure does not account for the entire map density.
- The time
estimation gives
a quite accurate estimate of the on-lattice 6D search time, however,
the
subsequent off-lattice Powell optimization is not considered in the
estimation
and depends on the -explor number. You can balance the precision of the
search (e.g. angular sampling, option -deg) with the compute expense.
Remember
that for medium to low resolution maps angular sampling steps of < 6
° degrees are not particularly useful.
- You
can also save time
if you use only the carbon alpha or backbone atoms of the input
structure.
Typically, at low resolution the docking does not depend sensitively on
the level of detail in the atomic structure.Take into account that the
exhaustive 6D search scales approximately as N log N with the
voxel
number N in your map.
- If you have a large map you can try to crop the
data
to a region of interest (e.g. the asymmetric unit). Allow for
sufficient
room for the probe structure since you do not know its exact location a
priori.
- This
is a rigid body
search. If you expect large, induced fit conformational changes, you
can
dissect your atomic structures in rigid-body domains and perform the
docking
with each of them individually. Alternatively, you may want to try our flexible
fitting strategies.
|
| eul2pdb - Graphical
Representation of Euler Angles
Purpose:
The eul2pdb
utility is used to generate a graphical representation of a set of
Euler angles resulting from a colores
run. The eul2pdb programs writes a pseudo-atomic structure in a
PDB formatted file where the set of Euler angles is represented as a
set
of points on a 10 A radius sphere. This file
can then be either inspected with a visualization program (for example VMD).
Usage (at
shell prompt):
./eul2pdb col_eulers.dat
out_file
out_file: output file, PDB format
|
Input at
program prompt:
None.
Output:
Pseudo-atomic structure
in PDB-format. Each triplet of Euler angles in the input
file is represented as a point on a 10 A radius sphere. The phi Euler
angle (rotation in the projection plane) is encoded in the B-factor
column of the PDB file (in radians units), whereas theta and psi
correspond to longitude and latitude on the sphere.
|
| map2map
- Format Conversion
Former names: convert, conformat
Purpose:
Volumetric
density data is converted
to a Situs-specific format on a cubic lattice. This allows Situs
programs to keep track of coordinate systems and it makes the core
Situs programs
independent
of the ever changing map format standards.
The map2map
utility reads many file
formats used by standard EM application software. These include the
MRC,
SPIDER,
and CCP4 formats, as well as similar 4-byte floating-point binary
formats
(automatic byte-order adjustment). X-PLOR maps in ASCII format, and
ASCII
files that contain a sequence of density values in free format are also
recognized. The reverse conversion of Situs format files to CCP4, MRC,
X-PLOR, SPIDER format is also supported to limited extent, to
facilitate the visualization
of Situs-generated maps.
Usage (at
shell prompt):
| ./map2map file1 file2
file1: inputfile
file2: outputfile
|
Interactive input at
program prompt (for automation see below):
- Input file
format.
- Number of
x, y,
and z increments
(columns, rows, and sections).
- Voxel size
(grid spacing).
- Order of x,
y,
and z increments in
the input file.
Output:
Density file in
selected format.
The order of values in the sequence of densities is altered, if
necessary,
such that x increments change fastest and z increments change slowest.
In Situs format, a short header holds the voxel size and numbers of x,
y, and z increments, as well as the 3D coordinates of the first voxel
(x=1,
y=1, z=1). The default origin of the map coordinate system in Situs is
the origin of the unit cell (if input file is in X-PLOR, MRC and CCP4
formats)
or voxel (1,1,1) (all other input formats). The Situs header is
followed
by the sequence of data values. The converted Situs files are in ASCII
format, allowing the user to verify the successful conversion of the
data.
If a SPIDER, MRC, CCP4, or X-PLOR map is created, the unit cell spans
the
space from the coordinate system origin to the maximum extent of the
input
Situs map.
Notes and Known Bugs:
- You can automate
this program by "overloading" the standard input if you put expected
values in a script!
- You should
always check out the free conversion programs MAPMAN,
part of Gerard Kleywegt's RAVE package, and especially Image
Science 's em2em
program.
- The MRC and
CCP4 map formats are not fully supported as many flavors are available.
If you
encounter problems, we
recommend that you use the above third party conversion tools. It is
generally safe to use the SPIDER format which is fully supported by
map2map, so if
in doubt convert to SPIDER first. Known problems: Some users have
reported a mixup of axes / scrambling of densities with some particular
maps. Very old CCP4 maps may correspond to the format
currently
denoted "MRC", whereas future MRC maps may correspond to the current
"CCP4"
format. For a history and future plans of this development read Stephen
Fuller's page at EMBL.
- Avoid
round-trip conversions "EM
map format" -> Situs -> "EM map format", because information
about the
unit cell extent beyond that of the actual range of data is lost. Also,
Situs lattices are cubic, so any skewed unit cell parameters (alpha,
beta,
gamma, etc) of the original map are not saved.
|
| pdb2sax - Create a
Simulated Bead Model from a PDB
Purpose:
The pdb2sax
utility allows one to fill an input atomic structure with close-packed
spheres on a hexagonal lattice. It allows one to create simulated bead
models for validating Situs modeling applications.
Usage
(at shell prompt):
| ./pdb2sax file1 file2 radius
file1: inputfile, PDB format
file2: outputfile (bead model),
PDB format
radius: bead radius in Angstrom
|
Interactive input at
program prompt (also suitable for automation):
- Choice of
atom
mass-weighting and
B-factor cutoff level. Atoms with B-factors above the cutoff level will
be ignored. You can automate
this program by "overloading" the standard input if you put expected
values in a script!
Output:
PDB file that contains the centers of
the simulated beads with the radii in the occupancy column. This file
can then be either inspected with a visualization program (for example VMD), or processed into a bead volume map
using pdb2vol.
|
pdb2vol
- Create a Volumetric Map from a PDB
Former name:
pdblur
Purpose:
The pdb2vol
utility is a real-space
convolution tool. It allows one to lower the resolution of an atomic
structure
to a user-specified value, or to create a bead model from atomic
coordinates.
The structure is first projected to a cubic lattice by trilinear
interpolation.
Subsequently, each lattice point is convoluted with one of five
supported
kernel (point-spread) functions.
Usage
(at shell prompt):
| ./pdb2vol file1 file2
file1: inputfile, PDB format
file2: outputfile, Situs format
|
Interactive input at
program prompt (also suitable for automation):
- If water,
hydrogen,codebook vector atoms are present, choice
of ignoring them.
- Choice of
atom
mass-weighting and
B-factor cutoff level. Atoms with B-factors above the cutoff level will
be ignored.
- Desired
voxel
spacing for output
map.
- Kernel
width,
defined by either the
kernel half-max radius r-half (enter positive value) or by the
target
resolution of the output map (enter value of resolution as negative
number).
The standard deviation (sigma) of the kernel is assumed to be
half
the target resolution.
- Type of
smoothing kernel:
- Gaussian,
exp(-1.5
r^2 / sigma^2)
- Triangular,
max(0,
1 - 0.5 |r|
/ r-half)
- Semi-Epanechnikov,
max(0, 1 -
0.5 |r|^1.5 / r-half^1.5)
- Epanechnikov,
max(0, 1 - 0.5 r^2
/ r-half^2)
- Hard
Sphere, max(0, 1 -
0.5 r^60 / r-half^60)
- Choice of
correction for lattice
smoothing (subtract the lattice projection mean-square deviation from
the
kernel variance).
- The kernel
amplitude at the kernel
origin (r=0).
- You can automate
this program by "overloading" the standard input if you put expected
values in a script!
Output:
Density file in
Situs format.
The new grid follows the coordinate system origin convention of the
atomic
structure and forms the smallest possible box that fully encloses the
structure
convoluted by the kernel.
|
pdbsymm
- Symmetry Builder
Supersedes
former program: hlxbuild
Purpose:
The
pdbsymm tool generates multiple copies of the input structure according
to a user-specified symmetry. Currently supported symmetry types
include: C, D and helical. If an input map (optional) is also specified
and the map has square cross-sections, the x- and y-position of the
principal symmetry axis (for symmetries C and D) or of the helical axis
(for helical symmetry) is automatically set to the geometric center of
a cross-section (in the map coordinate system). If the map is cubic and
D symmetry is requested, the z-position of the secondary axes is set to
the geometric center of the y-z cross-section. This functionality is
useful if one wants to generate assemblies from subunits that had been
fitted to the input map earlier.
Usage (at
shell prompt):
| ./pdbsymm file1 [file2] file3
file1: inputfile, PDB format
file2: (optional) inputfile for
helical or symmetry axis, Situs
format
file3: outputfile, PDB format
|
Interactive input at
program prompt (also suitable for automation):
Depending on
symmetry type:
- Helical
rise
per subunit (in z-direction).
- Angular
twist
per subunit (sign determines
handedness).
- Desired
number
of subunits to be
placed before file1 structure.
- Desired
number
of subunits to be
placed after file1 structure.
- Order of
the principal symmetry axis.
- [If file2
is unspecified: x- and y-position
of helical axis (offset from file1 coordinate system origin).]
- z-position
of secondary symmetry axes (for D symmetry - offset from file1
coordinate system origin).
- You can automate
this program by "overloading" the standard input if you put expected
values in a script!
Output:
Symmetry
PDB file containing
multiple copies of input PDB file.
Note:
file2 input maps
that don't have
square (x,y) cross-sections or whose helical axis is not in the
geometric
center of the square cross-sections must first be crop-edited with voledit.
|
| qdock
- Rigid-Body Docking of High- and Low-Resolution Data
Purpose:
Specialized tool
to dock a high-resolution
structure to low-resolution data using a fixed number of codebook
vectors . Much of the functionality of qdock has been superseded by
the combined vector quantization / docking utility qrange
and by the cross-correlation based colores.
qdock remains part of the distribution to support
correlation-coefficient
based docking. It is assumed that the user has already determined a
suitable
number k of codebook vectors with qrange
and has
created the vector files with qpdb and
qvol
.
Similar to qrange,
qdock carries out an exhaustive search of the k! = k*(k-1)* ...*2
permutations
of corresponding codebook vectors and returns a list of best
least-squares
fits. The limitation of qdock is that the docking can be carried out
only
for a fixed number k of vectors. The advantage of qdock is that the
results
can be ranked (if desired) by the correlation coefficient that measures
the overlap of the high- and low-resolution data. A lower rms deviation
(rmsd) of the least-squares fit typically corresponds to a higher
correlation
coefficient. However, the coefficients typically lie within a very
narrow
numeric range and care must be taken because fits based on the
correlation
coefficient alone are often ambiguous.
The user has the
choice to change
the origin of the map coordinate system. This option might be helpful
if
a graphics program has a convention for defining the map coordinate
system
that differs from that of the map2map
utility.
Usage (at
shell prompt):
./qdock file1 file2 file3 file4
file1: inputfile 1, PDB format,
codebook vectors from qvol
file2: inputfile 2, Situs format,
volumetric map corresponding to
qvol vectors
file3: inputfile 3, PDB format, codebook
vectors from qpdb
file4: inputfile 4, PDB format,
high-resolution structure corresponding
to qpdb vectors
|
Interactive input at
program prompt (also suitable for automation):
- Choice of
volumetric map coordinate
system.
- Ranking by
vector rmsd or by correlation
coefficient (carbon alpha or full atom).
- Selection
of a
docked high-resolution
structure from a list.
- Filename
for
the selected output
structure.
- If desired,
select and export additional
results from the list.
- You can automate
this program by "overloading" the standard input if you put expected
values in a script!
Output:
- (Program
level:) Possible pairs of
corresponding codebook vectors. The (default: 20) best least-squares
fits
are ranked by their rmsd (in Angstrom), or by their correlation
coefficient.
Permutations indicate the order of qpdb vectors fitted to qvol vectors.
See the Situs manuscripts
for more
information. Also, the positional and orientational accuracy of the
fitting
is given, calculated based on the statistical vector variability.
- (Shell
level:)
The superposition
of the selected corresponding codebook vectors defines a rigid-body
transformation
which superimposes high- and low-resolution data sets. After
transformation,
the docked high-resolution structure is written to a file in PDB
format.
The file3 codebook vectors are sorted and transformed in concert with
the
structure; they are appended, together with the file1 vectors, to the
output
PDB file.
|
| qpdb
- Vector Quantization of a PDB
Purpose:
Specialized tool
to perform a vector
quantization of atomic resolution data. qpdb supports the
correlation-coefficient based
docking
with qdock, and also flexible
docking. To enable skeleton-based
flexible
docking
with qvol, qpdb includes options to
learn vector
distances
and to export the Voronoi
cells generated by
the
vectors.
In qpdb a small number of calculations
(8 by
default)
are repeated with different random number seeds. The averaged codebook
vectors and their statistical variability are then written to the
output
file.
Usage:
First, the user
must determine
a suitable number of codebook vectors e.g. with qrange
or by visual inspection. For rigid-body docking, a small number (3-6)
is
usually sufficient. qpdb employs an efficient mass-weighting scheme
using
"equally weighted input vectors". The program also allows to ignore
flexible
or poorly defined atoms with high crystallographic B-factors. This
option
should only be chosen if there is an indication that parts of the
protein
are not visible in the low-resolution data due to disorder.
Usage (at
shell prompt):
| ./qpdb file1 file2
file1: inputfile (atomic
structure), PDB format
file2: outputfile (codebook
vectors), PDB format
|
Interactive input at
program prompt (also suitable for automation):
- If water,
hydrogen,codebook vector atoms are present, choice
of ignoring them.
- B-factor
cutoff
level. Atoms with
B-factors above this level will be ignored.
- Number of
codebook vectors.
- Choice of
computing the vector connectivities
(neighborhood relationships) with the Competitive Hebb Rule (Wriggers
et al., 1998) and writing them to a file.
- Choice of
computing the Voronoi
cells and writing them to a file.
- You can automate this program by
"overloading" the standard input if you put expected values in a script!
Output:
- (Program
level:) The sphericity,
a measure between 0 and 1 that characterizes how spherical the shape of
the structure (file1) is. After the vector quantization the program
prints
the average rms variability of the codebook vectors in Angstrom. Also
given
in Angstrom is the radius of gyration of the vectors.
- (Shell
level:)
Codebook vectors in
PDB-formatted output file. The vector rms variabilities, representing
the
precision of the codebook vectors, are written to the occupancy fields
of the PDB-style atom entries. The effective radii of the Voronoi cells
are written to the B-factor fields of the PDB-style atom entries.
(Optional)
Vector connectivities can be written to a PSF file or a distance
constraints file. Constraint file entries are
triples
of free-format values in the order <index1> <index2>
<distance
in Angstrom>, where the indices correspond to the order of vectors
in file2,
counting from 1. (Optional) The Voronoi cells can be written to a PDB
file
consisting of the file1 atom entries where the index of each
corresponding
vector is written to the B-factor field of the output file.
|
| qrange
- Automatic Vector Quantization and Rigid-Body Docking
Purpose:
This program
automatically superimposes
atomic structures with corresponding low-resolution
density. Practically all density must be accounted for by the atomic
structure for this to work. If in doubt use colores.
You've been warned!
qrange
consolidates the functions
of the older vector
quantization routines qvol
and qpdb and of the rigid-body docking
routine qdock
into a single program to enable a more user-friendly fitting. The major
conceptual innovation of qrange and related programs is the
discretization
of the configurational search space by vector
quantization
. This discretization yields a list of best-scoring superpositions of
the
codebook vectors that represent the structural data sets. The search is
computationally very efficient and can be carried out on a standard
workstation
within seconds.
The global
optimum of vector positions
is difficult to find. In qrange a small number of TRN
calculations
(8 by default) are repeated with different random number seeds. The
averaged
codebook vectors and their statistical variability are then saved. In
general,
a low vector variability indicates good convergence and reliable vector
positions. The variability depends on the shape of the 3D input density
distribution and on the number of vectors. Therefore, it is a good
selection
criterion for finding an optimal number k of codebook vectors.
To optimally
represent the input
density distribution, qrange employs a multi-resolution approach and
computes
vectors for a range of k (3 <= k <= 9 by default). Note that for
globular shapes a small number of vectors (3-4) is usually sufficient
to
encode the shape unambiguously. A maximum number of k=9 vectors ( = 27
degrees
of freedom) provides sufficient leeway to encode even the most complex
shapes.
After computing
the k vectors
for both high- and low resolution data, the subsequent docking then
determines
the six rigid-body degrees of freedom by a least-squares fit [Kabsch,
1976]
of the k pairs of vectors. The corresponding vectors are not known a
priori,
and all k! = k*(k-1)* ...*2 possible permutations are explored. The
program
saves a list of best least-squares fits (for each number k), ranked by
the remaining rms deviation (rmsd) after superposition of the vectors.
The ranking by codebook vector rms deviation typically produces a clear
prediction of the optimum docking configuration.
Usage:
In a practical
application, the
user selects the number k that produces the smallest statistical vector
variability. Alternatively, the user may also consider the number k
that
results in the smallest rmsd (the two selection criteria give the same
results in most cases). For a given k the user then selects the fit
with
the smallest vector rmsd from a list of (default: 20) best scoring
results.
This strategy was validated in detail by Wriggers
and
Birmanns; it identifies a clear winner in most situations (for
more
info read the manuscript). The positional and orientational accuracy of
the fitting is also estimated based on the statistical vector
variability.
Usage (at
shell prompt):
| ./qrange file1 file2
file1: inputfile 1, Situs format
file2: inputfile 2, PDB format
|
Interactive input at
program prompt (also suitable for automation):
- Choice of
utilities to inspect the
density distribution (e.g. voxel histogram).
- Threshold
(cutoff) density value.
- Choice of
volumetric map coordinate
system.
- If water
molecules are present, choice
of ignoring them.
- B-factor
cutoff
level. Atoms with
B-factors above this level will be ignored.
- Selection
of
the vector number k
from a list.
- Selection
of a
docked high-resolution
structure from a list.
- Filename
for
the selected output
structure.
- If desired,
select and export additional
results from the lists.
- You can automate this program by
"overloading" the standard input if you put expected values in a script!
Output:
- (Program
level:) The sphericity,
a measure between 0 and 1 that characterizes how spherical the shape of
the structure (file2) is. After the multiple vector quantizations the
program
prints the selection criteria (vector variability and vector rmsd) as a
function of k. Also shown are the possible pairings of corresponding
codebook
vectors. The best least-squares fits are ranked by their vector rmsd
(in
Angstrom). Also, the correlation coefficient is given. Permutations
indicate
the internal order of vectors. Finally, the positional and
orientational
accuracy of the selected fit is given, as computed from the statistical
vector variability.
- (Shell
level:)
The superposition
of the selected corresponding codebook vectors defines a rigid-body
transformation
which superimposes high- and low-resolution data sets. After
transformation,
the docked high-resolution structure is written to a file in PDB
format.
The codebook vectors are appended to this output file. QVOL atoms
represent
the low-resolution vectors and QPDB atoms represent the high resolution
vectors. The vector rms variabilities, representing the precision of
the
codebook vectors, are written to the occupancy fields of the PDB-style
atom entries.
Notes:
- qrange
employs
an efficient atom
mass-weighting scheme using "equally weighted input vectors".
- The user
has
the choice to change
the origin of the map coordinate system. This option might be helpful
if
a graphics program has a different convention for defining the map
coordinate
system than map2map .
- The program
also allows to ignore
flexible or poorly defined atoms with high crystallographic B-factors.
This option should only be chosen if there is an indication that parts
of the protein are not visible in the low-resolution data due to
disorder.
- Often there
is
no one-to-one correspondence
between low-resolution data and what one would consider the physical
density
of the structure, so users may need to experiment with the volumetric
and
B-factor threshold values until the results are satisfying. To help
estimate
appropriate threshold values for single molecules, qvol
and qpdb print the radius of
gyration of
the
codebook vectors, a measure of vector compactness. At the proper
threshold
densities the radius of gyration returned by qvol
should be approximately equal to that of qpdb
.
|
| qvol
- Vector Quantization of Volumetric Map
Purpose:
Specialized tool
to perform a vector
quantization of low-resolution, single molecule data. For
rigid-body
docking, much of the functionality of qvol has been superseded by the
combined
vector quantization / docking utility qrange
.
qvol
remains part of the distribution to support the correlation-coefficient
based docking with qdock, and to
support flexible
docking. In the absence of existing vector positions, qvol carries
out a global search using the TRN algorithm. If
start vectors are already known, the LBG local
search
algorithm is used instead of TRN. LBG allows to add distance
constraints to the vector refinement that are useful for flexible
docking.
The global
optimum of vector positions
is difficult to find. With TRN,
a small number
of calculations (8 by default) are repeated with different random
number
seeds. The averaged codebook vectors and their statistical variability
are then written to the output file. With LBG,
no statistical clustering is performed. In this case it is important to
specify reliable initial positions from a prior qvol run.
Usage:
In a practical
application of
qvol, one should extract from the volumetric data a region of interest
corresponding to a single molecule using e.g. voledit.
Next, the user
must determine
a suitable number of codebook vectors. Only densities above a
user-defined threshold
value are considered by qvol to eliminate background noise in the
low-resolution
data. Depending on the noise, this threshold value should be at 50-80%
of the level that is typically considered the "molecular surface" of
the
biopolymer in the low-resolution data.
New vector
positions are calculated
automatically with the TRN
method if no start
vectors
are specified. Subsequently, these vector positions can be refined in a
second qvol run with the LBG
method (this is
done
automatically if start vectors are
specified).
Also, any distance constraints can be read from a file or entered at
the
command prompt at this time.
Usage (at
shell prompt):
| ./qvol file1 [file2] file3
file1: inputfile, Situs format
file2: inputfile, start vectors,
PDB format (optional)
file3: outputfile, PDB format
|
Interactive input at
program prompt (also suitable for automation):
- Choice of
utilities to inspect the
density distribution (e.g. voxel histogram).
- Threshold
(cutoff) density value.
- Number of
codebook vectors.
- (If file2
is
specified): Choice of
entering distance constraints manually or from a file.
There are two constraint file options. Constraint
file entries generated e.g. with qpdb
are triples
of free-format values in the order
<index1>
<index2> <distance in Angstrom>, where the indices
correspond to
the order of vectors in file2, counting from 1. It is also possible to
read the connectivities from a PSF file in which case the missing
distances are computed from file2.
- Choice of
computing the vector connectivities
(neighborhood relationships) with the Competitive Hebb Rule (Wriggers
et al., 1998) and writing them to a file.
- You can automate
this program by "overloading" the standard input if you put expected
values in a script!
Output:
- (Program
level:) Statistical analysis
of the vectors and their radius of gyration, i.e. the radial rms
deviation
from the vector center of mass.
- (Shell
level):
Codebook vectors in
a PDB-formatted output file. The vector rms variabilities, representing
the precision of the codebook vectors, are written to the occupancy
fields
of the PDB-style atom entries. (Optional) Vector connectivities can be
written to a PSF
file
or a distance constraints file.
Notes:
-
Vector
connectivities in PSF format
can be visualized and edited as bond connections (together with the
atom-style PDB
entries of file2 and file3) using the molecular graphics program VMD. Simply overload
the
PSF file into the PDB file in the VMD 'Molecule' menu. Then
under the 'Mouse' menu select 'Add/Remove Bonds'. The edited
connectivity can then be saved later into a PSF file from the VMD
command console (assuming your molecule
is 'top'):
set sel [atomselect top all] $sel writepsf my.psf
|
- If there
are
cluster size deviations
from the expected value (default: 8) when using the TRN
algorithm, refine the found vector positions by passing them to qvol as
input file of a second, LBG
run.
- Distance
constraints do not determine
the chirality (handedness) of vector connections. If you encounter
mirror
images or otherwise flipped connections after running qvol compared to
connections determined with qpdb, you need to experiment with the
indexing
of your constraints. The LBG method combined with the SHAKE constraint
algorithm is relatively insensitive to the position of start vectors.
|
| vol2pdb
- Create a PDB from a
Volumetric Map
Purpose:
The vol2pdb
utility allows one to encode positive density values of a 3D map into a
PDB file with the densities written to the PDB occupancy column. This
is useful for colores and colacor,
both of which require a PDB and a map as
input parameters.
Usage (at
shell prompt):
| ./vol2pdb
file1 file2
file1:
inputfile 1, Situs format
file2:
outputfile, PDB format
|
Input at
program prompt:
None.
Output:
PDB format file
with densities written to occupancy field (if rescaling necessary the
conversion factor is given by the program).
|
| volcube
- Creating Isocontour Surfaces
Purpose:
The program
volcube is needed only for special applications or older versions of
VMD since VMD now reads Situs formatted maps directly. It produces
wireframe
meshes or solid surfaces of isocontours that can be source from the VMD command console.
The isosurfaces are generated with an improved version of the "marching
cubes'' algorithm [Lorensen and Cline, 1987].
Usage (at
shell prompt):
./volcube file1 file2
file1: inputfile, Situs format
file2: outputfile, graphics rendering
primitives in VMD script format
|
Interactive input at
program prompt (also suitable for automation):
- New voxel
size
for rendering (the
input grid is automatically interpolated).
- Isocontour
surface density level.
- Rendering
style: lines (wireframe),
triangles (solid, flat shades), or trinorm (solid, smooth shades).
- You can automate
this program by "overloading" the standard input if you put expected
values in a script!
Output:
- (Program
level:) Diagnostic messages
and classification of the "marching cube" intersection patterns
according
to Heiden et al., J. Comp. Chem. 14 (1993), 246-250.
- (Shell
level:)
VMD script file with
graphics primitives. The primitives can be sourced from the VMD text
console
(cf. VMD
user guide ). File parameters are written to a header. Example: The
following sequence of commands in the VMD text console will first list
all files in the current directory and then render the graphics
primitives
of file "trinorm.vmd" in red", and those of file "wireframe.vmd" in
green:
ls
draw color red
source trinorm.vmd
draw color green
source wireframe.vmd |
Known Bugs:
To prevent
shading of the wireframe
lines, the Tcl output scripts turn off the VMD "materials" rendering
property.
Subsequently loaded solid surfaces will therefore be rendered flat.
This
can be prevented by sourcing all solid surfaces before sourcing any
wireframes.
|
| voldiff
- Discrepancy / Difference Mapping
Former name:
subtract
Purpose:
The voldiff
utility allows one to compute the difference density map (discrepancy
map)
of two volume data sets in Situs format. The input datasets can differ
in their parameters. If necessary the second input file is resampled to
the grid of the first input file.
Usage (at
shell prompt):
| ./voldiff
file1 file2 file3
file1:
inputfile 1, Situs format
file2:
inputfile 2, Situs format
file3:
outputfile, Situs format
|
Input at
program prompt:
None.
Output:
Density
file in Situs format. The new density values are computed by
subtracting
the corresponding values of input grid 2 (which is resampled by
trilinear interpolation, if necessary) from those of input grid 1.
The output grid inherits the parameters from grid 1.
|
voledit
- Inspecting and Editing 3D Maps
Supersedes
former programs: volslice, floodfill, volpad (padup), volcrop (pindown)
Purpose:
The
floodfill
utility is used
to extract a targeted contiguous volume from a given density map.
Originating
from the vicinity of a given start voxel, floodfill finds recursively
the
maximum contiguous volume formed by neighboring voxels that exceed a
given
contour density level. Suitable start positions can be identified by
visual
inspection with the voledit or voledit3d
utilities. The routine tolerates near misses of the start position.
Cross
sections
of the density
data in the (x,y)-, (y,z)-, or (z,x)-planes can be inspected with the
simple terminal window graphics program voledit. The
utility
can also be used to write individual 2D slices or 3D volumes to files.
Volumes can be edited by cropping, zero padding, polygon clipping, and
segmentation (specified under options).
Usage (at
shell prompt):
| ./voledit file1
file1: inputfile, Situs format
|
Interactive input at
program prompt (also suitable for automation):
- Type of cross
section, (x,y),
(y,z), or (z,x).
- Threshold
(cutoff) value for
the rendering of the density.
- z, x, or y
position of the cross
section plane (grid units).
- Polygon
clipping
parameters and
vertices (options).
- Cropping
parameters in voxel units (options).
- Zero padding
in voxel units (options).
- Segmentation
parameters to extract a targeted
contiguous volume.
Originating
from the vicinity of a given start voxel, voledit finds recursively
the
maximum contiguous volume formed by neighboring voxels that exceed a
given threshold density level. An additional layer is added for
aesthetic
reasons
to facilitate isocontouring near
the cutoff
level. Although the extracted grid contains some voxels (in the contour
layer) with densities below the cutoff, all voxels with density values
above the cutoff are guaranteed to be part of the found contiguous
volume.
Voxels outside the contour layer are assigned a density value
of
0.
- File name for
2D slice or 3D
volume output
file (options).
- You can automate this program by
"overloading" the standard input if you put expected values in a script!
Output:
(Shell window:)
Cross section
of the input map in Situs format. Note that pairs of voxels neighboring
in the vertical direction are represented by a single character:
'^' if
the upper voxel
density exceeds a threshold (cutoff) level,
'u', if the lower
voxel density
exceeds the threshold level,
'0' if both upper
and lower voxel
densities exceed the threshold level, and
' ', if the
densities are below
the threshold.
(2D Output File:)
Voxel indices and
density of specified cross section.
(3D Output
File:) Density file in
Situs format.
The new grid inherits the voxel size (grid spacing) of the old grid.
The
number of x, y, and z increments, and the coordinates of voxel (1,1,1)
depend on the chosen editing options (cropping, padding, segmentation).
For example,
in segmentation when shrinking of the box is selected, the grid
dimensions are determined by the minimum box that contains
both
the contiguous volume plus one layer of neighboring voxels with density
values below the threshold.
Notes:
This program
requires the use
of a fixed-width font in the shell window.
|
volhist
- Inspecting and Manipulating the Voxel Histogram
Former name:
histovox
Purpose:
The volhist
utility prints the
voxel histogram [Frank et al., 1991] of the density values. The
histogram
illustrates two general properties of low-resolution density
distributions.
First, a pronounced peak at low densities is due to background
scattering.
The protein density typically corresponds to a second, broader peak at
higher densities. When integrating the histogram ``from the top down'',
the known molecular volume of a protein can be used to compute its
boundary
density value. The volhist program also allows the user to add a
constant
value to the densities to shift the background density peak to the
origin,
and to rescale the densities.
Usage (at
shell prompt):
| ./volhist file1 [file2]
file1: inputfile, Situs format
file2: (optional) outputfile,
Situs format
|
Interactive input at
program prompt (if file2 specified):
- Offset
density
value (will be added
to all voxels).
- Scaling
factor.
- You can automate
this program by "overloading" the standard input if you put expected
values in a script!
Output:
- (Program
level:) Voxel histogram
and fractional volume of volumetric data echoed to the screen. The
histogram
bars are normalized by the second highest density peak.
- (Shell
level:)
Density file in Situs
format (if specified). The new density values are computed by adding
the
offset value and by multiplying the scaling factor entered at the
program
prompt. The new grid inherits all size and position parameters of the
old
grid.
|
volvoxl
- Grid Interpolation
Former name:
interpolate
Purpose:
The volvoxl
utility allows
one to change the voxel size (grid spacing) of a grid. Density values
of
the new grid are computed by trilinear interpolation.
Usage (at
shell prompt):
| ./volvoxl file1 file2
file1: inputfile, Situs format
file2: outputfile, Situs format
|
Interactive input at
program prompt (also suitable for automation):
The desired new
voxel size (grid
spacing). You can automate
this program by "overloading" the standard input if you put expected
values in a script!
Output:
Density file in
Situs format.
The new grid inherits the coordinate system origin and forms the
largest
possible box that is fully enclosed by the old grid.
|
| Header
File and Library Routines
The suite
of programs is supported by a header file (situs.h)
containing
user-defined parameters and by auxiliary library programs. The library
programs and their respective header files handle
input and output of atomic coordinates in PDB format (lib_pio.c), input
and output of volumetric data (lib_vio.c), input of data at the prompt
(lib_std.c), Eigenvector computation for real symmetric 3x3 matrices
(lib_jac.c),
Euler angle generation (lib_eul.c), random number generation
(lib_rnd.c), array management (lib_vec.c), Powell optimization
(lib_pow.c),
map manipulation (lib_vwk.c), PDB manipulation (lib_pwk.c), symmetric
multiprocessing (lib_smp.c), and timing (lib_tim.c).
|
| Return
to the front page . |
|