Flexible Docking
Tutorial, Part I
|
| The tutorial introduces the basic
ideas of the landmark-based flexible
docking strategy using actin as a test system. It is helpful for the
understanding
of the tutorial if the user is already familiar with the classic
EM tutorial and the correlation-based docking tutorial. This tutorial uses the qplasty
tool for approximative flexing at carbon alpha level of detail,
we offer alternative (and more stereochemically accurate) Molecular
Dynamics protocols at Situs Flavors. The
results of the flexing can
be compared to solutions
distributed with the tutorial software. More documentation is available
in the user guide, on the methodology
page, and in the published articles. |
|
Content:
|
| Download
and Installation
First, follow these registration
and download steps
(each Situs tutorial is separate and must be downloaded individually)!
Then, return to this page.
In addition to
the executables, the
Situs_2.7.1_flex_tutorial/bin
directory contains 4 data files and an executable shell
script:
- 0_actin_orig.pdb:
Original
actin (start) structure.
- 0_actin_target.pdb:
Target
structure for flexible docking.
- 0_rnap1.pdb:
atomic structure of RNA polymerase in "closed" conformation.
- 0_rnap2.situs:
simulated EM map of RNA polymerase in "open" conformation.
- run_tutorial.bash:
Bash shell script containing all commands of this tutorial.
In the following,
we will
use the first two files to dock actin to various low-resolution maps. The
user can compare all generated files to the files in the "solutions"
directory. The two files in brown are used in part II
of this tutorial.
|
Data
Flow and Design
The series
of steps and the programs that are required to dock an
atomic-resolution
structure flexibly to single-molecule, low-resolution data are shown
schematically
in the following figure. Detailed program explanations are
given
in the user guide .

Schematic
diagram of flexing related
routines. Major Situs components (blue) are classified by their
functionality.
The main work flow is indicated by brown arrows. The advanced modeling of
distance constraints for the motion
capture skeleton is
shown in dark
blue. Visualization
(orange) for the rendering of the data requires a molecular
graphics viewer (we use
here the VMD
graphics program, Chimera and Sculptor also support Situs
format).
Standard EM
formats are supported
and are converted to cubic lattices in Situs format. This is done with
the map2map utility. Subsequently, the
data is inspected and, if necessary, prepared for the vector
quantization using a variety of visualization and analysis tools.
Atomic
coordinates in PDB format can be transformed to low-resolution maps, if
necessary, and vice versa. During vector quantization of the
high-resolution structure,
distances can be learnt that are sent to the vector quantizer of the
low-resolution
structure to enable skeleton-based fitting.
After
the vector quantization, the high-resolution structure is flexibly
docked by the qplasty
tool to the
low-resolution density by the
corresponding codebook
vectors.
|
| Creating
a Simulated Target Map for Validation
First we create
a simulated EM map from a known atomic structure for later validation.
We lower the
resolution of the
target structure to 15 Å with the pdb2vol
kernel convolution utility. Enter at the shell prompt:
| ./pdb2vol 0_actin_target.pdb
1_actin_target.situs |
Select
mass-weighting (enter 2)
and select no B-factor cutoff (enter 1). Next enter the desired voxel
spacing of the output map. Given the dimensions of the structure, 2
Å
appears to be a good compromise between lattice accuracy and storage
requirement.
Next, enter the desired output resolution as a negative number: -15
Å.
Next, select the Gaussian smoothing kernel (enter 1). Select lattice
correction
(enter 1), and enter the maximum amplitude of the kernel (enter 1). You can also automate this
procedure in a script by overloading the expected input (see run_tutorial.bash for
details):
The program
projects the atomic
structure to the lattice, computes the Gaussian kernel, and carries out
the real-space convolution, writing the resulting volumetric map to the
file 1_actin_target.situs.
Here is the full program output:
./pdb2vol 0_actin_target.pdb 1_actin_target.situs
lib_pio>
3572 atoms read.
pdb2vol>
Found 639 hydrogens, 0 water atoms, 0 codebook vectors, 0 density atoms
pdb2vol>
Hydrogens will be ignored.
pdb2vol>
Do you want to mass-weight the atoms ?
pdb2vol>
pdb2vol>
1: No
pdb2vol>
2: Yes
pdb2vol> 2
pdb2vol>
Do you want to select atoms based on a B-factor threshold?
pdb2vol>
pdb2vol>
1: No
pdb2vol>
2: Yes
pdb2vol> 1
pdb2vol>
2933 out of 3572 atoms selected for conversion.
pdb2vol>
pdb2vol>
The input structure measures 74.971 x 62.460 x 38.249 Angstrom
pdb2vol>
pdb2vol>
Please enter the desired voxel spacing for the output map (in
Angstrom): 2
pdb2vol>
pdb2vol>
Kernel width. Please enter (in Angstrom):
pdb2vol>
(as pos. value) kernel half-max radius or
pdb2vol>
(as neg. value) target resolution (2 sigma)
pdb2vol>
Now enter (signed) value: -15
pdb2vol>
pdb2vol>
Please select the type of smoothing kernel:
pdb2vol>
pdb2vol>
1: Gaussian, exp(-1.5 r^2 / sigma^2)
pdb2vol>
sigma = 7.500A, r-half = 5.098A, r-cut = 12.990A
pdb2vol>
pdb2vol>
2: Triangular, max(0, 1 - 0.5 |r| / r-half)
pdb2vol>
sigma = 7.500A, r-half = 5.929A, r-cut = 11.859A
pdb2vol>
pdb2vol>
3: Semi-Epanechnikov, max(0, 1 - 0.5 |r|^1.5 / r-half^1.5)
pdb2vol>
sigma = 7.500A, r-half = 7.331A, r-cut = 11.637A
pdb2vol>
pdb2vol>
4: Epanechnikov, max(0, 1 - 0.5 r^2 / r-half^2)
pdb2vol>
sigma = 7.500A, r-half = 8.101A, r-cut = 11.456A
pdb2vol>
pdb2vol>
5: Hard Sphere, max(0, 1 - 0.5 r^60 / r-half^60)
pdb2vol>
sigma = 7.500A, r-half = 9.722A, r-cut = 9.835A
pdb2vol> 1
pdb2vol>
pdb2vol>
Do you want to correct for lattice interpolation smoothing effects?
pdb2vol>
pdb2vol>
1: Yes (slightly lowers the kernel width to maintain target resolution)
pdb2vol>
2: No
pdb2vol> 1
pdb2vol>
pdb2vol>
Finally, please enter the desired kernel amplitude (scaling factor): 1
pdb2vol>
pdb2vol>
Projecting atoms to cubic lattice by trilinear interpolation...
pdb2vol>
... done. Lattice smoothing (sigma = atom rmsd): 1.408 Angstrom
pdb2vol>
pdb2vol>
Computing Gaussian kernel (correcting sigma for lattice smoothing)...
pdb2vol>
... done. Kernel map extent 15 x 15 x 15 voxels
pdb2vol>
pdb2vol>
Convolving lattice with kernel...
pdb2vol>
... done. Spatial resolution (2 sigma) of output map: 15.000A
pdb2vol>
lib_vio>
Writing density data...
lib_vio>
Volumetric data written to file 1_actin_target.situs
lib_vio>
Situs formatted map file 1_actin_target.situs - Header information:
lib_vio>
Columns, rows, and sections: x=1-57, y=1-52, z=1-39
lib_vio>
3D coordinates of first voxel (1,1,1):
(-58.000000,-50.000000,-36.000000)
lib_vio>
Voxel size in Angstrom: 2.000000
|
By loading the resulting map into VMD
(see below), one can vary the density threshold. A threshold of
10 (~15% of the maximum value) corresponds approximately to the surface
of the molecule.
|
| Preliminary
Rigid-Body Registration
Before we start
fitting the original
actin structure to the target structure, it is important to roughly
align
the atomic structure and the target map by rigid-body fitting.
An initial alignment
"by eye" can e.g. be done with VMD (move a
loaded molecule by selecting the VMD menu Mouse -> Move ->
Molecule,
then translate it with the mouse and rotate it by pressing the Shift
key; the new coordinates can then be saved by selecting File -> Save
Coordinates).
Alternatively, an automated
rigid-body fitting
procedure can also be employed, e.g. using colores, collage,
or matchpt as explained in other tutorials. It is a good idea to
export a
number
of best-scoring rigid body fits, and
to explore the alignment of these
fits by eye, before selecting one for subsequent flexible fitting.
For example,
using colores at the shell prompt enter:
| ./colores 1_actin_target.situs
0_actin_orig.pdb -res
15.0
-deg 20 -explor 1 |
After this run we rename the resulting fit and remove the auxiliary files of the colores run:
mv col_best_001.pdb 2_actin_orig_dock_target.pdb
rm col_*
|
|
| Vector
Quantization of the High-Resolution Structure with qpdb
Now, we perform
the vector quantization
of the rigid fitted structure with the qpdb
utility.
At the shell
prompt, enter
| ./qpdb
2_actin_orig_dock_target.pdb
2_actin_orig_dock_target.qpdb |
and select mass-weighting (enter 2),
ignore the B-factor cutoff (enter 1). Next, enter the
number
of codebook vectors: 4 (one for each of actin's subdomains). Watch the
program compute a number of datasets for statistical
averaging.
The file 2_actin_orig_dock_target.qpdb now contains the four new
codebook
vectors, their rms variability, and the effective radius of their Voronoi
cells, in PDB format. Finally, the user is asked whether
nearest-neighbor
connectivities should be learnt, or whether the Voronoi cells should be
saved. Here we twice enter 1 (don't save the connectivities or Voronoi cells). See also run_tutorial.bash.
Here is the
output of the entire
qpdb calculation:
./qpdb
2_actin_orig_dock_target.pdb 2_actin_orig_dock_target.qpdb
lib_pio>
3580 atoms read.
...qpdb>
Found 639 hydrogens, 0 water atoms, 8 codebook vectors, 0 density atoms
...qpdb>
Hydrogens will be ignored.
...qpdb>
Do you want to mass-weight the atoms ?
...qpdb>
...qpdb>
1: No
...qpdb>
2: Yes
...qpdb> 2
...qpdb>
Do you want to select atoms based on a B-factor threshold?
...qpdb>
...qpdb>
1: No
...qpdb>
2: Yes
...qpdb> 1
...qpdb>
2954 equally weighted inputs out of originally 3580 atoms selected for
conversion.
...qpdb>
...qpdb>
Sphericity of the atomic structure: 0.52
...qpdb>
Enter desired number of codebook vectors for data quantization: (0 to
exit): 4
...qpdb>
Computing 8 datasets, 100000 iterations each...
...qpdb>
Now producing dataset 1
...qpdb>
Now producing dataset 2
...qpdb>
Now producing dataset 3
...qpdb>
Now producing dataset 4
...qpdb>
Now producing dataset 5
...qpdb>
Now producing dataset 6
...qpdb>
Now producing dataset 7
...qpdb>
Now producing dataset 8
...qpdb>
...qpdb>
Codebook vectors have been written to file 2_actin_orig_dock_target.qpdb
...qpdb>
The PDB B-factor field contains the equivalent spherical radii
...qpdb>
of the corresponding Voronoi cells (in Angstrom).
...qpdb>
Cluster analysis of the 8 independent calculations:
...qpdb>
The PDB occupancy field in 2_actin_orig_dock_target.qpdb contains the
rms variabilities of the vectors.
...qpdb>
Average rms fluctuation of the 4 codebook vectors: 0.865 Angstrom
...qpdb>
Radius of gyration of the 4 codebook vectors: 17.347 Angstrom
...qpdb>
...qpdb>
Do you want to learn nearest-neighbor connectivities?
...qpdb>
Choose one of the following options -
...qpdb>
1: No.
...qpdb>
2: Learn and save to a PSF file
...qpdb>
3: Learn and save to a constraints file
...qpdb>
4: Learn and save to both PSF and constraints files
...qpdb> 1
...qpdb>
...qpdb>
Do you want to save the Voronoi cells?
...qpdb>
Choose one of the following options -
...qpdb> 1: No.
I'm done
...qpdb>
2: Yes. Save cells to a PDB file
...qpdb> 1
...qpdb> Bye bye!
|
|
| Vector
Quantization of the Low-Resolution Map with qvol
For the vector
quantization of the volumetric
dataset with the qvol utility we use
the previous qpdb codebook vectors as start positions. After entering
| ./qvol 1_actin_target.situs 2_actin_orig_dock_target.qpdb
3_actin_target.qvol |
the user is
prompted to enter
the density cutoff value. We enter 100 which is an appropriate surface value for this map. Subsequently, the program
asks whether we wish to optimize the data. or analyse it only. We enter
1 for LBG optimization. Next, the program asks if the users wishes to
use distance constraints, we enter 1 (No). Finally, the user is asked
whether
nearest-neighbor connectivities should be learnt. For now we enter 1 (No). The file
3_actin_target.qvol
now contains the "QVOL" codebook vectors.
Here is the
output of this qvol
calculation:
./qvol 1_actin_target.situs 2_actin_orig_dock_target.qpdb 3_actin_target.qvol
lib_vio> Situs formatted map file 1_actin_target.situs - Header information:
lib_vio> Columns, rows, and sections: x=1-57, y=1-52, z=1-39
lib_vio> 3D coordinates of first voxel: (-58.000000,-50.000000,-36.000000)
lib_vio> Voxel size in Angstrom: 2.000000
lib_vio> Reading density data...
lib_vio> Volumetric data read from file 1_actin_target.situs
...qvol> Density values below a user-defined cutoff value will not be considered
...qvol> Do you want to inspect the input density values before entering the cutoff value?
...qvol> Choose one of the following three options -
...qvol> 1: No (continue)
...qvol> 2: Show me the minimum and maximum density values only
...qvol> 3: Show me the voxel histogram
...qvol> 1
...qvol> Now enter the cutoff density value: 100
...qvol> Cutting off density values < 100.000000, remaining occupied volume: 12713 voxels (1.017040e+05 Angstrom^3)
lib_pio> 4 atoms read.
...qvol> Do you want to optimize the start vectors or skip and proceed to the connectivity analysis?
...qvol> Choose one of the following two options -
...qvol> 1: Optimize start vectors with LBG
...qvol> 2: Skip and proceed directly to connectivity analysis
...qvol> 1
...qvol>
...qvol> Using start vectors from file 2_actin_orig_dock_target.qpdb.
...qvol>
...qvol> Vector distance constraints restrict undesired degrees of freedom.
...qvol> Do you want to add distance constraints?
...qvol> Choose one of the following three options -
...qvol> 1: No
...qvol> 2: Yes. I want to enter them manually
...qvol> 3: Yes. I want to read connectivities from a PSF file and use start vector distances
...qvol> 4: Yes. I want to read them from a Situs constraints file
...qvol> 1
...qvol> Starting standard LBG vector quantization.
...qvol> It. 1 -- Average vector update: 3.794887e-01 Angstrom
...qvol> It. 2 -- Average vector update: 3.459883e-01 Angstrom
...qvol> It. 3 -- Average vector update: 3.164652e-01 Angstrom
...qvol> It. 4 -- Average vector update: 2.914341e-01 Angstrom
...qvol> It. 5 -- Average vector update: 2.667951e-01 Angstrom
...qvol> It. 6 -- Average vector update: 2.440749e-01 Angstrom
...qvol> It. 7 -- Average vector update: 2.238978e-01 Angstrom
...qvol> It. 8 -- Average vector update: 2.044688e-01 Angstrom
...qvol> It. 9 -- Average vector update: 1.876906e-01 Angstrom
...qvol> It. 10 -- Average vector update: 1.723334e-01 Angstrom
...qvol> It. 11 -- Average vector update: 1.587836e-01 Angstrom
...qvol> It. 12 -- Average vector update: 1.464327e-01 Angstrom
...qvol> It. 13 -- Average vector update: 1.345356e-01 Angstrom
...qvol> It. 14 -- Average vector update: 1.241281e-01 Angstrom
...qvol> It. 15 -- Average vector update: 1.145233e-01 Angstrom
...qvol> It. 16 -- Average vector update: 1.052219e-01 Angstrom
...qvol> It. 17 -- Average vector update: 9.707859e-02 Angstrom
...qvol> It. 18 -- Average vector update: 8.882026e-02 Angstrom
...qvol> It. 19 -- Average vector update: 8.154858e-02 Angstrom
...qvol> It. 20 -- Average vector update: 7.512557e-02 Angstrom
...qvol> It. 21 -- Average vector update: 6.959220e-02 Angstrom
...qvol> It. 22 -- Average vector update: 6.432284e-02 Angstrom
...qvol> It. 23 -- Average vector update: 5.972955e-02 Angstrom
...qvol> It. 24 -- Average vector update: 5.495829e-02 Angstrom
...qvol> It. 25 -- Average vector update: 5.124211e-02 Angstrom
...qvol> It. 26 -- Average vector update: 4.685754e-02 Angstrom
...qvol> It. 27 -- Average vector update: 4.316624e-02 Angstrom
...qvol> It. 28 -- Average vector update: 4.039170e-02 Angstrom
...qvol> It. 29 -- Average vector update: 3.792771e-02 Angstrom
...qvol> It. 30 -- Average vector update: 3.597380e-02 Angstrom
...qvol> It. 31 -- Average vector update: 3.496715e-02 Angstrom
...qvol> It. 32 -- Average vector update: 3.241709e-02 Angstrom
...qvol> It. 33 -- Average vector update: 3.048423e-02 Angstrom
...qvol> It. 34 -- Average vector update: 2.848838e-02 Angstrom
...qvol> It. 35 -- Average vector update: 2.647463e-02 Angstrom
...qvol> It. 36 -- Average vector update: 2.404363e-02 Angstrom
...qvol> It. 37 -- Average vector update: 2.244006e-02 Angstrom
...qvol> It. 38 -- Average vector update: 2.056871e-02 Angstrom
...qvol> It. 39 -- Average vector update: 1.959148e-02 Angstrom
...qvol> It. 40 -- Average vector update: 1.851113e-02 Angstrom
...qvol> It. 41 -- Average vector update: 1.725249e-02 Angstrom
...qvol> It. 42 -- Average vector update: 1.626679e-02 Angstrom
...qvol> It. 43 -- Average vector update: 1.602147e-02 Angstrom
...qvol> It. 44 -- Average vector update: 1.627645e-02 Angstrom
...qvol> It. 45 -- Average vector update: 1.567052e-02 Angstrom
...qvol> It. 46 -- Average vector update: 1.466849e-02 Angstrom
...qvol> It. 47 -- Average vector update: 1.390725e-02 Angstrom
...qvol> It. 48 -- Average vector update: 1.287678e-02 Angstrom
...qvol> It. 49 -- Average vector update: 1.187649e-02 Angstrom
...qvol> It. 50 -- Average vector update: 1.093469e-02 Angstrom
...qvol> It. 51 -- Average vector update: 1.013612e-02 Angstrom
...qvol> It. 52 -- Average vector update: 9.438498e-03 Angstrom
...qvol>
...qvol> Final clustering -- Average vector update: 0.000000e+00 Angstrom
...qvol>
...qvol> Codebook vectors have been written to file 3_actin_target.qvol
...qvol> The PDB B-factor field contains the equivalent spherical radii
...qvol> of the corresponding Voronoi cells (in Angstrom).
...qvol> Radius of gyration of the 4 codebook vectors: 19.685 Angstrom
...qvol>
...qvol> Do you want to update or save the input connectivities?
...qvol> Choose one of the following options -
...qvol> 1: No. I'm done
...qvol> 2: Update and save to a PSF file
...qvol> 3: Update and save to a constraints file
...qvol> 4: Update and save to both PSF and constraints files
...qvol> 1
...qvol> Bye bye!
|
|
| Flexible Fitting using
Interpolation
The flexible
docking is approximated, based on the sparsely sampled displacements
from the above qvol and qpdb codebook vectors, by interpolation with qplasty. This is sufficient for carbon alpha
level accuracy. We use here the default parameters for qplasty.
For more information on the algorithm
see Rusu et al., 2008.
To start the flexing,
enter at the shell prompt:
./qplasty 2_actin_orig_dock_target.pdb
2_actin_orig_dock_target.qpdb \
3_actin_target.qvol 4_flexed_to_target.pdb
|
The flexed structure has been written
to file 4_flexed_to_target_pdb.
|
Visualization
We now inspect the above results with VMD. The following
sequence of commands
in the VMD text console (cf. VMD user guide) will load the original and flexed actin
structures,
2_actin_orig_dock_target.pdb (red) and 4_flexed_to_target.pdb (green),
and render them in colored tube representation. The script also renders
the target
density
map, 1_actin_target.situs, in gray:
mol load
pdb 4_flexed_to_target.pdb
mol load
pdb 2_actin_orig_dock_target.pdb
mol load
situs 1_actin_target.situs
mol top 0
rotate stop
display
resetview
display
projection orthographic
mol modstyle
0 0 Tube 0.3 6
mol modstyle
0 1 Tube 0.3 6
mol modstyle
0 2 Isosurface 100 0 0 1 2 1
mol
modcolor 0 0 ColorID 7
mol
modcolor 0 1 ColorID 1
mol
modcolor 0 2 ColorID 2
|
Don't forget to hit "enter" after the last line! The result
should look very similar to this image:

(Click image to
enlarge)
|
| Part II:
Skeleton-Based Docking
We are now prepared to improve the
stereochemical quality of the flexing. It is possible to constrain the
distances between the landmarks to reduce the effect of noise and
experimental
limitations on the codebook vector positions. This "skeleton" based
approach, as
described on the Vector Quantization
page, is related to 3D motion capture
technology
used in the entertainment industry and in biomechanics. The application
is demonstrated in the Flexible
Docking Tutorial, Part II.
|
| Return
to the front page . |
|