Flexible Docking Tutorial, Part II
The tutorial teaches skeleton-based flexible docking using RNA polymerase as a test system. It is helpful for the understanding of the tutorial if the user is already familiar with part I and has performed at least the installation step (the rest of part I can be skipped if the earlier output files are copied from the "solutions" directory). More documentation is available in the user guide, on the methodology page, and in the published articles.
Content:
Introduction

In principle, the resolution of flexible fitting of an atomic structure to a low-resolution map can be improved by increasing the number of codebook vectors. However, there are practical limitations. If there are experimental limitations (e.g. noise, missing parts) not all vectors converge towards equivalent features in the two data sets if the vector number is substantially increased. The solution to this problem is to freeze longitudinal degrees of freedom, and to impose distance constraints on the vectors. This can be done very efficiently with the tools provided by Situs.

We consider here an interesting case, the flexible fitting of the closed RNA polymerase structure to a (simulated) open form at low resolution. Inspection of this file reveals that an isocontour value of 50 units is appropriate.

Preliminary Rigid-Body Registration

Before we start the flexing, it is mandatory to roughly align the atomic structure and the target map by rigid-body fitting. This can be done "by eye" or with one of our Situs tools, as described earlier. Here the structures are already sufficiently aligned.
Vector Quantization of the High-Resolution Structure

Now, we perform the vector quantization of the roughly aligned atomic structure with the quanpdb utility.

At the shell prompt, enter

./quanpdb 0_rnap1.pdb 5_rnap1.qpdb

and select mass weigthing (enter 2) and no B-factor cutoff (enter 1). Next, enter the number of codebook vectors: 15. Watch the program compute a number of datasets for statistical averaging. The file 5_rnap1.qpdb now contains the 15 new codebook vectors, their rms variability, and the effective radius of their Voronoi cells, in PDB format. Finally, the user is asked whether nearest-neighbor connectivities should be learnt, or whether the Voronoi cells should be saved. Here we enter 2 to save the connectivities to file 5_rnap1.qpsf. We do not wish to save the Voronoi cells (enter 1). 

Here is the output of the entire quanpdb calculation:

./quanpdb 0_rnap1.pdb 5_rnap1.qpdb
lib_pio> 10760 atoms read.
quanpdb> Found 0 hydrogens, 0 water atoms, 0 codebook vectors, 0 density atoms
quanpdb> Do you want to mass-weight the atoms ?
quanpdb>
quanpdb> 1: No
quanpdb> 2: Yes
quanpdb> 2
quanpdb> Do you want to select atoms based on a B-factor threshold?
quanpdb>
quanpdb> 1: No
quanpdb> 2: Yes
quanpdb> 1
quanpdb> 10760 equally weighted inputs out of originally 10760 atoms selected for conversion.
quanpdb>
quanpdb> Sphericity of the atomic structure: 0.20
quanpdb> Enter desired number of codebook vectors for data quantization: (0 to exit): 15
quanpdb> Computing 8 datasets, 100000 iterations each...
quanpdb> Now producing dataset 1
quanpdb> Now producing dataset 2
quanpdb> Now producing dataset 3
quanpdb> Now producing dataset 4
quanpdb> Now producing dataset 5
quanpdb> Now producing dataset 6
quanpdb> Now producing dataset 7
quanpdb> Now producing dataset 8
quanpdb>
quanpdb> Codebook vectors have been written to file 5_rnap1.qpdb
quanpdb> The PDB B-factor field contains the equivalent spherical radii
quanpdb> of the corresponding Voronoi cells (in Angstrom).
quanpdb> Cluster analysis of the 8 independent calculations:
quanpdb> The PDB occupancy field in 5_rnap1.quanpdb contains the rms variabilities of the vectors.
quanpdb> Average rms fluctuation of the 15 codebook vectors: 3.732 Angstrom
quanpdb> Radius of gyration of the 15 codebook vectors: 41.810 Angstrom
quanpdb>
quanpdb> Do you want to learn nearest-neighbor connectivities?
quanpdb> Choose one of the following options -
quanpdb> 1: No.
quanpdb> 2: Learn and save to a PSF file
quanpdb> 3: Learn and save to a constraints file
quanpdb> 4: Learn and save to both PSF and constraints files
quanpdb> 2
quanpdb> Enter PSF filename: 5_rnap1.qpsf
quanpdb> Connectivity data written to PSF file 5_rnap1.qpsf.
quanpdb>
quanpdb> Do you want to save the Voronoi cells?
quanpdb> Choose one of the following options -
quanpdb> 1: No. I'm done
quanpdb> 2: Yes. Save cells to a PDB file
quanpdb> 1
quanpdb> Bye bye..

Assigning Corresponding Vectors and Distance Constraints

We now have codebook vectors and the distance information for the atomic structure. To proceed with the flexible docking, two problems must be solved:

  • We need to generate low-resolution vectors that are in correspondance with the atomic ones, i.e. here we must form 15 pairs of vectors that will be used as refinement parameters.
  • Modeling of the skeleton. The distance connectivity file 5_rnap1.qpsf contains connectivity information about adjacent high-resolution vectors. This file is only a starting point. E.g. enforcing all of these constraints would make the structure rather rigid, so we must select a sparse set of non-redundant distances, i.e. a skeleton, that enables flexible fitting.

There is no automated Situs routine for assigning corresponding vectors and for modeling the skeleton. It is recommended to select these vectors by visual inspection with a graphics program.

Vector connectivities in PSF format (recognized by Molecular Dynamics related programs such as VMD, CHARMM, X-PLOR, CNS, NAMD) can be visualized and edited as bond connections (together with the corresponding codebook vector PDB file) using the molecular graphics program VMD. Here we overload the PSF file into the PDB file in the VMD command console:

mol load pdb 5_rnap1.qpdb psf 5_rnap1.qpsf
mol load pdb 0_rnap1.pdb
mol modstyle 0 1 Tube 0 1
mol modcolor 0 0 ColorID 10
mol modcolor 0 1 ColorID 2
display projection orthographic


Then rotate your molecule such that it is oriented as in the figures below (you should recognize the pattern):

(click images to enlarge)

Then toggle the display of molecule 0_rnap1.pdb (in the VMD Molecule menu) so that only the blue connectivities remain. Don't rotate the molecule. Under the 'Mouse' menu select 'Add/Remove Bonds'. Remove any "red" bonds shown above by clicking on both end points. Then add the "green" bonds the same way. You should have a proper balance between freedom of motion and stability of the skeleton. This takes some experience, trial and error. The blue connectivities give a reasonable fit that can be improved later. The edited connectivity can then be saved into a PSF file from the VMD command console (assuming your connectivity molecule is still 'top'):

set sel [atomselect top all]
$sel writepsf 6_rnap1.qpsf


Note: in run_tutorial.bash we copy 6_rnap1.qpsf from the solution directory. This is a "cheat" that allows one to run the entire script without waiting for this manual step.
LBG Vector Quantization of the Low-Resolution Map

Now that the skeleton connectivities have been determined, the skeleton distances can be enforced with the local LBG optimization algorithm of the quanvol utility. After entering

./quanvol 0_rnap2.situs 5_rnap1.qpdb 7_rnap2.qvol

the user is prompted to enter the density cutoff value. We enter the approximate surface threshold, 50. The start vectors are taken from file 5_rnap1.qpdb and the user has the choice optimization with LBG (enter 1) and of assigning vector distances. We read them from the file 6_rnap1.qpsf (option 3). Finally, the user is asked whether nearest-neighbor connectivities should be learnt, we don't need this option at the moment (enter 1). The file 7_rnap2.qvol now contains the newly placed codebook vectors with the distance constraints enforced. As an exercise you should load both this file and
6_rnap1.qpsf, together with 0_rnap2.situs into VMD:

(click image to enlarge)

Below is the output of the entire quanvol LBG calculation. LBG is an iterative "gradient descent" method that converges slowly, whereas TRN (without the second argument as start vectors) uses a prespecified number of calculations. The additional SHAKE statements describe the convergence of the distance constraints:

./quanvol 0_rnap2.situs 5_rnap1.qpdb 7_rnap2.qvol
lib_vio> Situs formatted map file 0_rnap2.situs - Header information:
lib_vio> Columns, rows, and sections: x=1-44, y=1-47, z=1-45
lib_vio> 3D coordinates of first voxel (1,1,1): (140.000000,12.000000,0.000000)
lib_vio> Voxel size in Angstrom: 4.000000
lib_vio> Reading density data...
lib_vio> Volumetric data read from file 0_rnap2.situs
quanvol> Density values below a user-defined cutoff value will not be considered
quanvol> Do you want to inspect the input density values before entering the cutoff value?
quanvol> Choose one of the following three options -
quanvol>      1: No (continue)
quanvol>      2: Show me the minimum and maximum density values only
quanvol>      3: Show me the voxel histogram
quanvol> 1
quanvol> Now enter the cutoff density value: 50
quanvol> Cutting off density values < 50.000000, remaining occupied volume: 10005 voxels (6.403200e+05 Angstrom^3)
lib_pio> 15 atoms read.
quanvol> Do you want to optimize the start vectors or skip and proceed to the connectivity analysis?
quanvol> Choose one of the following two options -
quanvol>      1: Optimize start vectors with LBG
quanvol>      2: Skip and proceed directly to connectivity analysis
quanvol> 1
quanvol>
quanvol> Using start vectors from file 5_rnap1.qpdb.
quanvol>
quanvol> Vector distance constraints restrict undesired degrees of freedom.
quanvol> Do you want to add distance constraints?
quanvol> Choose one of the following three options -
quanvol>      1: No
quanvol>      2: Yes. I want to enter them manually
quanvol>      3: Yes. I want to read connectivities from a  file and use start vector distances
quanvol>      4: Yes. I want to read them from a Situs constraints file
quanvol> 3
quanvol> Enter filename: 6_rnap1.qpsf
quanvol> 30 connectivities read from file 6_rnap1.qpsf
quanvol> The corresponding distances were assigned from file 5_rnap1.quanpdb
quanvol> Distance preconditioning step 1 -- 48 SHAKE distance iterations
quanvol> Distance preconditioning step 2 -- 39 SHAKE distance iterations
quanvol> Distance preconditioning step 3 -- 37 SHAKE distance iterations
quanvol> Distance preconditioning step 4 -- 40 SHAKE distance iterations
quanvol> Distance preconditioning step 5 -- 39 SHAKE distance iterations
quanvol> Distance preconditioning step 6 -- 37 SHAKE distance iterations
quanvol> Distance preconditioning step 7 -- 35 SHAKE distance iterations
quanvol> Distance preconditioning step 8 -- 34 SHAKE distance iterations
quanvol> Distance preconditioning step 9 -- 33 SHAKE distance iterations
quanvol> Distance preconditioning step 10 -- 33 SHAKE distance iterations
quanvol> Starting standard LBG vector quantization.
quanvol> It. 1 -- 53 SHAKE distance iterations
quanvol> It. 1 -- Average vector update: 6.108390e-01 Angstrom
quanvol> It. 2 -- 48 SHAKE distance iterations
quanvol> It. 2 -- Average vector update: 5.948577e-01 Angstrom
quanvol> It. 3 -- 48 SHAKE distance iterations
quanvol> It. 3 -- Average vector update: 5.773613e-01 Angstrom
quanvol> It. 4 -- 49 SHAKE distance iterations
quanvol> It. 4 -- Average vector update: 5.605641e-01 Angstrom
quanvol> It. 5 -- 52 SHAKE distance iterations
quanvol> It. 5 -- Average vector update: 5.440111e-01 Angstrom

...

quanvol> It. 118 -- 89 SHAKE distance iterations
quanvol> It. 118 -- Average vector update: 1.105221e-02 Angstrom
quanvol> It. 119 -- 89 SHAKE distance iterations
quanvol> It. 119 -- Average vector update: 9.938773e-03 Angstrom
quanvol>
quanvol> Final clustering -- 71 SHAKE distance iterations
quanvol> Final clustering -- Average vector update: 2.273772e-03 Angstrom
quanvol> Final clustering -- 71 SHAKE distance iterations
quanvol> Final clustering -- Average vector update: 2.607668e-07 Angstrom
quanvol>
quanvol> Codebook vectors have been written to file 7_rnap2.qvol
quanvol> The PDB B-factor field contains the equivalent spherical radii
quanvol> of the corresponding Voronoi cells (in Angstrom).
quanvol> Radius of gyration of the 15 codebook vectors: 42.462 Angstrom
quanvol>
quanvol> Do you want to update or save the input connectivities?
quanvol> Choose one of the following options -
quanvol>      1: No. I'm done
quanvol>      2: Update and save to a PSF file
quanvol>      3: Update and save to a constraints file
quanvol>      4: Update and save to both PSF and constraints files
quanvol>      5: Just save (don't update) to a PSF file
quanvol> 1
quanvol> Bye bye!

Flexible, Skeleton-Based Docking with Interpolation

As described earlier, the flexible docking is performed here in an approximative way by interpolation from the 15 pairs of codebook vectors. 

To start the qplasty optimization, enter at the shell prompt

./qplasty 0_rnap1.pdb 5_rnap1.qpdb 7_rnap2.qvol 8_flexed_rnap.pdb

The flexed structure will be written to the file 8_flexed_rnap.pdb.

Visualization I: 3D

The following sequence of commands in the VMD text console (cf. VMD user guide) will load the original and flexibly docked rnap structures, 0_rnap1.pdb (red) and 8_flexed_rnap.pdb (green), and render them in colored tube representation. The script also renders the target density map 0_rnap2.situs in gray:

mol load pdb 8_flexed_rnap.pdb
mol load pdb 0_rnap1.pdb
mol load situs 0_rnap2.situs
mol top 0
rotate stop
display resetview
display projection orthographic
mol modstyle 0 0 Tube 0.3 6
mol modstyle 0 1 Tube 0.3 6
mol modstyle 0 2 Isosurface 50 0 0 1 2 1
mol modcolor 0 0 ColorID 7
mol modcolor 0 1 ColorID 1
mol modcolor 0 2 ColorID 2

Don't forget to hit "enter" after the last line!  The final result should look like this: 


(click image to enlarge)

Note that some density regions are not optimally filled by the flexed structure. As an exercise, you can manually edit the connectivities or codebook vector positions to optimize the fit.
Visualization II: 2D

The following script shows how to take advantage of bash shell scripting to automate the work flow of rendering 2D projections and difference maps between a structure and a density map.  Projections of Difference maps = Difference of Projections (since there are no Situs tools for the latter, we use the former approach):

# create 15A resolution map from 0_rnap1.pdb
./pdb2vol 0_rnap1.pdb tmp.situs <<< '2
1
4
-15
1
1
1
'
# match size of tmp.situs to that of 0_rnap2.situs
rm 9_rnap1_15.situs
./voledit tmp.situs <<< '1
0
5
0
1
1
1
0
0
0
4
2
45
1
47
2
46
0
12
9_rnap1_15.situs
0
13
'
rm tmp.situs
# create Y projection from 9_rnap1_15.situs
rm 9_rnap1_15.dat
./voledit  9_rnap1_15.situs <<< '2
-999
0
11
3
9_rnap1_15.dat
0
13
'
# create Y projection from 0_rnap2.situs
rm 9_rnap2.dat
./voledit 0_rnap2.situs <<< '2
-999
0
11
3
9_rnap2.dat
0
13
'
# create difference map
./voldiff 0_rnap2.situs 9_rnap1_15.situs 9_diff.situs
# create Y projection from 9_diff.situs
rm 9_diff.dat
./voledit 9_diff.situs <<< '2
-999
0
11
3
9_diff.dat
0
13
'

The resulting projections (.dat files) can be inspected with a plotting program such as MATLAB. The final result should look like this: 


(click image to enlarge)

This script is part of the included run_tutorial.bash script (see the file for detailed documentation).
Return to the front page .