Icosahedral permuted comparison of relative shape similarity

Euromug '96 - New projects - next


The goal of this project is to develop molecular surface shape descriptors, examine the utility of various similarity metrics, and explore the possibility of building a high-speed search method for molecular shapes.


Fast methods for generating shape descriptors based on tesselated icosahedra and X-based development applications have been implemented. Current tasks include design of the search algorithm, evaluation of optimization methods, and integration of surface properties. Results are promising so far except that our current approach for integrating surface properties plays havoc with many of the most powerful speed optimizations.



The IPCRESS strategy is to generate a description of molecular surface based on telssalated icosahedron and use these for fast comparisons of molecular shape. The general approach is similar to that used in the BURST program, but IPCRESS works from the inside-out instead of vise-versa. The method used for generating shape descriptors is essentially the same as presented at MUG '96, in theory: Methods used to compare molecular shape descriptors are also similar to that previously reported, in general: This method is theoretically amenable to very profound speed optimization. Exact shape descriptors are easy to compute (100's/S); approximate descriptors can be computed at amazing speeds using a grid method (~1000 operations per molecule). The rotational iteration is intrinsically fast because vector permutation can be used instead of transendental methods (this is the main advantage of the icosahedral algorithm). Because there are no "real" geometry calculations, the whole search can be done with integer arithmetic. The descriptors are similar to fingerprints and are amenable to many of the same speed optimizations used in merlinserver.

This is the main window of the development program xvip:

The following examples are from an experiment using selected conformations from the medchem95c database. The idea was to determine whether ipcress could find similarities in "3-D" molecular shape among a small set of structures which differed in their "2-D" connectivity. Starting with standard medchem95c data (using conformations and clusters in the standard distribution), structures were selected which were the "centroid" of 2D-based J-P clusters which also were present wdi95 with known mechanism of action, indications, and at least 10 tradenames. Molecular shape descriptors were generated by computing 162 exact ray intersection distances from the atomic geometric mean to the VDW surface using non-unified radii (these computations were done with hydrogens). Results were evaluated by comparing various metrics across all possible inter-conformational comparisons. Graphical displays were used only to verify reasonable operation, and are shown here for purposes of illustration.

Although the structures were selected to represent clusters of differing connectivity, some pairs remain which are similar by all other criteria, e.g., ofloxacin and norfloxacin:

Graphical representation of shape descriptors for these conformations of ofloxacin and norfloxacin:

Display of ipcress match: overlayed conformations (offset is due to geometric mean centering) and deviation in ray distances (white stars are ofloxacin intersections).

Sulindac is similar to ofloxacin in shape but not in connectivity. Here is the matching conformation of sulindac and its shape descriptor:

Overlaid conformations and shape descriptors of ofloxacin and sulindac (note that the match was for sulindac's mirror image):

Surface properties can be included in the similarity metric, but it's hard to determine a "correct" scaling metric. Using a simple connectivity-based charges as surface properties, and counting normalized shape and surface errors equally, the best match to testosterone in this set is stanalone:

Testosterone and stanalone conformations are shown as matched by ipcress, coloring by surface property. The testosterone-oxymetazoline match is also shown which close in shape but not in surface property.

Daylight Chemical Information Systems, Inc.