Neighbor_Search: General Usage & Hints

Neighbor_Search calculates the "neighborhood" for compound(s) which are input by the user.

Compounds may be input in two formats:

If the compound is an "SC" compound or a "CP" compound, then the appropriate label, such as "SC-58635" may be entered in the "ID:" field of Neighbor_Search.
Otherwise, the SMILES string must be entered into the ID field. SMILES stands for Simplified Molecular Input Line Entry Specification and takes a bit of time to learn. Never fear, however, for you don't have to learn SMILES if you don't want! There are at least two helper programs which can figure out the SMILES strings for structures which you sketch in:
- ChemDraw (on the Macintosh) can give you the string if you lasso your structure then select "Copy SMILES" from the "Edit" menu. The string will be in the Clipboard and you can then paste the string into the ID field of Neighbor_Search .
- GRINS (Graphical INput of Smiles) may be used on any platform. A button appears just below the ID field in Neighbor_Search which allows you to enter the GRINS editor. If you have entered a SMILES string into the ID field and then press the Grins editor button, the structure will be depicted on the next page. You may then go on to the Grins editor which will start with the structure you have entered, or just go back.
You can see the SMILES strings for structures you have entered by pushing the Display Smiles toggle and the pressing the Redisplay button. This also applies to structures retreived from any of the databases. So for example, you may retreive SC-58125, toggle on the SMILES display, cut&paste the SMILES string into the ID field and press the Grins editor to create a substituted SC-58125.

Once the compound is input, one or more databases to search against may be selected. The default is to search only the SC database.

Finally, a similarity range may be input. This range represents similarity as measured by the Tanimoto Coefficient (Tc) between molecules characterized by the Daylight fingerprint. The Tc ranges from 1.0 (highest homology) to 0,0 (no homology). Extensive experience with this similarity measure suggests that a range from 1.0 to 0.70 (the default values in Neighbor_Search) seems optimal for a first pass. If too many compounds are found using these values, the lower bound should be raised from 0.70. Likewise, if too few compounds are found, the range may be expanded by decreasing the lower bound. Research at Abbott Laboratories with the Tc/Daylight fingerprint suggests that, on average, relatively high homology (Tc >= 0.85) is required in order to see similarity in terms of biological activity.