Neighbor_Search: General Usage & Hints
Neighbor_Search calculates the "neighborhood" for compound(s) which are input
by the user.
Compounds may be input in two formats:
- If the compound is an "SC" compound or a "CP" compound, then the appropriate
label, such as "SC-58635" may be entered in the "ID:" field of Neighbor_Search.
- Otherwise, the SMILES string must be entered into the ID field. SMILES stands for Simplified Molecular Input Line Entry Specification and takes a bit of time to learn. Never fear, however, for
you don't have to learn SMILES if you don't want!
There are at least two
helper programs which can figure out the SMILES strings for structures which
you sketch in:
- ChemDraw (on the Macintosh) can give you the string if you lasso your structure
then select "Copy SMILES" from the "Edit" menu. The string will be in the
Clipboard and you can then paste the string into the ID field of Neighbor_Search
.
- GRINS (Graphical INput of Smiles) may be used on any platform. A button appears just below the ID field in
Neighbor_Search which allows you to enter the GRINS editor.
If you have entered a SMILES string into the ID field and then press
the Grins editor button, the structure will be depicted on the
next page. You may then go on to the Grins editor which will start with the
structure you have entered, or just go back.
You can see the SMILES strings for structures you have entered by
pushing the Display Smiles toggle and the pressing the
Redisplay button. This also applies to structures retreived from any of
the databases. So for example, you may retreive SC-58125, toggle on
the SMILES display, cut&paste the SMILES string into the ID field and
press the Grins editor to create a substituted SC-58125.
Once the compound is input, one or more databases to search against may
be selected. The default is to search only the SC database.
Finally, a similarity range may be input. This range represents similarity
as measured by the Tanimoto Coefficient (Tc) between molecules characterized
by the Daylight fingerprint. The Tc ranges from 1.0 (highest homology) to 0,0 (no homology). Extensive experience with this similarity measure suggests that a range from 1.0 to 0.70 (the
default values in Neighbor_Search) seems optimal for a first pass. If too
many compounds are found using these values, the lower bound should be raised
from 0.70. Likewise, if too few compounds are found, the range may be expanded
by decreasing the lower bound. Research at Abbott Laboratories with the
Tc/Daylight fingerprint suggests that, on average, relatively high homology
(Tc >= 0.85) is required in order to see similarity in terms of biological
activity.