MUG '03 -- 25 - 28 Feb, 2003

Tuneable Similarity Searching

David Gange
Row2 Technologies, Inc.


Current implementations of similarity searching deal with the weightings of the variables in one of two ways. Either the weighting issue is ignored, in which case variables with larger numeric values count for more in the similarity calculation, or the variables are rescaled to have a mean of 0 and unit variance, in which case each variable has the same weight. We are allow the users to apply their own weights to the variables, giving the user control of the weightings, to allow the user to determine which variables are most important in the determination of similarity. By adjusting the weights used in the similarity search the users are able to tune the search to achieve the results they desire.

Presentation slides:

Daylight Chemical Information Systems, Inc.