17 June 2004

Daylight CIS UK Ltd

19

Three’s a crowd

•The process of
taking a large set of objects and partitioning
them into subsets such that objects, within a set, are more *like* each other than they are *like* objects in other sets,
is known as clustering.

•If we take our
ordered lists for all possible targets then in the same way that a pair of compounds is said to be similar if they contain a proportion of the same substructures ( shared bits = **c** ), compounds can be grouped if
they share a *proportion of* *nearest neighbours.
*

•This grouping
by proportion of shared nearest neighbours is
an appropriate algorithm for Daylight non-parametric descriptors and is the basis of the Jarvis-Patrick clustering algorithm.