Using Descriptor Counts in Clustering

Descriptor-Count Information

  • Descriptor Counts: Number of times a fingerprint bit is on (equals 1) in a dataset.


  • Useful Information: Variability of each bit (0-50%).


  • Probability that a bit will have the same value in a pair of compounds is related to the bit's variability by


    • p = (var)² + (1 - var)²
    Variability Prob. of a Match
    0.1 0.82
    0.2 0.68
    0.3 0.58
    0.4 0.52
    0.5 0.50


| Prev | Contents | Next | Robin Hewitt (rhewitt@acm.org), Feb 2003