Daylight Summer School 1998, July 28-30, St. John's College, Santa Fe, NM

Daylight Worksheet - Cluster Package

The Cluster Package enables one to generate clusters of compounds based on the Daylight Fingerprint descriptor and the Jarvis-Patrick clustering algorithm. Subsets of large datasets can be selected as well as clustering data added to TDT files for insertion into Daylight Databases. Keep track of files from this exercise for use in Day 2 labs.

  1. Generate a TDT file containing a clustered dataset from the ~mug/data/day1.cluster.smi dataset which uses fixed length fingerprints 5 nearest neighbors and tanimoto threshold of 0.7, and a "reasonable" JP clustering level chosen from jpscan output.
  • Pick a representative subset of the clustered dataset from step one by selecting only the cluster centroids and the singletons.
  • Update the nearneighbors table generated from the day1.cluster.tdt dataset with the ~mug/data/day1.smi dataset fingerprinted with the same parameter set used in step one.
    Daylight Chemical Information Systems Inc.