MUG '04 -- 24 - 27 Feb, 2004

From 2D based Descriptors to QSAR and Classification Approaches

Darko Butina
ChemoMine Consultancy


Use of SMARTS based descriptors, based on SMILES and SMARTS toolkits under Daylight software, offer rather unique way of developing project specific and most relevant descriptors that would capture chemical diversity present. In my experience of developing QSAR (Quantitative Structure Activity Relationship) in drug discovery, describing molecule in terms of chemical descriptors is probably the most crucial part of the key three components needed to develop SAR:

  1. Experimental Data
  2. Structural Descriptors
  3. Statistical Method

Daylight based program that reads external definition file (SMARTS based) and calculates counts of each descriptor will be demonstrated. The use of those fingerprints will be demonstrated in building clogP, as PLS linear regression model (SIMCA), and also in case of more complex models requiring use of decision trees (C5).

Presentation slides:

Daylight Chemical Information Systems, Inc.