(University of Sheffield)
John Bradshaw   (GlaxoWellcome)
Data mining is currently an important topic in drug discovery where vast amounts of data about compounds are being accumulated through the techniques of combinatorial chemistry and high-throughput screening. Much effort has been devoted to the development and validation of descriptors that are both relevant to bioactivity and that are sufficiently rapid to calculate so that large quantities of data can be handled. In this context, 2D fingerprints have proved to be successful in identifying close analogues ("me-too" compounds) however they are less effective in identifying compounds that exhibit the same activity but that arise from different structural classes.
We have been investigating the effectiveness of reduced graphs in identifying structure activity relationships. Reduced graphs can provide a different view of compounds than 2D fingerprints since they involve generalising the structural features within a molecule while retaining the topology between the features. Some preliminary experiments carried out on activity classes extracted from the World Drugs Index (WDI) show that: reduced graphs are effective in identifying molecules belonging to the same activity class; and that actives are found that complement those found using conventional fingerprints.