Seeking new compounds but intimidated by the size of the catalogue? How do we handle a continually growing population of over three million structures?

Dave Langley

GlaxoSmithKline R & D, Stevenage, Hertfordshire, SG1 2NY, UK

My presentation will be mostly from the perspective of my experience in recent years of managing compound acquisition activities for GlaxoWellcome research, plus a little crystal ball gazing for GSK. I will discuss the internal factors that influenced our decisions to acquire compounds and the external opportunities as they have evolved.

As in many pharmaceutical companies, we realised some years ago that the compound collection was of limited diversity, as it was heavily dominated by compounds made in-house for previous programmes.

We also considered that the size of the collection was inadequate as a screening resource. Although compound numbers alone do not guarantee success in screening, there is a need for a certain critical mass. Over the past few years we have seen the development of a growing number of specialist compound supply companies who cumulatively offer hundreds of thousands of compounds. The compounds are mostly appropriate for biological screening, are of good quality and can be provided in formats that reduce the handling demands in-house.

A critical element in the acquisition process is dealing effectively with the structure files from suppliers. We have actively collected databases and built cumulative 2D and 3D databases containing over three million unique structures. This powerful resource has enabled us to select compounds in various ways. Around 80% of the compounds we have acquired have been intended to enhance the collection for across-the-board screening. The other 20% have been selected specifically for screening against target families, or in some cases, single targets.

Looking ahead we see the flow of new compounds and the establishment of new suppliers continuing, with new structure files arriving daily. Another significant sector involves new companies, which are generating compounds by library chemistries and evolving various business models.

Perhaps the most important area for the future involves the significant advances that are being made in computing capacity in order to handle huge numbers of structures, and developments in chemical software that are enabling ever more precise definitions of lead-likeness and compound selection techniques. Overall, external sources will continue to offer a significant input to hit generation chemistry.

Presentation slides

Daylight Chemical Information Systems, Inc.