MUG'99 -- 23-26 February 1999 -- Santa Fe, NM

Daylight Overview, MUG '99 Edition - Q&A

Q: What is Daylight software used for?
A: Chemical information processing. Storage, retrieval, and display of chemical structures and other data and manipulation of that data in a network environment. Formulation and implementation of "chemical intelligence". High performance searching even for large databases. Our mission: to store all the world's chemical information. (Examples: minoxidil in medchem99, Beta-Aspartate-Hydroxamate in wdi984.)
Q: How is Daylight software better or different than other competing software?
A: A toolkit-based approach (since ~1990) facilitating cross-platform multi-vendor interoperabiltiy over local and large networks. Rigorous and public chemical information languages and algorithms facilitating the open communication of chemical information and knowledge. Foremost and primary among these languages is SMILES, a simple, comprehensive, unique-ifyable nomenclature for chemistry, invented by Dave Weininger.
Huh? You mean SMILES is a Daylight invention? You've got to be kidding me.
A: I'm not kidding. And SMILES has steadily gained acceptance and popularity and become an industry standard. Other Daylight languages include SMARTS, for flexible substructure pattern matching, Thor DataTrees (TDTs) for data, and Reaction SMILES.
Q: What do you mean by "toolkit-based" exactly?
A: The Daylight Toolkits are programming libraries of functions comprising a well defined and supported API -- application programming interface -- suitable for commercial software developers and code-happy rapid-prototyping computational chemists alike. The full power and functionality of Daylight software is made available at the toolkit interface. In some respects, many of the Daylight applications are simply examples of Daylight Toolkit programs.

In addition, many Daylight applications can be employed as tools, and integrated with other software to create a custom solution.

Q: So... is this an expert system? A neural network? A genetic algorithm?
A: No, but these things could be written using the toolkits. Daylight provides a "chemical information processing infrastructure" which can provide many basic and advanced features "out of the box" and is infinitely customizable to accomodate the needs of users.
Q: I don't really want to do any programming, and why should I? You are the software company. What can your application software do out of the box?
A: Well, here's an example, a scenario. Let's say you have a file of 500,000 compounds with associated data. You can convert them to Daylight TDTs, create a Daylight database. In so doing you find that there are 30,000 duplicate structures; these will be merged automatically thanks to the uniquification of SMILES. 40,000 are stereoisomers; these will be merged also, with the stereoisomeric SMILES retained along with their respective associated data. You can also:
  • Use ClogP to calculate logP values and add to database.
  • Calculate approximate low-energy conformation data with Rubicon.
  • Cluster all 500K structures with the Clustering Package.
  • Perform Thor lookups on all corporate reg numbers and find that several are erroneously ambiguous.
  • Perform similarity searches in about 1.5 seconds.
  • Make the database available to everyone in your company, either by Daylight's X-clients, or with a web browser plus the new JavaGRINS molecular editor.
Q: Impressive. I'm almost convinced. Say, does it run on a PC?
A: Short answer: No.
Longer answer: No, but insofar as PC's run browsers, you can run it "from a PC". Also, there is a Remote Toolkit for toolkit programming with 16 or 32-bit Windows (or MacOS).
Vaguer answer: Maybe one day a Linux port or NT port.
Q: So what platforms does Daylight run on? What hardware do I need?
  • SGI, IRIX 5.3 and up
  • Sun, SunOS 5.4 (Solaris 2.4) and up

    The Merlinserver requires enough physical RAM to store all searchable data "pools". For a typical 500,00 structure database, perhaps 200MB of additional RAM. Multiple CPUs will improve search speed.

  • Q: Does Daylight sell databases too?
    A: Yes, Daylight publishes and distributes databases authored and maintained by other organizations. We convert these datasets to Daylight format, often adding data such as clustering data and 3D coordinates. There is a growing list: Medchem (BioByte/Pomona), World Drug Index (Derwent), Index Chemicus and Current Chemical Reactions (ISI), Spresi and SpresiReact (InfoChem), ACD (MDLIS), the Maybridge catalog, AsInEx catalog, and others. For more info see our website.
    Q: Let's see how you'll answer this one: What is Daylight's greatest weakness?
    A: Daylight has focused on its toolkits, not its user interfaces. Early on, the need to learn SMILES was an activation energy barrier which discouraged many potential users. Still, in many ways, we expect more effort and savvy from the user, but we believe the effort is justified and well rewarded.
    Q: My company uses Oracle to store biological data. Can I integrate a Daylight system with Oracle?
    A: I'm glad you asked me that. Yes. Daylight and Oracle have an ongoing joint effort to create a Daylight-chemistry-cartridge for Oracle. Also, Daylight has partnered with several other companies such as Informix, CAS, MSI, Netgenics, and Synopsys, to integrate products and improve their overall usefulness for users. See our partners page for the list. Moreover, as described, Daylight's toolkit-based architecture facilitates software integration, and can be done by anyone.
    Q: Can Daylight handle reactions?
    A: Yes. The Reaction Toolkit was introduced in 1996, and the Reaction SMILES language was introduced to describe reactions consisting of reactants, agents and products. Reactions are fully integrated into the Daylight system, and reactions can be searched powerfully and flexibly. Reaction transforms, encoded as SMIRKS, allow the toolkit to implement "virtual chemistry", enabling a wide range of programmatic possibilities, from virtual combinatorial libraries to reaction expert systems. (References: 1. Daylight Theory Manual, 2., Reaction Talk, MUG '97 [J. Delany]).
    Q: How about the Monomer Toolkit -- what does that do?
    A: The Monomer Toolkit was introduced in 1993 to support combinatorial chemistry, with languages Chuckles, Chortles and Charts. Monomers are user-configureable molecular building blocks. Combinatorial mixtures can be specified for oligomeric or scaffold-based libraries and stored in a database. Enumeration is thus avoided and speed and efficiency enhanced. Monomer Toolkit capability is built into the database servers and clients, however, no user-friendly library building application currently exists. (References: 1. Daylight Theory Manual), 2. White Paper on Daylight Combinatorics, C. James, 1996.)
    Q: So what's new at Daylight these days?

    New code:

  • 4.61 release (July '98) / 4.62 update (March '99?)
  • Daylight JavaGRINS and improvements
  • Daylight JavaTools in prototype
  • SMIRKS 4.6 improvements and fixes
  • smi2gif atom coloring via TDT input
  • Merlin part n-tuple fingerprint searching.
  • Atom (Stigmata) fingerprints supported in toolkit
  • Thor and Merlin connection timeout
  • Merlinserver pool load speed improvements
  • Rubicon rule changes
  • FPP Reaction searching
  • ClogP update
  • Daylight scripting wrappers
  • 64-bit clean port
  • PC Remote Toolkit Update (Win95/NT)

    New Databases:

  • Medchem99 (BioByte) -- more compounds and data
  • WDI 98.4 (Derwent) -- w/ new formatting
  • ACD 98.2 (MDL) -- bigger, more suppliers
  • CCR 98.1 (soon?) (ISI)
  • Asinex
  • Index Chemicus 98.1 (soon?) (ISI)
  • Collaborations:

  • Oracle
  • MSI (Diversity Explorer, etc.)
  • Informix
  • Derwent
  • Synopsys (Accord)
  • CambridgeSoft
  • NetGenics
  • DataAspects
  • CAS
  • Others


  • Terravivo

    New Santa Fe office!

  • MUG'99 -- 23-26 February 1999 -- Santa Fe, NM

    Daylight Chemical Information Systems Inc.