SMILES are not the only root
Traditional thor and merlin databases have been rooted in the SMILES. Three example databases will illustrate how that link can increasingly be broken. This puts demands on the toolkit programs to adequately handle non-SMILES rooted trees.
- DNP database.
This is the Chapman and Hall database of natural products which has been converted in-house to a thor database. This contains information about the natural sources of compounds and some of their properties.
If this database is sufficiently interesting to other DAYLIGHT users we will try and arrange with Chapman and Hall to provide a version through DAYLIGHT. Currently this is not available to users.
Anyone interested, should mail email@example.com or me at firstname.lastname@example.org.
Alternatively talk to me!!!!
Many of the entries do not have structures, even though they may have molecular weight or formula information. It would be nice to be able to go back to the thor tree in the usual merlin manner, by clicking on the column.
- SIGMA database
As a tenth anniversary present, this is the substituent constant data, π, that came from Pomona College on microfiche. Various data have been added such as STERIMOL parameters and we have access to the H-bonding data of Abrahams.
The trees here are formally rooted in SMILES albeit, partial structures.
Subject to Biobyte's agreement this database can be distributed or put on the http server list. The Abrahams' data are not in the MUG version of the database.
- SMARTS database
One of the great strengths of the DAYLIGHT system has always been the ability to describe chemical concepts. Remember GCL!!!
Whilst SMARTS is powerful, it is not a language which is easily remembered by occasional users. What has been put together has been a database which allows SMARTS to be stored and retrieved in a sensible fashion. The trees are rooted on the name for the SMARTS, a suggested convention is that the first character is uppercase and the rest lower case, like monomers. So, if we have a basic nitrogen, N_basic would be a suitable, though arbitrary name.
The tree contains data items for the SMARTS itself, and the English language description of what the SMARTS means. For an international company like ours there is no reason why the description could not be in Italian, French or Spanish. Finally the originator of the SMARTS is added.
In order to encourage users to contribute their useful SMARTS we have a web-based mailer, which mails the thor datatree to the the database administrator. In principle, we could thorload directly using the thorfilters, but as Jack has mentioned there maybe security implications for this type of operation.
As the trees are post-processed, it is possible to put whitespace in the SMARTS to make them humanly parsable to reduce transcription errors.
What would be nice would be able to use thorlookup to retrieve a SMARTS directly. Unfortunately current versions require the $SMI datatype which is totally missing from this database.
It may be this, too could act as a 'contrib' database, to be added to across the web maintained centrally at DAYLIGHT. Again comments are welcome.