Upgrading Databases to 4.71

Note: 4.7x and 4.81 database formats are identical. It is unnecessary to upgrade 4.71 databases to 4.72 ,4.73, or 4.81

If you have databases from versions 4.61 or 4.62 of Daylight software they are not compatible with the 4.71 system, therefore you must follow the database upgrade steps before proceeding. (If you have pre-4.61 databases, these must first be upgraded to 4.61 before they are upgraded to 4.71. See documentation for the program thordbfix461. Contact support@daylight.com for details and help.)

4.71 editions of all Daylight-supplied commercial databases, current versions, are available from Daylight (contact info@daylight.com, 949-367-9990).

For in-house databases, or to do-it-yourself, the following procedure is provided, utilizing the program thordbcheck471.

Why the need to upgrade? Several fixes and improvements to the SMILES Toolkit were made for 4.71. See the release notes for details. As a consequence, some SMILES are canonicalized differently in 4.71 than in the past. Aromaticity detection has changed somewhat, so the corresponding molecule objects have changed, and therefore the fingerprints have changed. So, for some fraction (usually a small minority) of pre-4.71 database TDTs, the SMILES (root-SMILES and any other SMILES) and fingerprints (FP and others) need to be replaced. thordbcheck471 provides a convenient and robust way to accomplish this task.

Note that the Thor database file-format has not changed. So the 4.71 Thorserver will be happy to open a 4.62 database. Rather, the data in the file is where the incompatibility lies, so that SMILES improperly canonicalized will not be found by a Thor lookup.

STEPS:

  1. The 4.71 Thorserver should be installed and running. The environment should be correct, such that applications such as thorls, thordbping, thorload and thordelete are available.
  2. BACK UP ALL DATABASES BEFORE UPGRADING! If any errors occur during the upgrade procedure, you will want access to the database in its original state.
  3. The 4.6* databases should be in Thor's database path as defined by $DY_DATABASE_PATH. Typically, databases are in a directory specified by $DY_THORDB, we will assume that this is the case for this procedure.
  4. Run thordbcheck471 as follows on the database to be upgraded. In this example we use the database name "foobar".
  5. thordbcheck471 -f foobar foobar@myhost
    Enter server access password for myhost:thor:myname >>
    Enter password for database foobar@myhost:thor:* >>
    ...
  6. Assure that the database is not read-only. The command thordbinfo will provide this information. A read-only database can be reconfigured writable by using sthorman, thorchange, or by simply editing the header file (*.THOR) and removing the line "read only: TRUE". Here's the thorchange syntax:
  7. thorchange -SETACCESS WRITEABLE foobar@myhost::thor
  8. Use thordelete to delete the TDTs written to the file foobar.old, as follows:
  9. cat foobar.old | thordelete -RAW_DATA TRUE -INPUT_FORMAT TDT foobar@myhost
  10. Use thorload to load the updated TDTs written to the file foobar.new, as follows:
  11. cat foobar.new \
       | thorload -RAW_DATA TRUE -MERGE TRUE -OVERWRITE FALSE foobar@myhost
  12. If any errors occurred, and/or any TDTs are written to foobar.bad, some manual correction will need to be made to these TDTs, and then they will need to be loaded as in the previous step. If it is not clear how to do this, contact Daylight Support.
  13. The "version:" field in the database header files (*.THOR) should be updated to "4.71". This must be done by editing the file. The version line should read: version: 4.71.
  14. You may wish to verify that the upgrade has been successful by running thordbcheck471 a second time, on the converted database.
  15. IMPORTANT NOTE: This program does not handle the FPP dataitems automatically. If a database contains mixtures using the FPP datatype, these should be dumped, re-fingerprinted, and reloaded.
  16. NOTE: (OPTIONAL) It may be desirable to replace the $FPG<> datatree in a converted database, to reflect the fact that the fingerprints are 4.71-compatible, as one may get warnings about it when performing future loads of the database. Warnings about the $FPG datatype do not affect the function of the database.
  17. NOTE: Multi-field dataitems with SMILES normalizations are not handled by thordbcheck471. These must be fixed manually. If not a root-id, these SMILES will be corrected by simply re-thorloading the database with option -RAW_DATA FALSE.
  18. NOTE: Daylight server and database passwords are required. These may be specified interactively when prompted by each thorfilter application, or in the dbspec, if DY_SECURE_PASSWORDS is set to TRUE.
Automating the upgrade to 4.71

If you have a large number of databases to upgrade, you may wish to automate this process by using a unix script. The script thordbfix471 is provided in the contrib directory for this purpose (v471/contrib/src/admin/thordbfix471).