Thor/Merlin 4.5: Reaction normalizations


General

Version 4.5 of Thor supports existing datatype normalizations for reaction data and introduces two new reaction-specific normalizations.

Examples

The following simplified datatrees containing carbonate equilibria reactions are offered as an example. Assume that the $SMI datatype is defined with USMILES, MAKEGRAPH and MAKERXNMOL normalizations, that $RMOL, $PMOL and REM are defined in the usual way (simply), that PK has two NUMERIC fields (pK and Temperature), and that PC has a single PART_NTUPLE field (pC). The following datatrees might be loaded into a Thor database.

  $SMI<"O=C=O.[OH2]>>[OH]C(=O)[OH]">
  REM<"Dissolution of carbon dioxide in water.">
  PK<1.47;25.0>
  PC<6.99,-1.74,4.40>
  |
  $SMI<"[OH]C(=O)[OH]>>[H+].[O-]C(=O)[OH]">
  REM<"Dissociation of carbonic acid to bicarbonate.">
  PK<6.35;25.0>
  |
  $SMI<"[O-]C(=O)[OH]>>[H+].[O-]C(=O)[O-]">
  REM<"Dissociation of bicarbonate to carbonate.">
  PK<10.33;25.0>
  |
Note that the SMILES data are quoted in the above datatrees (as they must be because of the `>' characters). The REM(ark) data are also quoted (though they don't need to be in these cases). The three numbers 1.74, 6.99, and 4.40 in the PC dataitem form a component-tuple, corresponding to the reaction components in order: carbon dioxide, water, and carbonic acid, respectively.

Assume the above trees were loaded into a database which was then thorlist-ed. Loading the above trees into a Thor database, then thorlist-ing them would produce the following datatrees containing normalized data:


  $SMI<"O.O=C=O>>OC(=O)O">
  REM<Dissolution of carbon dioxide in water.>
  PK<1.47;25.0>
  PC<-1.74,6.99,4.40>
  $GRF<O.OCO.OC(O)O>
  $RMOL<O>
  $RMOL<O=C=O>
  $PMOL<OC(=O)O>
  |
  $SMI<"OC(=O)O>>[H+].OC(=O)[O-]">
  REM<Dissociation of carbonic acid to bicarbonate.>
  PK<6.35;25.0>
  $GRF<OC(O)O.OC(O)O>
  $RMOL<OC(=O)O>
  $PMOL<[H+]>
  $PMOL<OC(=O)[O-]>
  |
  $SMI<"OC(=O)[O-]>>[H+].[O-]C(=O)[O-]">
  REM<Dissociation of bicarbonate to carbonate.>
  PK<10.33;25.0>
  $GRF<OC(O)O.OC(O)O>
  $RMOL<OC(=O)[O-]>
  $PMOL<[H+]>
  $PMOL<[O-]C(=O)[O-]>
  |
Several normalization features are illustrated here.
Daylight Chemical Information Systems, Inc.
info@daylight.com