Daylight Properties Package Reference Manual

Daylight Version 4.9
Release Date 08/01/11

Copyright notice: This document is copyrighted © 2011 by Daylight Chemical Information Systems, Inc. of Laguna Niguel, CA. Daylight explicitly grants permission to reproduce this document under the condition that it is reproduced in its entirety, including this notice. All other rights are reserved.

Table of Contents

  1. Introduction
  2. Prerequisites
  3. Conventions.
  4. Using Dayprop and Tautomer
  5. Using Dayproptalk and Tautomertalk
  6. Using Dayproptalk and Tautomertalk with Daycart[tm]
  7. Property Names and Descriptions
  8. Appendix: References

1. Introduction

1.1 Why use the Daylight Properties Package?

With the advent of virtual screening, there is a requirement for rapid estimation of physical properties directly from molecular structure. The Daylight Properties Package addresses this need by providing a command line program called Dayprop and an interactive program object called Dayproptalk. In addition there are SQL scripts to use the Dayproptalk program with Daycart[tm] within Oracle. The modular architecture of the package allows for additional properties to be added in the future.

2. Prerequisites

2.1 Programming Knowledge

Users of this manual should have general UNIX skills as well as general knowledge of Daylight software. Users of Dayproptalk should be familiar with the concept of program objects, as well as how to use them. Users of DayCart should have general knowledge of Oracle.

2.2 Software Requirements

Dayprop is included with the standard Daylight distribution (versions 4.82 or later), which can be downloaded from Daylight's web site (http://www.daylight.com). In order to use Dayprop, a "dayprop" program license from Daylight is required. Within Oracle, DayCart must be licensed to call the property package functions.

2.3 UNIX Configuration

Users need to have the appropriate environment variables set. For more information see the Daylight Installation Guide instructions.

3. Conventions.

Teletype is used for routine names, source code, computer output, object names, and environment variables.

Bold is used for commands you type.

Italic is used for filenames, variable names (except in examples), and emphasis in a description.


4. Using Dayprop and Tautomer

Dayprop and Tautomer are a non-interactive program that takes Thor Data Tree (TDT) input containing SMILES and provides output in TDT format.

4.1 Dayprop

Dayprop takes a property to be calculated with the -PROPERTY option and outputs the calculated value into the PPROP datatype in the TDT.

Usage:

% dayprop [options] [infile.tdt [outfile.tdt]]
If present, Dayprop will read in the input from infile.tdt and write it's output to outfile.tdt. If filenames are not specified Dayprop will read from standard input and write to standard output.

The output PPROP datatype consists of four parts: the property to be calculated, the calculated value, the method used to calculate the value, and a comment.

For example:
    % dayprop -property AVERAGE_MOL_WT -propid "MyComment"
    $SMI<CC>|
    PPROP<1;30.08;0;MyComment;>|

In the above example output, 1 is the indirect identifier for the property AVERAGE_MOL_WT, 30.08 is the computed AVERAGE_MOL_WT of ethane, 0 is the indirect identifier for the method version used to calculate the property, "MyComment" is the user supplied comment.

Dayprop options:
Note: Options are case insenstitive and can not be abbreviated.

% dayprop -h
% dayprop -HELP [TRUE|FALSE]
    Write the help message to standard output and exit.
% dayprop -version
% dayprop -VERSION [TRUE|FALSE]
    Write the version number of the dayprop program to standard output (ex. 4.95) and exit. If the dayprop program is not accessible for any reason, dayprop outputs 000
% dayprop -dump_indirect
% dayprop -DUMP_INDIRECT [TRUE|FALSE]
    Write the indirect data for use in the Thor databases to standard output and exit:

      % dayprop -dump_indirect
      $PI<0;NO_PROPERTY>|
      $PI<1;AVERAGE_MOL_WT>|
      $PI<2;MOL_FORM>|
      $PI<3;ROTBOND_COUNT>|
      $PI<4;HDONOR_COUNT>|
      $PI<5;HACCEPTOR_COUNT>|
      $PI<6;PARACHOR>|
      $PI<7;ACCURATE_MASS>|
      $PI<8;MOLAR_VOLUME>|
      $PI<9;RING_COUNT>|
      $PI<10;RIGIDITY>|
      $PI<11;FRAGMENT_COUNT>|
      $PI<12;FLEXIBILITY>|
      $PI<13;FINGERPRINT>|
      $PI<14;DEPICTION>|
      $PI<15;STEREOCENTER_COUNT>|
      $PI<16;PART_COUNT>|
      $PI<17;POLAR_SURFACE_AREA>|
      $PI<18;ATOM_COUNT>|
      $PI<19;MATCH_COUNT>|
      $PI<20;ALL>|
      $MI<0;4.95 DCIS Standard>|

% dayprop -PROPERTY property
    Works with all the properties listed below in the properties section, as well as the property ALL, which lists all the properties (except for MATCH_COUNT) at once.

    Example of a single property calculation:

      % dayprop -PROPERTY ATOM_COUNT
      $SMI<CC>|
      $SMI<CC>PPROP<18;2;0;>|

    Example using the ALL property to calculate all the properties at once:

      % dayprop -PROPERTY ALL
      $SMI<CC>
      |
      $SMI<CC>
      PPROP<1;30.08;0;>
      PPROP<2;C2H6;0;>
      PPROP<3;0;0;>
      PPROP<4;0;0;>
      PPROP<5;0;0;>
      PPROP<6;110.40;0;>
      PPROP<7;30.046951;0;>
      PPROP<8;56.00;0;>
      PPROP<9;0;0;>
      PPROP<10;1.0000;0;>
      PPROP<11;1;0;>
      PPROP<12;0.00;0;>
      PPROP<13;......................................E
      .................................................
      ..................................................................+
      ..........................20........................
      ...................................6.2..........6
      ...............................................................
      .........................1;0;>
      PPROP<14;-0.51,1.53,0.51,1.53;0;>
      PPROP<15;0;0;>
      PPROP<16;1;0;>
      PPROP<17;0.00;0;>
      PPROP<18;2;0;>
      |


% dayprop -PROPID "comment"
    Adds a comment to the comment field in PPROP.

    For example:

      % dayprop -PROPERTY AVERAGE_MOL_WT -PROPID "MyComment"
      $SMI<CC>|
      $SMI<CC>PPROP<1;30.08;0;MyComment;>|
% dayprop -SMARTS 'smarts'
    Defines the user pattern to be used with the MATCH_COUNT property.

    For example:

      % dayprop -PROPERTY MATCH_COUNT -SMARTS '[#6]'
      $SMI<CC>|
      $SMI<CC>PROP<19;2;0;>|
% dayprop -SINGLE_PART [TRUE|FALSE]
    When TRUE, treats the entire input SMILES as a single molecule and computes a single property value. When FALSE, computes a comma separated list of property values for the dot separated components within the input SMILES. The default is FALSE.
    NOTE: -RXNDIFF TRUE can not be used with -SINGLE_PART FALSE option.


    For example:

      % dayprop -PROPERTY AVERAGE_MOL_WT -SINGLE_PART TRUE
      $SMI<CC.CC>|
      $SMI<CC.CC>PROP<1;60.16;0;>|
% dayprop -RXNDIFF [TRUE|FALSE]
    Calculates the difference between the property values of the product and reactant. The default for the RXNDIFF option is FALSE. Note that -RXNDIFF TRUE can not be used with -SINGLE_PART FALSE.

    For example:

      % dayprop -PROPERTY AVERAGE_MOL_WT -SINGLE_PART TRUE -RXNDIFF TRUE
      $SMI<"CCCN>>CN">|
      $SMI<"CCCN>>CN">PPROP<1;-28.06;0;>|

4.2 Tautomer

% tautomer [options] [infile.tdt [outfile.tdt]]
If present, Tautomer will read in the input from infile.tdt and write it's output to outfile.tdt. If filenames are not specified Tautomer will read from standard input and write to standard output. The original SMILES is returned in the list of tautomers.

The output $TAUT datatype consists of one part: the calculated value.

For example:
    % tautomer
    $SMI<Oc1cc(O)ncn1>
    |

    $SMI<Oc1cc(O)ncn1>
    $TAUT<O=C1CC(=O)N=CN1>
    $TAUT<OC1=NC=NC(=O)C1>
    $TAUT<Oc1cc(=O)[nH]cn1>
    $TAUT<Oc1cc(=O)nc[nH]1>
    $TAUT<Oc1cc(O)ncn1>
    |

Tautomer options:
Note: Options are case insenstitive and can not be abbreviated.

% tautomer -h
% tautomer -HELP [TRUE|FALSE]
    Write the help message to standard output and exit.
% tautomer -version
% tautomer -VERSION [TRUE|FALSE]
    Write the version number of the tautomer program to standard output (ex. 4.95) and exit. If the tautomer program is not accessible for any reason, tautomer outputs 000

% tautomer -NO_ENOL [TRUE|FALSE]

    This option allows only heteroatoms to participate as hydrogen donors or acceptors. This option suppresses keto-enol type tautomerism.
    The default is FALSE.

    % tautomer -NO_ENOL TRUE
    $SMI<Oc1cc(O)ncn1>
    |

    $SMI<Oc1cc(O)ncn1>
    $TAUT<Oc1cc(=O)[nH]cn1>
    $TAUT<Oc1cc(=O)nc[nH]1>
    $TAUT<Oc1cc(O)ncn1>
    |

% tautomer -ISO [TRUE|FALSE]
    The -ISO FALSE option returns the tautomers as unique SMILES. When true the program returns the tautomers as absolute SMILES.
    The default is FALSE.

    %tautomer -ISO TRUE
    $SMI<CC(=O)[C@H](C)C(=O)OC>
    |

    $SMI<CC(=O)[C@H](C)C(=O)OC>
    $TAUT<COC(=C(C)C(=C)O)O>
    $TAUT<COC(=C(C)C(=O)C)O>
    $TAUT<COC(=O)C(=C(C)O)C>
    $TAUT<COC(=O)[C@@H](C)C(=C)O>
    $TAUT<COC(=O)[C@@H](C)C(=O)C>
    |

% tautomer -KEKULE [TRUE|FALSE]
    The KEKULE option when TRUE generates kekule structures using dt_xsmiles(). When FALSE the program generates canonical SMILES using dt_cansmiles();
    The default is FALSE.


    tautomer -KEKULE TRUE
    $SMI<Oc1cc(O)ncn1>
    |
    $SMI<Oc1cc(O)ncn1>
    $TAUT<O=C1CC(=O)N=CN1>
    $TAUT<OC1=CC(=O)N=CN1>
    $TAUT<OC1=CC(=O)NC=N1>
    $TAUT<OC1=NC=NC(=O)C1>
    $TAUT<OC=1C=C(O)N=CN1>
    |

% tautomer -UNIQUE [TRUE|FALSE]
    When TRUE the UNIQUE option writes out the canonical tautomer. The canonical tautomer is the one generated by using the relative electronegativities of the atom types (O > S > Se > Te > N > C) as graph invariants to preferentially assign double bond and hydrogen positions in the tautomer. Although this tautomer often corresponds to the lowest energy form, it is not guaranteed, as it's generated from graph theory and does not consider extended electronic factors.
    The default is FALSE

    % tautomer -UNIQUE TRUE
    $SMI<Oc1cc(O)ncn1>
    |

    $SMI<Oc1cc(O)ncn1>
    $TAUT<Oc1cc(=O)[nH]cn1>
    |
% tautomer -ITERATION_LIMIT [LIMIT]
    This is the maximum number of donor or acceptor positions to iterate. If a structure has more than LIMIT of either donors or acceptors, then no tautomer enumeration is performed. The default limit is 0, which causes the program to generate tautomers for every input structure until all possible tautomers have been generated. A reasonable value for limit to minimize long-running, pathological cases, is 10.

% tautomer -FIXED_SUBSTRUCTURE [SUBSTRUCTURE]
    A comma-separated list of SMARTS which are matched against each input molecule. Any atoms which match are marked as non-tautomerizable and hence stay fixed throughout the enumeration. Useful for excluding specific functional groups from the calculation.

    % tautomer
    $SMI<O=CCCCC(=O)N>
    |

    $SMI<O=CCCCC(=O)N>
    $TAUT<NC(=CCC=CO)O>
    $TAUT<NC(=CCCC=O)O>
    $TAUT<NC(=O)CCC=CO>
    $TAUT<NC(=O)CCCC=O>
    $TAUT<OC(=N)CCCC=O>
    $TAUT<OC=CCCC(=N)O>
    |

    vs:
    % tautomer -FIXED_SUBSTRUCTURE 'O=CN'
    $SMI<O=CCCCC(=O)N>
    |

    $SMI<O=CCCCC(=O)N>
    $TAUT<NC(=O)CCC=CO>
    $TAUT<NC(=O)CCCC=O>
    |

5. Using Dayproptalk and Tautomertalk

Dayproptalk is an interactive program object that parallels Dayprop.
Tautomertalk is an interactive program object that parallels Tautomer.

5.1 What are Program Objects?

Program objects are used to provide two-way communication with an external process. For example, the clogptalk program for computing hydrophobicity for a structure represented in SMILES. Using program objects, a calling program can start an external program, send input, receive the program's output and perform other tasks while the external program remains running and ready for more input. This is particularly important for many property programs which spend significant amounts of time initializing themselves. A number of programs supporting program objects are supplied with the release of Daylight Software. Most of these are supplied as contributed code in the progob directory. The commercial programs clogptalk and cmrtalk operate as program objects (in $DY_ROOT/bin). The dayproptalk program operates in the same way, using the PIPETALK protocol.

5.2 Dayproptalk program object messages.

Dayproptalk responds to all of the standard program object messages, as defined in the Program Objects section of the Toolkit Programmer's Guide.

One minor difference is the behavior of Qwerty: Say HELP. If the current property is set, HELP returns a help message specific to that property, otherwise a generic help message is returned.
    Qwerty: Set PROPERTY UNDEFINED.
    Qwerty: Over.

    Property type set to UNDEFINED...0
    Qwerty: Over.
    Qwerty: Say HELP.
    Qwerty: Over.

    Reads SMILES, computes requested physical property
    -- takes one argument, the property to be calculated.
    One of AVERAGE_MOL_WT, MOL_FORM, ROTBOND_COUNT, HDONOR_COUNT,
    HACCEPTOR_COUNT, PARACHOR, ACCURATE_MASS, MOLAR_VOLUME,
    RING_COUNT, RIGIDITY, FRAGMENT_COUNT, FLEXIBILITY,
    FINGERPRINT, DEPICTION, STEREOCENTER_COUNT, PART_COUNT,
    ATOM_COUNT, POLAR_SURFACE_AREA
    Qwerty: Over.
    Qwerty: Set PROPERTY AVERAGE_MOL_WT.
    Qwerty: Over.

    Property type set to AVERAGE_MOL_WT...1
    Qwerty: Over.
    Qwerty: Say HELP.
    Qwerty: Over.

    Reads SMILES, calculates molecular weight based on
    average atomic weights for naturally occuring elements
    Qwerty: Over.
Dayproptalk also responds to the following object messages:

"Qwerty: Set PROPERTY property."
    Sets the property value.

    For example:

      Qwerty: Set PROPERTY AVERAGE_MOL_WT.
      Qwerty: Over.

      Property type set to AVERAGE_MOL_WT...1
      Qwerty: Over.

    NOTE: The Dayproptalk program optionally takes a single argument, the property name. When present, dayproptalk starts with the property set to the given value.

"Qwerty: Say PROPERTY".
    Shows the current property value.

      Qwerty: Say PROPERTY.
      Qwerty: Over.

      Property set to AVERAGE_MOL_WT.
      Qwerty: Over.
"Qwerty: Set SMARTS SMARTS."
    Sets the user defined SMARTS pattern.

      Qwerty: Set SMARTS [#6].
      Qwerty: Over.

      SMARTS set to [#6]
      Qwerty: Over.

"Qwerty: Say SMARTS."
    Shows the current user defined SMARTS pattern.

      Qwerty: Say SMARTS.
      Qwerty: Over.
      SMARTS set to [#6]
      Qwerty: Over.
"Qwerty: Set VALUE_ONLY TRUE."
"Qwerty: Set VALUE_ONLY FALSE."

    When TRUE, sets the output to suppress the SMILES in the output and return only the calculated value. When FALSE, sets the output to return the both the input SMILES and the calculated value. The default is FALSE.

    For example:

      Qwerty: Set PROPERTY AVERAGE_MOL_WT.
      Qwerty: Over.
      Property type set to AVERAGE_MOL_WT...1
      Qwerty: Over.
      CC
      Qwerty: Over.

      CC 30.08
      Qwerty: Over.
      Qwerty: Set VALUE_ONLY TRUE.
      Qwerty: Over.

      Value only output toggled
      Qwerty: Over.
      CC
      Qwerty: Over.

      30.08
      Qwerty: Over.
      Qwerty: Set VALUE_ONLY FALSE.
      Qwerty: Over.

      Value only output toggled
      Qwerty: Over.
      CC
      Qwerty: Over.

      CC 30.08
      Qwerty: Over.

"Qwerty: Set SINGLE_PART TRUE."
"Qwerty: Set SINGLE_PART FALSE."

    When TRUE treats the input SMILES as a single molecule and computes a single property value. When FALSE, computes a property value for each dot separated component within a SMILES. SINGLE_PART FALSE can not be used with the option RXNDIFF TRUE. The default is FALSE.

    For example:

      Qwerty: Set SINGLE_PART FALSE.
      Qwerty: Set PROPERTY AVERAGE_MOL_WT.
      Qwerty: Over.

      Property type set to AVERAGE_MOL_WT...1
      Qwerty: Over.
      CC.CC
      CC.CC 30.08,30.08
      Qwerty: Over.
      Qwerty: Set SINGLE_PART TRUE.
      Qwerty: Over.

      Single part only output toggled
      Qwerty: Over.
      CC.CC
      Qwerty: Over.

      CC.CC 60.16
      Qwerty: Over.
"Qwerty: Set RXNDIFF TRUE."
"Qwerty: Set RXNDIFF FALSE."

    When TRUE, returns the difference of computed property values between the product and reactant molecules. When FALSE, returns the computed property values for the individual component molecules. The default is FALSE.
    NOTE: SINGLE_PART FALSE can not be used with the option RXNDIFF TRUE.

      Welcome to dayproptalk
      Qwerty: Over.
      Qwerty: Set PROPERTY AVERAGE_MOL_WT.
      Qwerty: Over.

      Property type set to AVERAGE_MOL_WT...1
      Qwerty: Over.
      Qwerty: Set SINGLE_PART TRUE.
      Qwerty: Over.

      Single part only output toggled
      Qwerty: Over.
      Qwerty: Set RXNDIFF TRUE.
      Qwerty: Over.

      Single part rxn only output toggled
      Qwerty: Over.
      CCCN>>CN
      Qwerty: Over.
      CCCN>>CN -28.06
      Qwerty: Over.
Dayproptalk can be used to succesively to compute a set of properties because the data after the SMILES is preserved with each run of dayproptalk:
    #!/bin/sh
    #
    # This shell script produces a space separated table containing:
    # SMILES Mol_Wt Rot_Bonds Frag_Count
    # suitable for analysis program such as Excel or JMP

    # Put in column headers
    echo " SMILES Mol_wt Rot_Bonds Frag_Count" > outfile
    #
    cat $1 \
    | pipetalker dayproptalk AVERAGE_MOL_WT \
    | pipetalker dayproptalk ROTBOND_COUNT \
    | pipetalker dayproptalk FRAGMENT_COUNT \
    | grep -v "Welcome" >> outfile

    $ sh test.sh smiles
    $ more outfile

    SMILES Mol_wt Rot_Bonds Frag_Count
    CCCCN 73.16 2 1
    CCCN 59.13 1 1
    CCN 45.10 0 1

5.3 Tautomertalk program object messages.

Tautomertalk responds to all of the standard program object messages, as defined in the Program Objects section of the Toolkit Programmer's Guide.
Tautomertalk also responds to the following object messages:

"Qwerty: Set NO_ENOL TRUE."
"Qwerty: Set NO_ENOL FALSE."

    This option allows only heteroatoms to participate as hydrogen donors or acceptors. This option suppresses keto-enol type tautomerism.
    The default is FALSE.

    For example:

      Welcome to tautomertalk
      Qwerty: Over.
      Oc1cc(O)ncn1
      Qwerty: Over.

      O=C1CC(=O)N=CN1
      OC1=NC=NC(=O)C1
      Oc1cc(=O)[nH]cn1
      Oc1cc(=O)nc[nH]1
      Oc1cc(O)ncn1
      Qwerty: Over.
      Qwerty: Set NO_ENOL TRUE.
      Qwerty: Over.
      No_enol option toggled
      Qwerty: Over.
      Oc1cc(O)ncn1
      Qwerty: Over.
      Oc1cc(=O)[nH]cn1
      Oc1cc(=O)nc[nH]1
      Oc1cc(O)ncn1
      Qwerty: Over.

"Qwerty: Set ISO TRUE."
"Qwerty: Set ISO FALSE."

    The ISO FALSE option returns the tautomers as unique SMILES. When true the program returns the tautomers as absolute SMILES.
    The default is FALSE.

    For example:

      Welcome to tautomertalk
      Qwerty: Over.
      CC(=O)[C@H](C)C(=O)OC
      Qwerty: Over.
      COC(=C(C)C(=C)O)O
      COC(=C(C)C(=O)C)O
      COC(=O)C(=C(C)O)C
      COC(=O)C(C)C(=C)O
      COC(=O)C(C)C(=O)C
      Qwerty: Over.
      Qwerty: Set ISO TRUE.
      Qwerty: Over.

      Iso option toggled
      Qwerty: Over.
      CC(=O)[C@H](C)C(=O)OC
      Qwerty: Over.
      COC(=C(C)C(=C)O)O
      COC(=C(C)C(=O)C)O
      COC(=O)C(=C(C)O)C
      COC(=O)[C@@H](C)C(=C)O
      COC(=O)[C@@H](C)C(=O)C
      Qwerty: Over.
"Qwerty: Set KEKULE TRUE."
"Qwerty: Set KEKULE FALSE."

    The KEKULE option when TRUE generates kekule structures using dt_xsmiles(). When FALSE the program generates canonical SMILES using dt_cansmiles();
    The default is FALSE.

    For example:

      Welcome to tautomertalk
      Qwerty: Over.
      Oc1cc(O)ncn1
      Qwerty: Over.
      O=C1CC(=O)N=CN1
      OC1=NC=NC(=O)C1
      Oc1cc(=O)[nH]cn1
      Oc1cc(=O)nc[nH]1
      Oc1cc(O)ncn1
      Qwerty: Over.
      Qwerty: Set KEKULE TRUE.
      Qwerty: Over.

      Kekule option toggled
      Qwerty: Over.
      Oc1cc(O)ncn1
      Qwerty: Over.

      O=C1CC(=O)N=CN1
      OC1=CC(=O)N=CN1
      OC1=CC(=O)NC=N1
      OC1=NC=NC(=O)C1
      OC=1C=C(O)N=CN1
      Qwerty: Over.

"Qwerty: Set UNIQUE TRUE."
"Qwerty: Set UNIQUE FALSE."

    When TRUE the UNIQUE option writes out the canonical tautomer.
    The default is FALSE.

    For example:

      Welcome to tautomertalk
      Qwerty: Over.
      Oc1cc(O)ncn1
      Qwerty: Over.

      O=C1CC(=O)N=CN1
      OC1=NC=NC(=O)C1
      Oc1cc(=O)[nH]cn1
      Oc1cc(=O)nc[nH]1
      Oc1cc(O)ncn1
      Qwerty: Over.
      Qwerty: Set UNIQUE TRUE.
      Qwerty: Over.
      Unique option toggled
      Qwerty: Over.
      Oc1cc(O)ncn1
      Qwerty: Over.
      Oc1cc(=O)[nH]cn1
      Qwerty: Over.
"Qwerty: Set ITERATION_LIMIT [LIMIT]."
    This is the maximum number of donor or acceptor positions to iterate. If a structure has more than LIMIT of either donors or acceptors, then no tautomer enumeration is performed. The default limit is 0, which causes the program to generate tautomers for every input structure until all possible tautomers have been generated. A reasonable value for limit to minimize long-running, pathological cases, is 10.
"Qwerty: Set FIXED_SUBSTRUCTURE [SUBSTRUCTURE]."
    The SUBSTRUCTURE is a comma separated list of SMARTS which are matched against each input molecule. Any atoms which match are marked as non-tautomerizable and hence stay fixed throughout the enumeration. Useful for excluding specific functional groups from the calculation.

    For example:

      Welcome to tautomertalk
      Qwerty: Over.
      NC(=O)CCCC=O
      Qwerty: Over.

      NC(=CCC=CO)O
      NC(=CCCC=O)O
      NC(=O)CCC=CO
      NC(=O)CCCC=O
      OC(=N)CCCC=O
      OC=CCCC(=N)O
      Qwerty: Over.
      Qwerty: Set FIXED_SUBSTRUCTURE O=CN.
      Qwerty: Over.

      Substructure fixed to O=CN
      Qwerty: Over.
      O=CCCCC(=O)N
      Qwerty: Over.

      NC(=O)CCC=CO
      NC(=O)CCCC=O
      Qwerty: Over.

6. Using Dayproptalk and Tautomertalk with DayCart[tm]

The PIPETALK protocol provides an ideal way for the Daylight Oracle cartridge, Daycart[TM] to communicate with external functions. For instance, to populate a table column using property values with a SQL command:
    UPDATE my_table SET Molwt = average_mol_wt(smiles);

6.1 Dayproptalk with DayCart[tm]

The script dy_props_create.plb creates the ddprop package to be used with DayCart. The script dy_props_clean.plb removes the ddprop package. The singular versions of the functions (i.e. atom_count) compute the property with SINGLE_PART TRUE. The plural versions of the functions (i.e. atom_counts) compute a comma separated list of properties with SINGLE_PART FALSE.

For example:
    SQL> select atom_count('CCC') from dual;

    ATOM_COUNT('CCC')
    -----------------
    3

    SQL> select atom_counts('CCC.CCC') from dual;

    ATOM_COUNTS('CCC.CCC')
    --------------------------------------------------------------------------------
    3,3
The following property functions are provided:
    The following functions take a VARCHAR2 or CLOB and return a single computed value using SINGLE_PART TRUE.
      operator function_name ( 
      smiles in VARCHAR2_OR_CLOB 
      ) 
      => NUMBER 
      
      Where function_name is:
      accurate_mass, atom_count, average_mol_wt, hacceptor_count, hdonor_count, flexibility, fragment_count,molar_volume, parachor, part_count, polar_surface_area, rigidity, ring_count, rotbond_count, stereocenter_count

      operator function_name ( 
      smiles in VARCHAR2_OR_CLOB 
      ) 
      => VARCHAR2_OR_CLOB 
      
      Where function_name is:
      depiction, fingerprint, mol_form

    The following functions take a VARCHAR2 or CLOB and return a comma separated list of computed values using SINGLE_PART FALSE.
      operator function_name ( 
      smiles in VARCHAR2_OR_CLOB 
      ) 
      => VARCHAR2_OR_CLOB 
      
      Where function_name is:
      accurate_masses, atom_counts, average_mol_wts, depictions, hacceptor_counts, hdonor_counts, flexibilities, fingerprints, fragment_counts, mol_forms, molar_volumes, parachors, polar_surface_areas rigidities, ring_counts, rotbond_counts, stereocenter_counts


    The functions match_count and match_counts have the following function prototypes:
      operator match_count (
      smiles in VARCHAR2_OR_CLOB,
      smarts in VARCHAR2_OR_CLOB
      )
      => NUMBER
      
      operator match_counts ( smiles in VARCHAR2_OR_CLOB, smarts in VARCHAR2_OR_CLOB ) => VARCHAR2_OR_CLOB

6.2 Tautomertalk with DayCart[tm]

The PIPETALK protocol provides an ideal way for the Daylight Oracle cartridge, Daycart[TM] to communicate with external functions. For instance, to populate a table column using tautomer values with a SQL command:
    UPDATE my_table SET tautomer = compute_tautomer(smiles,tautomer number,iso flag,no_enol flag);
The script dy_props_create.plb creates the ddprop package to be used with DayCart. The script dy_props_clean.plb removes the ddprop package.

For example:
    SQL> select compute_tautomer('Oc1cc(=O)[nH]cn1',1,0,0) from dual;

    COMPUTE_TAUTOMER('OC1CC(=O)[NH]CN1',1,'0','0')
    --------------------------------------------------------------------------------
    O=C1CC(=O)N=CN1

The following property functions are provided:
    The following functions take a VARCHAR2 or CLOB and return a single computed value.
    The function count_tautomer counts the total number of tautomers.
      operator count_tautomer (
      sosdata IN VARCHAR2_OR_CLOB, isomer_flag IN VARCHAR2,
      no_enol_flag IN VARCHAR2, kekule_flag IN VARCHAR2)
      ) 
      => NUMBER
      
    The function compute_tautomer takes a SMILES as input, the number of the tautomers to be returned, and option flags for isomer and no_enol. Tautomers are computed using aromaticity and are returned as SMILES strings.
      operator compute_tautomer ( smiles in VARCHAR2_OR_CLOB, tautomer_number in NUMBER, isomer_flag IN VARCHAR2, no_enol_flag IN VARCHAR2) ) => VARCHAR2_OR_CLOB
    For example: To get the number of tautomers for the smiles 'Oc1cc(=O)[nH]cn1'.
      SQL> select count_tautomer('Oc1cc(=O)[nH]cn1',0,0,0) from dual; COUNT_TAUTOMER('OC1CC(=O)[NH]CN1',0,0) ---------------------------------------------------------- 5 To get the 5th tautomer returned: SQL> select compute_tautomer('Oc1cc(=O)[nH]cn1',5,0,0) from dual; COMPUTE_TAUTOMER('OC1CC(=O)[NH]CN1',5,0,0) ------------------------------------------------------------------------------- Oc1cc(O)ncn1
    The function compute_xtautomer takes a SMILES as input, the number of the tautomers to be returned, and option flags for isomer and no_enol. Tautomers are computed without using aromaticity and their kekule SMILES are returned.
      operator compute_xtautomer ( smiles in VARCHAR2_OR_CLOB,tautomer_number in NUMBER, isomer_flag IN VARCHAR2, no_enol_flag IN VARCHAR2) ) => VARCHAR2_OR_CLOB To get the 5th tautomer returned: SQL> select compute_xtautomer('Oc1cc(=O)[nH]cn1',5,0,0) from dual; COMPUTE_XTAUTOMER('OC1CC(=O)[NH]CN1',5,0,0) ------------------------------------------------------------------------------- OC=1C=C(O)N=CN1
    The compute_tautomer and compute_xtautomer functions can be put into a loop to get out all the tautomers. For example:
      set serveroutput on declare tautomer varchar2(32000); xtautomer varchar2(32000); taut_num number := 0; n number; begin taut_num := ddprop.fcount_tautomer('O=c1[nH]cccc1',0,0,0); n := 1; while n <= taut_num loop tautomer := ddprop.fcompute_tautomer('O=c1[nH]cccc1',n,0,0); dbms_output.put_line(' tautomer ' || tautomer); n := n + 1; end loop; taut_num := ddprop.fcount_tautomer('O=c1[nH]cccc1',0,0,1); n := 1; while n <= taut_num loop xtautomer := ddprop.fcompute_xtautomer('O=c1[nH]cccc1',n,0,0); dbms_output.put_line(' xtautomer ' || xtautomer); n := n + 1; end loop; end; The output of the above code is: tautomer O=C1C=CCC=N1 tautomer O=C1CC=CC=N1 tautomer O=c1cccc[nH]1 tautomer Oc1ccccn1 xtautomer O=C1C=CC=CN1 xtautomer O=C1C=CCC=N1 xtautomer O=C1CC=CC=N1 xtautomer OC1=CC=CC=N1
    The function smi2tautomer takes in a smiles and returns the unique tautomer.
    operator smi2tautomer (
    smiles in VARCHAR2_OR_CLOB
    ) 
    => VARCHAR2_OR_CLOB
    
    
      For example: SQL> select smi2tautomer('Oc1cc(O)ncn1') from dual; SMI2TAUTOMER('OC1CC(O)NCN1') ------------------------------------------------------------------------------- Oc1cc(=O)[nH]cn1
    The function smi2xtautomer takes in a smiles and returns the unique xtautomer.
      For example:
      SQL> select smi2xtautomer('Oc1cc(O)ncn1') from dual;
      
      SMI2XTAUTOMER('OC1CC(O)NCN1')
      -------------------------------------------------------------------------------
      OC1=CC(=O)NC=N1
      
      
    The function count_subtautomer counts the total number of tautomers with a fixed substructure. The fixed substructure is a comma-separated list of SMARTS which are matched against each input molecule. Any atoms which match are marked as non-tautomerizable and hence stay fixed throughout the enumeration. Useful for excluding specific functional groups from the calculation.
      operator count_subtautomer (
      sosdata IN VARCHAR2_OR_CLOB, isomer_flag IN VARCHAR2,
      no_enol_flag IN VARCHAR2, kekule_flag IN VARCHAR2,
      substructure IN VARCHAR2_OR_CLOB )
      )
      => NUMBER
      
    The function compute_subtautomer takes a SMILES as input, the number of the tautomer to be returned, the option flags for isomer and no_enol, and the fixed substructure. Tautomers are computed using aromaticity and are returned as SMILES strings.
      operator compute_subtautomer ( smiles in VARCHAR2_OR_CLOB, tautomer_number in NUMBER, isomer_flag IN VARCHAR2, no_enol_flag IN VARCHAR2, substructure IN VARCHAR2_OR_CLOB) ) => VARCHAR2_OR_CLOB
    The function compute_xsubtautomer takes a SMILES as input, the number of the tautomer to be returned, and option flags for isomer and no_enol, and the fixed substructure. Tautomers are computed without using aromaticity and their kekule SMILES are returned.
      operator compute_xsubtautomer ( 
      smiles in VARCHAR2_OR_CLOB,tautomer_number in NUMBER, 
        isomer_flag IN VARCHAR2, no_enol_flag IN VARCHAR2,
        substructure IN VARCHAR2_OR_CLOB)
      ) 
      =>  VARCHAR2_OR_CLOB
      
    The following example shows the difference in output between compute_tautomer and compute_subtautomer and betwwen compute_xtautomer and compute_subxtautomer.

    For example:
      set serveroutput on
      declare
      tautomer varchar2(32000);
      xtautomer varchar2(32000);
      subtautomer varchar2(32000);
      subxtautomer varchar2(32000);
      taut_num number := 0;
      n number;
      begin
       taut_num := ddprop.fcount_tautomer('NC(=O)CCCC=O',0,0,0);
       n := 1;
       while n <= taut_num loop
          tautomer := ddprop.fcompute_tautomer('NC(=O)CCCC=O',n,0,0);
          dbms_output.put_line(' tautomer ' || tautomer);
          n := n + 1;
       end loop;
      
       taut_num := ddprop.fcount_tautomer('NC(=O)CCCC=O',0,0,1);
       n := 1;
       while n <= taut_num loop
          xtautomer := ddprop.fcompute_tautomer('NC(=O)CCCC=O',n,0,0);
          dbms_output.put_line(' xtautomer ' || xtautomer);
          n := n + 1;
       end loop;
      
       taut_num := ddprop.fcount_subtautomer('NC(=O)CCCC=O',0,0,0,'O=CN');
        n := 1;
        while n <= taut_num loop
           subtautomer := ddprop.fcompute_subtautomer('NC(=O)CCCC=O',n,0,0,'O=CN');
           dbms_output.put_line(' subtautomer ' || subtautomer);
           n := n + 1;
       end loop;
      
       taut_num := ddprop.fcount_subtautomer('NC(=O)CCCC=O',0,0,1,'O=CN');
       n := 1;
       while n <= taut_num loop
          subxtautomer := ddprop.fcompute_subtautomer('NC(=O)CCCC=O',n,0,0,'O=CN');
          dbms_output.put_line(' subxtautomer ' || subxtautomer);
          n := n + 1;
       end loop;
      end;
      
      The output for the above code is:

      tautomer NC(=CCC=CO)O
      tautomer NC(=CCCC=O)O
      tautomer NC(=O)CCC=CO
      tautomer NC(=O)CCCC=O
      tautomer OC(=N)CCCC=O
      tautomer OC=CCCC(=N)O
      xtautomer NC(=CCC=CO)O
      xtautomer NC(=CCCC=O)O
      xtautomer NC(=O)CCC=CO
      xtautomer NC(=O)CCCC=O
      xtautomer OC(=N)CCCC=O
      xtautomer OC=CCCC(=N)O
      subtautomer NC(=CCC=CO)O
      subtautomer NC(=CCCC=O)O
      subxtautomer NC(=O)CCC=CO
      subxtautomer NC(=O)CCCC=O


    The function set_iteration_limit takes in a number as the limit. The limit is the maximum number of donor or acceptor positions to iterate. If a structure has more than LIMIT of either donors or acceptors, then no tautomer enumeration is performed. The default limit is 0, which causes the program to generate tautomers for every input structure until all possible tautomers have been generated. A reasonable value for limit to minimize long-running, pathological cases, is 10
      operator set_iteration_limit(limit in NUMBER) => NUMBER
      

7. Property Names and Descriptions.

The following properties can be calculated using dayprop or dayproptalk:

ACCURATE_MASS
    Molecular weight in atomic mass units using the the most common isotope of each element. This version uses IUPAC 1989 values. NOTE: Isotopic atom specifications in the input SMILES are ignored for this calculation.

    For example, the weight of the most common isotope for hydrogen is 1.007825 amu:

      Dayprop
      % echo '$SMI<[H]>|' | dayprop -property accurate_mass
      $SMI<[H]>PPROP<7;1.007825;0;>|

      Dayproptalk
      Qwerty: Set PROPERTY ACCURATE_MASS.
      Qwerty: Over.
      Property type set to ACCURATE_MASS...7
      Qwerty: Over.
      [H] Qwerty: Over.
      [H] 1.007825
      Qwerty: Over.

      SQL
      SQL> select accurate_mass('[H]') from dual;

      ACCURATE_MASS('[H]')
      --------------------
      1.007825


ATOM_COUNT
    The count of heavy atoms in a molecule. Hydrogens are always ignored. Used to moderate the molecular weight values.

    For example, the atom count of methane is 1:

      Dayprop
      % echo '$SMI<C([H])([H])([H])[H]>|' | dayprop -property atom_count
      $SMI<C([H])([H])([H])[H]<PPROP<18;1;0;>|


      Dayproptalk
      Qwerty: Set PROPERTY ATOM_COUNT.
      Qwerty: Over.

      Property type set to ATOM_COUNT...18
      Qwerty: Over.
      C([H])([H])([H])[H]
      Qwerty: Over.

      C([H])([H])([H])[H] 1
      Qwerty: Over.

      SQL
      SQL> select atom_count('C([H])([H])([H])[H]') from dual;

      ATOM_COUNT('C([H])([H])([H])[H]')
      ---------------------------------
      1

AVERAGE_MOL_WT
    Molecular weight based on average atomic weights for naturally occurring elements. NOTE: Isotopic atom specifications in the input SMILES are ignored for this calculation.

    For example: the average molecule weight of hydrogen is 1.01 amu:

      Dayprop
      % echo '$SMI<[H]>|' | dayprop -property average_mol_wt
      $SMI<[H]>PPROP<1;1.01;0;>|


      Dayproptalk
      Qwerty: Set PROPERTY AVERAGE_MOL_WT.
      Qwerty: Over.
      Property type set to AVERAGE_MOL_WT...1
      Qwerty: Over.
      [H]
      Qwerty: Over.
      [H] 1.01
      Qwerty: Over.

      SQL
      SQL> select average_mol_wt('[H]') from dual;

      AVERAGE_MOL_WT('[H]')
      ---------------------
      1.01
DEPICTION
    Compute planar coordinates for a depiction using the DEPICT Toolkit. Coordinates are computed for explicit atoms only.

    For example, if the stereochemical hydrogen in alanine is implicit, 6 pairs of coordinates are computed:

      Dayprop
      % echo '$SMI<N[C@@](C)C(=O)O>|' | dayprop -property depiction
      $SMI<N[C@@H](C)C(=O)O>PPROP<14;-0.10,1.53,-0.36,2.51,-0.63, 3.49,0.62,2.77,0.88,3.75,1.34,2.05;0;>|

      Dayproptalk
      Qwerty: Set PROPERTY DEPICTION.
      Qwerty: Over.

      Property type set to DEPICTION...14
      Qwerty: Over.
      N[C@@H](C)C(=O)O
      Qwerty: Over.

      N[C@@H](C)C(=O)O

      -0.10,1.53,-0.36,2.51,-0.63,3.49,0.62,2.77,0.88,3.75,1.34,2.05
      Qwerty: Over.

      SQL
      SQL> select depiction('N[C@@H](C)C(=O)O') from dual;

      DEPICTION('N[C@@H](C)C(=O)O')
      --------------------------------------------------------------------------------
      -0.10,1.53,-0.36,2.51,-0.63,3.49,0.62,2.77,0.88,3.75,1.34,2.05

    In contrast, if the hydrogen is explicit, 7 pairs of coordinates will be computed:

      Dayprop
      % echo '$SMI|' | dayprop -property depiction
      $SMI<N[C@@]([H])(C)C(=O)O>PPROP<14;-0.10,1.53,-0.36,2.51,-0.36,2.51, -0.63,3.49,0.62,2.77,0.88,3.75,1.34,2.05;0;>|

      Dayproptalk
      Qwerty: Set PROPERTY DEPICTION.
      Qwerty: Over.
      Property type set to DEPICTION...14
      Qwerty: Over.
      N[C@@]([H])(C)C(=O)O
      Qwerty: Over.
      N[C@@]([H])(C)C(=O)O -0.10,1.53,-0.36,2.51,-0.36,2.51,-0.63,3.49,0.62,2.77,0.88,3.75,1.34,2.05
      Qwerty: Over.

      SQL
      SQL> select depiction('N[C@@]([H])(C)C(=O)O') from dual;

      DEPICTION('N[C@@]([H])(C)C(=O)O')
      --------------------------------------------------------------------------------
      -0.10,1.53,-0.36,2.51,-0.36,2.51,-0.63,3.49,0.62,2.77,0.88,3.75,1.34,2.05

    For more information, see the manual page on dt_calcxy().
FINGERPRINT
    Generate a fingerprint using the FINGERPRINT Toolkit. Default parameters for MINSTEP, MAXSTEP and SIZE are 0, 7, and 2048, respectively. For example, the fingerprint for ethanol is computed as follows:

      Dayprop
      % echo '$SMI<CCO>|' | dayprop -property fingerprint
      $SMI<CCO>PPROP<13;......................................E......2..............
      .............................0............U................................+...
      +..............+..........................20...................................
      ..........2........+....6.2..........6..................6......................
      ....U..........................................1;0;>|

      Dayproptalk
      Qwerty: Set PROPERTY FINGERPRINT.
      Qwerty: Over.

      Property type set to FINGERPRINT...13
      Qwerty: Over.
      CCO
      Qwerty: Over.

      CCO ......................................E......2..............................
      .............0............U................................+...+..............+.
      .........................20.............................................2.......
      .+....6.2..........6..................6..........................U..............
      ............................1
      Qwerty: Over.

      SQL
      SQL> select fingerprint('CCO') from dual;

      FINGERPRINT('CCO')
      --------------------------------------------------------------------------------
      ......................................E......2..................................
      .........0............U................................+...+..............+.....
      .....................20.............................................2........+..
      ..6.2..........6..................6..........................U..................
      ........................1
FLEXIBILITY
    The ratio of rotatable bonds to the total count of bonds. Values range from 1.0 (totally flexible) to 0.0 (Rigid structures). Bonds to hydrogen are excluded. For more information on rotatable bonds refer to the ROTBOND_COUNT property:

      Dayprop
      % echo '$SMI<CCCCCC>|' | dayprop -property flexibility
      $SMI<CCCCCC>PPROP<12;0.60;0;>|

      Dayproptalk
      Qwerty: Set PROPERTY FLEXIBILITY.
      Qwerty: Over.

      Property type set to FLEXIBILITY...12
      Qwerty: Over.
      CCCCCC
      Qwerty: Over.
      CCCCCC 0.60
      Qwerty: Over.

      SQL
      SQL> select flexibility('CCCCCC') from dual;

      FLEXIBILITY('CCCCCC')
      ---------------------
      .6

FRAGMENT_COUNT
    The count of the number of fragments formed by removal of the isolated carbons from the structure. An isolated carbon is defined by the SMARTS pattern
    [$([#6]);!$(C(F)(F)F);!$(c(:[!c]):[!c]);!$([#6]=,#[!#6]);!$([#6;!+0])]
    Note that lone isotopic hydrogen atoms are not counted as fragments after removal of the isolating carbons. So CC[2H] and CC both have fragment counts of zero.
    For example, the fragment count for ethanol is 1:

      Dayprop
      % echo '$SMI<CCO>|' | dayprop -property fragment_count
      $SMI<CCO>PPROP<11;1;0;>

      Dayproptalk
      Qwerty: Set PROPERTY FRAGMENT_COUNT.
      Qwerty: Over.
      Property type set to FRAGMENT_COUNT...11
      Qwerty: Over.
      CCO
      Qwerty: Over.

      CCO 1
      Qwerty: Over.

      SQL
      SQL> select fragment_count('CCO') from dual;

      FRAGMENT_COUNT('CCO')
      ---------------------
      1

    The fragment count for ethane is 0:

      Dayprop
      % echo '$SMI<CC>|; | dayprop - property fragment_count
      $SMI<CC>PPROPlt;11;0;0;gt;|


      Dayproptalk
      Qwerty: Set PROPERTY FRAGMENT_COUNT.
      Qwerty: Over.
      Property type set to FRAGMENT_COUNT...11
      Qwerty: Over.
      CC
      Qwerty: Over.
      CC 0
      Qwerty: Over.

      SQL
      SQL> select fragment_count('CC') from dual;

      FRAGMENT_COUNT('CC')
      --------------------
      0

HACCEPTOR_COUNT
    Number of hydrogen-bonding acceptor sites as defined by the SMARTS pattern [$([!#6;+0]);!$([F,Cl,Br,I]);!$([o,s,nX3]);!$([Nv5,Pv5,Sv4,Sv6])]. Heavy atoms may have multiple hydrogen-bonding sites.

    For example, the number of hydrogen-bonding acceptor sites for water is 2 (one for each electron pair):

      Dayprop
      % echo '$SMI<O>|' | dayprop -property hacceptor_count
      $SMI<O>PPROP<5;2;0;>|


      Dayproptalk
      Qwerty: Set PROPERTY HACCEPTOR_COUNT.
      Qwerty: Over.
      Property type set to HACCEPTOR_COUNT...5
      Qwerty: Over.
      O
      Qwerty: Over.

      O 2
      Qwerty: Over.

      SQL
      SQL> select hacceptor_count('O') from dual;

      HACCEPTOR_COUNT('O')
      --------------------
      2
HDONOR_COUNT
    Number of hydrogen-bonding donor sites as defined by the SMARTS pattern [!#6;!H0]. Heavy atoms may have multiple hydrogen-bonding sites.

    For example, the number of hydrogen-bonding donor sites for water is 2 (one for each hydrogen):

      Dayprop
      % echo '$SMI<O>|' | dayprop -property hdonor_count
      $SMI<O>PPROP<4;2;0;>|

      Dayproptalk
      Qwerty: Set PROPERTY HDONOR_COUNT.
      Qwerty: Over.
      Property type set to HDONOR_COUNT...4
      Qwerty: Over.
      O
      Qwerty: Over.

      O 2
      Qwerty: Over.

      SQL
      SQL> select hdonor_count('O') from dual;

      HDONOR_COUNT('O')
      -----------------
      2
MATCH_COUNT
    The number of unique matches of a user defined SMARTS in a molecule.
    For example, the number of carbons in ethanol is 2:

      Dayprop
      % echo '$SMI<CCO>|' | dayprop -property match_count -smarts '[#6]'
      $SMI<CCO>PPROP<19;2;0;>|

      Dayproptalk
      Qwerty: Set PROPERTY MATCH_COUNT.
      Qwerty: Over.
      Property type set to MATCH_COUNT...19
      Qwerty: Over.
      Qwerty: Set SMARTS [#6].
      Qwerty: Over.
      SMARTS set to [#6]
      Qwerty: Over.
      CCO
      Qwerty: Over.
      CCO 2
      Qwerty: Over.

      SQL
      SQL> select match_count('CCO','[#6]') from dual;

      MATCH_COUNT('CCO','[#6]')
      -------------------------
      2
MOLAR_VOLUME
    Average molar volume based on Schrödinger's method. Only works for C,H,N,O,S,F,Cl,Br,I. The molar volume is the volume of one mole of compound i.e. the inverse of the density. The additive constitutive method used here is that of Schroeder. Molar volume is used in estimating interfacial tension, liquid viscosity, surface tension and water solubility:

      Dayprop
      % echo '$SMI<CCO>|' | dayprop -property molar_volume
      $SMI<CCO>PPROP<8;63.00;0;>|

      Dayproptalk
      Qwerty: Set PROPERTY MOLAR_VOLUME.
      Qwerty: Over.
      Property type set to MOLAR_VOLUME...8
      Qwerty: Over.
      CCO
      Qwerty: Over.

      CCO 63.00
      Qwerty: Over.

      SQL
      SQL> select molar_volume('CCO') from dual;

      MOLAR_VOLUME('CCO')
      -------------------
      63
MOL_FORM
    Calculates molecular formula in Hill order. Charges are ignored:

      Dayprop
      % echo '$SMI<N[C@@](C)C(=O)O>|' | dayprop -property mol_form
      $SMI<N[C@@](C)C(=O)O>PPROP<2;C3H6NO2;0;>|

      Dayproptalk
      Qwerty: Set PROPERTY MOL_FORM.
      Qwerty: Over.
      Property type set to MOL_FORM...2
      Qwerty: Over.
      N[C@@](C)C(=O)O
      Qwerty: Over.

      N[C@@](C)C(=O)O C3H6NO2
      Qwerty: Over.

      SQL
      SQL> select mol_form('N[C@@](C)C(=O)O') from dual;

      MOL_FORM('N[C@@](C)C(=O)O')
      --------------------------------------------------------------------------------
      C3H6NO2

PARACHOR
    Computes molar surface tension in dynes per centimeter using McGowan's method.

    In this example, the parachor of ethanol is 127.60 dyn/cm:

      Dayprop
      % echo '$SMI<CCO>|' | dayprop -property parachor
      $SMI<CCO>PPROP<6;127.60;0;>|


      Dayproptalk
      Qwerty: Set PROPERTY PARACHOR.
      Qwerty: Over.

      Property type set to PARACHOR...6
      Qwerty: Over.
      CCO
      Qwerty: Over.

      CCO 127.60
      Qwerty: Over.


      SQL
      SQL> select parachor('CCO') from dual;

      PARACHOR('CCO')
      ---------------
      127.6

    This supported set of atoms is C,H,N,S,P,F,Cl,Br,I. This property is set to No Value for molecules that contain atoms outside the supported set.

    For example, the parachor for silicon dioxide cannot be computed:

      Dayprop
      % echo '$SMI<O=[Si]=O>|' | dayprop -property parachor
      $SMI<O=[Si]=O>PPROP<6;No_Value;0;>|

      Dayproptalk
      Qwerty: Set PROPERTY PARACHOR.
      Qwerty: Over.

      Property type set to PARACHOR...6
      Qwerty: Over.
      O=[Si]=O
      Qwerty: Over.

      O=[Si]=O No_Value
      Qwerty: Over.

      SQL
      SQL> select parachor('O=[Si]=O') from dual;
      PARACHOR('O=[SI]=O')
      --------------------

    Parachor can be used to estimate surface tension and boiling point. Parachor is also used in estimating soil absorption coefficients, and water solubility. This version uses the atom contributions of McGowan. More complicated schema are available in the review by Quayle.

    MacLeod-Sugden method for surface tension estimation:[3],[6]


    For aniline c1ccccc1N o = 43.80 using estimates from dayprop(), experimental value is 42.9 at 20°C ( Handbook of Chemistry and Physics CRC Press)

    For ethyl acetate CCOC(=O)C o = 22.69 using estimates from dayprop(), experimental value is 23.9 at 20°C ( Handbook of Chemistry and Physics CRC Press)

    Meissner's method for boiling point estimation:[4]


    For chloroethyl vinyl ether ClCCOC=C Tb = 371.9°K using estimates from dayprop(), experimental value is 381.2°K ( Handbook of Chemistry and Physics CRC Press)

    For nicotine c1ccc(C2N(C)CCC2)cn1 Tb = 493.0°K using estimates from dayprop(), the experimental value is 515.7°K ( Handbook of Chemistry and Physics CRC Press)
PART_COUNT
    Number of components.

    For example, the number of components in a mixture of ethanol and water is 2:

      Dayprop
      % echo '$SMI<CCO.O>|' | dayprop -property part_count
      $SMI<CCO.O>PPROP<16;2;0;>|

      Dayproptalk
      Qwerty: Set PROPERTY PART_COUNT.
      Qwerty: Over.

      Property type set to PART_COUNT...16
      Qwerty: Over.
      CCO.O
      Qwerty: Over.

      CCO.O 2
      Qwerty: Over.

      SQL
      SQL> select part_count('CCO.O') from dual;

      PART_COUNT('CCO.O')
      -------------------
      2
POLAR_SURFACE_AREA
    Compute the topological polar surface area (TPSA) according to the method of Ertl, Rohde, and Selzer.[1]

    For example, the TPSA for ethanol is 20.23:

      Dayprop
      % echo '$SMI<CCO>|' | dayprop -property polar_surface_area
      $SMI<CCO>PPROP<17;20.23;0;>|


      Dayproptalk
      Qwerty: Set PROPERTY POLAR_SURFACE_AREA.
      Qwerty: Over.

      Property type set to POLAR_SURFACE_AREA...17
      Qwerty: Over.
      CCO
      Qwerty: Over.

      CCO 20.23
      Qwerty: Over.


      SQL
      SQL> select polar_surface_area('CCO') from dual;

      POLAR_SURFACE_AREA('CCO')
      -------------------------
      20.23
RIGIDITY
    Compute the Tanimoto similarity value between a molecule and a hypothetical version of itself with the rotatable bonds removed. Values range from 1 (rigid) to 0 (not rigid).

    For example, the rigidity value of hexane is 0.5833.

      Dayprop
      % echo '$SMI<CCCCCC>|' | dayprop -property rigidity
      $SMI<CCCCCC>PPROP<10;0.5833;0;>|


      Dayproptalk
      Qwerty: Set PROPERTY RIGIDITY.
      Qwerty: Over.
      Property type set to RIGIDITY...10
      Qwerty: Over.
      CCCCCC
      Qwerty: Over.
      CCCCCC 0.5833

      SQL
      SQL> select rigidity('CCCCCC') from dual;

      RIGIDITY('CCCCCC')
      ------------------
      .5833

    Note: Path lengths greater than 7 are not recognized. Rings with 8 or more atoms are not perceived.
RING_COUNT
    Number of smallest set of smallest rings (SSSR).

    For example, cubane has a SSSR of 5:

      Dayprop
      % echo '$SMI<C12C3C4C1C1C4C3C21>|' | dayprop -property ring_count
      $SMI<C12C3C4C1C1C4C3C21>PPROP<9;5;0;>|

      Dayproptalk
      Qwerty: Set PROPERTY RING_COUNT.
      Qwerty: Over.

      Property type set to RING_COUNT...9
      Qwerty: Over.
      C12C3C4C1C1C4C3C21
      Qwerty: Over.

      C12C3C4C1C1C4C3C21 5
      Qwerty: Over.

      SQL
      SQL> select ring_count('C12C3C4C1C1C4C3C21') from dual;

      RING_COUNT('C12C3C4C1C1C4C3C21')
      --------------------------------
      5
ROTBOND_COUNT:
    Number of rotatable bonds using the SMARTS pattern: [!$(*#*)&!D1&$(*(-[!#1])~[!#1])]-&!@[!$(*#*)&!D1&$(*(-[!#1])~[!#1])]. It matches acyclic bonds between two atoms that have additional non-Hydrogen substituents and that are not alkynes. Secondary amides are further excluded with the pattern ([N&H1&D2]-&!@[#6&X3]).

    For example, the number of rotatable bonds in alanine is 1:

      Dayprop
      % echo '$SMI<N[C@@](C)C(=O)O>|' | dayprop -property rotbond_count
      $SMI<N[C@@](C)C(=O)O>PPROP<3;1;0;>|


      Dayproptalk
      Qwerty: Set PROPERTY ROTBOND_COUNT.
      Qwerty: Over.

      Property type set to ROTBOND_COUNT...3
      Qwerty: Over.
      N[C@@](C)C(=O)O
      Qwerty: Over.

      N[C@@](C)C(=O)O 1
      Qwerty: Over.

      SQL
      SQL> select rotbond_count('N[C@@](C)C(=O)O') from dual;

      ROTBOND_COUNT('N[C@@](C)C(=O)O')
      --------------------------------
      1

    Note: Symmetrical tri-substituted groups are considered rotatable (ex., N-trimethyl anilinium cation, [N+](C)(C)(C)c1ccccc1) and symmetrical non-substituted groups are not (ex., [N+]([H])([H])([H])c1ccccc1).
    Bonds to explicit (isotopic) Hydrogens are not counted as needed substituents on rotatable bonds. CCC[2H] and CCC both have zero rotatable bonds.
    Amidines (C=C(N)N) are not recognized rotatable (for example: the carbon-carbon double bond in Zantac, CN/C(=C\[N+](=O)[O-])/NCCSCc1ccc(CN(C)C)o1, rotates on the NMR time scale).

    Triple bonds and the two adjacent single bonds are not recognized as rotatable bonds.

    Sulphonamides (NS(=O)*) are not recognized as rotatable (the double bond character of the N-S bond is questionable).

    Some other groups not considered rotatable as a unit include adamantyl, barrelenes, propelleranes, and extended cumulenes.
STEREOCENTER_COUNT
    The number of stereogenic centers using the following SMARTS patterns. necessary but not sufficient conditions for stereo:
    Atom stereo: [$([X4&!v6&!v5;H0,H1]),$([SX3]([#6])([#6])~O)]
    Bond stereo: [CX3;!H2]=[CX3;!H2]
    Allene stereo: [CX3;H0]=C=[CX3;H0,H1]

    For example, the number of stereogenic centers in alanine is 1:

      Dayprop
      % echo '$SMI<N[C@@H](C)C(=O)O>|' | dayprop -property stereocenter_count
      $SMI<N[C@@H](C)C(=O)O>PPROP<15;1;0;>|

      Dayproptalk
      Qwerty: Set PROPERTY STEREOCENTER_COUNT.
      Qwerty: Over.

      Property type set to STEREOCENTER_COUNT...15
      Qwerty: Over.
      N[C@@H](C)C(=O)O
      Qwerty: Over.

      N[C@@H](C)C(=O)O 1
      Qwerty: Over.

      SQL
      SQL> select stereocenter_count('N[C@@H](C)C(=O)O') from dual;

      STEREOCENTER_COUNT('N[C@@H](C)C(=O)O')
      --------------------------------------
      1


8. Appendix: References

1. Ertl, P.; Rohde, B." Selzer P., "Fast Calculation of Molecular Polar Surface Area as a Sum of Fragment-based Contributions and Its Application to the Prediction of Drug Transport Properties", J.Med.Chem.(2000), 43, 371 4- 3717.

2. Katritzky, A. R.; Lobanov, V. S.; Karelson, M. QSPR: The correlation and quantitative prediction of chemical and physical properties from structure. Chem. Soc. Rev. 1995, 24, 279-287

3. MacLeod, D.B. "On a Relation between Surface Tension and Density" Trans. Faraday Soc. 19 384-42 (1923)

4. Meissner H.P., "Critical Constants from Parachor and Molar Refraction" Chem. Eng. Prog. 45 149-153 (1949)

5. Reid, R.C.; Prausnitz J.M.; and Poling, B.E. The Properties of Liquids and Gases, 4th ed., New York: McGraw-Hill Book Company (1987). As an historical note this is identical to the example in the Medchem 3.41 GCL manual

6. Sugden, S., "The Influence of the Orientation of Surface Molecules on the Surface Tension of Pure Liquids" J. Chem Soc. 125 1167-89 (1925)