Daylight v4.9
Release Date: 1 February 2008


sd2smarts - converts a query connection table-based file into a SMARTS-based file.

Unix Synopsis

sd2smarts [options] [infile [outfile]]


sd2smarts(1) converts an MDL molfile or SDfile containing a query into a Daylight SMARTS (SMA) or Thor Data Tree (TDT) file. Alternatively, the output can be directed to two SQL loader (SQLLDR) files.

The input file must be a molfile or SDfile in v2000 format. R-group but not S-group features are recognized. The input file can contain any query feature that can be represented in both MDL CTfile format and Daylight SMARTS. Examples include H-counts, ring-bond counts, bond types, bond topology, etc.

Default output is to stdout. In the case of SQLLDR output, the user must specify the rootname for the two output files.

Double bond stereochemistry and tetrahedral chirality are inferred from the atom coordinates and bond style information in the connection table and encoded as SMARTS. SMARTS and associated structural information including coordinates are automatically stored in the TDT and SQLLDR outputs.

Data in SDfiles are converted for TDT and SQLLDR outputs. The SQLLDR format stores data in one file (.dat) and structural information in another (.str). Legal characters for data tags are limited to: $, _, /, A-Z, a-z, and 0-9.

Unless otherwise specified using the ID_FIELD option, the characters in the first line of the header block of each connection table are assumed to be a unique ID. In the SMA output, this ID follows the space-delimited SMARTS. The ID is stored in the $NAM field for the TDT format and as the first line of the SQLLDR files. If first line of the header block is blank, the SMARTS will be used as the name for TDT and SQLLDR output.

The manual page for "convert" describes features common to this and the other "convert" programs. Please refer to it for more information on general usage and options such as -HELP, -VERSION, -SKIP_RECORDS, -DO_RECORDS, -ERROR_LEVEL, -ERROR_LOG, and -REJECT_LOG.




Controls whether the output is in SMARTS, TDT or SQLLDR format. The default is SMA. For the TDT and SQLLDR formats, information on the first line of each input header block and any non-standard atom labels in the input file are stored as LINE1 and as an atom-tuple in the ASYM datatype, respectively. The original atom is designated by '*' in the SMARTS. For TDT output, a special $SMIG datatype is written containing data about the conversion program name and version.


2D and/or 3D coordinates to the TDT or SQLLDR output. These data are taken from the actual coordinates in the input atom block and stored as a comma-separated list of values in unique SMILES order. Default is TRUE for both -ADD_2D and -ADD_3D. If non-zero coordinates are found in the atom block, then either 2D or 3D coordinates are written to the output file depending on which are available. Setting one of these values to FALSE eliminates the entry for that set of coordinates.
Splits data that is spread across multiple lines in an input file into separate entries for the TDT or SQLLDR output. The default is FALSE so that multiple lines are considered as a single value. Setting SPLIT_FIELDS to TRUE allows each line of a multi-line field to be considered as a separate value with the same data field identifier.
-ID_FIELD <name>
Sets the data field identifier to be used as a unique ID. As described above, the default for ID name is the first line of each header block. If there is no ID on line 1, the SMARTS is used. Alternatively, designating a data field identifier as the ID_FIELD causes the data in that field to be used as the ID. Note: One may need to place the data field identifier in quotes and use '\\' before '$'. Input records not containing information in the designated field are rejected.
Alters the way in which chirality is determined in order to detect implicit chiral centers. This is useful for some natural products. For a bond A-hash-B, the interpretation is that B is below A from the perspective of A and A is above B from the perspective of B. The default is FALSE. Setting -IMPLICIT_CHIRALITY to TRUE allows both ends of chiral bonds to be used in the determination of chiral centers.
Toggles whether stereochemistry for ring double bonds is indicated. Default is FALSE. Setting this option as TRUE, marks the cis/trans stereochemistry for all ring double bonds.
Indicates whether the values in the M ISO line of the property block are mass defects or actual masses for the isotopes listed. Default is FALSE. When -M__ISO_ARE_DEFECTS is set as TRUE, values in the line are treated as mass defects.
-FIX_RADICAL_RINGS [TRUE|FALSE] Converts radical rings to aromatic. The default is TRUE which allows for the certain types of five, six, and seven-membered radical rings to be converted to aromatic. Changing this option to FALSE, keeps the rings as specified in the input file. In order for a ring to be converted, all atoms in the ring must be carbon and designated as doublet radicals. In addition, no atom in the ring may have a charge. -DAYLIGHT_LIKE [TRUE|FALSE]
Determines whether both explicit H and H-count field are used. The default is TRUE. If this option is set to FALSE then only the explicit H is used.
Determines whether only specified stereochemistry is used. The default is TRUE. When this option is set to FALSE then specified and unspecified stereochemistry is used.
Determines whether chiral atoms in the input file must have explicit hydrogens. The default is TRUE. Setting this option to FALSE removes the requirement that chiral atoms have all hydrogens explicitly indicated.

Return Value

sd2smarts returns 0 to the environment if it succeeds without errors or a non-zero value if there are errors.



Daylight License

programs: convert

Related Topics

convert(1) mol2smi(1) smi2mol(1) rd2smi(1) smi2rd(1) rd2smarts(1) rd2smirks(1) licensing(5) options(5)