SMILES Tutorial: Bonds

This document is intended to be viewed with a tables-capable browser.

Bond specification

Single, double, triple, and aromatic bonds are represented by the symbols `-', `=', `#', and `:', respectively. Adjacent atoms without an intervening bond symbol connected by a valence-dictated bond (typically a single or aromatic bond). `-' (single) and `:' (aromatic) bond symbols may always be omitted on input.

The syntax for the bond sublanguage is:

   bond :  <empty> | '-' | '=' | '#' | ':'
        ;

Advanced issues

There is no "preferred" or "correct" ordering in SMILES, e.g., CCO and OCC are equally valid SMILES for ethanol.

The SMILES language specifies no predefined length limit on a SMILES string. In practice, most implementations define such a limit, typically between 20,000 and 80,000 characters.

Ring closure specification is an alternative method for specifying bonds as is explained in the section on ring specification.

Examples

Table 4. Bond specification in SMILES.
Depiction SMILES Name Remark
CC
or
C-C
or
[CH3]-[CH3]
ethane Adjacent aliphatic atoms are assumed to be bonded by a single bond: the single bond symbol `-' is not needed on input.
C=O
or
O=C
formaldehyde Double bonds are represented by an equals sign. Note that the order of input doesn't matter (SMILES may start with any atom).
C#N
or
N#C
hydrogen cyanide Triple bonds are represented by an hash (or "pounds") sign. (There is no handy triple-bond-like symbol in standard ASCII.)
C=C
or
cc
ethene Ethene is normally written C=C, but (surprise!) the default bond between non-aromatic sp2 atoms may be a double bond ...
C=CC=C
or
cccc
butadiene ... but not always.
(Butadiene is normally written C=CC=C.)
? ccc ? "There ain't no sech animal."

Forward to "Branching".
Back to "Atoms".
Return to table of contents.
Daylight Chemical Information Systems, Inc.
info@daylight.com