8. SMILES Toolkits: Substructures and Paths

Back to Table of Contents

8.1 Introduction

The Daylight Substructure Toolkit provides objects and functions to represent and operate on substructures and paths:

Substructure:
A set of atoms and bonds from one molecule. Typically a substructure is obtained as the result of a substructure search (see the chapter on the SMARTS Toolkit), but they can be created "from scratch" by an algorithm of your own design using Toolkit functions, described below.

A substructure is simply a set -- there is no implied order to the atoms or bonds in the set, and each atom and bond from the molecule occurs at most one time in the substructure. We normally think of a substructure as a set of atoms and bonds that are connected together in some chemically meaningful way. A substructure object can be used to represent these "ordinary" substructures, but it can also be used to represent less conventional collections. For example, a substructure object could hold all of the double bonds in a structure, all of the atoms with an odd number of protons, and so forth. In other words, a substructure object is just an arbitrary set of atoms and bonds; it is up to the programmer using the object to decide what the set means.

Path:
A path through a substructure. That is, a set of atoms and bonds from a single base molecule, and a particular ordering of those atoms and bonds.

The word "path" suggests that the ordering in the object be related to the actual connectivity of the molecule (as though you could "walk" the path without jumping around), but this is not a requirement. The path object is only defined to be an ordered set of atoms and bonds. For example, a path object could contain all bonds ordered by their bond order (i.e. single, double, triple, then aromatic), or could contain all atoms ordered by atomic number. Like the substructure object, it is up to the programmer to assign meaning to the path object's contents.

The SMARTS Toolkit also uses a closely-releated object type, the pathset:

Pathset:
A set of zero or more path objects from a single molecule. The pathset object and its uses are discussed at length in the SMARTS Toolkit chapter .

NOTE: An often confusing point is that the SMILES Toolkit provides substructure and path objects, but does not do substructure searching (substructure searching is available in the SMARTS Toolkit -- sold separately). There are many sources of substructures and paths besides SMARTS; the path and substructure objects provide a convenient way to represent them whether or not you purchase the SMARTS Toolkit.

To retrieve the contents of a path or substructure, you can create streams of atoms or bonds (see dt_stream()). Any modificacation to a path or substructure (adding or removing an atom or bond) causes all streams to be deallocated.

Substructure and Path objects always have a molecule as their base object (see dt_base()). Their existance dependes on the existance of the base molecule; deallocating a molecule causes all paths and substructure objects of the molecule to be deallocated.

8.2 Functions on Substructures and Paths

dt_alloc_substruct(Handle mol) => substruct
Returns a new substructure object. The new substructure object initially is empty (contains no atoms or bonds).

dt_alloc_path(Handle mol) => path
Returns a new path object. The new path object initially is empty (contains no atoms or bonds).

dt_add(Handle sp, Handle ab) => boolean
Add an atom or bond to the substructure or path. The atom or bond must be from the same molecule as the substructure's or path's base object

Adding an object to a substructure simply adds it if it is not there; the order in which objects are added to a substructure is not remembered. Adding an object to a path adds it to the end of the path unless it is already in the path, in which case the requested addition is ignored.

dt_member(Handle sp, Handle ab) => boolean
Returns TRUE if and only if the given atom or bond is a member of the substructure or path.

dt_remove(Handle sp, Handle ab) => boolean
Remove an atom or bond from a substructure or path.

Removing an atom or bond from a substructure may cause its ordering to change in arbitrary ways. Removing an atom or bond from a path leaves the order of the remaining objects unchanged.

Back to Table of Contents
Go to previous chapter Molecules
Go to next chapter Error Handling.