2. Basics: Daylight Toolkit Objects

Back to Table of Contents

2.1 Introduction to Objects

The Daylight Toolkit uses an object mechanism to simplify the task of programming for chemistry. We begin with a short example that illustrates the object-aspects of the Daylight Toolkit (many other examples come on the tapes with the Toolkits).

     /*----------------------------------------------------
      *  thorload.c -- a simple program to load data into
      *  a THOR database.
      *---------------------------------------------------*/

     #include <dt_smiles.h>
     #include <dt_thor.h>

     main(int argc, char **argv)
     {
       dt_Handle server, db, tdt;
       char dbname, servername;
       char *tdtbuf;
       int   tdtbufsize, tdtlen, isnew;

       /**** Specify the server (machine) and database name ***/
       servername = "my_machine_name";
       database = "medchem02demo";

       /**** Connect to server and open database ****/
       server = dt_thor_server(strlen(servername), servername,
                     4, "thor", 4, "thor", 0, "");
       db     = dt_open(server, strlen(dbname), dbname, 1, "w", 0, "", &isnew);

       /**** Load data until input is exhausted ****/
       while (1 == du_readtdt(stdin, &tdtlen, &tdtbuf, &tdtbufsize)) {
	 tdt = dt_thor_str2tdt(db, tdtlen, tdtbuf, 1);
	 dt_thor_tdtput(tdt, 0);
       }
       dt_dealloc(server);
     }

The important features of this example are:

  • The variables server, db and tdt are identifiers, or handles, that your code and the Daylight Toolkit use to refer to an object (in this case, Thor server, Thor database, and Thor datatree objects, respectively).

  • The handles themselves contain no information - they are not pointers to complex structures.

  • The Daylight Toolkit manages the objects for you - you need not be concerned with details of how the Toolkit represents the molecule or depiction.

  • Error handling is simplified: the NULL object (returned when errors are detected) is a valid handle that refers to nothing. In the above example, the Daylight Toolkit will not generate fatal errors even when the server can't be reached or the database can't be opened; it simply will not load anything. Functions (not illustrated) are provided to retrieve error messages from the Toolkit.

  • Because objects are managed by the Daylight Toolkit, the interface to various programming languages is straightforward: the Daylight Toolkit works equally well with C, FORTRAN, Pascal, or LISP.

Many programmers will recognize the similarity of this approach to Object-Oriented Programming (OOP). Although many of the ideas described here are borrowed from OOP, the Daylight Toolkit is not as complete or complicated as a true OOP system. However,the Toolkit uses the following key OOP concepts:

You deal with objects:
Everything you work with, such as molecules, depictions, databases and THOR Data Trees, is an object.

Objects are referenced by their handles:
An object is referred to by its handle; something the Toolkit gives you when you create the object. This handle is typically an arbitrary 32-bit integer, but even this level of detail is irrelevant: it does not matter to you what a handle is so long as you use it correctly.

Handles are opaque:
An object's handle is all you know about directly. The handle is opaque -- you can't see what is inside the object it represents.

The Toolkit uses a strict functional interface:
You never work on data structures or "common blocks". Instead, you call Toolkit functions to create, modify, use, and destroy Toolkit objects.

Objects are self-describing:
Each object "knows" what it is. Many Toolkit functions will take a variety of different object types (they are polymorphic; see the chapter entitled POLYMORPHIC FUNCTIONS); the function "asks" the object what type it is and performs the appropriate action.

2.2 Handles

As noted above, handles are the Toolkit's "name" for each object that it creates, and the handle is the only thing your application program knows directly about the object's representation. Because objects are opaque, it is irrelevant to you what a handle actually represents (in fact, different versions of the Toolkit use different methods to assign handles to objects).

Although handles are opaque, they have several properties that are important to the application programmer. These properties are the only ones that the Toolkit guarantees:

Uniqueness: Each handle is guaranteed to be unique at all times:

  • If two handles are equal, they refer to the same object.
  • If two handles are not equal, they refer to different objects.

Note that uniqueness is not guaranteed over time: the Toolkit may re-use a handle if the original object it represents is discarded.

Revocation: Some toolkit functions cause previously-returned handles to become invalid. For example, the handle for an object becomes invalid if the object is removed from the system with dt_dealloc(). A handle that has become invalid in this way is said to have been revoked. Generally speaking, all operations on revoked handles produce undefined results. It is up to the application programmer to guarantee that revoked handles are not used. For functions that cause revocations, the specific description of each function in the Daylight Programmer's Reference Manual will say exactly which handles are revoked.

Vigilance: To assist programmers during code development, "vigilant" versions of the Daylight Toolkit are available. These versions may be able to detect the use of an invalid handle. In other words, some toolkit implementations do define a behavior when an operation is applied to a revoked handle. In such vigilant versions, passing a revoked handle to a toolkit function will cause an error return. For extra help in detecting errors, a function named dt_invalid() may be used to test the validity of a handle; it is explained more fully in the chapter entitled POLYMORPHIC FUNCTIONS. A second vigilance function, dt_vh_stop_here(), is provided for use with a debugger. The Toolkit calls this function when an invalid handle is detected.

2.3 Object Types

The Daylight Toolkit supports a small number of object types. These are divided into several sections, corresponding to the Toolkit's parts (e.g. SMILES, Depict, THOR, etc.). Each of these object types is explained in more detail in the chapter for that section of the Daylight Toolkit; here we give an abbreviated list as an introduction to the object-type concept.

General:

stream
An ordered enumeration of objects from a base object
sequence
an ordered sequence of objects of any type
SMILES:
molecule
a molecule structure
atom
an atom in a molecule
bond
a bond in a molecule
cycle
a cycle in a molecule
Depict:
depiction
a 2-d representation of a molecule
conformation
a 3-d representation of a molecule
THOR
server
a connection to a THOR server process
database
a database of chemical information
datatree
a single entry from a database
dataitem
a datum from a datatree

Each of the above object types is represented by a symbolic constant:

Object Type Symbolic Name
server TYP_SERVER
stream TYP_STREAM
molecule TYP_MOLECULE
and so forth. The exact symbolic names for each object type can be found in the Daylight Programmer's Reference Manual.

There are two "pseudo object" types: TYP_ANY and TYP_INVALID. The pseudo object type TYP_ANY is used when any object is acceptable. Since it is a pseudo object type, there are no actual objects of type TYP_ANY. Similarly, TYP_INVALID may be returned by functions to indicate that the specified object is unknown or incorrect. There are no actual objects of type TYP_INVALID.

2.4 The NULL_OB Handle

One special handle value is used to represent "nothing"; it indicates that no object is present. It is called the null object, and its handle is represented by the symbolic constant NULL_OB. A handle whose value is NULL_OB is a valid handle, but it does not refer to any object and it has no type.

NULL_OB plays a special role in the Daylight Toolkit: Functions that return objects will return NULL_OB if an error occurs, and functions that take object parameters will accept NULL_OB as a valid handle (they ignore it and do nothing). This means that error management in applications that use the Toolkit is somewhat simplified -- in many cases the handle returned by one function can be safely passed to the next function whether the first function failed (returned NULL_OB) or succeeded (returned a handle to a real object). It is safe to pass NULL_OB anywhere a handle is expected. See the chapter on error handling for more discussion of this topic.

NOTE: In current implementations, NULL_OB is defined to be zero. However, there is no guarantee that this will always be the case. Application programs should explicitly compare for equality or inequality to NULL_OB rather than using constructs like "if (!my_handle) ...". Programs that assume NULL_OB is zero are explicitly non-portable.

2.5 Daylight Version Handling

The Daylight Toolkit has both Runtime version handling and Compile time version handling. The runtime version handling can be used in the user code to show which version of the runtime libraries are currently being used. The user code can compare the version number to the current Daylight release version and if it is different print an error message describing version inconsistency along with a suggestion to check LD_LIBRARY_PATH which tells the code which runtime libraries to use.

The runtime version and creation date can be accessed with the dt_info() function. If the dt_info() function is called with the "toolkit_version" parameter with the runtime libraries made with version 4.81 or later it will return a version number. Any libraries made prior to 4.81 will return NULL.

The compile time versions refer to when the entire toolkit was compiled. These versions are described in dt_smiles.h with DX_TOOLKIT_VERSION and DX_TOOLKIT_DATE. These are also the versions numbers and dates that are referenced in the man pages and other Daylight documentation.

The user can use DX_TOOLKIT_DATE and DX_TOOLKIT_VERSION to ensure that they are compiling their code with the correct runtime libraries.

Example of using runtime and compile time versions.

	int main()
	{
	... 
 	rver = dt_info(&rlen, NULL_OB, "toolkit_version");
  	if (rver == NULL)
             printf(stderr, "WARNING: you're using an older (pre-4.8) "
             "toolkit runtime library, check LD_LIBRARY_PATH");

        else if (0 != strncmp(rver, DX_TOOLKIT_VERSION, rlen)))
             printf(stderr, "You compiled this program with "
             "version %s but are running it against the "
             "%.*s toolkit runtime library.\n", DX_TOOLKIT_VERSION,
             rlen, rver);
        }

3. Basics: Polymorphic Functions

Back to Table of Contents

3.1 Polymorphism

There are many functions, such as counting, copying, deallocating, and naming, that can be applied to several different types of objects. We refer to these functions as polymorphic.

The idea of a polymorphic function might seem foreign at first, but it is actually quite familiar to all programmers. Take, for example, the "*" operator in FORTRAN. When applied to two numbers, we expect it to cause the two numbers to be multiplied. However, on closer inspection, the "*" operator turns out to be polymorphic: it can be applied to integers, single-precision floating-point numbers, double- precision floating-point numbers, and complex numbers.

The difference between the FORTRAN style of polymorphism and that employed by the Daylight Toolkit is only that the nature of the operation is determined at run time rather than at compile time. That is, the FORTRAN compiler looks at the operands and decides which of several functions to apply, then generates the appropriate code; at run-time the decision as to which function to apply has already been made. In the Daylight Toolkit, a dispatch function examines the object of interest and decides "on the spot" (i.e. at run time) which function to apply.

Not all polymorphic functions can be applied to all objects. The following two sections respectively describe "generic" polymorphic functions (those that apply to all objects) and "semi-generic" polymorphic objects (those that could apply to more than one object but not to all objects).

NOTE: The specific behavior of polymorhic functions when given different object types is rigorously defined in the reference manual pages.

As a simple example of the power of polymorphism, the following function accepts any object and prints out all of its string value(s):

     dt_Integer dump_strings(dt_Handle ob) {
	 dt_Handle m, d;
	 dt_String line;
	 dt_Integer len;

	 /*  Get the stringvalue */
	 line = dt_stringvalue(&len, ob);

	 /*  If the object has one, print it. */
	 if (line != NULL)
	   fprintf(stderr, "Stringvalue is: %.*s\n", len, line);

	 /* Check to see if the object is a stream or sequence.  If
	    so, examine the members also.*/
	 if ((dt_type(ob) == TYP_STREAM) || 
	     (dt_type(ob) == TYP_SEQUENCE))
	   {
	   dt_reset(ob);
	   while (NULL_OB != (m = dt_next(ob)) && !dt_atend(ob))
	     dump_string(m);  /* Call recursively for each member. */
	   }
	 return (TRUE);
     }

The important features of this example are:

  • The function need not know in advance the type of object which may be used in this function. The only exception is in cases where special processing is desired (here, for streams and sequences).

  • If dt_stringvalue() fails, we don't do anything special. It simply means that the given object doesn't have a string value, or doesn't respond to the dt_stringvalue() function. In either event, we continue.

3.2 Generic Functions

The following work on all Daylight Toolkit objects.

dt_adjunct(Handle ob) => object
Retrieve the adjunct object associated with ob (see dt_setadjunct()).

dt_invalid(Handle ob) => boolean
If the Daylight Toolkit is of the "vigilant" type and can determine that the given handle ob is invalid, return TRUE. Otherwise return FALSE.

dt_setadjunct(Handle ob, object adjunct_ob) => object
Makes adjunct_ob the adjunct of ob -- a simple mechanism to let one object "point" to another.

dt_type(Handle ob) => integer
Return the type of the given object, represented as an integer.

dt_typename(Handle ob) => string
Return a string naming the type of the given object, e.g. "molecule" for a molecule object.

3.3 Semi-Generic Functions

The following functions are generic in that they apply to more than one object type, but there may be object types to which they do not apply.

dt_add(Handle set, Handle object) => boolean
Adds object to set.

dt_base(Handle ob) => object
Returns the base object -- the object from which ob was derived. Examples of objects that have a base object are: depictions and conformations (base object is a molecule); and streams (base objects are molecules, THOR data trees, etc.).

dt_copy(Handle ob) => object
Returns a handle for a copy of the given object. A copy of an object shares no structure with the original. A copy of an object is guaranteed to behave exactly like the original in every respect.

dt_count(Handle ob, integer typeval) => integer
Counts and returns the number of objects of the specified type within or associated with the object ob.

dt_dealloc(Handle ob) => boolean
The given object is removed from the system and its handle is revoked. Frees all resources used by the object (memory, open files, etc.). Once revoked, a handle must not be used; doing so has undefined results, which may include "crashes" of the Toolkit.

dt_info(Handle ob, string whatinfo) => string
Return information about an object. Many objects have special properties in the sense that they are not set by Toolkit functions, but rather arise from external sources. An example is a THOR database: a call to dt_info(db_handle, "users") will return a string containing the names of all other users who currently have the database open. Using dt_info with a NULL_OB will return information about the runtime library that is currently being used. Using "creation_date" as the string will return the creation date of the runtime library and using "toolkit_version" will return the toolkit version number.

dt_member(Handle set, Handle object) => boolean
Returns TRUE if the given object is a member of the set.

dt_molecule(Handle ob) => molecule
Some objects (e.g. THOR Data Trees) have a "hidden" molecule object associated with them. It is often convenient to use this molecule object rather than re-creating it; for example, when you want a unique SMILES (dt_cansmiles()) for the object of interest. This function will return the hidden molecule's handle.

dt_parent(Handle ob) => object
Returns the parent of the specified object. Examples of objects that have parents are: atoms and bonds (parent is a molecule object); THOR data tree (parent is the database from which the data was retrieved).

dt_remove(Handle set, Handle object) => boolean
Removes object from the set.

dt_setstringvalue(Handle strobj, string str) => boolean
Changes an object's contents to the specified string. Several objects, include string objects, THOR datatree objects, and Fingerprint objects, have a string value that can be set by this function.

The object maintains its own copy of str, so the contents of str need not be maintained after calling this function. Note that not all objects that return a string (see dt_stringvalue()) allow you to set the string value; in some cases the string value is derived from other properties.

dt_stringvalue(Handle ob) => string
Returns the string value of the object. Many objects, include string objects, THOR objects, and Fingerprint objects, have a string value that can be accessed by this function.

The string returned by this function is "owned" by the Toolkit, and should not be modified in any way by the application. For example, attempting to directly overwrite the contents of a string object is an error -- although it may work in one implementation or with one particular language, it is an unsupported operation and may fail in future releases of the Toolkit or with compilers on different operating systems. One should always use dt_setstringvalue() to change an object's contents.

dt_stream(Handle ob, integer typecode) => stream
This polymorphic function is described in detail in the chapter on Stream and Sequence objects.
Back to Table of Contents
Go to previous chapter: Introduction.
Go to next chapter: Error Handling