Daylight v4.9
Release Date: 1 February 2008


thordiff - print differences between two Thor databases

Unix Synopsis

thordiff [options] database1 database2 [outfile]


Compares each TDT in database1 to the equivalent TDT in database2. Differences, if any, are reported on standard output.

thordiff(1) reports the minimum information it can that shows the differences between two TDTs. As the two TDTs that differ are printed, dataitems that are the same are suppressed, and identifiers are only shown if they differ or if dataitems that are part of the identifier's subtree differ.

TDT differences are printed with "1: " and "2: " prefixes to each line, to make it clear that the TDT is from database1 or database2, respectively.

It is very likely that the two TDTs will have different timestamps; and there may also be other datatypes that are not of interest. The -INCLUDE_DATATYPES and -EXCLUDE_DATATYPES options allow you to specify which datatypes are to be considered (see below).

Note that the comparison is one-way. In particular TDTs that are in database2 but not in database1 are not reported. For a complete list of differences, you must invoke thordiff(1) twice, reversing the order of the databases.


-INCLUDE_DATATYPES "tag [tag ...]"
-EXCLUDE_DATATYPES "tag [tag ...]"
These two options name specific datatypes (via their tags) that are to be considered or ignored during comparison. INCLUDE_DATATYPES is considered first. If it is the special word "ALL", then all datatypes present in the database are marked for consideration. Otherwise, only datatypes on the list of tags are considered. Next, any datatype tag on the list specified by EXCLUDE_DATATYPES is removed from the list of datatypes to be considered.

Multiple tags can be specified; they are separated by space, tab, comma, or a vertical bar "|" character. If spaces or tabs are used, the tags must be quoted on the command line to make them a single parameter. For example, $SMI,CP and "$SMI CP" both specify the SMILES and Computed-logP datatypes.

The default for INCLUDE_DATATYPES is "ALL", and for EXCLUDE_DATATYPES is "" (none). Thus, the default is to consider all data.

Specify whether TDTs are written in "list" format (one line per dataitem, or "dump" format (one line per TDT). Default is LIST.
Specifies the cache level. OFF disables caching altogether. WRITETHRU causes the database's hash table to be cached for reading, but writes are immediately posted to the disk. READWRITE caches the hash table for reading and writing; changes aren't posted to the disk until the database is closed. WRITETHRU_ALL caches the entire database (which may require considerable memory, depending on the database's size) for reading, but immediately posts modified records to the disk. READWRITE_ALL caches the entire database; changes aren't posted to the disk until the database is closed or "sync'ed". Note that this option has no effect if caching is disabled or forced for the database (see thormake(1), sthorman(1)). Default is "" (unspecified -- use the database's default).

The following options are common to most or all "thorfilter" programs. They are described in more detail in thorfilters(1).


TRUE means don't allow passwords on the command line (require interactive entry). Default: TRUE.
Names the default TCP/IP service or "port" of the Thor server. Default: thor.
The interval (number of TDTs) between minor reports. The minor report is a period "." printed on "standard error". N = 0 suppresses the minor report. Default is 10.
The interval (number of TDTs) between major reports. The major report prints the number of TDTs processed and the number of errors so far, followed by a newline, to "standard error". N = 0 suppresses the major report. Default is 500.

Return Value

Returns status zero if it is able to open and compare the two databases, or status one if there is a problem.


Check that everything in db1 is identical, except for timestamps, to db2:
thordiff -exclude_datatype TS db1 db2
Check that all of the SMILES in db1 are also present in db2 (note that the UNIX syntax shown below requires $SMI to be quoted so that it won't be interpreted as an environment variable):
thordiff -include_datatype '$SMI' db1 db2
Check that two databases are identical in every respect. We have to run the program twice, because it only checks one way (as discussed above):
thordiff db1 db2 thordiff db2 db1



Daylight License

programs: thor

Related Topics

dayevict(1) daymessage(1) merlindbping(1) merlinload(1) merlinls(1) merlinping(1) merlinwho(1) thorchange(1) thorcrunch(1) thordbping(1) thordelete(1) thordestroy(1) thordump(1) thorlist(1) thorload(1) thorlookup(1) thorls(1) thormake(1) thorping(1) thorwho(1)

sthorman(1) thorserver(1) merlinserver(1) licensing(5)

Daylight Theory Manual, Daylight System Administration Manual


None known.