| 15. THOR and Merlin Toolkits: DatabasesBack to Table of Contents
 
 15.1 Introduction
Although THOR and Merlin present very different views of the data in 
a database, both systems present the very same data.  Because of 
this, most operations on databases are identical in the THOR and 
Merlin Toolkits.  This includes opening and closing databases, 
setting the server's "search path" security operations, and 
datatype-object operations.  This chapter covers all of these common 
operations.
 
An old adage might be paraphrased here: "An example is worth a
thousand words"  Many tutorial examples of Thor and
Merlin Toolkit usage can be found in the "contrib"
directory:
 
     $DY_ROOT/contrib/src/thor/
     $DY_ROOT/contrib/src/merlin/
We strongly encourage you to study these examples before
attempting to write Thor and/or Merlin Toolkit programs.
 15.2 Search Path
THOR
and
Merlin servers
maintain a "search path" -- a list of 
directories which are to be searched for databases (see the Daylight 
System Administration Manual for more details).
(Note that the search path is a property of a server, not a database.  We
put it in the databases chapter rather than the server chapter because
it fits with other database operations.)
 
Note that the directories in the search path are interpreted by the
server's operating system, hence are in a format appropriate to that
operating system.  For example, a Macintosh client connected to a UNIX
server would use UNIX syntax to specify a database path (e.g.
"/thordb/mydb").  Similarly, environment variables are interpreted on
the server's operating system, not the client's.
 
 
dt_getsearchpath(Handle server) ==> Handle sos
Returns sequence of strings (SOS), each string-object of which 
contains a directory in the server's search path.  The order of 
directories in the SOS is the order in which the directories will be 
searched to find a database.
dt_setsearchpath(Handle server, string password, string path, integer replace);
Sets the server's search-path.  You can either replace the current path,
or add to it.
dt_getdatabases(Handle server) ==> Handle sos
Returns a sequence of string objects (SOS) containing all databases 
in server's search path.
 
 15.3 Creating and Configuring Databases
Database creation is only done by the THOR server, so the functions 
in this section don't apply to the Merlin Toolkit.  The following 
functions are used to create and configure a
THOR database.
 
 15.3.1 Database Creation
dt_thor_createdb(Handle server,int dlen, string path, int sizepri, int sizexref) => Handle database
Creates a new empty THOR database and opens it with "executive" 
permission. 
The parameter pathmust be a complete path, not a relative 
path or just a filename.
The parameters sizepriandsizexrefare the requested sizes of the 
primary and cross-reference hash tables, respectively.  For more
information about database sizes, see the
reference page for this function and the
Daylight THOR-Merlin Administration Manual. 
 15.3.2 Database Configuration
Each database can have one, two or three auxiallary databases associated 
with it:
 
 
 The datatypes database:  Contains special-purpose datatype-
definition TDTs (e.g. "$D<$SMI>_V...|".  Each time a new 
datatype
is encountered, its definition is retrieved from this 
database.  For more information about datatypes, see the
Daylight Theory Guide. 
  The indirect-data database.  Contains the expansions for 
indirect references.  For more information about indirect data, see the
Daylight Theory Guide.
 
Generally speaking:
 
 a datatypes database will have no associated databases
 an indirect-data database will have a datatypes database associated with 
it that defines the indirect-datatype definitions
 a regular chemical database will always have a datatypes database
and will often have an indirect-data database.
 
 
dt_thor_getauxillarydb(Handle database, integer type) => string path
Returns the path (directory, filename, and suffix) of the auxillary 
database associated with db of type type.  Type will be either 
DX_THOR_DATATYPESDB or DX_THOR_INDIRECTDB.
dt_thor_setauxillarydb(Handle database, integer type, string path) => boolean ok
Sets the database that is to be associated with db as type
type, wheretypeis either
DX_THOR_DATATYPESDB or DX_THOR_INDIRECTDB. 
 15.3.3 Database Crunching
After a series of deletions and/or replacements, a database's data 
files may have "holes" in them.  For example, if a TDT is enlarged 
(e.g. new dataitems added), it will no longer fit in its original 
spot; new space is allocated for it and the old space is marked 
"unused".  THOR can sometimes re-use these available spaces 
(depending on the server's implementation and configuration), but 
generally the server is unable to make 100% use of the space in a 
database that has been extensively modified.  This can cause a 
database to grow to be much larger than the amount of actual data it 
contains.
 
Crunching is the process of moving all data "forward" in the file to 
fill in these unused spaces, leaving all unused space at the end of 
the file; a pass is made through the entire database, reading and re-
writing data and rebuilding the hash table.  Once this is done, the 
file is truncated to get rid of the unused space at the end, freeing 
the file-system space for other uses.
 
The crunch operation should not be undertaken lightly, as the crunch 
operation is indivisible; while a crunch is under way, the server 
doesn't respond to other clients.  Depending on database size, a 
crunch can take anywhere from several seconds to many minutes.
 
During a crunch, the database is temporarily in invalid states; for 
example, the hash table file is invalid until the crunch operation is 
complete.  The database may be corrupted if some error occurs 
(usually an interruption such as a power failure) midway through a 
crunch.  The actual data records may not be damaged, but hash 
information is usually destroyed; the data are no longer accessible.  
In such a case, the thordump(1) utility may be required to recover 
the data.
 
 
dt_thor_crunchdata(Handle database) => boolean
Crunches (recovers unused space from) the primary data file of a
database.
dt_thor_crunchxref(Handle database) => boolean
Crunches (recovers unused space from) the cross-reference data file 
of a database.
dt_thor_autocrunch_limit(Handle database, float limit) => float limit
The database's "autocrunch" parameter is used to trigger an
automatic database crunch whenever the fraction of free space exceeds
a limit.  The fraction is computed as:
			  bytes_free
     free space =  -----------------------
		   bytes free + bytes used
This function both sets and returns the "autocrunch"
limit -- the fraction of free space which, if exceeded, will trigger
an automatic crunch.  The limit applies to both the primary and
cross-reference data files.  Iflimitis <= 0.0, the
database's limit is unaffected; this serves as a way to query the
current value without modifying it.  Values greater than 1.0 are not
permitted.  A value of 1.0 will disable autocrunching. 
 15.4 Opening and Closing Databases
The Toolkit calls to open and close databases in THOR and Merlin are 
identical, but the actual operations performed by the two servers are 
quite different:
 
 
 THOR opens all of the database's data files (the primary and 
cross-reference data files, and the primary and cross-reference hash 
tables). These files remain open as long as the database is open.  
If caching is enabled (see below), data
are read from the disk files into the Thor server's memory.
If multiple clients open the same database, the server creates a 
"client context" for each, but shares the database resources (i.e. 
the files) among the clients.
 Merlin opens the primary data file, reads its contents into 
memory, and closes the file.  The memory remains in use as long as 
the database is in use (by any user).  Each client that opens the 
same database has its own "client context" in the server, but all 
clients share the database's in-memory image.
 
dt_open(Handle server, string dbname, string permission, string password,RETURN integer isnew) ==> Handle database
Opens a database on a
THOR
or
Merlin
server.
The pathis the path (directories and filename) of the 
database on the server machine.
If it is a simple filename (no directory information), the server 
will search its search path for the database -- the first database 
found in the path that matches the name is used.
If path contains any directory information, it must be a complete 
path - partial and relative paths 
are not allowed.  When a complete path is specified, the server's 
search path is ignored.
The string permis one of "r", "w",
or "e", representing read, read/write, and executive
permission.  Thepasswordmust be the database's password for
the requested permission or higher (i.e. the executive password always
works, the write password works for reading or writing, and the read
password only works for reading.) 
dt_exists (Handle server, string dbname) ==> boolean isopen
Returns TRUE if the named database exists.
dt_isopen (Handle server, string dbname) ==> boolean isopen
Returns TRUE if the named database is already open (either open by
some other client, or marked "hold" - see
dt_hold() and
below).
dt_ispublic(Handle server, string name) ==> boolean ispublic
Returns TRUE if the named database is "public"; that is, if
it has an empty read-permission password so that it can be opened
without a password.
 15.5 Memory Usage: Cache and Hold15.5.1 Merlin HOLDIt can take a long time for a Thor or Merlin server to open a
database: Merlin's in-memory high-speed searching requires that it
scan the entire database into memory; Thor provides various levels of
"caching" -- loading heavily-used parts of the database (or
even all of the database) into memory to improve performance.
Because of the potentially high overhead to open a database, both Thor
and Merlin provide a "hold" for databases which causes the
database to remain open even when no client is using it.  For Merlin,
"hold" means the database is retained in memory.  For Thor,
"hold" means the database files remain open, and cached
portions of the database remain in memory.
dt_hold(Handle database, string thorpassword) ==> boolean ok
Marks the specified database "held", so that it will be
retained in the Merlin server's memory.  The password is that of the
user "thor", and must be supplied even if you connected to
the server as the user "thor".  Returns TRUE if the
operation succeeded.  The operation fails if the server determines
that the password is incorrect, or if database is not a Merlin
database (pool) object.
dt_isheld(Handle database) ==> boolean isheld
Returns TRUE if database is marked "hold".  Returns FALSE if the 
database is not marked "hold", or if database is not a Merlin 
database (pool) object.
dt_release(Handle database, string execpassword) ==> boolean ok
Marks the specified database "released" (not held), so that it will 
be removed from the Merlin server's memory when the last client 
closes it.  The password is that of the user "thor", and must be 
supplied even if you connected to the server as the user "thor".  
Returns TRUE if the operation succeeded.  The operation fails if the 
server determines that the  password is incorrect, or if database is 
not a Merlin database (pool) object.  Note that the database is
not released as long as any client (including the one
performing this operation) has the database open.  Clients can be
"evicted" to force closure; see
dt_evict().
 
 15.5.2 THOR Caching
A THOR server's performance can be improved by "caching": storing 
frequently-used sub-parts of the database in the server's memory.  
This is discussed in more detail in the Daylight Theory Manual and 
the Daylight System Administration Manual.
 
Remember that a server is free to silently ignore any and all caching 
requests, depending on the particular implementation and the server's 
configuration.
 
Valid caching levels are symbolic constants in the THOR Toolkit:
 
 
| Thor Caching Levels |  
  | DX_THOR_OFF | no caching |  
  | DX_THOR_RTABLE | write-through cache of hash table |  
  | DX_THOR_TABLE | complete cache of hash table |  
  | DX_THOR_RALL | write-through cache of everything |  
  | DX_THOR_ALL | complete cache of everything |  
The following functions control caching:
 
dt_thor_cache(Handle database, int level) => boolean
Enable caching for the database.  The parameter levelindicates what type of caching to perform; see the table above.
dt_thor_cachecontrol(Handle database, int when, int level) => boolean
Overrides cache requests from normal users; the cache-control 
specification becomes a property of the database, and remains in 
effect when the database is closed and reopened.  Requires executive 
permission.  The parameter levelindicates how much caching to 
perform, as described above.  The parameterwhenindicates:
 
  DX_THOR_CACHE_NEVER
  Caching is always disabled; caching requests from other clients
      are prohibited and are silently ignored.
  DX_THOR_CACHE_OK
  Caching requests from clients are allowed; the parameter
      levelis ignored.  This is the default.DX_THOR_CACHE_ALWAYS
  Caching is forced whenever a database is 
      opened, to the level specified by level; caching requests from other
      clients are prohibited and are silently ignored.
 
dt_thor_cachesync(Handle database) => boolean
Forces all cached data to be written to the disk immediately.  This 
should only be done occasionally, as it is an "atomic"
operation -- the entire sync is completed before any other client
requests are served, which can adversely affect performance.
 
 
There is only one function for managing the security of databases.  
Note that it is polymorphic; it also applies to server objects; its 
behavior when applied to server objects is described in the
Server Security Functions
chapter of this manual.
 
dt_setpassword(Handle database, string what, string authorizing_pw,string newpw) => boolean
Changes a password for the database.
Note that when a database's password is changed any existing users of 
that database are unaffected; a client program can keep a database 
open indefinitely even though the password used to open the database 
is no longer valid.  Authorization is only checked when the database 
is opened.
 
The string what indicates which of the three passwords is to be changed; it
must be one of "r", "w", or "e", for read,
write, or executive passwords, respectively.
 
 15.7 Record LockingThor provides a mechanism for "locking" a TDT ("record").
When a client program locks a record, the record is said to be "owned"
by that client.  The owner of a record has exclusive write access to
that record; no other client can modify or delete that record
(although they can read the record).  A record can only be locked by
one client at a time.
Record locking is an all-or-nothing affair:  Conceptually, if record
locking is enforced, then all records
must be locked before they can be modified.  In practice,
if you write an unlocked record, it is automatically locked, written,
then unlocked.  This means if another client has that record locked, your
write will fail due to a lock violation.
 
Once a record is locked, the client that owns the lock can do the
following:
 
When a record is locked by one client, all other clients that try
to use the record are restricted to read-only operations.
That is, they can only retrieve and examine the record (see
dt_thor_tdtget()),
and find out has it locked (see
dt_thor_tdtlockedby()).
Change the record:
The client with the lock can modify the record; no other client can.
Write the record to the database:
If a modified, locked record is written to the database, the changes
are "invisible" to other clients until that record is
unlocked ("committed").  Other clients will "see"
the original record, even though the client holding the lock sees the
changes.
Delete the record:
A deletion is essentially the same as a change:  Only the owner of the
lock can delete the record, and the record will appear unchanged
(undeleted) to other clients until it is unlocked ("committed").
Deleting a record does not unlock it -- the lock remains in
effect until it is explicitely removed (which causes the deletion to
be "committed").
Rollback modifications:
As long as a record remains locked, it can be "rolled back"
to its original state.  That is, if it has been modified or deleted,
those changes are undone by the "rollback" operation.
Rolling a record back does not unlock the record.
Commit modifications:
When the record is unlocked, it is "committed".  That is,
all modifications are finalized and become visible to other clients
using the database.  This includes deletion -- deletions take effect
when the record is unlocked.
 
It is possible to lock a record that does not exist.  This is commonly
necessary when writing a new record to the database -- the record is
locked, then written and finally unlocked ("committed").
 
The actual record locks are maintained by the Thor server.  If a
client disconnects from a Thor server or closes a database while it
still has records locked, the locks are automatically discarded and
the records are "rolled back".  Any changes made but not
committed are lost.  Locks can only be retained while a client is
connected to a Thor server and has a database open.
 
Record locking is not necessary in most situations.  Thor's ability to
merge records makes it possible for users to simultaneously modify
records with little chance of conflicts.  On the rare occasion when
conflicts arise, Thor's timestamp facility provides adequate warning.
 
The following functions control locking enforcement:
 
Other functions related to or affectd by record-locking enforcement are:
dt_thor_settdtlocking(Handle database, string password, dt_Integer enforce_locking) ==> boolean OK
Sets or unsets "record locking" enforcement for THOR database.
If enforce_lockingis TRUE, locking is enforced; if it is
FALSE, locking is disabled.
You can't change record locking enforcement while the database
is in use (i.e. open by any other client).
 
When record locking is enforced, records that are retrieved from a
writeable database are automatically locked (see
dt_thor_tdtget()).
A writeable database is one opened with "w" or
"e" permission using dt_open().
 
Record locking is a permanent property of the database
(i.e. it is retained when the database is closed and reopened), and it applies
to all client programs using the database.
 
dt_thor_tdtdttlocking(Handle database)
Returns TRUE or FALSE, indicating respectively that record locking is or is not
enforced for the specified database.
 
     dt_thor_tdtget
     dt_thor_tdtget_raw
     dt_thor_tdtlockedby
     dt_thor_tdtput
     dt_thor_tdtput_raw
     dt_thor_tdtremove
     dt_thor_tdtremove_raw
Back to Table of Contents
Go to previous chapter THOR and Merline Servers
 Go to next chapter THOR and Merlin Datatypes.
 |