Daylight Summer School 2002, June 5-7, Santa Fe, NM

DayCart TCM Interface and Worksheet -- WITH HINTS

The Daycart TCM interface is a simple CGI interface which uses a combination of normal CGI and Oracle Call Interface code to provide a web interface to the Traditional Chinese Medicines data. The interface can be accessed HERE. The web interface shows, in the source for each HTML page, the SQL query or queries which were used to retrieve the data. You will query the TCM data from the Web interface and then will perform the same queries from SQLPlus and get the results.

Starting SQLPLUS:

telnet day0, login mug/coffee
sqlplus mug/coffee
The main table of interest is: tcm01_mol. It has chemical structures. The other four tables (tcm01_plant, tcm01_plant_mol, tcm01_ref, tcm01_mol_ref) contain data about plants and references which are associated with the molecules. You can get information about the columns in the tables with the 'describe' command. For example:
describe tcm01_mol;

TCM Interface Queries

  1. Perform a structural query for Dopamine (NCCc1cc(O)c(O)cc1). Note that the Web interface returns three records. Now, perform the same query using SQLPlus. Again, you'll find the actual SQL query embedded in the source to the html ("View...Page Source" from Netscape).

    Note that the SQLPlus query returns the TCMM field, but this isn't displayed in the web page. Where is it used?

  2. Perform a structural query for the following SMILES: NCCc1cc(S)ccc1, chosen because it is not in the database. Note that the CGI attempts to find exact matches, then substructures, and failing both of those, will return similar structures. Perform the similarity search using SQLPlus. Again, you'll find the actual SQL query embedded in the source to the html.

    Modify the SQL query to display the molecular weight of each hit in the results.

    Modify the SQL query to only display the hits with a molecular weight of greater than 150.

Oracle Data Manipulation

  1. Create a demo table as a subset of tcm01_mol. Use your initials as part of the name (eg. tcm_jjd) so it doesn't collide with a classmates. Only include the first 100 records from tcm01_mol.

  2. Add a column for average molecular weight. Populate the new table with calculated values.

  3. Validate the calculated molecular weight (new_mw) against the stored molecular weight (mw). Look for a discrepancy greater than 1.0.

Multi-Table Queries

  1. Find out how many structures are common between the nci table and the tcm01_mol table.

  2. Refine the query and find all structures in both nci and tcm01_mol that contain the phenyl ethylamine substructure.

  3. Both the nci and tcm01_mol tables have a CAS Registry number column. Use the matching SMILES to validate the CAS registry numbers. That is, for all the structures in common between the two tables, how many have CAS registry number discrepancies?

Daylight Chemical Information Systems Inc.