Running Stigmata

A Stigmata analysis takes place in two parts. The first step is the analysis program, stigmata. To visualize the results one uses the visualization program, xvstigmata. The stigmata executable that you have is for an SGI running system 5, if you want a different binary then modifiy the "makefile" and type make. A stigmata executable should appear in your current working directory. Before running the program, you can see all of the options by typing

stigmata -h

You should see the following options:


 -m (query a second set of structures in file second.smi)
 -n (query a database titled second.smi, ascii output only)
 -o (will generate ascii.dat containing name, MODP,MSIM )
 -t thresh (modal threshold, default 0.5000)

The following is an example using the sample.smi dataset to demonstrate stigmata's command line options. First, to run a stigmata analysis of sample.smi at a threshold of 1.0, to generate both stigmata output for visual analysis (takes the form of a TDT file) and ascii output(contains just the MODP and MSIM values) one would type:

stigmata -t 1.0 -o sample.smi >sample.stig

The -t is the threshold option, it can be any real number R, where 0.5<= R <=1.0. The default value is 0.5. The -o option will generate the file "ascii.dat" which contains the MODP and MSIM values for each structure analyzed. Recall that a MODP value is similar to a substructural searching score. It ranges from 0 to 1, where zero means that the structure has nothing in common with the modal fingerprint and 1 meaning that all bits set in the molecular fingerprint are also set in the modal fingerprint. MSIM is a tanimoto coefficeint between the molecular fingerprint and the modal fingerprint and indicates frequency of unique features that the structure contains. The last two lines of the ascii.dat file should read:

Threshold=1.000000 RMAXF=0.387755 RMINF=1.000000
Maxfp Name=ortho,bromophenol Minfp Name=benzene

The threshold is the value which was used for the analysis, RMAXF is the ratio of the number of bits set in the modal relative to the number of bits set in the molecular fingerprint which contains the largest number of set bits (ortho, bromophenol). The RMINF value is the ratio of the number of bits set in the modal relative to the number of set bits in the molecular fingerprint with the fewest set bits. The Maxfp Name/Minfp Name correspond to the names of the structures which have the Maximum/Minimum number of bits set in their molecular fingerprint across the whole dataset. The RMAXF and RMINF values provide a clue as to the number of common features contained in the modal fingerprint. In this case all of the paths in benzene are contained in the modal, but ortho,bromophenol has signifcantly more features than those contained in the modal. This is reasonable, since the modal fingerprint for this analysis is just the molecular fingerprint for benzene. To visualize the results from this analysis the sample.stig file must be loaded into xvstigmata. See here for help in running xvstigmata. The results from this run are:

The modal fingerprint can be exported from the previous analysis and used as a query against a second set of structures/database which must be contained in the file "second.smi". The command is:

stigmata -t 1.0 -m sample.smi >sample2.stig

The hits from the modal search from sample.smi will appear in sample2.stig and can be visualized using xvstigmata . The results would look like:

Running Xvstigmata

Back to Help Page