Daylight Summer School 2002, June 5-7, Santa Fe, NM

Daylight Worksheet - Modify cansmi for reaction functions. -- WITH HINTS

We'll start reaction toolkit programming with cansmi.c. It should be familiar, and it illustrates one of the main ideas for reaction toolkit programming.

Reaction basics

  1. Build cansmi and run it. Try entering reactions. Note that if the reaction toolkit license is available, cansmi will successfully canonicalize reactions.

    % cansmi
    [OH2]>>OCC O>>CCO

  2. Modify cansmi.c. Use the function dt_type() to determine whether a molecule or reaction was entered. Print out the result. Compile and test.

    Add after smiles is interpreted by dt_smilin:

       {
          int type;
          type = dt_type(mol);
          if (type == TYP_MOLECULE) printf("TYP_MOLECULE\n");
          else if (type == TYP_REACTION) printf("TYP_REACTION\n");
       }
    

  3. If a reaction was input, use dt_count() to determine the number of molecules in the reaction and print out the result. Compile and test.

    Modify the above to:

       {
          int type,count;
          type = dt_type(mol);
          if (type == TYP_MOLECULE) printf("TYP_MOLECULE\n");
          else if (type == TYP_REACTION) {
            count = dt_count(mol,TYP_MOLECULE);
            printf("TYP_REACTION containing %d molecules\n",count);
          }
       }
    

  4. Use dt_stream() to get each molecule from the reaction, and print out each individual molecule canonical SMILES. Identify the role of each molecule in the reaction using dt_getrole(). Compile and test.

Modify the above to:

   {
      int type,count,role,len;
      dt_Handle mols,mol2;
      char *smi;
      type = dt_type(mol);
      if (type == TYP_MOLECULE) printf("TYP_MOLECULE\n");
      else if (type == TYP_REACTION) {
         count = dt_count(mol,TYP_MOLECULE);
         printf("TYP_REACTION containing %d molecules\n",count);
         mols = dt_stream(mol,TYP_MOLECULE);
         while (NULL_OB!=(mol2=dt_next(mols))) {
            role = dt_getrole(mol2,mol);
            smi = dt_cansmiles(&len,mol2,1);
            printf("\t%.*s (",len,smi);
            if      (role == DX_ROLE_REACTANT) printf("DX_ROLE_REACTANT)\n");
            else if (role == DX_ROLE_AGENT)    printf("DX_ROLE_AGENT)\n");
            else if (role == DX_ROLE_PRODUCT)  printf("DX_ROLE_PRODUCT)\n");
            else printf("unknown role = %d)\n",role);
         }
      }
   }


Atoms and Bonds in reactions

  1. Continuing with cansmi.c, lets look at how atoms and bonds behave for reactions. Use the function dt_count() to determine the number of atoms and bonds for each smiles (reaction or molecule) entered. Compile and test.

    Modify the above to:

       {
          int type,count,role,len,natoms,nbonds;
          dt_Handle mols,mol2;
          char *smi;
          natoms = dt_count(mol,TYP_ATOM);
          nbonds = dt_count(mol,TYP_BOND);
          type = dt_type(mol);
          if (type == TYP_MOLECULE) {
             printf("TYP_MOLECULE; natoms=%d; nbonds=%d\n",natoms,nbonds);
          } else if (type == TYP_REACTION) {
             count = dt_count(mol,TYP_MOLECULE);
             printf("TYP_REACTION containing %d molecules\n",count);
             mols = dt_stream(mol,TYP_MOLECULE);
             while (NULL_OB!=(mol2=dt_next(mols))) {
                role = dt_getrole(mol2,mol);
                smi = dt_cansmiles(&len,mol2,1);
                printf("\t%.*s (",len,smi);
                if      (role == DX_ROLE_REACTANT) printf("DX_ROLE_REACTANT)\n");
                else if (role == DX_ROLE_AGENT)    printf("DX_ROLE_AGENT)\n");
                else if (role == DX_ROLE_PRODUCT)  printf("DX_ROLE_PRODUCT)\n");
                else printf("unknown role = %d)\n",role);
             }
          }
       }
    

  2. For reactions, use dt_stream() to get each molecule. Print out the number of atoms and bonds in each molecule. Keep a running total of atoms and bonds for each reaction and compare to the overall count of atoms and bonds for the reaction (from step #2). Compile and test.

    Modify the above to:

       {
          int type,count,role,len,natoms,nbonds,natomsr,nbondsr;
          int i,j,natomsr=0,nbondsr=0;
          dt_Handle mols,mol2;
          char *smi;
          natoms = dt_count(mol,TYP_ATOM);
          nbonds = dt_count(mol,TYP_BOND);
          type = dt_type(mol);
          if (type == TYP_MOLECULE) {
             printf("TYP_MOLECULE; natoms=%d; nbonds=%d\n",natoms,nbonds);
          } else if (type == TYP_REACTION) {
             count = dt_count(mol,TYP_MOLECULE);
             printf("TYP_REACTION containing %d molecules\n",count);
             mols = dt_stream(mol,TYP_MOLECULE);
             while (NULL_OB!=(mol2=dt_next(mols))) {
                role = dt_getrole(mol2,mol);
                i = dt_count(mol2,TYP_ATOM);
                j = dt_count(mol2,TYP_BOND);
                smi = dt_cansmiles(&len,mol2,1);
                printf("\t%.*s (",len,smi);
                if      (role == DX_ROLE_REACTANT) printf("DX_ROLE_REACTANT");
                else if (role == DX_ROLE_AGENT)    printf("DX_ROLE_AGENT");
                else if (role == DX_ROLE_PRODUCT)  printf("DX_ROLE_PRODUCT");
                else printf("unknown role=%d",role);
                printf(";%d atoms, %d bonds)\n",i,j);
                natomsr += i;
                nbondsr += j;
             }
             printf("\ttotal in reaction: %d atoms, %d bonds\n",natomsr,nbondsr);
          }
       }
    


Daylight Chemical Information Systems Inc.
support@daylight.com