Assembling off-the-shelf Components Into Useful Applications, TJ O'Donnell, MUG2004

tdt2histo and filter_dayprop

The tdt2histo utility reads in TDTs containing numerical data, and produces HTML tables showing histogram bin counts of the data values. tdt2histo is a Perl function meant to be "required" by a Perl main.

The filter_dayprop utility converts the generic PPROP tags output by dayprop into more meaningful names, such as AVERAGE_MOL_WT. filter_dayprop is a Perl application.

The following command sequence dumps the nci00demo database, computes all the dayprops and adds meaningful tag names.

$DY_ROOT/bin/thorlist nci00demo | $DY_ROOT/bin/dayprop -property ALL | filter_dayprop > nci00demo_props.tdt

Sample output from tdt2histo
585

116.20
992

425.83
240

735.46
51

1045.09
13

1354.72
5

1664.35
1

1973.98


2283.61


2593.24


2902.87
1

3212.5

AVERAGE_MOL_WT 1937

468

45.10
1055

184.191
310

323.282
78

462.373
17

601.464
6

740.555
2

879.646


1018.737


1157.828


1296.919
1

1436.01

HDONOR_COUNT 1937

1273

0
501

2
122

4
38

6
2

8


10
1

12

RIGIDITY 1937

28

0.1562
130

0.24058
266

0.32496
327

0.40934
352

0.49372
257

0.5781
193

0.66248
113

0.74686
42

0.83124
20

0.91562
209

1

FRAGMENT_COUNT 1937

1069

0
758

3
93

6
10

9
5

12


15


18
1

21
1

24

PART_COUNT 1970

1937

1
33

2

FLEXIBILITY 1937

469

0.00
530

0.095
364

0.19
194

0.285
147

0.38
93

0.475
53

0.57
44

0.665
30

0.76
12

0.855
1

0.95

ACCURATE_MASS 1937

470

45.057850
1054

184.0325215
309

323.007193
78

461.9818645
17

600.956536
6

739.9312075
2

878.905879


1017.8805505


1156.855222


1295.8298935
1

1434.804565

RING_COUNT 1937

292

0
630

1
550

2
274

3
127

4
41

5
17

6
6

7

ROTBOND_COUNT 1937

1677

0
204

8
41

16
10

24
2

32
1

40
1

48


56


64
1

72

MOLAR_VOLUME 1888

651

56.00
952

212.1
221

368.2
45

524.3
15

680.4
2

836.5


992.6
1

1148.7


1304.8


1460.9
1

1617

POLAR_SURFACE_AREA 1937

839

0.00
760

42.044
246

84.088
66

126.132
15

168.176
3

210.22
6

252.264
1

294.308


336.352


378.396
1

420.44

STEREOCENTER_COUNT 1937

1741

0
126

3
39

6
26

9
2

12
1

15
1

18
1

21

HACCEPTOR_COUNT 1937

1021

0
753

6
126

12
24

18
6

24
5

30
1

36


42
1

48

ATOM_COUNT 1937

489

3
1061

13
307

23
61

33
11

43
6

53
1

63


73


83
1

93
1971