Dealing with Sequences

Efficiently finding similar proteins (via sequence similarity) requires specialized algorithms which work at the sequence (string) level. However, there are a number of interesting parallels between the methods for small molecule substructure searching and sequence similarity searching:

Substructure searching Sequence similarity
Topology 2D graphs 1D strings
Screening Fingerprints Local identities
Similarity measure Tanimoto, etc Scoring Matrices
Matching Graph matching Dynamic programming