Dealing with Sequences

Efficiently finding similar proteins (via sequence similarity) requires specialized algorithms which work at the sequence (string) level. However, there are a number of interesting parallels between the methods for small molecule substructure searching and sequence similarity searching:

	Substructure searching	Sequence similarity
Topology	2D graphs	1D strings
Screening	Fingerprints	Local identities
Similarity measure	Tanimoto, etc	Scoring Matrices
Matching	Graph matching	Dynamic programming