| Distance Based Indexing for String Proximity Search (2003) | |||||||||||||||
Abstract | |||||||||||||||
| In many database applications involving string data, it is common to have near neighbor queries (asking for strings that are similar to a query string) or nearest neighbor queries (asking for strings that are most similar to a query string). The similarity between strings is defined in terms of a distance function determined by the application domain. The most popular string distance measures are based on (a weighted) count of (i) character edit or (ii) block edit operations to transform one string into the other. Examples include the Levenshtein edit distance and the recently introduced compression distance. | |||||||||||||||
Publication details | |||||||||||||||
| |||||||||||||||