| Abstract Evaluating A Class of Distance-Mapping Algorithms for Data Mining and Clustering* (2008) | |||||||||||||||
Abstract | |||||||||||||||
| A distance-mapping algorithm takes a set of objects and a distance metric and then maps those objects to a Euclidean or pseudoEuclidean space in such a way that the distances among objects are approximately preserved. Distance mapping algorithms are a useful tool for clustering and visualization in data intensive applications, because they replace expensive distance calculations by sum-of-square calculations. This can make clustering in large databases with expensive distance metrics practical. In this paper we present five distance-mapping algorithms and conduct experiments to compare their performance in data clustering applications. These include two algorithms called FastMap and MetricMap, and three hybrid heuris-tics that combine the two algorithms in different ways. Ex-perimental results on both synthetic and RNA data show the superiority of the hybrid algorithms. The results imply that FastMap and MetricMap capture complementary in-formation about distance metrics and therefore can be used together to great benefit. The net effect is that multi-day computations may be done in minutes. 1 | |||||||||||||||
Publication details | |||||||||||||||
| |||||||||||||||