Publication View

Abstract Evaluating A Class of Distance-Mapping Algorithms for Data Mining and Clustering* (2008)

Abstract
A distance-mapping algorithm takes a set of objects and a distance metric and then maps those objects to a Euclidean or pseudoEuclidean space in such a way that the distances among objects are approximately preserved. Distance mapping algorithms are a useful tool for clustering and visualization in data intensive applications, because they replace expensive distance calculations by sum-of-square calculations. This can make clustering in large databases with expensive distance metrics practical. In this paper we present five distance-mapping algorithms and conduct experiments to compare their performance in data clustering applications. These include two algorithms called FastMap and MetricMap, and three hybrid heuris-tics that combine the two algorithms in different ways. Ex-perimental results on both synthetic and RNA data show the superiority of the hybrid algorithms. The results imply that FastMap and MetricMap capture complementary in-formation about distance metrics and therefore can be used together to great benefit. The net effect is that multi-day computations may be done in minutes. 1

Publication details
Download http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.72.4328
Source http://faculty.fullerton.edu/xwang/pub/p307-wang.pdf
Contributors CiteSeerX
Repository CiteSeerX - Scientific Literature Digital Library and Search Engine (United States)
Type text
Language English
Relation 10.1.1.39.5767, 10.1.1.17.2504