Publication View

Winner-Take-All EM Clustering (2009)

Abstract
The EM algorithm is often used with mixture models to cluster data, but for efficiency reasons it is sometimes desirable to produce hard clusters. Several hard clustering limits of EM are known. For example, k-means clustering can be derived from EM in a Gaussian mixture model by taking the limit of all variances going to zero. We present a new method of deriving Winner-Take-All versions of EM that can be used for mixtures, such as heteroscedastic Gaussians, where it is not possible to take that limit. The resulting clusters can have non-convex boundaries, allowing for some of the clusters to reside “inside ” others, producing dense foreground clusters embedded in a more diffuse background. Experiments show that using unequal variances can give better clusters on real data sets in terms of external quality measures. 1.

Publication details
Download http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.146.5184
Source http://www.cis.upenn.edu/~kandylas/papers/nescai07.pdf
Contributors CiteSeerX
Repository CiteSeerX - Scientific Literature Digital Library and Search Engine (United States)
Type text
Language English
Relation 10.1.1.133.4884, 10.1.1.18.2720, 10.1.1.33.2557, 10.1.1.38.4937, 10.1.1.12.309, 10.1.1.6.2778, 10.1.1.48.3989, 10.1.1.29.2482, 10.1.1.1.2064, 10.1.1.140.3309, 10.1.1.2.5895, 10.1.1.98.2001