| Winner-Take-All EM Clustering (2009) | |||||||||||||||
Abstract | |||||||||||||||
| The EM algorithm is often used with mixture models to cluster data, but for efficiency reasons it is sometimes desirable to produce hard clusters. Several hard clustering limits of EM are known. For example, k-means clustering can be derived from EM in a Gaussian mixture model by taking the limit of all variances going to zero. We present a new method of deriving Winner-Take-All versions of EM that can be used for mixtures, such as heteroscedastic Gaussians, where it is not possible to take that limit. The resulting clusters can have non-convex boundaries, allowing for some of the clusters to reside “inside ” others, producing dense foreground clusters embedded in a more diffuse background. Experiments show that using unequal variances can give better clusters on real data sets in terms of external quality measures. 1. | |||||||||||||||
Publication details | |||||||||||||||
| |||||||||||||||