Publication View

Multivariate Spatio-Temporal Clustering of Times-Series Data: An Approach for Diagnosing Cloud Properties and Understanding ARM Site Representativeness Multivariate Clustering (2008)

Abstract
A multivariate statistical clustering technique—based on the iterative k-means algorithm of Hartigan (Hartigan 1975)—has been used to extract patterns of climatological significance from 200 years of general circulation model (GCM) output. Originally developed and implemented on a Beowulf-style parallel computer constructed by Hoffman and Hargrove from surplus commodity desktop PCs (Hargrove et al. 2001), the high performance parallel clustering algorithm (Hoffman and Hargrove 1999) was previously applied to the derivation of ecoregions from map stacks of 9 and 25 geophysical conditions or variables for the conterminous U.S. at a resolution of 1 sq km (Hargrove and Hoffman 1999). Figure 1 describes this application of the k-means approach to Multivariate Geographic Clustering (MGC). The left side of Figure 1 represents geographic space, while the right side illustrates the same map cells or observations in a multi-dimensional data space. The N characteristics of each map cell on the left are used as the N coordinates for that observation in data space on the right. In Figure 1, N is 3: temperature, organic matter, and rainfall. Having no information about the geographic coordinates of each observation, the iterative clustering algorithm finds k groups of observations based on their proximity, by simple Euclidean distance, in data space. Reassembling the map cells in geographic space

Publication details
Download http://citeseerx.ist.psu.edu/viewdoc/summary?doi=?doi=10.1.1.113.9833
Source http://www.arm.gov/publications/proceedings/conf13/extended_abs/hoffman-fm.pdf
Contributors CiteSeerX
Repository CiteSeerX - Scientific Literature Digital Library and Search Engine (United States)
Type text
Language English
Relation 10.1.1.41.9314