| 9.4 USING CLUSTERED CLIMATE REGIMES FOR UNDERSTANDING WATER CYCLE VARIABILITY (2008) | |||||||||||||
Abstract | |||||||||||||
| A multivariate statistical clustering technique— based on the iterative k-means algorithm of Hartigan (Hartigan, 1975)—has been used to extract patterns of climatological significance from 200 years of general circulation model (GCM) output. Originally developed and implemented on a Beowulf-style parallel computer constructed by Hoffman and Hargrove from surplus commodity desktop PCs (Hargrove et al., 2001), the high performance parallel clustering algorithm (Hoffman and Hargrove, 1999) was previously applied to the derivation of ecoregions from map stacks of 9 and 25 geophysical conditions or variables for the conterminous U.S. at a resolution of 1 sq km (Hargrove and Hoffman, 1999). Figure 1 describes this application of the k-means approach to Multivariate Geographic Clustering (MGC). The left side of Figure 1 represents geographic space, while the right side illustrates the same map cells or observations in a multi-dimensional data space. The N characteristics of each map cell on the left are used as the N coordinates for that observation in data space on the right. In Figure 1, N is 3: temperature, organic matter, and rainfall. Having no information about the geographic coordinates of each observation, the iterative clustering algorithm finds k groups of observations based on their proximity, by simple Euclidean distance, in data space. Reassembling the map cells in geographic space and coloring them according to their cluster assignment yields a new map showing regions of approximately equal variance with respect to the N charac- | |||||||||||||
Publication details | |||||||||||||
| |||||||||||||