pygeoda.schc

pygeoda.schc(k, w, data, linkage_method, **kwargs)[source]

Spatially Constrained Hierarchical Clucstering (SCHC)

Spatially constrained hierarchical clustering is a special form of constrained clustering, where the constraint is based on contiguity (common borders). The method builds up the clusters using agglomerative hierarchical clustering methods: single linkage, complete linkage, average linkage and Ward’s method (a special form of centroid linkage). Meanwhile, it also maintains the spatial contiguity when merging two clusters.

Parameters
  • k (int) – number of clusters

  • w (Weight) – An instance of Weight class

  • data (tuple) – A list of numeric vectors of selected variable

  • linkage_method (str) – The method of agglomerative hierarchical clustering: {“single”, “complete”, “average”,”ward”}. Defaults to “ward”.

  • bound_variable (tuple, optional) – A numeric vector of selected bounding variable

  • min_bound (float, optional) – a minimum value that the sum value of bounding variable int each cluster should be greater than

  • scale_method (str, optional) – One of the scaling methods {‘raw’, ‘standardize’, ‘demean’, ‘mad’, ‘range_standardize’, ‘range_adjust’} to apply on input data. Default is ‘standardize’ (Z-score normalization).

  • distance_method (str, optional) – {“euclidean”, “manhattan”} the distance method used to compute the distance betwen observation i and j. Defaults to “euclidean”. Options are “euclidean” and “manhattan”

Returns

A dict with keys {“Clusters”, “TotalSS”, “Within-clusterSS”, “TotalWithin-clusterSS”, “Ratio”}

Return type

dict