pygeoda.azp_greedy¶
- pygeoda.azp_greedy(p, w, data, **kwargs)[source]¶
A greedy algorithm to solve the AZP problem
Note
The automatic zoning procedure (AZP) was initially outlined in Openshaw (1977) as a way to address some of the consequences of the modifiable areal unit problem (MAUP). In essence, it consists of a heuristic to find the best set of combinations of contiguous spatial units into p regions, minimizing the within sum of squares as a criterion of homogeneity. The number of regions needs to be specified beforehand.
- Parameters
p (int) – The number of spatially constrained clusters
w (Weight) – an instance of Weight class
data (tuple) – A list of numeric vectors of selected variable
bound_variable (tuple, optional) – A numeric vector of selected bounding variable
min_bound (float, optional) – A minimum value that the sum value of bounding variable int each cluster should be greater than
inits (int, optional) – The number of construction re-runs, which is for ARiSeL “automatic regionalization with initial seed location”
init_regions (tuple, optional) – The initial regions that the local search starts with. Default is empty. means the local search starts with a random process to “grow” clusters
scale_method (str, optional) – One of the scaling methods {‘raw’, ‘standardize’, ‘demean’, ‘mad’, ‘range_standardize’, ‘range_adjust’} to apply on input data. Default is ‘standardize’ (Z-score normalization).
distance_method (str, optional) – The distance method used to compute the distance betwen observation i and j. Defaults to “euclidean”. Options are “euclidean” and “manhattan”
random_seed (int, optional) – The seed for random number generator. Defaults to 123456789. It is the same as GeoDa software
cpu_threads (int, optional) – The number of cpu threads used for parallel computation
- Returns
A list of numeric vectors represents a group of clusters
- Return type
list