The max-p-region problem is a special case of constrained clustering where a finite number of geographical areas are aggregated into the maximum number of regions (max-p-regions), such that each region is geographically connected and the clusters could maximize internal homogeneity.
An instance of Weight class
A data frame with selected variables only. E.g. guerry[c("Crm_prs", "Crm_prp", "Litercy")]
A numeric vector of selected bounding variable
A minimum value that the sum value of bounding variable int each cluster should be greater than
(optional): The number of iterations of greedy algorithm. Defaults to 99.
(optional): The initial regions that the local search starts with. Default is empty. means the local search starts with a random process to "grow" clusters
(optional) One of the scaling methods ('raw', 'standardize', 'demean', 'mad', 'range_standardize', 'range_adjust') to apply on input data. Default is 'standardize' (Z-score normalization).
(optional) The distance method used to compute the distance betwen observation i and j. Defaults to "euclidean". Options are "euclidean" and "manhattan"
(optional) The seed for random number generator. Defaults to 123456789.
(optional) The number of cpu threads used for parallel computation
(optional) The distance matrix (lower triangular matrix, column wise storage)
A names list with names "Clusters", "Total sum of squares", "Within-cluster sum of squares", "Total within-cluster sum of squares", and "The ratio of between to total sum of squares".
if (FALSE) { # \dontrun{
library(sf)
guerry_path <- system.file("extdata", "Guerry.shp", package = "rgeoda")
guerry <- st_read(guerry_path)
queen_w <- queen_weights(guerry)
data <- guerry[c('Crm_prs','Crm_prp','Litercy','Donatns','Infants','Suicids')]
bound_variable <- guerry['Pop1831']
min_bound <- 3236.67 # 10% of Pop1831
maxp_clusters <- maxp_greedy(queen_w, data, bound_variable, min_bound, iterations=99)
maxp_clusters
} # }