R/clustering.R
redcap.Rd
REDCAP (Regionalization with dynamically constrained agglomerative clustering and partitioning) is developed by D. Guo (2008). Like SKATER, REDCAP starts from building a spanning tree with 4 different ways (single-linkage, average-linkage, ward-linkage and the complete-linkage). The single-linkage way leads to build a minimum spanning tree. Then,REDCAP provides 2 different ways (first-order and full-order constraining) to prune the tree to find clusters. The first-order approach with a minimum spanning tree is exactly the same with SKATER. In GeoDa and pygeoda, the following methods are provided: \* First-order and Single-linkage \* Full-order and Complete-linkage \* Full-order and Average-linkage \* Full-order and Single-linkage \* Full-order and Ward-linkage
redcap(
k,
w,
df,
method = "fullorder-averagelinkage",
bound_variable = data.frame(),
min_bound = 0,
scale_method = "standardize",
distance_method = "euclidean",
random_seed = 123456789,
cpu_threads = 6,
rdist = numeric()
)
The number of clusters
An instance of Weight class
A data frame with selected variables only. E.g. guerry[c("Crm_prs", "Crm_prp", "Litercy")]
"firstorder-singlelinkage", "fullorder-completelinkage", "fullorder-averagelinkage","fullorder-singlelinkage", "fullorder-wardlinkage"
(optional) A data frame with selected bound variabl
(optional) A minimum bound value that applies to all clusters
(optional) One of the scaling methods 'raw', 'standardize', 'demean', 'mad', 'range_standardize', 'range_adjust' to apply on input data. Default is 'standardize' (Z-score normalization).
(optional) The distance method used to compute the distance betwen observation i and j. Defaults to "euclidean". Options are "euclidean" and "manhattan"
(int,optional) The seed for random number generator. Defaults to 123456789.
(optional) The number of cpu threads used for parallel computation
(optional) The distance matrix (lower triangular matrix, column wise storage)
A names list with names "Clusters", "Total sum of squares", "Within-cluster sum of squares", "Total within-cluster sum of squares", and "The ratio of between to total sum of squares".
if (FALSE) { # \dontrun{
library(sf)
guerry_path <- system.file("extdata", "Guerry.shp", package = "rgeoda")
guerry <- st_read(guerry_path)
queen_w <- queen_weights(guerry)
data <- guerry[c('Crm_prs','Crm_prp','Litercy','Donatns','Infants','Suicids')]
guerry_clusters <- redcap(4, queen_w, data, "fullorder-completelinkage")
guerry_clusters
} # }