SKATER forms clusters by spatially partitioning data that has similar values for features of interest.
skater(
k,
w,
df,
bound_variable = data.frame(),
min_bound = 0,
scale_method = "standardize",
distance_method = "euclidean",
random_seed = 123456789,
cpu_threads = 6,
rdist = numeric()
)
The number of clusters
An instance of Weight class
A data frame with selected variables only. E.g. guerry[c("Crm_prs", "Crm_prp", "Litercy")]
(optional) A data frame with selected bound variable
(optional) A minimum bound value that applies to all clusters
One of the scaling methods ('raw', 'standardize', 'demean', 'mad', 'range_standardize', 'range_adjust') to apply on input data. Default is 'standardize' (Z-score normalization).
(optional) The distance method used to compute the distance betwen observation i and j. Defaults to "euclidean". Options are "euclidean" and "manhattan"
(int,optional) The seed for random number generator. Defaults to 123456789.
(optional) The number of cpu threads used for parallel computation
(optional) The distance matrix (lower triangular matrix, column wise storage)
A names list with names "Clusters", "Total sum of squares", "Within-cluster sum of squares", "Total within-cluster sum of squares", and "The ratio of between to total sum of squares".
library(sf)
guerry_path <- system.file("extdata", "Guerry.shp", package = "rgeoda")
guerry <- st_read(guerry_path)
#> Reading layer `Guerry' from data source
#> `/Users/runner/work/_temp/Library/rgeoda/extdata/Guerry.shp'
#> using driver `ESRI Shapefile'
#> Simple feature collection with 85 features and 29 fields
#> Geometry type: MULTIPOLYGON
#> Dimension: XY
#> Bounding box: xmin: 47680 ymin: 1703258 xmax: 1031401 ymax: 2677441
#> Projected CRS: NTF (Paris) / Lambert zone II
queen_w <- queen_weights(guerry)
data <- guerry[c('Crm_prs','Crm_prp','Litercy','Donatns','Infants','Suicids')]
guerry_clusters <- skater(4, queen_w, data)
guerry_clusters
#> $Clusters
#> [1] 3 2 3 1 1 1 2 1 2 1 1 1 2 1 1 3 3 3 2 4 3 1 2 1 2 2 4 1 1 1 1 1 4 3 4 1 2 1
#> [39] 4 3 3 4 2 1 1 1 4 4 2 2 4 2 2 4 2 3 2 2 4 2 3 1 1 1 2 2 1 2 3 4 2 2 2 2 3 2
#> [77] 1 1 1 1 3 3 3 2 2
#>
#> $`Total sum of squares`
#> [1] 504
#>
#> $`Within-cluster sum of squares`
#> [1] 57.89077 59.95242 28.72571 69.38030 62.30781 66.65809
#>
#> $`Total within-cluster sum of squares`
#> [1] 159.0849
#>
#> $`The ratio of between to total sum of squares`
#> [1] 0.3156447
#>