SKATER forms clusters by spatially partitioning data that has similar values for features of interest.

skater(
  k,
  w,
  df,
  bound_variable = data.frame(),
  min_bound = 0,
  scale_method = "standardize",
  distance_method = "euclidean",
  random_seed = 123456789,
  cpu_threads = 6,
  rdist = numeric()
)

Arguments

k

The number of clusters

w

An instance of Weight class

df

A data frame with selected variables only. E.g. guerry[c("Crm_prs", "Crm_prp", "Litercy")]

bound_variable

(optional) A data frame with selected bound variable

min_bound

(optional) A minimum bound value that applies to all clusters

scale_method

One of the scaling methods 'raw', 'standardize', 'demean', 'mad', 'range_standardize', 'range_adjust' to apply on input data. Default is 'standardize' (Z-score normalization).

distance_method

(optional) The distance method used to compute the distance betwen observation i and j. Defaults to "euclidean". Options are "euclidean" and "manhattan"

random_seed

(int,optional) The seed for random number generator. Defaults to 123456789.

cpu_threads

(optional) The number of cpu threads used for parallel computation

rdist

(optional) The distance matrix (lower triangular matrix, column wise storage)

Value

A names list with names "Clusters", "Total sum of squares", "Within-cluster sum of squares", "Total within-cluster sum of squares", and "The ratio of between to total sum of squares".

Examples

library(sf)
guerry_path <- system.file("extdata", "Guerry.shp", package = "rgeoda")
guerry <- st_read(guerry_path)
#> Reading layer `Guerry' from data source 
#>   `/Users/runner/work/_temp/Library/rgeoda/extdata/Guerry.shp' 
#>   using driver `ESRI Shapefile'
#> Simple feature collection with 85 features and 29 fields
#> Geometry type: MULTIPOLYGON
#> Dimension:     XY
#> Bounding box:  xmin: 47680 ymin: 1703258 xmax: 1031401 ymax: 2677441
#> Projected CRS: NTF (Paris) / Lambert zone II
queen_w <- queen_weights(guerry)
data <- guerry[c('Crm_prs','Crm_prp','Litercy','Donatns','Infants','Suicids')]
guerry_clusters <- skater(4, queen_w, data)
#> aaa0x0after gda_skater
guerry_clusters
#> $Clusters
#>  [1] 3 2 3 1 1 1 2 1 2 1 1 1 2 1 1 3 3 3 2 4 3 1 2 1 2 2 4 1 1 1 1 1 4 3 4 1 2 1
#> [39] 4 3 3 4 2 1 1 1 4 4 2 2 4 2 2 4 2 3 2 2 4 2 3 1 1 1 2 2 1 2 3 4 2 2 2 2 3 2
#> [77] 1 1 1 1 3 3 3 2 2
#> 
#> $`Total sum of squares`
#> [1] 504
#> 
#> $`Within-cluster sum of squares`
#> [1] 57.89077 59.95242 28.72571 69.38030 62.30781 66.65809
#> 
#> $`Total within-cluster sum of squares`
#> [1] 159.0849
#> 
#> $`The ratio of between to total sum of squares`
#> [1] 0.3156447
#>