5 Spatial Weights¶
Spatial weights are a key component in any cross-sectional analysis of spatial dependence. They are an essential element in the construction of spatial autocorrelation statistics, and provide the means to create spatially explicit variables, such as spatially lagged variables and spatially smoothed rates.
The spatial weights represents the possible spatial interaction between observations in space. Like GeoDa desktop software, pygeoda provides a rich variety of methods to create several different types of spatial weights:
Contiguity Based Weights: queen_weights(), rook_weights()
Distance Based Weights: distance_weights()
K-Nearest Neighbor Weights: knn_weights()
Kernel Weights: kernel_weights()
5.1 Queen Contiguity Weights¶
To create a Queen contiguity weights, we can call pygeoda’s function
pygeoda.queen_weights(gda, order=1, include_lower_order = False, precision_threshold = 0)
by passing the geoda object guerry created using pygeoda.open():
>>> queen_w = pygeoda.queen_weights(guerry)
>>> queen_w
Weights Meta-data:
number of observations: 85
is symmetric: True
sparsity: 0.05813148788927336
# min neighbors: 2
# max neighbors: 8
# mean neighbors: 4.9411764705882355
# median neighbors: 5.0
has isolates: False
The function queen_weights() returns an instance of Weight object. One can access the meta data of the spatial weights by accessing the attributes of Weight object:
5.2 Attributes of Weight object¶
num_obs
is_symmetric()
has_isolates()
weights_sparsity()
min_neighbors()
median_neighbors()
mean_neighbors()
max_neighbors()
get_neighbors()
spatial_lag()
save_weights()
We can access the details of the weights: e.g. get the neighbors of a specified observation, which is useful in exploratory spatial data analysis
>>> nbrs = queen_w.get_neighbors(0)
>>> print("Neighbors of 0-st observation are:", nbrs)
Neighbors of 0-st observation are: (35, 36, 66, 68)
We can also compute the spatial lag of a specified observation by passing the values of the selected variable:
>>> lag = queen_w.SpatialLag(guerry['Crm_prp'])
>>> print("Spatial lagged values of Crm_prp:", lag)
Spatial lagged values of Crm_prp: [7899.25, 6593.5,...]
5.3 Rook Contiguity Weights¶
To create a Rook contiguity weights, we can call pygeoda’s function
rook_weights(gda, order=1, include_lower_order=False, precision_threshold = 0)
by passing the geoda object guerry we just created:
>>> rook_w = geoda.rook_weights(guerry)
>>> print(rook_w)
Weights Meta-data:
number of observations: 85
is symmetric: True
sparsity: 0.05813148788927336
# min neighbors: 2
# max neighbors: 8
# mean neighbors: 4.9411764705882355
# median neighbors: 5.0
has isolates: False
To save the weights to a file, we can call pygeoda’s function
save_weights(ofname, layer_name, id_name, id_vec)
The layer_name is the layer name of loaded dataset. For a ESRI shapefile, the layer name is the file name without the suffix (e.g. Guerry).
The id_name is a key column, which contains unique values to represent observations.
The id_vec is the actual column data of id_name, it could be a tuple of integer or string values.
For example, in Guerry dataset, the column “CODE_DE” can be used as a key to save a weights file:
>>> rook_w.save_weights('./Guerry_r.gal', 'Guerry', 'CODE_DE', guerry['CODE_DE'])
True
Then, we should find the file “Guerry_r.gal” in the output directory.
5.4 Distance Based Weights¶
To create a Distance based weights, we can call pygeoda’s function
pygeoda.distance_weights(gda, dist_thres, power=1.0, is_inverse=False, is_arc=False, is_mile=True)
by passing the geoda object guerry we just created and the value of distance threshold. Like GeoDa, pygeoda provides a function to help you find a optimized distance threshold that guarantees that every observation has at least one neighbor:
pygeoda.min_distthreshold(GeoDa gda, bool is_arc = False, bool is_mile = True)
>>> dist_thres = pygeoda.min_distthreshold(guerry)
>>> dist_w = pygeoda.distance_weights(guerry, dist_thres)
>>> dist_w
Weights Meta-data:
number of observations: 85
is symmetric: True
sparsity: 0.043460207612456746
# min neighbors: 1
# max neighbors: 7
# mean neighbors: 3.6941176470588237
# median neighbors: 4.0
has isolates: False
5.5 K-Nearest Neighbor Weights¶
A special case of distance based weights is K-Nearest neighbor weights, in which every obersvation will have exactly k neighbors. To create a KNN weights, we can call pygeoda’s function:
pygeoda.weights.knn_weights(gda, k, power = 1.0,is_inverse = False, is_arc = False, is_mile = True)
For example, to create a 6-nearest neighbor weights using Guerry dataset:
>>> knn6_w = pygeoda.knn_weights(guerry, 6)
>>> print(knn6_w)
Weights Meta-data:
number of observations: 85
is symmetric: False
sparsity: 0.07058823529411765
# min neighbors: 6
# max neighbors: 6
# mean neighbors: 6.0
# median neighbors: 6.0
has isolates: False
5.6 Kernel Weights¶
Kernel Weights applies kernel function to determine the distance decay in the derived continuous weights kernel. The kernel weights are defined as a function K(z) of the ratio between the distance dij from i to j, and the bandwidth hi, with z=dij/hi.
The kernl functions include: * triangular * uniform * quadratic * epanechnikov * quartic * gaussian
Two functions are provided in pygeoda to create kernel weights:
5.6.1 Kernel Weights with fixed bandwidth¶
To create a kernel weights with fixed bandwith:
>>> kernel_w = pygeoda.kernel_weights(guerry, dist_thres, "uniform")
>>> print(kernel_w)
Weights Meta-data:
number of observations: 85
is symmetric: False
sparsity: 0.043460207612456746
# min neighbors: 1
# max neighbors: 7
# mean neighbors: 3.6941176470588237
# median neighbors: 4.0
has isolates: False
Besides the options is_inverse, power, is_arc and is_mile that are the same with the distance based weights, this kernel weights function has another option:
use_kernel_diagonals
(optional) FALSE (default) or TRUE, apply kernel on the
diagonal of weights matrix
5.6.2 Kernel Weights with adaptive bandwidth¶
To create a kernel weights with adaptive bandwidth or using max KNN distance as bandwidth:
>>> adptkernel_w = pygeoda.kernel_knn_weights(guerry, 6, "uniform")
>>> print(adptkernel_w)
Weights Meta-data:
number of observations: 85
is symmetric: False
sparsity: 0.07058823529411765
# min neighbors: 6
# max neighbors: 6
# mean neighbors: 6.0
# median neighbors: 6.0
has isolates: False
This kernel weights function two more options:
adaptive_bandwidth
(optional) TRUE (default) or FALSE: TRUE use adaptive bandwidth
calculated using distance of k-nearest neithbors, FALSE use max
distance of all observation to their k-nearest neighbors
use_kernel_diagonals
(optional) FALSE (default) or TRUE, apply kernel on the diagonal
of weights matrix