5 Spatial Weights

Spatial weights are a key component in any cross-sectional analysis of spatial dependence. They are an essential element in the construction of spatial autocorrelation statistics, and provide the means to create spatially explicit variables, such as spatially lagged variables and spatially smoothed rates.

The spatial weights represents the possible spatial interaction between observations in space. Like GeoDa desktop software, pygeoda provides a rich variety of methods to create several different types of spatial weights:

  • Contiguity Based Weights: queen_weights(), rook_weights()

  • Distance Based Weights: distance_weights()

  • K-Nearest Neighbor Weights: knn_weights()

  • Kernel Weights: kernel_weights()

5.1 Queen Contiguity Weights

To create a Queen contiguity weights, we can call pygeoda’s function

pygeoda.queen_weights(gda, order=1, include_lower_order = False, precision_threshold = 0)

by passing the geoda object guerry created using pygeoda.open():

>>> queen_w = pygeoda.queen_weights(guerry)
>>> queen_w
Weights Meta-data:
 number of observations:                   85
           is symmetric:                 True
               sparsity:  0.05813148788927336
        # min neighbors:                    2
        # max neighbors:                    8
       # mean neighbors:   4.9411764705882355
     # median neighbors:                  5.0
           has isolates:                False

The function queen_weights() returns an instance of Weight object. One can access the meta data of the spatial weights by accessing the attributes of Weight object:

5.2 Attributes of Weight object

  • num_obs

  • is_symmetric()

  • has_isolates()

  • weights_sparsity()

  • min_neighbors()

  • median_neighbors()

  • mean_neighbors()

  • max_neighbors()

  • get_neighbors()

  • spatial_lag()

  • save_weights()

We can access the details of the weights: e.g. get the neighbors of a specified observation, which is useful in exploratory spatial data analysis

>>> nbrs = queen_w.get_neighbors(0)
>>> print("Neighbors of 0-st observation are:", nbrs)
Neighbors of 0-st observation are: (35, 36, 66, 68)

We can also compute the spatial lag of a specified observation by passing the values of the selected variable:

>>> lag = queen_w.SpatialLag(guerry['Crm_prp'])
>>> print("Spatial lagged values of Crm_prp:", lag)
Spatial lagged values of Crm_prp: [7899.25, 6593.5,...]

5.3 Rook Contiguity Weights

To create a Rook contiguity weights, we can call pygeoda’s function

rook_weights(gda, order=1, include_lower_order=False, precision_threshold = 0)

by passing the geoda object guerry we just created:

>>> rook_w = geoda.rook_weights(guerry)
>>> print(rook_w)
Weights Meta-data:
 number of observations:                   85
           is symmetric:                 True
               sparsity:  0.05813148788927336
        # min neighbors:                    2
        # max neighbors:                    8
       # mean neighbors:   4.9411764705882355
     # median neighbors:                  5.0
           has isolates:                False

To save the weights to a file, we can call pygeoda’s function

save_weights(ofname, layer_name, id_name, id_vec)

The layer_name is the layer name of loaded dataset. For a ESRI shapefile, the layer name is the file name without the suffix (e.g. Guerry).

The id_name is a key column, which contains unique values to represent observations.

The id_vec is the actual column data of id_name, it could be a tuple of integer or string values.

For example, in Guerry dataset, the column “CODE_DE” can be used as a key to save a weights file:

>>> rook_w.save_weights('./Guerry_r.gal', 'Guerry', 'CODE_DE', guerry['CODE_DE'])
True

Then, we should find the file “Guerry_r.gal” in the output directory.

5.4 Distance Based Weights

To create a Distance based weights, we can call pygeoda’s function

pygeoda.distance_weights(gda, dist_thres, power=1.0,  is_inverse=False, is_arc=False, is_mile=True)

by passing the geoda object guerry we just created and the value of distance threshold. Like GeoDa, pygeoda provides a function to help you find a optimized distance threshold that guarantees that every observation has at least one neighbor:

pygeoda.min_distthreshold(GeoDa gda, bool is_arc = False, bool is_mile = True)
>>> dist_thres = pygeoda.min_distthreshold(guerry)
>>> dist_w = pygeoda.distance_weights(guerry, dist_thres)
>>> dist_w
Weights Meta-data:
 number of observations:                   85
           is symmetric:                 True
               sparsity: 0.043460207612456746
        # min neighbors:                    1
        # max neighbors:                    7
       # mean neighbors:   3.6941176470588237
     # median neighbors:                  4.0
           has isolates:                False

5.5 K-Nearest Neighbor Weights

A special case of distance based weights is K-Nearest neighbor weights, in which every obersvation will have exactly k neighbors. To create a KNN weights, we can call pygeoda’s function:

pygeoda.weights.knn_weights(gda, k, power = 1.0,is_inverse = False, is_arc = False, is_mile = True)

For example, to create a 6-nearest neighbor weights using Guerry dataset:

>>> knn6_w = pygeoda.knn_weights(guerry, 6)
>>> print(knn6_w)
Weights Meta-data:
 number of observations:                   85
           is symmetric:                False
               sparsity:  0.07058823529411765
        # min neighbors:                    6
        # max neighbors:                    6
       # mean neighbors:                  6.0
     # median neighbors:                  6.0
           has isolates:                False

5.6 Kernel Weights

Kernel Weights applies kernel function to determine the distance decay in the derived continuous weights kernel. The kernel weights are defined as a function K(z) of the ratio between the distance dij from i to j, and the bandwidth hi, with z=dij/hi.

The kernl functions include: * triangular * uniform * quadratic * epanechnikov * quartic * gaussian

Two functions are provided in pygeoda to create kernel weights:

5.6.1 Kernel Weights with fixed bandwidth

To create a kernel weights with fixed bandwith:

>>> kernel_w = pygeoda.kernel_weights(guerry, dist_thres, "uniform")
>>> print(kernel_w)
Weights Meta-data:
 number of observations:                   85
           is symmetric:                False
               sparsity: 0.043460207612456746
        # min neighbors:                    1
        # max neighbors:                    7
       # mean neighbors:   3.6941176470588237
     # median neighbors:                  4.0
           has isolates:                False

Besides the options is_inverse, power, is_arc and is_mile that are the same with the distance based weights, this kernel weights function has another option:

use_kernel_diagonals
(optional) FALSE (default) or TRUE, apply kernel on the
diagonal of weights matrix

5.6.2 Kernel Weights with adaptive bandwidth

To create a kernel weights with adaptive bandwidth or using max KNN distance as bandwidth:

>>> adptkernel_w = pygeoda.kernel_knn_weights(guerry, 6, "uniform")
>>> print(adptkernel_w)
Weights Meta-data:
 number of observations:                   85
           is symmetric:                False
               sparsity:  0.07058823529411765
        # min neighbors:                    6
        # max neighbors:                    6
       # mean neighbors:                  6.0
     # median neighbors:                  6.0
           has isolates:                False

This kernel weights function two more options:

adaptive_bandwidth
(optional) TRUE (default) or FALSE: TRUE use adaptive bandwidth
calculated using distance of k-nearest neithbors, FALSE use max
distance of all observation to their k-nearest neighbors

use_kernel_diagonals
(optional) FALSE (default) or TRUE, apply kernel on the diagonal
of weights matrix