Local Spatial Autocorrelation (2)
Other Local Spatial Autocorrelation Statistics
Luc Anselin^{1}
10/10/2020 (updated)
Introduction
In this chapter, we continue our exploration of local spatial autocorrelation statistics. First, we cover several extensions of the Local Moran, such as the Median Local Moran, the Differential Local Moran and a specialized version that deals with the variance instability in rates or proportions, the EB Local Moran. In addition, we discuss the Local Geary statistic. We also cover a local statistics introduced by Getis and Ord. These statistics are not a LISA in a strict sense, but nevertheless are important tools to discover local hot spots and cold spots.
We use two of the example data sets we introduced earlier, the natregimes data on homicides in U.S. counties and the Guerry collection of socioeconomic indicators for 1830 France.
Objectives

Assess the sensitivity of Local Moran to the use of a median spatial lag instead of an average spatial lag

Identify clusters and outliers in the change of a variable over time

Correct the local Moran statistic for variance instability in rates

Identify clusters and outliers by means of the Local Geary

Identify hot spots and cold spots by means to the Gi and Gi* statistics
GeoDa functions covered
 Space > Median Local Moran’s I
 Space > Differential Local Moran’s I
 Space > Local Moran’s I with EB rate
 Space > Univariate Local Geary
 Space > Local G
 Space > Local G*
Preliminaries
We will alternately use the Guerry data set and the natregimes data set. For each of these, we need an active spatial weights matrix, e.g., guerry_85_q for the former and natregimes_q for the latter. See the description in earlier Chapters on how to accomplish this.
For the natregimes data set, we need some further preparation to make the variables time sensitive. By means of the Time Editor, we first group the homicide rates hr60 through hr90 into the variable hr, the homicide counts hc60 through hc90 into the variable hc, and the population counts po60 through po90 into po.
Extensions of the Local Moran
We consider three extensions of the Local Moran. One uses the median of the neighboring values instead of the average in the computation of the local statistic. We refer to it as the Median Local Moran. A second extension provides an easy way to compute the Local Moran for the difference for a given variable at two points in time. The Differential Local Moran is equivalent to first computing the difference and then applying the Local Moran, but our implementation carries everything out in one step. Finally, we consider an adjustment to the Local Moran for the inherent variance instability of rates in the form of the EB Local Moran.
These extensions can be found among the first group in the Space item on the Menu, or in the corresponding drop down list obtained from the toolbar icon. For example, Figure 1 shows the selection of the Median Local Moran.
Median Local Moran
Principle
So far, we have defined a spatially lagged variable as \(\sum_j w_{ij} z_j\), or the average of the values observed at the neighboring locations. As we saw in the discussion of the interpretation of the clusters and outliers identified by a significant Local Moran, the spatial lag is sensitive to the presence of outliers. This may pull the average up (or down), even when many of the neighbors do not have high (low) values, creating a potentially misleading impression of a cluster or outlier.
An alternative can be based loosely on the idea of a median smoother (e.g., Wall and Devine 2000). In the latter, the value at a location (typically a rate) is replaced by the median of the neighboring locations. In the interpretation here, the median of the neighbors is used in the place of the average as a median spatial lag.
Consequently, the Median Local Moran becomes: \[I_{i}^M = z_i \times \mbox{med}(z_j, j \in N_i),\] where \(N_i\) is the neighbor set of location \(i\) (i.e., those locations for which \(w_{ij} \neq 0\)).
Inference and interpretation are identical to that for the original Local Moran.
Implementation
As shown in Figure 1, the Median Local Moran is invoked as the second item in first group of the Cluster Maps drop down list from the toolbar, or as Space > Univariate Median Local Moran’s I from the menu.
This brings up the usual variable selection dialog, as we saw in the previous Chapter. All the options are the same as for the Local Moran, i.e., the randomization, significance filter and saving of the results. We refer to that discussion for details.
Comparing Local Moran to Median Local Moran
To compare the results between the traditional Local Moran and the more robust Median Local Moran, we put the respective significance maps and cluster maps side by side. We use the Guerry data set with the Donatns variable in this illustration.
In Figure 2, we have the significance map for the Median Local Moran on the left, compared to the traditional Local Moran on the right. Both results are for 99,999 random permutations and use p=0.05 as the significance level, with queen contiguity as the spatial weights.
The Median Local Moran has slightly fewer significant locations, with 25 compared to 29 for the traditional Local Moran. Also, there is no longer a location that achieves a pvalue of 0.0001. The changes work in both directions. Several locations that are significant for the Local Moran are no longer significant in the median version. However, especially in the group of locations in the centerwest, there are a small number of newly significant locations.
The respective cluster maps are compared side by side in Figure 3, with again the results for the Median Local Moran in the left panel. Now, we see more clearly what is going on. The main effect seems to be on the former spatial outliers (one HighLow and one LowHigh), which are no longer significant. Recall the discussion in the previous Chapter about the potential leverage effect of some outliers. Using the median spatial lag effectively removes that influence. One of the spatial outliers remains significant however, strengthening our identification of that observation as an interesting location. The main effect on the clusters is a shrinkage of the number of LowLow observations and an increase in the size of the centerwest HighHigh cluster.
Overall, a comparison of the results for the Median Local Moran to the traditional Local Moran provides insight into the sensitivity of the results to potential outliers. It should be part of a standard sensitivity analysis, together with an assessment of different pvalue cutoffs.
Differential Local Moran
Principle
The Differential Local Moran statistic is the local counterpart to the differential Moran scatter plot, discussed in an earlier Chapter. Instead of using the observations on a variable at two different time periods separately, this statistic is based on the change over time, i.e., the difference between \(y_t\) and \(y_{t1}\). Note that this is the actual difference and not the absolute difference, so that a positive change will be viewed as high, and a negative change as low. The differences are used in standardized form, i.e., they are not the differences between the standardized variable at two points in time, but the standardized differences between the original values for the variable.
The formal expression for this statistic follows the same logic as before, and consists of the cross product of the difference between \(y_t\) and \(y_{t1}\) at \(i\) with the associated spatial lag:
\[I_{i}^D = c (y_{i,t}  y_{i,t1}) \sum_j w_{ij} (y_{j,t}  y_{j,t1}).\] The scaling constant \(c\) can be ignored. In essence, this is the same as the traditional Local Moran applied to the difference, but our implementation is based on selecting the two variables separately.
As before, inference is based on conditional permutation. All the usual caveats hold about multiple comparisons and the choice of a pvalue. In all respects, the interpretation is the same as for the traditional Local Moran.
Implementation
In order to provide some reference, we first create a traditional Local Moran cluster map for the variable hr in 90 in the time enabled natregimes data set (see Preliminaries). The result, for the default of 999 permutations and p=0.05 is shown in Figure 4.
The differential local Moran functionality is invoked from the Cluster Maps toolbar icon, as shown in Figure 1, or from the menu as Space > Differential Local Moran’s I.
Next follows a variable selection dialog, shown in Figure 5, which is slightly different from the standard interface, but the same as for the differential Moran scatter plot.
First, the variable of interest is selected (here, hr), and then the two time periods are chosen (here, 90 and 80). Note that the system is agnostic about the actual time periods, so that any combination can be selected. The statistic is computed for the difference between the time period specified as the first item and that given as the second item. In our example, the spatial weights are natregime_q.
As before, a choice is presented between a Significance Map, Cluster Map, and Moran Scatter Plot, with just the Cluster Map selected as the default. With the default setting, the result is based on 999 permutations with a pvalue of 0.05, as shown in Figure 6.
All the options, such as randomization and significance filter are the same as for the univariate local Moran and will not be further discussed here. There is a slight difference in how the results are saved, which is illustrated below.
Interpretation
The first aspect of the results is a much smaller number of significant locations compared to the standard cluster map (compare to Figure 4). The result here gives the locations where the change in the variable over time is matched by similar/dissimilar changes in the surrounding locations. It is important to keep in mind that the focus is on change, and there is no direct connection to whether this changes is from high or from low values.
Two situations can be distinguished, depending on whether the change variable takes on both positive and negative values, or when all the changes are of the same sign (i.e., all observations either increase or decrease over time).
When both positive and negative change values are present, the HighHigh locations will tend to be locations with a large increase (positive change), surrounded by locations with similar large increases. The LowLow locations will be observations with a large decrease (negative change), surrounded by locations with similar large decreases. Spatial outliers will be locations where an increase is surrounded by a decrease and vice versa.
When all changes are of the same sign, the interpretation of HighHigh and LowLow depends on the sign. Due to the standardization, large positive values will be considered high (above the mean), whereas large negative values will be labeled low (below the mean). This should be kept in mind when interpreting the results.
Saving the results
Similar to the functionality for the differential Moran scatter plot, the Save Results option includes an item to save the actual change variable (in raw form, not in standardized form). In the dialog, this corresponds to the Diff Values item, as shown in Figure 7. The other options are the same as for all local spatial autocorrelation statistics, i.e., the value of the statistic, cluster type, and pvalue.
Once the difference is saved as a separate variable, it can be used in a standard univariate Local Moran operation.
Local Moran with EB Rate
Principle
The last of the extensions of the local Moran’s I pertains to the special case where the variable of interest is a rate or proportion. As discussed for the Moran scatter plot, the resulting variance instability can cause problems for the Moran statistic. The EB standardization suggested by Assunção and Reis (1999) for the global case can be extended to the local statistic in a straightforward manner. The statistic has the usual form, but is computed for the standardized rates, \(z\).
\[I_{i}^{EB} = c z_i \sum_j w_{ij} z_j,\]
The standardization of the raw rate \(r_i\) is the same as before, and is repeated here for completeness (for a more detailed discussion, see the relevant Chapter):
\[z_i = \frac{r_i  \beta}{\sqrt{\alpha + (\beta/P_i)}}\] with \(\beta\) as an estimate of the mean and the denominator as an estimate of the standard error.^{2}
All inference and interpretation is the same as for the univariate case and is not further pursued here.
Implementation and interpretation
The local Moran functionality for standardized rates is invoked as the last item in the Moran group on the Cluster Maps toolbar icon, shown in Figure 1. Alternatively, it can be selected from the menu as Space > Local Moran’s I with EB Rate.
Since the rate standardization is computed as part of the operation, the variable selection interface is similar to that used for rate maps. With the time enabled variables, we need to make sure the Time is synchronized so that both the numerator (the events) and the denominator (the population at risk) pertain to the same period (here, 90). In our example, shown in Figure 8, we take the Event Variable as homicide counts, hc(90), and the Base Variable as population, po(90). As before, we use the queen contiguity, natregimes_q.
Again, we can choose between a Significance Map, a Cluster Map and the Moran Scatter Plot options. With the default settings (cluster map, 999 permutations and p = 0.05), the resulting map is as in Figure 9.
Compared to the cluster map for the raw rates in Figure 4, we observe major differences in the LowLow clusters in the upper midwest. Typically, the greater the variation among values for the base variable (population at risk), especially when the latter is small, the more the two maps will tend to differ. However, when the base population is more or less equal by design (e.g., for census tracts in the U.S.), there is little gain from using the EB rates.
Saving the results
Here again, the Save Results option includes an item to save the actual EB rate (this is identical to the EB Rate standardization that can be computed with the Table Calculator option). In the dialog, this corresponds to the EB Rates item, as shown in Figure 10. The other options are the same as for all local spatial autocorrelation statistics, i.e., the value of the statistic, cluster type, and pvalue.
Local Geary
Principle
The Local Geary statistic, first outlined in Anselin (1995), and further elaborated upon in Anselin (2019), is a Local Indicator of Spatial Association (LISA) that uses a different measure of attribute similarity. As in its global counterpart, the focus is on squared differences, or, rather, dissimilarity. In other words, small values of the statistics suggest positive spatial autocorrelation, whereas large values suggest negative spatial autocorrelation.
The Geary c statistic of spatial autocorrelation (Geary 1954) takes on the following form: \[c = \frac{\sum_i \sum_j w_{ij}(x_i  x_j)^2/2S_0}{\sum_i (x_i  \bar{x})^2 / (n1)},\] with \(S_0 = \sum_i \sum_j w_{ij}\), and where the \(x\) in the numerator do not need to be in standardized form, due to the squared difference. The statistic has a mean value of 1 under the null hypothesis of spatial randomness. Significant values less than 1 indicate positive spatial autocorrelation and values larger than 1 negative spatial autocorrelation.
After controlling for the parts in the expression that do not change with \(i\), a local version of the statistic can be found as (for technical details, see Anselin 1995): \[LG_i = \sum_j w_{ij}(x_i  x_j)^2,\] in the usual notation. Again, because of the squared difference, there is no need to standardize \(x\).
Closer examination reveals that this statistic consists of a weighted sum of the squared distance in attribute space for the geographical neighbors of observation \(i\). Since there is no crossproduct involved, there is no direct relation to linear similarity. In other words, since the Local Geary uses a different criterion of attribute similarity, it may detect patterns that escape the Local Moran, and vice versa.
As for the Local Moran, analytical inference is based on an approximation and generally not very reliable. Instead, the same conditional permutation procedure as for the Local Moran is implemented. The results are interpreted in the same way, with the caveat regarding the pvalues and the notion of significance.
Clusters and spatial outliers
The interpretation of significant
locations in terms of the type of association is not as straightforward for the Local Geary as
it was for the Local Moran. In essence, this
is because the attribute similarity is not a crossproduct and thus has no direct
correspondence with the slope in a scatter plot. Nevertheless, we can use the linking
capability within GeoDa
to make an incomplete classification.
Those locations identified as significant and with the Local Geary statistic smaller than its mean, suggest positive spatial autocorrelation (small differences imply similarity). For those observations that can be classified in the upperright or lowerleft quadrants of a matching Moran scatter plot, we can identify the association as HighHigh or LowLow. However, given that the squared difference can cross the mean, there may be observations for which such a classification is not possible. We will refer to those as other positive spatial autocorrelation.
For negative spatial autocorrelation (large values imply dissimilarity), it is not possible to assess whether the association is between HighLow or LowHigh outliers, since the squaring of the differences removes the sign.
We will illustrate this further below.
Implementation
We return to using the Guerry data set for our illustration.
In the same way as for the Local Moran, the Local Geary can be invoked from the Cluster Maps toolbar icon, as the first item in the fourth block in Figure 1. Alternatively, it can be started from the main menu, as Space > Univariate Local Geary.
The subsequent step is the same as before, bringing up the Variable Settings dialog that contains the names of the available variables as well as the spatial weights. Everything operates in the same way for all local statistics, so we will not dwell on those aspects here. We again select Donatns as the variable, with guerry_85_q as the queen contiguity weights.
The following dialog offering different window options is slightly different, in that there is no Moran scatter plot option. The only options are for the Significance Map and the Cluster Map. The default is that only the latter is checked, as in Figure 11.
After selecting the Significance Map option as well, the OK button generates two maps, using a default pvalue of 0.05 and 999 permutations, as shown in Figures 12 and 13. In our example, there are 28 significant locations, highlighted in Figure 12.
As discussed above, some of the locations with a positive spatial autocorrelation can be distinguished between the HighHigh and LowLow cases. As shown in Figure 13, there are 9 such HighHigh locations and 17 LowLow locations. There are no locations with positive spatial autocorrelation classified as other in this case. There are two observations with negative spatial autocorrelation, although, as discussed, it is not possible to characterize the type of spatial outliers they correspond with.
All the options operate the same for all local statistics, including the randomization setting, the selection of significance levels, the selection of cores and neighbors, the conditional map option, as well as the standard operations of setting the selection shape and saving the image.
Below, we only discuss the interpretation and how to save the results, which differ slightly from the standard case.
Interpretation and significance
Before proceeding further, we change the randomization option to 99,999 permutations. This results in minor changes in the cluster map, with two of the marginal (i.e., only significant at p < 0.05) LowLow locations removed. As a result, there are now 26 significant locations.
Clusters and spatial outliers
To illustrate the rationale behind the classification of the local clusters, we link the locations identified as HighHigh with a matching Moran scatter plot. As usual, we select the observations in question by clicking on the red rectangle in the legend next to HighHigh. This highlights the corresponding locations in the cluster map (the other locations become more transparent) and simultaneously selects the matching points in the Moran scatter plot. As illustrated in Figure 14, the type of association is between locations above the mean and a spatial lag that is also above the mean, which we have characterized as HighHigh.
Similarly, selecting the LowLow cluster cores (click the orange rectangle in the legend) shows the corresponding points in the lowerleft quadrant of the Moran scatter plot in Figure 15.
In our example, all cluster locations can be classified by means of the Moran scatter plot. However, typically, this is not the case and some observations have to be classified as other. Those are locations surrounded by neighbors that are similar (small squared difference), but they may be located on different sides of the mean (e.g., a value slightly above the mean and a neighbor slightly below the mean).
For negative spatial autocorrelation, there is no unambiguous classification, since the squared differences eliminate the sign of the dissimilarity between an observation and its neighbors. The corresponding points in the Moran scatter plot are not informative, as shown in Figure 16.
Changing the significance threshold
With 99,999 permutations, the significance map allows for a much finer grained assessment of significance. In our example, in Figure 17, 14 locations are significant at 0.05, 10 at 0.01, and one each for 0.001 and 0.00001. Note that there is some correspondence between the Local Moran and the Local Geary cluster maps, but there is by no means a perfect match. Specifically, while the most significant locations are in the same region (the South of France), the location with p < 0.00001 found here is not the same as the one identified for the Local Moran, but a neighbor.
In the same way as for the Local Moran statistic, we can manipulate the Significance Filter to assess the sensitivity of the identified clusters and spatial outliers to the choice of the cutoff point. For example, in Figure 18, with p < 0.01, there are six HighHigh and six LowLow cluster cores, but there is no longer any evidence of spatial outliers.
In this particular case, the Bonferroni bound and the FDR yield the same cutoff value of 0.00012, with only one significant location, highlighted in Figure 19.
Saving the results
We can again add selected statistics to the data table by means of the Save Results option. As before, the dialog gives the option to save the statistic itself, the cluster indication and the significance, as shown in Figure 20. Default values for the variable names are suggested, but these will typically need to be customized.
The code for the cluster classification used for the Local Geary is 0 for not significant, 1 for a HighHigh cluster core, 2 for a LowLow cluster core, 3 for other (positive spatial autocorrelation), and 4 for negative spatial autocorrelation.
As always, any addition to the data table is only made permanent after a Save operation.
GetisOrd Statistics
Principle
An early class of statistics for local spatial autocorrelation was suggested by Getis and Ord (1992), and further elaborated upon in Ord and Getis (1995). It is derived from a point pattern analysis logic. In its earliest formulation the statistic consisted of a ratio of the number of observations within a given range of a point to the total count of points. In a more general form, the statistic is applied to the values at neighboring locations (as defined by the spatial weights). There are two versions of the statistic. They differ in that one takes the value at the given location into account, and the other does not.
The \(G_i\) statistic consist of a ratio of the weighted average of the values in the neighboring locations, to the sum of all values, not including the value at the location (\(x_i\)). \[G_i = \frac{\sum_{j \neq i} w_{ij} x_j}{\sum_{j \neq i} x_j}\]
In contrast, the \(G_i^*\) statistic includes the value \(x_i\) in both numerator and denominator: \[G_i^* = \frac{\sum_j w_{ij} x_j}{\sum_j x_j}.\] Note that in this case, the denominator is constant across all observations and simply consists of the total sum of all values in the data set. The statistic is the ratio of the average values in a window centered on an observation to the total sum of observations.
The interpretation of the GetisOrd statistics is very straightforward: a value larger than the mean (or, a positive value for a standardized zvalue) suggests a HighHigh cluster or hot spot, a value smaller than the mean (or, negative for a zvalue) indicates a LowLow cluster or cold spot. In contrast to the Local Moran and Local Geary statistics, the GetisOrd approach does not consider spatial outliers.^{3}
Inference can be derived from an analytical approximation, as given in Getis and Ord (1992) and Ord and Getis (1995). However, as for the Local Moran and Local Geary, such approximation may not be reliable in practice. Instead, conditional random permutation can be employed, using an identical procedure as for the other statistics.
Implementation
The implementation of the GetisOrd statistics is largely identical to that of the other local statistics. Each statistic an be selected from the second group in the drop down menu generated by the Cluster Maps toolbar icon. Alternatively, they can be invoked from the menu as Space > Local G or Space > Local G*.
The next step brings up the Variable Settings dialog. We continue with the Guerry example and again select Donatns as the variable, with guerry_85_q as the queen contiguity weights.
This is followed by a choice of windows to be opened. The latter is again slightly different from the previous cases. The default, shown in Figure 21, is to use rowstandardized weights and to generate only the Cluster Map. The Significance Map option needs to be invoked explicitly by checking the corresponding box.
The GetisOrd statistics also allow the use of binary weights (i.e., not rowstandardized), by having the rowstandardized weights box unchecked. In practice, the results rarely differ much.
Using the default settings of 999 permutations, with p at 0.05, yields the significance map for the \(G_i\) statistic shown in Figure 22.
The corresponding cluster map, illustrated in Figure 23, shows 10 HighHigh cluster cores or hot spots (in red on the map), and 19 LowLow cluster cores or cold spots (in blue on the map). Note that these are the exact same locations as identified for the Local Moran, except that the spatial outliers are now classified as part of the clusters (one in the HighHigh group and one in the LowLow group).
In this particular example, the cluster map for the \(G_i^*\) statistic, shown in Figure 24, gives the identical results. This is often the case, but not always, so there is a point in computing both statistics.
For the GetisOrd statistics, all the same options are available as for the Local Moran and the Local Geary statistics, and we refer to those discussions for details.
Interpretation and significance
In the same way as for the other statistics, changing the number of permutations to 99,999 provides a more detailed insight into the importance of the different locations, as indicated in the significance map. While 21 locations are deemed to be significant for 0.05, there are only four such locations for 0.01, three for 0.001 and one for 0.00001. This is the exact same result as for the Local Moran, illustrated in Figure 25 for the \(G_i\) statistic (the results for the \(G_i^*\) statistic are the same).
Using the Significance Filter, we can assess the effect of a change of critical pvalue to 0.01. In Figure 26, only one HighHigh cluster core remains, whereas the LowLow cluster is reduced to seven observations.
As shown in Figure 27, the FDR criterion further reduces the number of significant locations to three in the South of the country. These are the same three locations also identified by the Local Moran statistic.
The result for the Bonferroni bound is again the same as for the Local Moran, with only one significant location (see the previous Chapter).
Saving the results
The Save Results option makes it possible to add the statistics and their characteristics to the data table. As shown in Figure 28, three options are available: the statistic itself (either \(G_i\) or \(G_i^*\)), the associated cluster category and pseudo pvalues.
For the GetisOrd statistics, there are only three cluster categories, with observations taking the value of 0 for not significant, 1 for a HighHigh cluster, and 2 for a LowLow cluster.
As always, the addition of the new variables to the table is made permanent by a save operation.
References
Anselin, Luc. 1995. “Local Indicators of Spatial Association — LISA.” Geographical Analysis 27: 93–115.
———. 2019. “A Local Indicator of Multivariate Spatial Association, Extending Geary’s c.” Geographical Analysis 51 (2): 133–50.
Assunção, Renato, and Edna A. Reis. 1999. “A New Proposal to Adjust Moran’s I for Population Density.” Statistics in Medicine 18: 2147–61.
Geary, R. 1954. “The Contiguity Ratio and Statistical Mapping.” The Incorporated Statistician 5: 115–45.
Getis, Arthur, and J. Keith Ord. 1992. “The Analysis of Spatial Association by Use of Distance Statistics.” Geographical Analysis 24: 189–206.
Ord, J. Keith, and Arthur Getis. 1995. “Local Spatial Autocorrelation Statistics: Distributional Issues and an Application.” Geographical Analysis 27: 286–306.
Wall, Patrick, and Owen Devine. 2000. “Interactive Analysis of the Spatial Distribution of Disease Using a Geographic Information System.” Journal of Geographical Systems 2 (3): 243–56.

University of Chicago, Center for Spatial Data Science – anselin@uchicago.edu↩︎

To recap, \(\beta = \sum_i O_i / \sum_i P_i\), where \(O_i\) is the number of events at \(i\) and \(P_i\) is the population at risk. The estimate of \(\alpha = [\sum_i P_i ( r_i  \beta )^2 ] / P  \beta / ( P / n),\), with \(n\) as the total number of observations, such that \(P/n\) is the average population. Note that the estimate of \(\alpha\) can be negative, in which case it is set to zero.↩︎

When all observations for a variable are positive, as is the case in our examples, the G statistics are positive ratios less than one. Large ratios (more precisely, less small values since all ratios are small) correspond with HighHigh hot spots, small ratios with LowLow cold spots.↩︎