This dataset contains polygons that correspond to the South London Natural Experiment attribute data compiled by Coleman (2019). The data is only available for 28 subdistricts. The rest of the observations were excluded given that there was no attribute data. The resulting (now projected) map consists of the 28 adjacent subdistricts, as described below.

The documentation is available at:


The Geoda script is available at:


Overview of data

Variable Description
dis_ID London district ID
district London district
sub_ID London subdistrict ID
subdist London subdistrict
pop1851 Population for 1851
supplier Water company suppliers that served the subdistrict
supplierID Water company supplier ID
perc_sou Proportion of the population that was served by the Southwark & Vauxhall company
perc_lam Proportion of the population that was served by the Lambeth company
perc_other Proportion of the population that was served by a company other than Southwark & Vauxhall or Lambeth
lam_degree Creates categories for the proportion of the population that was served by the Lambeth company
d_overall Number of deaths attributed to the cholera epidemic in the seven weeks ending August 26, 1854
d_sou Number of deaths attributed to the cholera epidemic in the seven weeks ending August 26, 1854 for the Southwark & Vauxhall company
d_lam Number of deaths attributed to the cholera epidemic in the seven weeks ending August 26, 1854 for the Lambeth company
d_pump Number of deaths attributed to the cholera epidemic in the seven weeks ending August 26, 1854 originating in pump-wells
d_thames Number of deaths attributed to the cholera epidemic in the seven weeks ending August 26, 1854 from water from the Thames River and ditches
rate_sou7w Southwark & Vauxhall cholera death rate per 10000 people in the seven weeks ending August 26, 1854
rate_lam7w Lambeth cholera death rate per 10000 people in the seven weeks ending August 26, 1854 - Missing values are undefined and should not be converted to 0
rate_oth7w Cholera death rate per 10000 people for ‘other’ category in the seven weeks ending August 26, 1854 - Missing values are undefined and should not be converted to 0
deaths1849 Number of deaths attributed to the cholera epidemic in 1849
deaths1854 Number of deaths attributed to the cholera epidemic in 1854
rate1849 Cholera death rate per 10000 people in 1849
rate1854 Cholera death rate per 10000 people in 1854
pop1849 Population for 1849
pop1854 Population for 1854
rAvSupR_49 Average supplier-region-specific cholera mortality rate per 10000 people in 1849
rAvSupR_54 Average supplier-region-specific cholera mortality rate per 10000 people in 1854
pred_Snow Snow’s cholera death count prediction (from his 1856 Table VI)
pred_DiD49 Cholera death count prediction from Difference-in-Difference regression analysis for 1849
pred_DiD54 Cholera death count prediction from Difference-in-Difference regression analysis for 1854


Coleman, T. (2019). Causality in the Time of Cholera: John Snow as a Prototype for Causal Inference. Working paper. Available at SSRN at Data can be downloaded from (last accessed September 2, 2020).

Prepared by Center for Spatial Data Science. Last updated February 1, 2021. Data provided “as is,” no warranties.