Because there is no single, authoritative source of COVID case data in the United States, we use multiple sources. We include 1P3A data, a crowdsourced dataset from international volunteers that offered the first available county-level data for public use in early March. We also include USAFacts data, a journalistic dataset that made county-level data available in March. In future releases, we may also incorporate county-level reports by NYT and John Hopkins University. We recommend checking across multiple data sources, and validating against local health department numbers.
At this time, hospitalization data is not available in a single data source for all counties across the United States. If this changes, we will be certain to incorporate it.
At this time, data is not available by race, age, and demographics in a single data source for all counties across the United States. If this changes, we will be certain to incorporate it. There are several projects we recommend for state-wide metrics for race and ethniciy data, including the Racial Dashboard by Covidtracking.com and Data 4 Black Lives.
Yes, we recently added 7-Day Average Daily Confirmed Count. We calculated this variable by taking the difference between the current day’s confirmed count and the confirmed count 7 days ago, and then dividing this difference by 7. For example, we took the difference between the confirmed count as of June 30th and that as of June 23rd, divided the difference by 7, then used this as an estimate for 7 Day Average Daily Confirmed Count for June 30th. Technically speaking, this measures the daily average growth for the week right before June 30th. Also note that this calculation is only available with USAFacts data because of data completeness (1P3A has several days’ data missing in January and February).
Because there is no one single authoratative, validated source for county-level COVID cases and deaths for real-time analysis, we incorporate multiple datasets from multiple projects to allow for comparisons. For more information about each data source see our Data page.
High-Low & Low-High clusters refer to Outliers. These are areas that have a high (and low in Low-High) number of cases within the county and a low (and high in High-Low) number of cases in neighboring counties, highlighting an emerging risk or priority for containment.
County-level visualizations show a dramatically more detailed pandemic landscape, while aggregate data alone can miss local hot spots of surging COVID cases. If one only looks at state-level data, a county cluster would have to be extreme to show up, and by then it might be too late for prevention measures to be enacted.
With the fixed bins option, the legend stays the same as you go back in time. This makes it easier to watch the spread of COVID over time. In the other option, the legend changes each day, adjusting for the optimal classification according to how many cases exist for that day.
For a more technical response; both use a non-linear algorithm to group observations such that within-group homogeneity is maximized. However, the Natural Breaks (fixed bins) option applies the algorithm to the data for the most recent date and uses the same bins to group observations for historical data. This allows us to better visualize the pandemic spread over time. The Natural Breaks option, on the other hand, uses a non-linear algoritm to group observations for every day's data, which results in maxized within-group homogeneity for every day's data, but difficult to visualize changes over time.
The Atlas Learning Community is project by CSI Solutions to provide Atlas super-users, health practioners, and planners a place to interact. It is moderated by coalition members. Ask questions, provide feedback, and help make the Atlas Coalition stronger!