What Data is in Data Commons


While Data Commons (DC) has information about a wide variety of types of entities (cities, states, countries, schools, companies, facilities, etc.), most of the information today is about places. DC contains a catalog of about 2.9 million places. In addition to basic metadata like the location, type and containment information, many places also contain information about their shape, area, etc.

The most common type of information about places is in the form of a time series. Each time series is a set of observations, across a set of time periods, about a combination of a place and a variable (also called statistical variable) from a particular source. As an example, here is the median income in Berkeley, CA over a period of ten years, according to the US Census Bureau.

See the sample queries about Places and Properties of Places to explore further.

Statistical Variables

DC contains over 90,000 distinct variables, a.k.a. ‘statistical variables’, all of whom can be explored in the statistical variable explorer tool. Not all places have observations corresponding to every one of these variables. For example the Average Retail Price of Electricity is available for all States in the US but this information is not available at the City or County level like it is for the Count of Households with no Health Insurance . The Map and timeline tools can not only be for visualizations, but also for determining what variables are available for a given geography. To do so, visit the Map Explorer, type the name of a geographic region like “United States” or “India” and select an administrative entity level (place type) from the dropdown on the right. You will notice the Statistical Variables on the left pane filtered to the relevant set only.

See the sample queries section to experiment with variables.

Data Sources

Often different sources provide data for the same variable. As an example, sources for a given series can be seen for Life Expectancy in the Statistical Variable Explorer. Life Expectancy data in DC is sourced from the World bank and Organization for Economic Co-operation and Development (OECD), and the same place can have data from both sources. You can select the data source in the Map View or Timeline visualization by clicking on the Select Source button below the chart. For example, in the Map View for Life Expectancy in all countries of the world.

See the sample queries about Properties of Places to see how to filter by source. Sources are named and as shown in the sample queries, we can specify a source, if needed.