Over the past few months, we’ve continued to incorporate new data into our knowledge graph and develop new tools. Here are some of the highlights:
New Statistical Variable Explorer
As Data Commons has grown, the number of Statistical Variables has increased. With over 300k variables to choose from (and counting!), we wanted to make it easier for you to find the right variables for your analysis. To address this, we added a new tool for exploring Statistical Variables. The tool provides metadata about the observations, places, and provenances we have for each variable.
New Data
Lately, we’ve been focused on building up our inventory of sustainability-related data. Some of recent our imports include:
- Several of the IPCC RCP scenarios (e.g. Max Daily Temperature Based on RCP 8.5 in the US)
- WHO’s Global Health Observatory (e.g. Prevalence (%) of females in the US with BMI of 30 or greater, Percent of rural population in South Africa with at least basic drinking water services, and Percent of urban population in the US with household expenditures on health greater than 10% of total household expenditure or income)
- UN’s Energy Statistics Database (e.g. Annual Generation of Coal in the US)
- EPA’s Greenhouse Gas Reporting Program (e.g. Greenhouse Gas emissions from large facilities in Santa Clara County, and California, as well as EPA reporting facilities such as Anheuser Busch Baldwinsville Brewery and Glen Burnie Landfill)
- Stanford’s DeepSolar (e.g. Count of Solar Installation per capita in California)
We’re also in the process of importing a large number of US Census American Community Survey Subject Tables, which contain detailed demographic data about a variety of topics. For example:
- Count of With Food Stamps in The Past 12 Months, Below Poverty Level in The Past 12 Months per capita
- Count of Single Mother Family Household, Some College or Associate’s Degree
New Import Tool
We’ve made it easier for contributors to add datasets to Data Commons with our new open source command-line tool. This tool provides linting and detailed stats validation, streamlining our data ingestion process and making it more accessible.
Check out our Github repo here.
As always, please feel free to share any feedback.
Thanks!
Natalie on behalf of the Data Commons team