The Data Commons Graph aggregates data from many different data sources into a single knowledge graph. Data Commons is based on the data model used by schema.org, for more information see our guide to the data model.
The Data Commons API is a set of APIs that allow developers to programmatically access the data in the Data Commons graph. This access is provided through a set of REST APIs, with additional wrappers in Python and R.
Our APIs can be roughly grouped into the following:
Local Node Exploration: Given a node (or set of nodes), explore the graph around those node(s).
Domain specific APIs: These are groups of APIs, specific to particular domains, E.g., places, statistics.
Graph Query/SPARQL: Given a subgraph where some of the nodes are variables, retrieve possible matches. This corresponds to a subset of the graph query language SPARQL.
Utilities: These are Python Notebook specific APIs for helping with Pandas DataFrames, etc.
For each API, in addition to the arguments and return value for the REST calls, we also document language specific bindings.
Almost all our APIs take references to nodes and properties as arguments. Every
node (properties are also nodes) has a
Data Commons ID (DCID), which is used
to pass nodes as arguments to API calls. The DCID of schema.org terms used in
Data Commons is their schema.org ID.
Using the Data Commons API requires an API Key. Details on obtaining a key, installing Python libraries, etc. can be found in the API setup guide.
Local Node Exploration
- get_property_labels: given a node, return the
DCIDs of the properties associated with this node. In graph terminology, return the
DCIDs of the arc-labels of the arcs into and out of this node.
- get_property_values: given a node and a property, return the value of this property for that node. In graph terminology, return the target/source of the arcs into/out of this node with that arc label.
- get_triples: given a node, return all the triples in which this node is either the subject/source or object/target.
Domain Specific APIs
To simplify accessing these statistics, StatisticalVariable’s are human readable identifiers that represents a metric for a place and time, corresponding to a pair of StatisticalPopulation and Observation with some generalization.
- get statistics: given a list of place DCID’s, return a time series of statistical values for the
- get population: given a list of place DCIDs, return the DCID of
StatisticalPopulations for these places, constrained by the given property values.
- get observation: given a list of
StatisticalPopulationDCIDs, return the DCID of
Observations for these statistical populations, constrained by the given observations’ property values.
- get population and observation: given the DCID of a node, return all the
Observations for this node.
- get place observation: return all
Observations for all
Places of a certain type, for a given
observationDate, given a set of constraints on the
Many applications need listings of places of a given type, often within containing areas.