Data Commons REST API V2
- Overview
- What’s new in V2
- Service endpoints
- Query parameters
- POST requests
- Authentication
- Find available entities, variables, and their DCIDs
- Relation expressions
- URL-encoding reserved characters in GET requests
- Pagination
Overview
The Data Commons REST API is a REST library that enables developers to programmatically access data in the Data Commons knowledge graph, using HTTP. This allows you to explore the structure of the graph, integrate statistics from the graph into data analysis applications and much more.
Following HTTP, a REST API call consists of a request that you provide, and a response from the Data Commons servers with the data you requested, in JSON format. You can use the REST API with any tool or language that supports HTTP. You can make queries on the command line (e.g. using cURL), by scripting HTTP requests in another language like Javascript, or even by entering an endpoint into your web browser!
What’s new in V2
The V2 API collapses functionality from the V1 API into a smaller number of endpoints, by introducing a syntax for relation expressions, described below. Each API endpoint can also handle both single and bulk requests.
Service endpoints
You make requests through API endpoints. You access each endpoint using its unique URL, which is a combination of a base URL and the endpoint’s URI.
The base URL for all REST endpoints is:
https://api.datacommons.org/VERSION
The current version is v2
.
To access a particular endpoint, append the URI to the base URL (e.g. https://api.datacommons.org/v2/node
).
The URIs for the V2 API are below:
API | URI path | Description |
---|---|---|
Node | /node | Fetches information about edges and neighboring nodes |
Observation | /observation | Fetches statistical observations |
Resolve entities | /resolve | Returns a Data Commons ID (DCID ) for entities in the graph |
SPARQL | /v2/sparql | Returns matches to a SPARQL graph query |
Endpoints for custom instances
If you are running your own Data Commons, the URL/URI endpoints are slightly different:
CUSTOM_URL/core/api/v2
Query parameters
Endpoints take a set of parameters which allow you to specify the entities, variables, timescales, etc. you are interested in. The V2 APIs only use query parameters.
Query parameters are chained at the end of a URL behind a ?
symbol. Separate multiple parameter entries with an &
symbol. For example, this would look like:
https://api.datacommons.org/v2/node?key=API_KEY&nodes=DCID1&nodes=DCID2&property=<-*
Still confused? Each endpoint’s documentation page has examples at the bottom tailored to the endpoint you’re trying to use.
POST requests
All V2 endpoints allow for POST requests. For POST requests, feed all parameters in JSON format. For example, in cURL, this would look like:
curl -X POST \ -H "X-API-Key: API_KEY" \ --url https://api.datacommons.org/v2/node \ --data '{ "nodes": [ "geoId/06085", "geoId/06086" ], "property": "->[name, latitude, longitude]" }'
Authentication
All access to base Data Commons using the REST APIs must be authenticated and authorized with an API key.
We provide a trial API key for general public use. This key will let you try the API and make single requests.
AIzaSyCTI4Xz-UW_G2Q2RfknhcfdAnTHq5X5XuI
The trial key is capped with a limited quota for requests. If you are planning on using our APIs more rigorously (e.g. for personal or school projects, developing applications, etc.) please request an official key without any quota limits; please see Obtain an API key for information.
Note: If you are sending API requests to a custom Data Commons instance, do not include any API key in the requests.
To include an API key, add your API key to the URL as a query parameter by appending ?key=API_KEY
.
For GET requests, this looks like:
https://api.datacommons.org/v2/ENDPOINT?key=API_KEY
If the key is not the first query parameter, use &key=API_KEY
instead. This looks like:
https://api.datacommons.org/v2/ENDPOINT?QUERY=VALUE&key=API_KEY
For POST requests, pass the key as a header. For example, in cURL, this looks like:
curl -X POST \ --url https://api.datacommons.org/v2/node \ --header 'X-API-Key: API_KEY' \ --data '{ "nodes": [ "ENTITY_DCID_1", "ENTITY_DCID_2", ... ], "property: "RELATION_EXPRESSION" }'
Find available entities, variables, and their DCIDs
Many requests require the DCID of the entity or variable you wish to query. For tips on how to find relevant DCIDs, entities and variables, please see the Key concepts document, specifically the following sections:
Relation expressions
Data Commons represents real world entities and data as nodes. These nodes are connected by directed edges, or arcs, to form a knowledge graph. The label of the arc is the name of the property.
Relation expressions include arrow annotation and other symbols in the syntax to
represent neighboring nodes, and to support chaining and filtering.
These new expressions allow all of the functionality of the V1 API to be
expressed with fewer API endpoints in V2. All V2 API calls require relation
expressions in the property
or expression
parameter.
The following table describes symbols in the V2 API relation expressions:
-> |
An outgoing arc |
<- |
An incoming arc |
{PROPERTY:VALUE} |
Filtering; identifies the property and associated value |
[] |
Multiple properties, separated by commas |
* |
All properties linked to this node |
+ |
One or more expressions chained together for indirect relationships, like containedInPlace+{typeOf:City} |
Incoming and outgoing arcs
Arcs in the Data Commons Graph have directions. In the example below, for the node Argentina, the property containedInPlace
exists in both in and out directions, illustrated in the following figure:
Note the directionality of the property containedInPlace
: incoming arc represents “Argentina contains Buenos Aires”, while the outgoing arc represents “Argentina is in South America”.
Nodes for outgoing arcs are represented by ->
, while nodes for incoming arcs
arcs are represented by <-
. To illustrate using the above example:
- Regions that include Argentina (DCID:
country/ARG
):country/ARG->containedInPlace
- All cities directly contained in Argentina (DCID:
country/ARG
):country/ARG<-containedInPlace{typeOf:City}
Filters
You can use filters to reduce results to only match nodes with a specified property and value. Use {} to specify property:value pairs to define the filter. Using the same example, country/ARG<-containedInPlace+{typeOf:City}
only returns nodes with the typeOf:City
, filtering out typeOf:AdministrativeArea1
and so on.
Specify multiple properties
You can combine multiple properties together within []
. For example, to request a few outgoing arcs for a node, use
->[name, latitude, longitude]
. See more in this Node API example).
Wildcard
To retrieve all properties linked to a node, use the *
wildcard, e.g. <-*
.
See more in this Node API example.
Chain properties
Use +
to express a chain expression. A chain expression represents requests for information about nodes
which are connected by the same property, but are a few hops away. This is supported only for the containedInPlace
property.
To illustrate again using the Argentina example:
- All cities directly contained in Argentina (dcid:
country/ARG
):country/ARG<-containedInPlace{typeOf:City}
- All cities indirectly contained in Argentina (dcid:
country/ARG
):country/ARG<-containedInPlace+{typeOf:City}
URL-encoding reserved characters in GET requests
HTTP GET requests do not allow some of the characters used by Data Commons DCIDs and relation expressions. When sending GET requests, you may need to use the corresponding percent codes for reserved characters. For example, a query string such as the following:
https://api.datacommons.org/v2/node?key=AIzaSyCTI4Xz-UW_G2Q2RfknhcfdAnTHq5X5XuI&nodes=geoId/06&property=<-*
should be encoded as:
https://api.datacommons.org/v2/node?key=AIzaSyCTI4Xz-UW_G2Q2RfknhcfdAnTHq5X5XuI&nodes=geoId%2F06&property=%3C-%2A
Although sometimes the original characters may work, it’s safest to always encode them.
Tip: Don’t URL-encode delimiters between parameters (
&
), separators between parameter names and values (=
), or-
.
See https://www.w3schools.com/tags/ref_urlencode.ASP for a handy reference.
Pagination
When the response to a request is too long, the returned payload is
paginated. Only a subset of the response is returned, along with a long string
of characters called a token. To get the next set of entries, repeat the
request with nextToken
as an query parameter, with the token as its value.
For example, the request:
curl --request GET \
'https://api.datacommons.org/v2/node?key=AIzaSyCTI4Xz-UW_G2Q2RfknhcfdAnTHq5X5XuI&nodes=geoId%2F06&property=%3C-%2A'
will return something like:
{
"data": {
"geoId/06": {
"arcs": < ... output truncated for brevity ...>
},
},
"nextToken": "SoME_veRy_L0ng_STrIng"
}
To get the next set of entries, repeat the previous command and append the nextToken
:
curl --request GET \
'https://api.datacommons.org/v2/node?key=AIzaSyCTI4Xz-UW_G2Q2RfknhcfdAnTHq5X5XuI&nodes=geoId%2F06&property=%3C-%2A&nextToken=SoME_veRy_L0ng_STrIng'
Similarly for POST requests, this would look like:
curl -X POST \
-H "X-API-Key: AIzaSyCTI4Xz-UW_G2Q2RfknhcfdAnTHq5X5XuI" \
--url https://api.datacommons.org/v2/node \
--data '{
"nodes": "geoId/06",
"property": "<-*",
"nextToken": "SoME_veRy_L0ng_STrIng"
}'
Don’t forget to URL-encode any special characters that appear in the string.