Data Commons REST API V2

Overview

The Data Commons REST API is a REST library that enables developers to programmatically access data in the Data Commons knowledge graph, using HTTP. This allows you to explore the structure of the graph, integrate statistics from the graph into data analysis applications and much more.

Following HTTP, a REST API call consists of a request that you provide, and a response from the Data Commons servers with the data you requested, in JSON format. You can use the REST API with any tool or language that supports HTTP. You can make queries on the command line (e.g. using cURL), by scripting HTTP requests in another language like Javascript, or even by entering an endpoint into your web browser!

What’s new in V2

The V2 API collapses functionality from the V1 API into a smaller number of endpoints, by introducing a syntax for relation expressions, described below. Each API endpoint can also handle both single and bulk requests.

Service endpoints

You make requests through API endpoints. You access each endpoint using its unique URL, which is a combination of a base URL and the endpoint’s URI.

The base URL for all REST endpoints is:

https://api.datacommons.org/VERSION

The current version is v2.

To access a particular endpoint, append the URI to the base URL (e.g. https://api.datacommons.org/v2/node ). The URIs for the V2 API are below:

API URI path Description
Node /node Fetches information about edges and neighboring nodes
Observation /observation Fetches statistical observations
Resolve entities /resolve Returns a Data Commons ID (DCID) for entities in the graph
SPARQL /v2/sparql Returns matches to a SPARQL graph query

Endpoints for custom instances

If you are running your own Data Commons, the URL/URI endpoints are slightly different:

CUSTOM_URL/core/api/v2

Query parameters

Endpoints take a set of parameters which allow you to specify the entities, variables, timescales, etc. you are interested in. The V2 APIs only use query parameters.

Query parameters are chained at the end of a URL behind a ? symbol. Separate multiple parameter entries with an & symbol. For example, this would look like:

https://api.datacommons.org/v2/node?key=API_KEY&nodes=DCID1&nodes=DCID2&property=<-*

Still confused? Each endpoint’s documentation page has examples at the bottom tailored to the endpoint you’re trying to use.

POST requests

All V2 endpoints allow for POST requests. For POST requests, feed all parameters in JSON format. For example, in cURL, this would look like:

curl -X POST \
-H "X-API-Key: API_KEY" \
--url https://api.datacommons.org/v2/node \
--data '{
  "nodes": [
    "geoId/06085",
    "geoId/06086"
  ],
  "property": "->[name, latitude, longitude]"
}'

Authentication

All access to base Data Commons using the REST APIs must be authenticated and authorized with an API key.

We provide a trial API key for general public use. This key will let you try the API and make single requests.

The trial key is capped with a limited quota for requests. If you are planning on using our APIs more rigorously (e.g. for personal or school projects, developing applications, etc.) please request an official key without any quota limits; please see Obtain an API key for information.

Note: If you are sending API requests to a custom Data Commons instance, do not include any API key in the requests.

To include an API key, add your API key to the URL as a query parameter by appending ?key=API_KEY.

For GET requests, this looks like:

https://api.datacommons.org/v2/ENDPOINT?key=API_KEY

If the key is not the first query parameter, use &key=API_KEY instead. This looks like:

https://api.datacommons.org/v2/ENDPOINT?QUERY=VALUE&key=API_KEY

For POST requests, pass the key as a header. For example, in cURL, this looks like:

curl -X POST \
--url https://api.datacommons.org/v2/node \
--header 'X-API-Key: API_KEY' \
--data '{
  "nodes": [
    "ENTITY_DCID_1",
    "ENTITY_DCID_2",
    ...
  ],
  "property: "RELATION_EXPRESSION"
}'

Find available entities, variables, and their DCIDs

Many requests require the DCID of the entity or variable you wish to query. For tips on how to find relevant DCIDs, entities and variables, please see the Key concepts document, specifically the following sections:

Relation expressions

Data Commons represents real world entities and data as nodes. These nodes are connected by directed edges, or arcs, to form a knowledge graph. The label of the arc is the name of the property.

Relation expressions include arrow annotation and other symbols in the syntax to represent neighboring nodes, and to support chaining and filtering. These new expressions allow all of the functionality of the V1 API to be expressed with fewer API endpoints in V2. All V2 API calls require relation expressions in the property or expression parameter.

The following table describes symbols in the V2 API relation expressions:

-> An outgoing arc
<- An incoming arc
{PROPERTY:VALUE} Filtering; identifies the property and associated value
[] Multiple properties, separated by commas
* All properties linked to this node
+ One or more expressions chained together for indirect relationships, like containedInPlace+{typeOf:City}

Incoming and outgoing arcs

Arcs in the Data Commons Graph have directions. In the example below, for the node Argentina, the property containedInPlace exists in both in and out directions, illustrated in the following figure:

Note the directionality of the property containedInPlace: incoming arc represents “Argentina contains Buenos Aires”, while the outgoing arc represents “Argentina is in South America”.

Nodes for outgoing arcs are represented by ->, while nodes for incoming arcs arcs are represented by <-. To illustrate using the above example:

  • Regions that include Argentina (DCID: country/ARG): country/ARG->containedInPlace
  • All cities directly contained in Argentina (DCID: country/ARG): country/ARG<-containedInPlace{typeOf:City}

Filters

You can use filters to reduce results to only match nodes with a specified property and value. Use {} to specify property:value pairs to define the filter. Using the same example, country/ARG<-containedInPlace+{typeOf:City} only returns nodes with the typeOf:City, filtering out typeOf:AdministrativeArea1 and so on.

Specify multiple properties

You can combine multiple properties together within []. For example, to request a few outgoing arcs for a node, use ->[name, latitude, longitude]. See more in this Node API example).

Wildcard

To retrieve all properties linked to a node, use the * wildcard, e.g. <-*. See more in this Node API example.

Chain properties

Use + to express a chain expression. A chain expression represents requests for information about nodes which are connected by the same property, but are a few hops away. This is supported only for the containedInPlace property.

To illustrate again using the Argentina example:

  • All cities directly contained in Argentina (dcid: country/ARG): country/ARG<-containedInPlace{typeOf:City}
  • All cities indirectly contained in Argentina (dcid: country/ARG): country/ARG<-containedInPlace+{typeOf:City}

URL-encoding reserved characters in GET requests

HTTP GET requests do not allow some of the characters used by Data Commons DCIDs and relation expressions. When sending GET requests, you may need to use the corresponding percent codes for reserved characters. For example, a query string such as the following:

https://api.datacommons.org/v2/node?key=AIzaSyCTI4Xz-UW_G2Q2RfknhcfdAnTHq5X5XuI&nodes=geoId/06&property=<-*

should be encoded as:

https://api.datacommons.org/v2/node?key=AIzaSyCTI4Xz-UW_G2Q2RfknhcfdAnTHq5X5XuI&nodes=geoId%2F06&property=%3C-%2A

Although sometimes the original characters may work, it’s safest to always encode them.

Tip: Don’t URL-encode delimiters between parameters (&), separators between parameter names and values (=), or -.

See https://www.w3schools.com/tags/ref_urlencode.ASP for a handy reference.

Pagination

When the response to a request is too long, the returned payload is paginated. Only a subset of the response is returned, along with a long string of characters called a token. To get the next set of entries, repeat the request with nextToken as an query parameter, with the token as its value.

For example, the request:

curl --request GET \
  'https://api.datacommons.org/v2/node?key=AIzaSyCTI4Xz-UW_G2Q2RfknhcfdAnTHq5X5XuI&nodes=geoId%2F06&property=%3C-%2A'

will return something like:

{
  "data": {
    "geoId/06": {
      "arcs": < ... output truncated for brevity ...>
    },
  },
  "nextToken": "SoME_veRy_L0ng_STrIng"
}

To get the next set of entries, repeat the previous command and append the nextToken:

curl --request GET \
  'https://api.datacommons.org/v2/node?key=AIzaSyCTI4Xz-UW_G2Q2RfknhcfdAnTHq5X5XuI&nodes=geoId%2F06&property=%3C-%2A&nextToken=SoME_veRy_L0ng_STrIng'

Similarly for POST requests, this would look like:

curl -X POST \
-H "X-API-Key: AIzaSyCTI4Xz-UW_G2Q2RfknhcfdAnTHq5X5XuI" \
--url https://api.datacommons.org/v2/node \
--data '{
  "nodes": "geoId/06",
  "property": "<-*",
  "nextToken": "SoME_veRy_L0ng_STrIng"
}'

Don’t forget to URL-encode any special characters that appear in the string.