Advanced setups

This page covers hybrid setups that are not recommended for most use cases, but may be helpful for some custom Data Commons instances:

Run the data management container locally and the service container in the cloud

This process is similar to running both data management and services containers locally, with a few exceptions:

  • Your input directory will be the local file system, while the output directory will be a Google Cloud Storage bucket and folder.
  • You must start the job with credentials to be passed to Google Cloud, to access the Cloud SQL instance.

Before you proceed, ensure you have set up all necessary GCP services.

Set environment variables

To run a local instance of the data management container, you need to set all of the environment variables in the custom_dc/env.list file, including all the GCP ones.

  1. Obtain the values output by Terraform scripts: Go to https://console.cloud.google.com/run/jobs for your project, select the relevant job from the list, and click View and edit job configuration.
  2. Expand Edit container, and select the Variables and secrets tab.
  3. Copy the values of all the variables, with the exception of FORCE_RESTART and INPUT_DIR to your env.list file.
  4. Set the value of INPUT_DIR to the full local path where your CSV, JSON, and JSON files are located.

Run the data management Docker container

  • Bash script
  • Docker commands
./run_cdc_dev_docker.sh --container data [--release latest]
If you don't specify the --release option, it will use the stable version by default.
  1. Generate credentials for Cloud application authentication:
    gcloud auth application-default login
  2. Run the container:
    docker run \
    --env-file $PWD/custom_dc/env.list \
    -v INPUT_DIRECTORY:INPUT_DIRECTORY \
    -e GOOGLE_APPLICATION_CREDENTIALS=/gcp/creds.json \
    -v $HOME/.config/gcloud/application_default_credentials.json:/gcp/creds.json:ro \
    gcr.io/datcom-ci/datacommons-data:VERSION
  • The input directory is the local path. You don't specify the output directory, as you aren't mounting a local output volume.
  • The version is latest or stable.

To verify that the data is correctly created in your Cloud SQL database, use the procedure in Inspect the Cloud SQL database.

(Optional) Run the data management Docker container in schema update mode

If you have tried to start a container, and have received a SQL check failed error, this indicates that a database schema update is needed. You need to restart the data management container, and you can specify an additional, optional, flag, DATA_RUN_MODE to miminize the startup time.

  • Bash script
  • Docker commands
./run_cdc_dev_docker.sh --container data --schema_update [--release latest]
docker run \
--env-file $PWD/custom_dc/env.list \
-v INPUT_DIRECTORY:INPUT_DIRECTORY \
-e GOOGLE_APPLICATION_CREDENTIALS=/gcp/creds.json \
-v $HOME/.config/gcloud/application_default_credentials.json:/gcp/creds.json:ro \
-e DATA_RUN_MODE=schemaupdate
gcr.io/datcom-ci/datacommons-data:VERSION

Restart the services container in Google Cloud

Follow any of the procedures provided in Start/restart the services container.

Access Cloud data from a local services container

For testing purposes, if you wish to run the services Docker container locally but access the data in Google Cloud. This process is similar to running both data management and services containers in the cloud, but with a step to start a local Docker services container.

Before you proceed, ensure you have set up all necessary GCP services.

Set environment variables

To run a local instance of the services container, you need to set all of the environment variables in the custom_dc/env.list file, including all the GCP ones.

  1. Obtain the values output by Terraform scripts: Go to https://console.cloud.google.com/run/services for your project, select the relevant service from the list, and click the Revisions tab.
  2. In the right-hand window, scroll to Environment variables.
  3. Copy the values of all the variables, with the exception of FORCE_RESTART to your env.list file.

Run the services Docker container

  • Bash script
  • Docker commands
To build and run a custom image:
./run_cdc_dev_docker.sh --actions build_run --container service --image IMAGE_NAME:IMAGE_TAG
To run a previously built custom image:
./run_cdc_dev_docker.sh --container service --image IMAGE_NAME:IMAGE_TAG
To run a Data Commons standard release:
./run_cdc_dev_docker.sh --container service [--release latest]
If you don't specify the --release option, it will use the stable version by default.
  1. Generate credentials for Cloud application authentication:
    gcloud auth application-default login
  2. Run the container.
    To run a custom image:
    docker run -it \
    --env-file $PWD/custom_dc/env.list \
    -p 8080:8080 \
    -e DEBUG=true \
    -e GOOGLE_APPLICATION_CREDENTIALS=/gcp/creds.json \
    -v $HOME/.config/gcloud/application_default_credentials.json:/gcp/creds.json:ro \
    -v $PWD/server/templates/custom_dc/custom:/workspace/server/templates/custom_dc/custom \
    -v $PWD/static/custom_dc/custom:/workspace/static/custom_dc/custom \
    IMAGE_NAME:IMAGE_TAG
    • The image name and image tag are the values you set when you created the package.
    • You don't specify any directories here, as you aren't mounting any local volumes.

    To run a Data Commons standard release:
    docker run -it \
    --env-file $PWD/custom_dc/env.list \
    -p 8080:8080 \
    -e DEBUG=true \
    -e GOOGLE_APPLICATION_CREDENTIALS=/gcp/creds.json \
    -v $HOME/.config/gcloud/application_default_credentials.json:/gcp/creds.json:ro \
    gcr.io/datcom-ci/datacommons-services:VERSION
    • The version is latest or stable.
    • You don't specify any directories here, as you aren't mounting any local volumes.

Once the services are up and running, visit your local instance by pointing your browser to http://localhost:8080.

If you encounter any issues, look at the detailed output log on the console, and visit the Troubleshooting Guide for detailed solutions to common problems.

Page last updated: April 14, 2025 • Send feedback about this page