Advanced setups
This page covers hybrid setups that are not recommended for most use cases, but may be helpful for some custom Data Commons instances:
- Running the data management container locally, and the service container in Google Cloud. In this scenario, you store your input data locally, and write the output to Cloud Storage and Cloud SQL. This might be useful for users with very large data sets, that would like to cut down on output generation times and the cost of storing input data in addition to output data.
- Running the service container locally, and the data management container in Google Cloud. If you have already set up a data processing pipeline to send your input data to Google Cloud, but are still iterating on the website code, this might be a useful option.
- Running the service container locally, and custom MCP instructions in Google Cloud. If you’re already using Google Cloud Storage but want to test the server locally, you can use this option.
Run the data management container locally and the service container in the cloud
This process is similar to running both data management and services containers locally, with a few exceptions:
- Your input directory will be the local file system, while the output directory will be a Google Cloud Storage bucket and folder.
- You must start the job with credentials to be passed to Google Cloud, to access the Cloud SQL instance.
Before you proceed, ensure you have set up all necessary GCP services.
Step 1: Set environment variables
To run a local instance of the data management container, you need to set all of the GCP-related environment variables in the custom_dc/env.list file.
- Obtain the values output by Terraform scripts:
- Cloud Console
- gcloud CLI
- Go to https://console.cloud.google.com/run/jobs for your project, select the relevant job from the list, and click View and edit job configuration.
- Under the Containers tab, select the Variables & Secrets tab.
- Look up the name of the secret for the
DB_PASSvariable. It is in the formNAMESPACE-datacommons-mysql-password. - Go to https://console.cloud.google.com/secret-manager and in the list of secrets, click on the link of the secret name.
- Select Actions > View secret value. Copy the value to your
env.listfile.
- Run the following command:
gcloud run jobs describe JOB_NAME --region REGION
- From the
Secretssection of the output, note the name of theDB_PASSsecret. It is in the formNAMESPACE-datacommons-mysql-password. - Run this command to obtain its value:
gcloud secrets versions access latest --secret=SECRET_ID
- Copy all of the variable values obtained above into your
env.listfile, with the exception ofFORCE_RESTARTandINPUT_DIR. - Set the value of
INPUT_DIRto the full local path where your CSV, JSON, and MCF files are located. - If needed, update the value of
OUTPUT_DIRto the Google Cloud Storage folder where the output should be written, in the formgs://GCS_BUCKET/FOLDER.
Step 2: Run the data management Docker container
- Bash script
- Docker commands
website root directory, run the following command:
./run_cdc_dev_docker.sh --container data
- Generate credentials for Cloud application authentication:
gcloud auth application-default login
- From the
websiteroot directory, run the data container:docker run \ --env-file $PWD/custom_dc/env.list \ -v INPUT_DIRECTORY:INPUT_DIRECTORY \ -e GOOGLE_APPLICATION_CREDENTIALS=/gcp/creds.json \ -v $HOME/.config/gcloud/application_default_credentials.json:/gcp/creds.json:ro \ gcr.io/datcom-ci/datacommons-data:stable>
- The input directory is the local path. You don't specify the output directory, as you aren't mounting a local output volume.
To verify that the data is correctly created in your Cloud SQL database, use the procedure in Inspect the Cloud SQL database.
(Optional) Run the data management Docker container in schema update mode
If you have tried to start a container, and have received a SQL check failed error, this indicates that a database schema update is needed. You need to restart the data management container, and you can specify an additional, optional, flag, DATA_RUN_MODE to miminize the startup time.
- Bash script
- Docker commands
./run_cdc_dev_docker.sh --container data --schema_update
docker run \ --env-file $PWD/custom_dc/env.list \ -v INPUT_DIRECTORY:INPUT_DIRECTORY \ -e GOOGLE_APPLICATION_CREDENTIALS=/gcp/creds.json \ -v $HOME/.config/gcloud/application_default_credentials.json:/gcp/creds.json:ro \ -e DATA_RUN_MODE=schemaupdate gcr.io/datcom-ci/datacommons-data:stable
Step 3: Restart the services container in Google Cloud
Follow any of the procedures provided in Manage your service.
Access Cloud data from a local services container
For testing purposes, if you wish to run the services Docker container locally but access the data in Google Cloud. This process is similar to running both data management and services containers in the cloud, but with a step to start a local Docker services container.
Before you proceed, ensure you have set up all necessary GCP services.
Step 1: Set environment variables
To run a local instance of the services container, you need to set all of the GCP-related environment variables in the env.list file.
- Obtain the values output by Terraform scripts:
- Cloud Console
- gcloud CLI
- Go to https://console.cloud.google.com/run/services for your project, and select the relevant service from the list.
- In the Service details screen, click the Revisions tab.
- In the right-hand window, select the Containers tab and scroll down to the Environment variables section.
- Look up the name of the secret for the
DB_PASSvariable. It is in the formNAMESPACE-datacommons-mysql-password. - Go to https://console.cloud.google.com/secret-manager and in the list of secrets, click on the link of the secret name.
- Select Actions > View secret value.
- Run the following command:
gcloud run services describe SERVICE_NAME --region REGION
- From the
Secretssection of the output, note the name of theDB_PASSsecret. It is in the formNAMESPACE-datacommons-mysql-password. - Run this command to obtain its value:
gcloud secrets versions access latest --secret=SECRET_ID
- Copy all of the variable values obtained above into your
env.listfile, with the exception ofFORCE_RESTART.
Step 2: Run the services Docker container
- Bash script
- Docker commands
- Generate credentials for Cloud application authentication:
gcloud auth application-default login
- From your
websiteroot directory, run the services container:./run_cdc_dev_docker.sh --container service [--image IMAGE_CONTAINER_URL]
If you're using a custom-built image, the image container URL is required, in the form
name:tag.
- Generate credentials for Cloud application authentication:
gcloud auth application-default login
- Run the container:
docker run -it \ -p 8080:8080 \ -e DEBUG=true \ --env-file $PWD/custom_dc/env.list \ -e GOOGLE_APPLICATION_CREDENTIALS=/gcp/creds.json \ -v $HOME/.config/gcloud/application_default_credentials.json:/gcp/creds.json:ro \ IMAGE_CONTAINER_URLThe image container URL is the name and tag of a prebuilt or custom-built image.
Once the services are up and running, visit your local instance by pointing your browser to http://localhost:8080.
If you encounter any issues, look at the detailed output log on the console, and visit the Troubleshooting Guide for detailed solutions to common problems.
Run the service container locally, with custom MCP instruction files in Google Cloud
This process is similar to the above, assuming that you are also accessing data files in Google Cloud Storage.
Before you proceed, ensure you have set up all necessary GCP services.
Step 1: Upload Markdown files to Google Cloud Storage
Follow step 1 of Provide custom MCP instructions files, using any of the methods to create the directories and upload the files.
Step 2: Configure local environment variable
In your env.list file, set the DC_INSTRUCTIONS_DIR variable to the folder you created in Google Cloud Storage in the previous step, using the form gs://GCS_BUCKET/INSTRUCTIONS_FOLDER. For example, if your Cloud Storage bucket is named mybucket and the folder you created in it is called instructions, you would specify the following:
DC_INSTRUCTIONS_DIR=gs://mybucket/instructions
Step 3: Restart the services container
Run the services container as in step 2 above.
To verify that the custom files are loaded, in the MCP server output, you should see something like the following:
INFO:datacommons_mcp.app:Loaded custom instructions for server.md from gs://mybucket/instructions
INFO:datacommons_mcp.app:Loaded custom instructions for tools/get_observations.md from gs://mybucket/instructions
INFO:datacommons_mcp.app:Loaded custom instructions for tools/search_indicators.md from gs://mybucket/instructions
Step 4: Connect an agent to the server
Follow any of the procedures in Connect an AI agent to a local server.
Page last updated: May 01, 2026 • Send feedback about this page