Skip to content

Docker Deployment

This guide covers deploying IndentiaDB using Docker, from a single-container quick start to a three-node high-availability cluster with HAProxy load balancing.

Prerequisites

  • Docker Engine 24.0 or later
  • Docker Compose v2 (included with Docker Desktop, or install the Compose plugin separately)
  • At least 2 GB of available RAM per IndentiaDB node
  • A directory for persistent data storage

Trial image

ghcr.io/indentiaplatform/indentiadb-trial is a rolling 2-month evaluation build. The expiry date is printed in the startup logs. License keys for unlimited use can be requested at indentia.ai.

Quick Start

Run a single IndentiaDB instance with default settings:

docker run -d \
  --name indentiadb \
  -p 7001:7001 \
  -p 9200:9200 \
  -e SURREAL_USER=root \
  -e SURREAL_PASS=changeme \
  -v indentiadb-data:/data \
  ghcr.io/indentiaplatform/indentiadb-trial:latest

Verify the instance is running:

curl http://localhost:7001/health

The endpoints are:

Endpoint URL
SPARQL http://localhost:7001/sparql
GraphQL http://localhost:7001/graphql
Elasticsearch-compatible API http://localhost:9200
Health http://localhost:7001/health
Metrics http://localhost:7001/metrics

Docker Compose — Single Node

For a development or small production deployment, use Docker Compose with a single node.

docker-compose.yml

version: "3.9"

services:
  indentiadb:
    image: ghcr.io/indentiaplatform/indentiadb-trial:latest
    container_name: indentiadb
    restart: unless-stopped
    ports:
      - "7001:7001"
      - "9200:9200"
    environment:
      SURREAL_USER: root
      SURREAL_PASS: changeme
      ES_HYBRID_SCORER: rrf
      ES_VECTOR_SEARCH_MODE: hnsw
    volumes:
      - indentiadb-data:/data
      - ./config/indentiagraph.toml:/config/indentiagraph.toml:ro
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:7001/health"]
      interval: 10s
      timeout: 5s
      retries: 5
      start_period: 30s
    deploy:
      resources:
        limits:
          cpus: "2"
          memory: 4G
        reservations:
          cpus: "0.5"
          memory: 1G
    logging:
      driver: "json-file"
      options:
        max-size: "100m"
        max-file: "5"

volumes:
  indentiadb-data:
    driver: local

Start the stack:

docker compose up -d

View logs:

docker compose logs -f indentiadb

Stop the stack:

docker compose down

Docker Compose HA — 3-Node Cluster with HAProxy

For production with high availability, deploy three IndentiaDB nodes behind HAProxy.

docker-compose.ha.yml

version: "3.9"

services:
  indentiadb-1:
    image: ghcr.io/indentiaplatform/indentiadb-trial:latest
    container_name: indentiadb-1
    restart: unless-stopped
    environment:
      SURREAL_USER: root
      SURREAL_PASS: changeme
      ES_HYBRID_SCORER: bayesian
      ES_VECTOR_SEARCH_MODE: hnsw
      RAFT_SEED_NODES: "indentiadb-1:7002,indentiadb-2:7002,indentiadb-3:7002"
      NODE_ID: "1"
    volumes:
      - indentiadb-data-1:/data
    networks:
      - indentiadb-cluster
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:7001/health"]
      interval: 10s
      timeout: 5s
      retries: 5
      start_period: 30s
    deploy:
      resources:
        limits:
          cpus: "2"
          memory: 4G

  indentiadb-2:
    image: ghcr.io/indentiaplatform/indentiadb-trial:latest
    container_name: indentiadb-2
    restart: unless-stopped
    environment:
      SURREAL_USER: root
      SURREAL_PASS: changeme
      ES_HYBRID_SCORER: bayesian
      ES_VECTOR_SEARCH_MODE: hnsw
      RAFT_SEED_NODES: "indentiadb-1:7002,indentiadb-2:7002,indentiadb-3:7002"
      NODE_ID: "2"
    volumes:
      - indentiadb-data-2:/data
    networks:
      - indentiadb-cluster
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:7001/health"]
      interval: 10s
      timeout: 5s
      retries: 5
      start_period: 30s
    deploy:
      resources:
        limits:
          cpus: "2"
          memory: 4G

  indentiadb-3:
    image: ghcr.io/indentiaplatform/indentiadb-trial:latest
    container_name: indentiadb-3
    restart: unless-stopped
    environment:
      SURREAL_USER: root
      SURREAL_PASS: changeme
      ES_HYBRID_SCORER: bayesian
      ES_VECTOR_SEARCH_MODE: hnsw
      RAFT_SEED_NODES: "indentiadb-1:7002,indentiadb-2:7002,indentiadb-3:7002"
      NODE_ID: "3"
    volumes:
      - indentiadb-data-3:/data
    networks:
      - indentiadb-cluster
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:7001/health"]
      interval: 10s
      timeout: 5s
      retries: 5
      start_period: 30s
    deploy:
      resources:
        limits:
          cpus: "2"
          memory: 4G

  haproxy:
    image: haproxy:2.9-alpine
    container_name: indentiadb-haproxy
    restart: unless-stopped
    ports:
      - "7001:7001"
      - "9200:9200"
      - "8404:8404"    # HAProxy stats
    volumes:
      - ./haproxy.cfg:/usr/local/etc/haproxy/haproxy.cfg:ro
    networks:
      - indentiadb-cluster
    depends_on:
      indentiadb-1:
        condition: service_healthy
      indentiadb-2:
        condition: service_healthy
      indentiadb-3:
        condition: service_healthy

volumes:
  indentiadb-data-1:
    driver: local
  indentiadb-data-2:
    driver: local
  indentiadb-data-3:
    driver: local

networks:
  indentiadb-cluster:
    driver: bridge

haproxy.cfg

global
    log stdout format raw local0
    maxconn 50000

defaults
    log     global
    mode    http
    option  httplog
    option  dontlognull
    timeout connect 5s
    timeout client  60s
    timeout server  60s
    retries 3

frontend indentiadb_http
    bind *:7001
    default_backend indentiadb_nodes

frontend indentiadb_es
    bind *:9200
    default_backend indentiadb_es_nodes

frontend stats
    bind *:8404
    stats enable
    stats uri /stats
    stats refresh 30s

backend indentiadb_nodes
    balance roundrobin
    option httpchk GET /health
    http-check expect status 200
    server node1 indentiadb-1:7001 check inter 5s rise 2 fall 3
    server node2 indentiadb-2:7001 check inter 5s rise 2 fall 3
    server node3 indentiadb-3:7001 check inter 5s rise 2 fall 3

backend indentiadb_es_nodes
    balance roundrobin
    option httpchk GET /_cluster/health
    http-check expect status 200
    server node1 indentiadb-1:9200 check inter 5s rise 2 fall 3
    server node2 indentiadb-2:9200 check inter 5s rise 2 fall 3
    server node3 indentiadb-3:9200 check inter 5s rise 2 fall 3

Start the HA cluster:

docker compose -f docker-compose.ha.yml up -d

The cluster automatically elects a Raft leader. All nodes serve read requests; writes are forwarded to the leader.


Environment Variable Configuration

Variable Description Default
SURREAL_USER Root username (required)
SURREAL_PASS Root password (required)
ES_HYBRID_SCORER Hybrid search fusion algorithm: rrf, bayesian, or linear rrf
ES_VECTOR_SEARCH_MODE Vector index type: hnsw or flat hnsw
RAFT_SEED_NODES Comma-separated list of host:port for Raft peer discovery (cluster mode) (unset)
NODE_ID Unique node identifier within the cluster (auto)
LOG_LEVEL Log verbosity: error, warn, info, debug, trace info
DATA_DIR Override the data directory path /data

Configuration File

For advanced tuning, mount a TOML configuration file at /config/indentiagraph.toml:

docker run -d \
  --name indentiadb \
  -p 7001:7001 \
  -p 9200:9200 \
  -e SURREAL_USER=root \
  -e SURREAL_PASS=changeme \
  -v ./indentiagraph.toml:/config/indentiagraph.toml:ro \
  -v indentiadb-data:/data \
  ghcr.io/indentiaplatform/indentiadb-trial:latest

Example indentiagraph.toml:

[server]
bind = "0.0.0.0"
http_port = 7001
raft_port = 7002
es_port = 9200

[storage]
data_dir = "/data"

[search]
hybrid_scorer = "rrf"
vector_search_mode = "hnsw"

[cache]
query_cache_size_mb = 256

[logging]
level = "info"
format = "json"

Volume Mount Strategy

-v indentiadb-data:/data

Named volumes are managed by Docker and survive container removal. Use this for most deployments.

Bind Mount (Host Directory)

-v /srv/indentiadb/data:/data

Bind mounts give direct access to data from the host. Ensure the host directory is owned by the user that Docker runs the container as (typically UID 1000).

Multiple Mounts

Separate data and configuration:

-v indentiadb-data:/data \
-v ./config:/config:ro \
-v ./logs:/var/log/indentiadb

Health Check Configuration

The /health endpoint returns HTTP 200 when the node is ready. Docker Compose and Docker Swarm use this for container health tracking:

healthcheck:
  test: ["CMD", "curl", "-f", "http://localhost:7001/health"]
  interval: 10s
  timeout: 5s
  retries: 5
  start_period: 30s

Check container health status:

docker inspect --format='{{.State.Health.Status}}' indentiadb

Log Streaming

Stream logs from a running container:

docker logs -f indentiadb

Stream logs with timestamps:

docker logs -f --timestamps indentiadb

With Docker Compose:

docker compose logs -f --tail=100 indentiadb

Configure JSON log driver with rotation in docker-compose.yml:

logging:
  driver: "json-file"
  options:
    max-size: "100m"
    max-file: "5"

Backup and Restore

Creating a Backup

# Stop the container gracefully
docker stop indentiadb

# Create a tarball of the data volume
docker run --rm \
  -v indentiadb-data:/data:ro \
  -v $(pwd):/backup \
  alpine tar czf /backup/indentiadb-backup-$(date +%Y%m%d).tar.gz -C /data .

# Restart the container
docker start indentiadb

Restoring from Backup

# Stop and remove the existing container
docker stop indentiadb && docker rm indentiadb

# Clear and restore the data volume
docker run --rm \
  -v indentiadb-data:/data \
  -v $(pwd):/backup \
  alpine sh -c "rm -rf /data/* && tar xzf /backup/indentiadb-backup-20260321.tar.gz -C /data"

# Start the container again
docker run -d \
  --name indentiadb \
  -p 7001:7001 \
  -p 9200:9200 \
  -e SURREAL_USER=root \
  -e SURREAL_PASS=changeme \
  -v indentiadb-data:/data \
  ghcr.io/indentiaplatform/indentiadb-trial:latest

Production Considerations

Resource Limits

Always set CPU and memory limits in production to prevent runaway resource consumption:

deploy:
  resources:
    limits:
      cpus: "4"
      memory: 8G
    reservations:
      cpus: "1"
      memory: 2G

Log Rotation

Configure log rotation to prevent disk exhaustion:

logging:
  driver: "json-file"
  options:
    max-size: "200m"
    max-file: "10"

Restart Policy

Use restart: unless-stopped or restart: always to ensure the container recovers from crashes:

restart: unless-stopped

Upgrading

  1. Pull the new image:
docker pull ghcr.io/indentiaplatform/indentiadb-trial:latest
  1. For Docker Compose:
docker compose pull
docker compose up -d
  1. For HA clusters, perform a rolling upgrade by restarting one node at a time and waiting for it to rejoin the Raft cluster before proceeding to the next node:
docker compose -f docker-compose.ha.yml pull
docker compose -f docker-compose.ha.yml up -d --no-deps indentiadb-1
# Wait for node 1 to be healthy
docker compose -f docker-compose.ha.yml up -d --no-deps indentiadb-2
# Wait for node 2 to be healthy
docker compose -f docker-compose.ha.yml up -d --no-deps indentiadb-3

Ports Reference

Port Protocol Description
7001 HTTP SPARQL, GraphQL, REST API, health, and metrics
7002 gRPC Raft inter-node communication (cluster only)
9200 HTTP Elasticsearch-compatible search API