Docker Deployment¶
This guide covers deploying IndentiaDB using Docker, from a single-container quick start to a three-node high-availability cluster with HAProxy load balancing.
Prerequisites¶
- Docker Engine 24.0 or later
- Docker Compose v2 (included with Docker Desktop, or install the Compose plugin separately)
- At least 2 GB of available RAM per IndentiaDB node
- A directory for persistent data storage
Trial image
ghcr.io/indentiaplatform/indentiadb-trial is a rolling 2-month evaluation build. The expiry date is printed in the startup logs. License keys for unlimited use can be requested at indentia.ai.
Quick Start¶
Run a single IndentiaDB instance with default settings:
docker run -d \
--name indentiadb \
-p 7001:7001 \
-p 9200:9200 \
-e SURREAL_USER=root \
-e SURREAL_PASS=changeme \
-v indentiadb-data:/data \
ghcr.io/indentiaplatform/indentiadb-trial:latest
Verify the instance is running:
The endpoints are:
| Endpoint | URL |
|---|---|
| SPARQL | http://localhost:7001/sparql |
| GraphQL | http://localhost:7001/graphql |
| Elasticsearch-compatible API | http://localhost:9200 |
| Health | http://localhost:7001/health |
| Metrics | http://localhost:7001/metrics |
Docker Compose — Single Node¶
For a development or small production deployment, use Docker Compose with a single node.
docker-compose.yml¶
version: "3.9"
services:
indentiadb:
image: ghcr.io/indentiaplatform/indentiadb-trial:latest
container_name: indentiadb
restart: unless-stopped
ports:
- "7001:7001"
- "9200:9200"
environment:
SURREAL_USER: root
SURREAL_PASS: changeme
ES_HYBRID_SCORER: rrf
ES_VECTOR_SEARCH_MODE: hnsw
volumes:
- indentiadb-data:/data
- ./config/indentiagraph.toml:/config/indentiagraph.toml:ro
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:7001/health"]
interval: 10s
timeout: 5s
retries: 5
start_period: 30s
deploy:
resources:
limits:
cpus: "2"
memory: 4G
reservations:
cpus: "0.5"
memory: 1G
logging:
driver: "json-file"
options:
max-size: "100m"
max-file: "5"
volumes:
indentiadb-data:
driver: local
Start the stack:
View logs:
Stop the stack:
Docker Compose HA — 3-Node Cluster with HAProxy¶
For production with high availability, deploy three IndentiaDB nodes behind HAProxy.
docker-compose.ha.yml¶
version: "3.9"
services:
indentiadb-1:
image: ghcr.io/indentiaplatform/indentiadb-trial:latest
container_name: indentiadb-1
restart: unless-stopped
environment:
SURREAL_USER: root
SURREAL_PASS: changeme
ES_HYBRID_SCORER: bayesian
ES_VECTOR_SEARCH_MODE: hnsw
RAFT_SEED_NODES: "indentiadb-1:7002,indentiadb-2:7002,indentiadb-3:7002"
NODE_ID: "1"
volumes:
- indentiadb-data-1:/data
networks:
- indentiadb-cluster
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:7001/health"]
interval: 10s
timeout: 5s
retries: 5
start_period: 30s
deploy:
resources:
limits:
cpus: "2"
memory: 4G
indentiadb-2:
image: ghcr.io/indentiaplatform/indentiadb-trial:latest
container_name: indentiadb-2
restart: unless-stopped
environment:
SURREAL_USER: root
SURREAL_PASS: changeme
ES_HYBRID_SCORER: bayesian
ES_VECTOR_SEARCH_MODE: hnsw
RAFT_SEED_NODES: "indentiadb-1:7002,indentiadb-2:7002,indentiadb-3:7002"
NODE_ID: "2"
volumes:
- indentiadb-data-2:/data
networks:
- indentiadb-cluster
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:7001/health"]
interval: 10s
timeout: 5s
retries: 5
start_period: 30s
deploy:
resources:
limits:
cpus: "2"
memory: 4G
indentiadb-3:
image: ghcr.io/indentiaplatform/indentiadb-trial:latest
container_name: indentiadb-3
restart: unless-stopped
environment:
SURREAL_USER: root
SURREAL_PASS: changeme
ES_HYBRID_SCORER: bayesian
ES_VECTOR_SEARCH_MODE: hnsw
RAFT_SEED_NODES: "indentiadb-1:7002,indentiadb-2:7002,indentiadb-3:7002"
NODE_ID: "3"
volumes:
- indentiadb-data-3:/data
networks:
- indentiadb-cluster
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:7001/health"]
interval: 10s
timeout: 5s
retries: 5
start_period: 30s
deploy:
resources:
limits:
cpus: "2"
memory: 4G
haproxy:
image: haproxy:2.9-alpine
container_name: indentiadb-haproxy
restart: unless-stopped
ports:
- "7001:7001"
- "9200:9200"
- "8404:8404" # HAProxy stats
volumes:
- ./haproxy.cfg:/usr/local/etc/haproxy/haproxy.cfg:ro
networks:
- indentiadb-cluster
depends_on:
indentiadb-1:
condition: service_healthy
indentiadb-2:
condition: service_healthy
indentiadb-3:
condition: service_healthy
volumes:
indentiadb-data-1:
driver: local
indentiadb-data-2:
driver: local
indentiadb-data-3:
driver: local
networks:
indentiadb-cluster:
driver: bridge
haproxy.cfg¶
global
log stdout format raw local0
maxconn 50000
defaults
log global
mode http
option httplog
option dontlognull
timeout connect 5s
timeout client 60s
timeout server 60s
retries 3
frontend indentiadb_http
bind *:7001
default_backend indentiadb_nodes
frontend indentiadb_es
bind *:9200
default_backend indentiadb_es_nodes
frontend stats
bind *:8404
stats enable
stats uri /stats
stats refresh 30s
backend indentiadb_nodes
balance roundrobin
option httpchk GET /health
http-check expect status 200
server node1 indentiadb-1:7001 check inter 5s rise 2 fall 3
server node2 indentiadb-2:7001 check inter 5s rise 2 fall 3
server node3 indentiadb-3:7001 check inter 5s rise 2 fall 3
backend indentiadb_es_nodes
balance roundrobin
option httpchk GET /_cluster/health
http-check expect status 200
server node1 indentiadb-1:9200 check inter 5s rise 2 fall 3
server node2 indentiadb-2:9200 check inter 5s rise 2 fall 3
server node3 indentiadb-3:9200 check inter 5s rise 2 fall 3
Start the HA cluster:
The cluster automatically elects a Raft leader. All nodes serve read requests; writes are forwarded to the leader.
Environment Variable Configuration¶
| Variable | Description | Default |
|---|---|---|
SURREAL_USER |
Root username | (required) |
SURREAL_PASS |
Root password | (required) |
ES_HYBRID_SCORER |
Hybrid search fusion algorithm: rrf, bayesian, or linear |
rrf |
ES_VECTOR_SEARCH_MODE |
Vector index type: hnsw or flat |
hnsw |
RAFT_SEED_NODES |
Comma-separated list of host:port for Raft peer discovery (cluster mode) |
(unset) |
NODE_ID |
Unique node identifier within the cluster | (auto) |
LOG_LEVEL |
Log verbosity: error, warn, info, debug, trace |
info |
DATA_DIR |
Override the data directory path | /data |
Configuration File¶
For advanced tuning, mount a TOML configuration file at /config/indentiagraph.toml:
docker run -d \
--name indentiadb \
-p 7001:7001 \
-p 9200:9200 \
-e SURREAL_USER=root \
-e SURREAL_PASS=changeme \
-v ./indentiagraph.toml:/config/indentiagraph.toml:ro \
-v indentiadb-data:/data \
ghcr.io/indentiaplatform/indentiadb-trial:latest
Example indentiagraph.toml:
[server]
bind = "0.0.0.0"
http_port = 7001
raft_port = 7002
es_port = 9200
[storage]
data_dir = "/data"
[search]
hybrid_scorer = "rrf"
vector_search_mode = "hnsw"
[cache]
query_cache_size_mb = 256
[logging]
level = "info"
format = "json"
Volume Mount Strategy¶
Recommended: Named Docker Volume¶
Named volumes are managed by Docker and survive container removal. Use this for most deployments.
Bind Mount (Host Directory)¶
Bind mounts give direct access to data from the host. Ensure the host directory is owned by the user that Docker runs the container as (typically UID 1000).
Multiple Mounts¶
Separate data and configuration:
Health Check Configuration¶
The /health endpoint returns HTTP 200 when the node is ready. Docker Compose and Docker Swarm use this for container health tracking:
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:7001/health"]
interval: 10s
timeout: 5s
retries: 5
start_period: 30s
Check container health status:
Log Streaming¶
Stream logs from a running container:
Stream logs with timestamps:
With Docker Compose:
Configure JSON log driver with rotation in docker-compose.yml:
Backup and Restore¶
Creating a Backup¶
# Stop the container gracefully
docker stop indentiadb
# Create a tarball of the data volume
docker run --rm \
-v indentiadb-data:/data:ro \
-v $(pwd):/backup \
alpine tar czf /backup/indentiadb-backup-$(date +%Y%m%d).tar.gz -C /data .
# Restart the container
docker start indentiadb
Restoring from Backup¶
# Stop and remove the existing container
docker stop indentiadb && docker rm indentiadb
# Clear and restore the data volume
docker run --rm \
-v indentiadb-data:/data \
-v $(pwd):/backup \
alpine sh -c "rm -rf /data/* && tar xzf /backup/indentiadb-backup-20260321.tar.gz -C /data"
# Start the container again
docker run -d \
--name indentiadb \
-p 7001:7001 \
-p 9200:9200 \
-e SURREAL_USER=root \
-e SURREAL_PASS=changeme \
-v indentiadb-data:/data \
ghcr.io/indentiaplatform/indentiadb-trial:latest
Production Considerations¶
Resource Limits¶
Always set CPU and memory limits in production to prevent runaway resource consumption:
Log Rotation¶
Configure log rotation to prevent disk exhaustion:
Restart Policy¶
Use restart: unless-stopped or restart: always to ensure the container recovers from crashes:
Upgrading¶
- Pull the new image:
- For Docker Compose:
- For HA clusters, perform a rolling upgrade by restarting one node at a time and waiting for it to rejoin the Raft cluster before proceeding to the next node:
docker compose -f docker-compose.ha.yml pull
docker compose -f docker-compose.ha.yml up -d --no-deps indentiadb-1
# Wait for node 1 to be healthy
docker compose -f docker-compose.ha.yml up -d --no-deps indentiadb-2
# Wait for node 2 to be healthy
docker compose -f docker-compose.ha.yml up -d --no-deps indentiadb-3
Ports Reference¶
| Port | Protocol | Description |
|---|---|---|
| 7001 | HTTP | SPARQL, GraphQL, REST API, health, and metrics |
| 7002 | gRPC | Raft inter-node communication (cluster only) |
| 9200 | HTTP | Elasticsearch-compatible search API |