Data Models¶
IndentiaDB supports six data models within a single engine. This document describes each model, explains when to use it, and provides complete working examples.
Overview¶
| Model | Query Interface | Write Interface | When to Use |
|---|---|---|---|
| Relational | SurrealQL SELECT | SurrealQL CREATE / UPDATE | Structured records with typed schemas, reports, aggregations |
| Document | SurrealQL SELECT | SurrealQL CREATE / UPSERT | Semi-structured, nested JSON, flexible schemas |
| Graph RDF | SPARQL 1.2 | SPARQL UPDATE / Graph Store | Knowledge graphs, ontologies, provenance, inference |
| Graph LPG | LPG JSON DSL | Projected from RDF or documents | Traversals, shortest path, PageRank, connected components |
| Vector | SurrealQL HNSW operators | SurrealQL CREATE with embedding field | Similarity search, RAG pipelines |
| Full-Text | SurrealQL @@ / ES Query DSL |
SurrealQL CREATE with indexed field | Keyword search, fuzzy matching, BM25 ranking |
1. Relational (SurrealQL)¶
What It Is¶
The relational model in IndentiaDB uses SCHEMAFULL tables — tables where every field is declared with a type, and records that violate the schema are rejected at write time. This gives you the predictability of a traditional SQL database with the ergonomics of SurrealQL.
SurrealQL record links eliminate the need for explicit JOIN clauses. A field of type record<department> automatically resolves to the referenced record during a SELECT.
When to Use It¶
- You have well-defined, stable schemas
- You need aggregations, grouping, and ordered results
- You want referential integrity enforced at the field level
- You are replacing a PostgreSQL / MySQL workload
Full Working Example¶
-- Define schema
DEFINE TABLE department SCHEMAFULL;
DEFINE FIELD name ON department TYPE string;
DEFINE FIELD budget ON department TYPE number;
DEFINE FIELD location ON department TYPE string;
DEFINE TABLE employee SCHEMAFULL;
DEFINE FIELD name ON employee TYPE string;
DEFINE FIELD email ON employee TYPE string ASSERT string::is::email($value);
DEFINE FIELD department ON employee TYPE record<department>;
DEFINE FIELD salary ON employee TYPE number;
DEFINE FIELD hired ON employee TYPE datetime;
-- Insert departments
CREATE department:engineering CONTENT {
name: "Engineering", budget: 500000, location: "Amsterdam"
};
CREATE department:research CONTENT {
name: "Research", budget: 300000, location: "Utrecht"
};
-- Insert employees
CREATE employee:alice CONTENT {
name: "Alice van den Berg",
email: "alice@example.com",
department: department:engineering,
salary: 85000,
hired: d'2023-03-15T09:00:00Z'
};
CREATE employee:bob CONTENT {
name: "Bob de Vries",
email: "bob@example.com",
department: department:engineering,
salary: 92000,
hired: d'2022-01-10T09:00:00Z'
};
CREATE employee:carol CONTENT {
name: "Carol Jansen",
email: "carol@example.com",
department: department:research,
salary: 78000,
hired: d'2024-06-01T09:00:00Z'
};
-- Basic query: employees earning above 80k with department resolved
SELECT name, salary, department.name AS dept
FROM employee
WHERE salary > 80000
ORDER BY salary DESC;
-- Results: Bob (92000, Engineering), Alice (85000, Engineering)
-- Aggregation: average salary per department
SELECT department.name AS dept, math::mean(salary) AS avg_salary, count() AS headcount
FROM employee
GROUP BY department
ORDER BY dept;
-- Engineering: avg=88500, headcount=2; Research: avg=78000, headcount=1
-- Subquery: employees in departments with budget > 400k
SELECT name, department.name AS dept
FROM employee
WHERE department IN (
SELECT VALUE id FROM department WHERE budget > 400000
);
-- Alice, Bob
-- Update: give Engineering a raise
UPDATE employee SET salary += 5000 WHERE department = department:engineering;
2. Document (SurrealQL)¶
What It Is¶
The document model uses SCHEMALESS tables — tables that accept any JSON-like structure without a predefined schema. Different records in the same table can have completely different fields. This is analogous to MongoDB collections.
SurrealDB's dot-notation field access, array filters, and record links work identically in schemaless tables.
When to Use It¶
- Your data structure is evolving or not yet fully defined
- You are storing heterogeneous records (different shapes per record)
- You need to model nested objects with arbitrary depth
- You are replacing a MongoDB / CouchDB workload
Full Working Example¶
DEFINE TABLE project SCHEMALESS;
DEFINE TABLE task SCHEMALESS;
-- Create a project with nested metadata and milestones array
CREATE project:indentiagraph CONTENT {
name: "IndentiaGraph",
status: "active",
tags: ["database", "graph", "rdf", "rust"],
metadata: {
started: "2024-01-15",
lead: "Alice",
priority: "high",
budget: 250000
},
milestones: [
{ name: "Alpha", date: "2024-06-01", completed: true },
{ name: "Beta", date: "2024-12-01", completed: true },
{ name: "GA", date: "2025-06-01", completed: false }
]
};
-- Create tasks with record links to the project
CREATE task:lpg_csr CONTENT {
title: "Implement CSR adjacency structure",
project: project:indentiagraph,
assignee: "Bob",
status: "done",
labels: ["performance", "lpg"],
story_points: 8
};
CREATE task:sparql_engine CONTENT {
title: "SPARQL 1.2 property path evaluation",
project: project:indentiagraph,
assignee: "Alice",
status: "in_progress",
labels: ["rdf", "sparql"],
depends_on: [task:lpg_csr],
story_points: 13
};
-- Nested field access
SELECT name, metadata.lead, metadata.priority
FROM project WHERE status = "active";
-- Array contains filter
SELECT title, assignee FROM task WHERE labels CONTAINS "lpg";
-- Filter array elements inline (returns only completed milestones)
SELECT milestones[WHERE completed = true] AS done_milestones
FROM project:indentiagraph;
-- Auto-resolve record link: project.name is fetched automatically
SELECT title, project.name AS project_name, assignee
FROM task
WHERE assignee = "Alice";
-- UPSERT: create or overwrite
UPSERT project:fleetapi SET
name = "Fleet API",
status = "active",
metadata.lead = "Carol";
-- MERGE: partial update (only specified fields change)
UPDATE task:sparql_engine MERGE { status: "done", completed_at: time::now() };
-- DELETE with condition
DELETE task WHERE status = "done" AND story_points < 5;
3. Graph RDF (SPARQL 1.2)¶
What It Is¶
IndentiaDB stores RDF triples natively in a 6-permutation index (see Architecture). The SPARQL 1.2 endpoint supports all SPARQL 1.1 features plus RDF-star quoted triples, the TRIPLE() function family, and the latest Working Draft changes through the 9 April 2026 update (aligned with the RDF 1.2 Candidate Recommendation of 7 April 2026).
An RDF triple is a statement of fact: (subject, predicate, object). Named graphs group triples into logical partitions for provenance tracking, access control, or organizational separation.
When to Use It¶
- You are building a knowledge graph or ontology
- You need to represent provenance (who asserted this fact, with what confidence)
- You need to run semantic inference (RDFS/OWL entailment)
- You are federating queries across multiple SPARQL endpoints
- You need to query linked open data
Full Working Example: Social Knowledge Graph¶
# Insert data with named graphs
INSERT DATA {
GRAPH <http://example.org/social> {
<http://example.org/alice> a <http://xmlns.com/foaf/0.1/Person> ;
<http://xmlns.com/foaf/0.1/name> "Alice van den Berg" ;
<http://xmlns.com/foaf/0.1/age> 30 ;
<http://xmlns.com/foaf/0.1/knows> <http://example.org/bob> ,
<http://example.org/carol> .
<http://example.org/bob> a <http://xmlns.com/foaf/0.1/Person> ;
<http://xmlns.com/foaf/0.1/name> "Bob de Vries" ;
<http://xmlns.com/foaf/0.1/age> 28 ;
<http://xmlns.com/foaf/0.1/knows> <http://example.org/carol> .
<http://example.org/carol> a <http://xmlns.com/foaf/0.1/Person> ;
<http://xmlns.com/foaf/0.1/name> "Carol Jansen" ;
<http://xmlns.com/foaf/0.1/age> 35 .
}
}
# SELECT with FILTER and OPTIONAL
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT ?name ?age ?email WHERE {
GRAPH <http://example.org/social> {
?person a foaf:Person ;
foaf:name ?name ;
foaf:age ?age .
OPTIONAL { ?person foaf:mbox ?email }
}
FILTER (?age >= 29)
}
ORDER BY ?name
# Property path: friends-of-friends (2 hops)
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT DISTINCT ?fof ?name WHERE {
<http://example.org/alice> foaf:knows/foaf:knows ?fof .
?fof foaf:name ?name .
FILTER (?fof != <http://example.org/alice>)
}
# RDF-star: annotate a triple with provenance metadata
INSERT DATA {
<< <http://example.org/alice> <http://xmlns.com/foaf/0.1/knows> <http://example.org/bob> >>
<http://example.org/since> "2020-01-15"^^<http://www.w3.org/2001/XMLSchema#date> ;
<http://example.org/confidence> "0.95"^^<http://www.w3.org/2001/XMLSchema#decimal> ;
<http://example.org/source> <http://example.org/linkedin_import> .
}
# Query provenance via RDF-star
PREFIX ex: <http://example.org/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT ?subject ?object ?since ?confidence WHERE {
<< ?subject foaf:knows ?object >> ex:since ?since ;
ex:confidence ?confidence .
FILTER (?confidence > 0.9)
}
ORDER BY DESC(?confidence)
See the SPARQL 1.2 Reference for the complete SPARQL reference.
4. Graph LPG (LPG JSON DSL)¶
What It Is¶
The Labeled Property Graph model represents graphs as nodes (with labels and properties) and typed edges (with properties). IndentiaDB's LPG engine builds a Compressed Sparse Row (CSR) adjacency structure projected from RDF triples and/or document tables.
The CSR layout provides: - O(1) neighbor lookup for any node - Cache-friendly memory layout for BFS/DFS traversals - Efficient graph algorithm execution (PageRank, connected components)
Importantly: LPG does not have a separate write path. You write data as RDF triples or SurrealQL documents, and the LPG view is built from those sources. This means your RDF knowledge graph automatically gains graph algorithm capabilities.
When to Use It¶
- You need PageRank, connected components, or other graph algorithms
- You need shortest path queries across a large graph
- You are analyzing social networks, dependency graphs, or supply chains
- You want to augment RDF data with graph-theoretic analysis
Full Working Example¶
Step 1: Write data as RDF triples
PREFIX ex: <http://example.org/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
INSERT DATA {
ex:alice a foaf:Person ; foaf:name "Alice" .
ex:bob a foaf:Person ; foaf:name "Bob" .
ex:carol a foaf:Person ; foaf:name "Carol" .
ex:dave a foaf:Person ; foaf:name "Dave" .
ex:alice <http://xmlns.com/foaf/0.1/knows> ex:bob .
ex:bob <http://xmlns.com/foaf/0.1/knows> ex:carol .
ex:carol <http://xmlns.com/foaf/0.1/knows> ex:dave .
ex:alice <http://xmlns.com/foaf/0.1/knows> ex:dave .
}
Step 2: Query the LPG projection
POST /lpg/query
{
"kind": {
"Traverse": {
"start": { "iri": "http://example.org/alice" },
"edge": "knows",
"direction": "Out",
"min_hops": 1,
"max_hops": 3,
"target_label": "Person"
}
},
"limit": 100,
"return_fields": ["id", "name", "hop_count"]
}
Shortest path:
POST /lpg/query
{
"kind": {
"ShortestPath": {
"start": { "iri": "http://example.org/alice" },
"target": { "iri": "http://example.org/carol" },
"edge": "knows",
"direction": "Out"
}
},
"limit": 1,
"return_fields": ["path", "length"]
}
PageRank:
POST /lpg/query
{
"kind": {
"PageRank": {
"damping": 0.85,
"max_iterations": 100,
"tolerance": 1e-8,
"label_filter": "Person"
}
},
"limit": 100,
"return_fields": ["id", "score"]
}
See the LPG Reference for the complete JSON DSL specification.
5. Vector / Embeddings¶
What It Is¶
IndentiaDB stores high-dimensional embedding vectors alongside structured data fields in the same record. Vectors are indexed using HNSW (Hierarchical Navigable Small World), an approximate nearest neighbor algorithm that provides O(log n) search time with tunable recall.
This enables vector similarity search (semantic search, image similarity, recommendation) without a separate vector database.
When to Use It¶
- You are building a RAG (Retrieval-Augmented Generation) pipeline
- You need semantic search beyond keyword matching
- You are comparing embeddings from an ML model (text, image, audio)
- You want to store vectors alongside metadata without a separate system
Full Working Example¶
-- Define table with embedding field
DEFINE TABLE document SCHEMAFULL;
DEFINE FIELD title ON document TYPE string;
DEFINE FIELD content ON document TYPE string;
DEFINE FIELD source ON document TYPE string;
DEFINE FIELD embedding ON document TYPE array<float>;
-- Define HNSW index (1536-dimensional, OpenAI text-embedding-3-small compatible)
DEFINE INDEX idx_embedding ON document FIELDS embedding
HNSW DIMENSION 1536 DIST COSINE
EFC 200 M 16;
-- Insert documents with pre-computed embeddings (embedding values abbreviated)
CREATE document:doc1 CONTENT {
title: "Introduction to Knowledge Graphs",
content: "Knowledge graphs represent real-world entities and their relationships...",
source: "internal_wiki",
embedding: [0.023, -0.041, 0.087, /* ... 1536 floats total ... */]
};
CREATE document:doc2 CONTENT {
title: "SPARQL 1.2 Working Draft",
content: "SPARQL is the W3C query language for RDF data...",
source: "w3c_spec",
embedding: [0.015, -0.033, 0.091, /* ... */]
};
-- Similarity search: find 5 most relevant documents for a query vector
LET $query_vec = [0.021, -0.039, 0.085, /* ... */];
SELECT id, title, source,
vector::similarity::cosine(embedding, $query_vec) AS score
FROM document
WHERE embedding <|5,200|> $query_vec
ORDER BY score DESC;
-- Euclidean distance search
SELECT id, title,
vector::distance::euclidean(embedding, $query_vec) AS distance
FROM document
WHERE embedding <|5|> $query_vec
ORDER BY distance ASC;
-- Hybrid search: combine BM25 keyword score with vector similarity
DEFINE INDEX idx_content ON document FIELDS content
SEARCH ANALYZER english_analyzer BM25;
LET $text_score = search::score(1);
LET $vec_score = vector::similarity::cosine(embedding, $query_vec);
SELECT id, title,
($text_score * 0.4 + $vec_score * 0.6) AS combined_score
FROM document
WHERE content @1@ "knowledge graph"
AND embedding <|20,200|> $query_vec
ORDER BY combined_score DESC
LIMIT 10;
-- RAG context retrieval pattern
LET $context_chunks = (
SELECT title, content,
vector::similarity::cosine(embedding, $query_vec) AS relevance
FROM document
WHERE embedding <|5,100|> $query_vec
ORDER BY relevance DESC
);
-- Return context for LLM prompt construction
SELECT title, content FROM $context_chunks;
6. Full-Text Search (BM25)¶
What It Is¶
IndentiaDB includes a built-in full-text search engine based on BM25/TF-IDF ranking. Text fields are indexed using configurable analyzers that control tokenization, normalization (lowercase, ASCII folding), stemming, and n-gram generation. Search uses BM25 scoring with optional field boosting.
The Elasticsearch-compatible API on port 9200 exposes the same search capability through the Elasticsearch-compatible Query DSL, enabling drop-in compatibility with ES client libraries.
When to Use It¶
- You need ranked keyword search over text fields
- You need fuzzy matching (typo tolerance)
- You need phrase queries or proximity scoring
- You are replacing an Elasticsearch / OpenSearch workload
- You want to combine full-text search with vector similarity (hybrid search)
Full Working Example¶
-- Define analyzers
DEFINE ANALYZER english_content
TOKENIZERS blank, class
FILTERS lowercase, ascii, snowball(english), edgengram(2, 15);
DEFINE ANALYZER exact_match
TOKENIZERS blank
FILTERS lowercase, ascii;
-- Define table and search indexes
DEFINE TABLE article SCHEMAFULL;
DEFINE FIELD title ON article TYPE string;
DEFINE FIELD body ON article TYPE string;
DEFINE FIELD author ON article TYPE string;
DEFINE FIELD tags ON article TYPE array<string>;
DEFINE FIELD published ON article TYPE datetime;
DEFINE INDEX idx_title ON article FIELDS title SEARCH ANALYZER english_content BM25(1.2, 0.75);
DEFINE INDEX idx_body ON article FIELDS body SEARCH ANALYZER english_content BM25(1.2, 0.75);
DEFINE INDEX idx_tags ON article FIELDS tags SEARCH ANALYZER exact_match BM25;
-- Insert articles
CREATE article:a1 CONTENT {
title: "Building Knowledge Graphs with SPARQL",
body: "SPARQL enables complex querying of RDF knowledge graphs...",
author: "Alice",
tags: ["sparql", "rdf", "knowledge-graph"],
published: d'2025-03-01T00:00:00Z'
};
CREATE article:a2 CONTENT {
title: "Vector Search in Rust",
body: "HNSW (Hierarchical Navigable Small World) enables fast approximate nearest neighbor search...",
author: "Bob",
tags: ["vector", "rust", "hnsw"],
published: d'2025-04-15T00:00:00Z'
};
CREATE article:a3 CONTENT {
title: "Multi-Model Databases Explained",
body: "Modern applications need relational, document, graph, vector, and full-text search in one system...",
author: "Carol",
tags: ["database", "multi-model"],
published: d'2025-05-10T00:00:00Z'
};
-- Single-field full-text search (title only)
SELECT title, search::score(1) AS relevance
FROM article
WHERE title @1@ "knowledge graph"
ORDER BY relevance DESC;
-- Multi-field search with boosted title (title score × 2 + body score)
SELECT title, author,
search::score(1) * 2.0 + search::score(2) AS relevance
FROM article
WHERE title @1@ "graph"
OR body @2@ "graph"
ORDER BY relevance DESC
LIMIT 10;
-- Fuzzy prefix matching (edgengram allows partial tokens)
SELECT title FROM article WHERE title @1@ "knowl";
-- Returns "Building Knowledge Graphs..." because of edgengram(2,15)
-- Combined: full-text + date filter + tag filter
SELECT title, author, published,
search::score(1) + search::score(2) AS relevance
FROM article
WHERE (title @1@ "database" OR body @2@ "database")
AND published > d'2025-01-01T00:00:00Z'
ORDER BY relevance DESC;
-- Elasticsearch-compatible API (port 9200)
-- POST http://localhost:9200/article/_search
-- {
-- "query": {
-- "multi_match": {
-- "query": "knowledge graph",
-- "fields": ["title^2", "body"],
-- "fuzziness": "AUTO"
-- }
-- },
-- "size": 10
-- }
Combining Models¶
The true power of IndentiaDB is combining these models in a single query or transaction. See Hybrid Queries for detailed multi-model examples including:
- RDF SPARQL results fed into SurrealQL document queries
- Vector similarity search enriched with RDF metadata
- Full-text search results traversed via LPG graph edges
- Knowledge graph analytics with results stored in document tables