RDF-star Guide¶
RDF-star (part of the RDF 1.2 specification) is the single most impactful improvement to the RDF standard in over a decade. This guide explains what it is, why it matters, and how to use it effectively in IndentiaDB.
The Problem: Annotating Facts in RDF 1.1¶
A knowledge graph stores facts as triples: subject → predicate → object. But real-world knowledge is rarely clean and absolute. You need to record:
- Where a fact came from (data source, ingestion job, agent)
- How confident you are in it (ML score, expert review, heuristic)
- When it was valid (temporal provenance, bitemporal modelling)
- Who is allowed to see it (access control)
In RDF 1.1, the only way to annotate a triple was reification — creating a rdf:Statement blank node that represents the triple, then attaching properties to the blank node:
# Goal: "Alice knows Bob" — confidence 0.9, from LinkedIn
# RDF 1.1 reification — 6 triples for 1 annotated fact:
_:stmt1 a rdf:Statement ;
rdf:subject ex:alice ;
rdf:predicate foaf:knows ;
rdf:object ex:bob .
_:stmt1 ex:confidence "0.9"^^xsd:decimal .
_:stmt1 ex:source <http://linkedin.com/import> .
This is painful in three ways:
| Problem | Impact |
|---|---|
| Verbosity | 6 triples instead of 1 — storage and ingestion cost multiplies |
| Blank node fragility | Blank nodes have no stable identity; you cannot reference _:stmt1 from another named graph or across SPARQL federation |
| Query complexity | Finding all high-confidence facts requires a three-way join through the blank node |
A knowledge graph with 10 million annotated facts requires 60 million triples under reification. Queries that filter on confidence scores must traverse an extra hop for every result.
The Solution: RDF-star Quoted Triples¶
RDF 1.2 introduces quoted triples: the triple << s p o >> is itself a first-class RDF term that can appear as the subject or object of another triple.
# RDF 1.2 — same annotation in 1 statement:
INSERT DATA {
<< ex:alice foaf:knows ex:bob >>
ex:confidence "0.9"^^xsd:decimal ;
ex:source <http://linkedin.com/import> .
}
No blank node. No indirection. The quoted triple << ex:alice foaf:knows ex:bob >> is the subject — you annotate it directly. The base triple and its metadata are stored together.
Impact at scale
10 million annotated facts → 10 million quoted triples instead of 60 million reified triples. Queries that filter on annotation properties are direct — no intermediate joins through blank nodes.
Syntax Reference¶
Inline annotation (the common case)¶
PREFIX ex: <http://example.org/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
INSERT DATA {
<< ex:alice foaf:knows ex:bob >>
ex:since "2020-01-15"^^xsd:date ;
ex:confidence "0.95"^^xsd:decimal ;
ex:source <http://crm.example.org/import-001> .
}
Shorthand with ~ operator (SPARQL 1.2)¶
The ~ operator is syntactic sugar for inline annotation of the immediately preceding triple:
INSERT DATA {
ex:alice foaf:knows ex:bob
~ ex:confidence "0.95"^^xsd:decimal
~ ex:source <http://crm.example.org/import-001> .
}
This is exactly equivalent to the << >> form — choose whichever reads more clearly for your use case.
Quoted triple as subject¶
Quoted triple as object¶
# ex:claim1 asserts a specific triple
INSERT DATA {
ex:claim1 ex:asserts << ex:alice foaf:knows ex:bob >> .
}
Nested quoted triples¶
A quoted triple can itself contain a quoted triple:
# "Source X claims that Source Y claims that Alice knows Bob"
INSERT DATA {
<< ex:sourceX ex:claims << ex:alice foaf:knows ex:bob >> >>
ex:recordedAt "2026-01-15T09:00:00Z"^^xsd:dateTime .
}
Querying Quoted Triples¶
Match and retrieve annotation properties¶
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX ex: <http://example.org/>
SELECT ?person1 ?person2 ?confidence ?source
WHERE {
<< ?person1 foaf:knows ?person2 >>
ex:confidence ?confidence ;
ex:source ?source .
}
ORDER BY DESC(?confidence)
Filter by confidence threshold¶
SELECT ?person1 ?person2 ?confidence
WHERE {
<< ?person1 foaf:knows ?person2 >>
ex:confidence ?confidence .
FILTER (?confidence >= 0.8)
}
Find all facts from a specific source¶
Retrieve annotation or fall back gracefully¶
Use OPTIONAL when not every triple has an annotation:
SELECT ?person1 ?person2 ?confidence
WHERE {
?person1 foaf:knows ?person2 .
OPTIONAL {
<< ?person1 foaf:knows ?person2 >> ex:confidence ?confidence .
}
}
The TRIPLE() Function Family¶
SPARQL 1.2 adds functions for constructing and decomposing triple terms programmatically.
Construct a triple term¶
Decompose a triple term¶
SELECT ?s ?p ?o
WHERE {
<< ex:alice foaf:knows ex:bob >> ex:confidence ?conf .
BIND(SUBJECT( << ex:alice foaf:knows ex:bob >> ) AS ?s)
BIND(PREDICATE(<< ex:alice foaf:knows ex:bob >> ) AS ?p)
BIND(OBJECT( << ex:alice foaf:knows ex:bob >> ) AS ?o)
}
Type-test with isTRIPLE()¶
Use Cases¶
1. Provenance and lineage¶
Track where every fact came from and how it got into the graph:
INSERT DATA {
GRAPH <http://example.org/hr> {
ex:alice ex:worksAt ex:Acme .
}
<< ex:alice ex:worksAt ex:Acme >>
prov:wasGeneratedBy ex:hr_import_2026_q1 ;
prov:generatedAtTime "2026-01-15T10:00:00Z"^^xsd:dateTime ;
ex:confidence "0.99"^^xsd:decimal ;
ex:source "hr-system-v2" .
}
2. ML confidence scores¶
Attach model confidence directly to inferred relationships:
INSERT DATA {
<< ex:product_a ex:similarTo ex:product_b >>
ex:mlScore "0.87"^^xsd:decimal ;
ex:modelId "rec-model-v3" ;
ex:computedAt "2026-03-01T00:00:00Z"^^xsd:dateTime .
}
Query: "find all product similarities above 0.85 computed by the latest model":
SELECT ?a ?b ?score
WHERE {
<< ?a ex:similarTo ?b >>
ex:mlScore ?score ;
ex:modelId "rec-model-v3" .
FILTER (?score > 0.85)
}
ORDER BY DESC(?score)
3. Access control (triple-level ACL)¶
IndentiaDB uses RDF-star for per-triple access control — the only database that does this at the storage layer:
INSERT DATA {
<< ex:patient_record_42 ex:diagnosis ex:condition_cancer >>
acl:allowedSid "S-1-5-21-domain-1050" ; # medical-staff group
acl:allowedSid "S-1-5-21-domain-2001" . # oncology group
}
Triples without acl:allowedSid annotations are visible to all principals with graph access. See the ACL documentation for the full model.
4. Temporal validity¶
Combine quoted triples with validity periods for bitemporal modelling:
INSERT DATA {
<< ex:alice ex:ceo_of ex:Acme >>
ex:validFrom "2023-06-01"^^xsd:date ;
ex:validUntil "2025-12-31"^^xsd:date .
}
Query current CEOs:
SELECT ?person ?company
WHERE {
<< ?person ex:ceo_of ?company >>
ex:validFrom ?from ;
ex:validUntil ?until .
FILTER (
?from <= "2026-03-28"^^xsd:date &&
?until >= "2026-03-28"^^xsd:date
)
}
Comparison: Reification vs RDF-star¶
| Aspect | RDF 1.1 Reification | RDF 1.2 (RDF-star) |
|---|---|---|
| Triples per annotated fact | 4–6 per annotation | 1 base triple + annotation triples |
| Subject identity | Blank node (fragile, local) | The triple itself (stable, global) |
| Cross-graph reference | Not possible with blank nodes | Works naturally |
| Query complexity | 3-way join through blank node | Direct pattern match |
| SPARQL syntax | Complex ?stmt rdf:subject ?s ... |
<< ?s ?p ?o >> ?prop ?val |
| Storage overhead (10M facts, 3 annotations each) | ~60M triples | ~10M base + 30M annotation triples |
| Interoperability | Wide (RDF 1.1 is universal) | Growing — RDF 1.2 is a W3C Candidate Recommendation (7 April 2026) |
Other RDF 1.2 Additions¶
RDF-star is the headline feature of RDF 1.2, but the specification adds two other important changes that IndentiaDB also supports.
rdf:JSON literals¶
rdf:JSON is a new built-in datatype for embedding JSON values as RDF literals:
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX ex: <http://example.org/>
INSERT DATA {
ex:sensor42 ex:config
"{\"threshold\": 25, \"unit\": \"celsius\"}"^^rdf:JSON .
}
In IndentiaDB, fields backed by rdf:JSON literals are automatically indexed as Elasticsearch keyword fields (exact match) rather than text (full-text). This preserves the raw JSON for filtering and aggregation without tokenization.
Language tags with base direction (rdf:dirLangString)¶
RDF 1.2 introduces a new datatype for language-tagged strings that also carry a text rendering direction (ltr or rtl). The SPARQL syntax uses a double-dash suffix:
INSERT DATA {
ex:title1 rdfs:label "Hello World"@en--ltr .
ex:title2 rdfs:label "مرحبا بالعالم"@ar--rtl .
}
The LANGDIR() SPARQL function returns the direction string ("ltr" or "rtl"):
Language tag canonicalization (BCP47)¶
Per the RDF 1.2 Candidate Recommendation, language tag comparisons are case-insensitive, and the canonical form of a tag is all lowercase. The direction component of a directional language-tagged string follows the same rule.
IndentiaDB preserves the original casing that was written into the store for regular output, but the canonical N-Triples serialization (see SPARQL endpoints — Response formats) emits tags in their canonical form:
| Stored | Canonical N-Triples output |
|---|---|
"chat"@EN-GB |
"chat"@en-gb |
"مرحبا"@AR--RTL |
"مرحبا"@ar--rtl |
"Hallo"@De-At--Ltr |
"Hallo"@de-at--ltr |
Use the canonical form whenever byte-identical output is required — e.g. for hashing, signing, or diffing across environments.
RDF Reference IRI validation (§3.3.1)¶
RDF 1.2 Section 3.3.1 defines the notion of an RDF Reference IRI: an IRI
that is suitable as a stable resource identifier and that must survive
reference resolution unchanged. When strict validation is enabled (via
IriValidator::strict_rdf12() in the builder library), the following are
flagged:
- IRIs with
.or..path segments (error — would collapse during reference resolution) - Missing authority component in hierarchical schemes such as
http/https(error) - Non-lowercase scheme names (warning — canonical form is lowercase)
- Empty path segments (warning)
This validation runs during index builds when strict mode is enabled; it does not reject writes on the live SPARQL endpoint.
Content-Type version signalling¶
When IndentiaDB returns RDF from the Graph Store Protocol (GET /data), the response includes version=1.2 in the Content-Type header — for example:
RDF 1.2-aware clients can use this to enable quoted-triple parsing. RDF 1.1 clients that ignore unknown parameters continue to work unchanged.
When to Use Quoted Triples¶
Use RDF-star when you need to annotate facts — provenance, confidence, access control, temporal validity, audit trails.
Use standard triples when the fact itself is the complete statement and you have no per-fact metadata to attach.
Don't reify unless you are integrating with a legacy system that only understands RDF 1.1. Even then, consider storing in RDF-star format and exporting as reification for compatibility.
Further Reading¶
- SPARQL 1.2 Reference — Full query language with RDF-star examples
- Lineage & Provenance — Production provenance pipeline built on RDF-star
- ACL Filtering — Per-triple access control using RDF-star
- W3C RDF 1.2 Candidate Recommendation — Specification (7 April 2026)
- W3C SPARQL 1.2 Working Draft — Query language (9 April 2026)