RDF-star Guide¶

RDF-star (part of the RDF 1.2 specification) is the single most impactful improvement to the RDF standard in over a decade. This guide explains what it is, why it matters, and how to use it effectively in IndentiaDB.

The Problem: Annotating Facts in RDF 1.1¶

A knowledge graph stores facts as triples: subject → predicate → object. But real-world knowledge is rarely clean and absolute. You need to record:

Where a fact came from (data source, ingestion job, agent)
How confident you are in it (ML score, expert review, heuristic)
When it was valid (temporal provenance, bitemporal modelling)
Who is allowed to see it (access control)

In RDF 1.1, the only way to annotate a triple was reification — creating a rdf:Statement blank node that represents the triple, then attaching properties to the blank node:

# Goal: "Alice knows Bob" — confidence 0.9, from LinkedIn
# RDF 1.1 reification — 6 triples for 1 annotated fact:

_:stmt1 a               rdf:Statement ;
        rdf:subject     ex:alice ;
        rdf:predicate   foaf:knows ;
        rdf:object      ex:bob .
_:stmt1 ex:confidence   "0.9"^^xsd:decimal .
_:stmt1 ex:source       <http://linkedin.com/import> .

This is painful in three ways:

Problem	Impact
Verbosity	6 triples instead of 1 — storage and ingestion cost multiplies
Blank node fragility	Blank nodes have no stable identity; you cannot reference `_:stmt1` from another named graph or across SPARQL federation
Query complexity	Finding all high-confidence facts requires a three-way join through the blank node

A knowledge graph with 10 million annotated facts requires 60 million triples under reification. Queries that filter on confidence scores must traverse an extra hop for every result.

The Solution: RDF-star Quoted Triples¶

RDF 1.2 introduces quoted triples: the triple << s p o >> is itself a first-class RDF term that can appear as the subject or object of another triple.

# RDF 1.2 — same annotation in 1 statement:
INSERT DATA {
    << ex:alice foaf:knows ex:bob >>
        ex:confidence "0.9"^^xsd:decimal ;
        ex:source     <http://linkedin.com/import> .
}

No blank node. No indirection. The quoted triple << ex:alice foaf:knows ex:bob >> is the subject — you annotate it directly. The base triple and its metadata are stored together.

Impact at scale

10 million annotated facts → 10 million quoted triples instead of 60 million reified triples. Queries that filter on annotation properties are direct — no intermediate joins through blank nodes.

Syntax Reference¶

Inline annotation (the common case)¶

PREFIX ex:   <http://example.org/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX xsd:  <http://www.w3.org/2001/XMLSchema#>

INSERT DATA {
    << ex:alice foaf:knows ex:bob >>
        ex:since      "2020-01-15"^^xsd:date ;
        ex:confidence "0.95"^^xsd:decimal ;
        ex:source     <http://crm.example.org/import-001> .
}

Shorthand with `~` operator (SPARQL 1.2)¶

The ~ operator is syntactic sugar for inline annotation of the immediately preceding triple:

INSERT DATA {
    ex:alice foaf:knows ex:bob
        ~ ex:confidence "0.95"^^xsd:decimal
        ~ ex:source     <http://crm.example.org/import-001> .
}

This is exactly equivalent to the << >> form — choose whichever reads more clearly for your use case.

Quoted triple as subject¶

INSERT DATA {
    << ex:alice foaf:knows ex:bob >>
        ex:confidence "0.95"^^xsd:decimal .
}

Quoted triple as object¶

# ex:claim1 asserts a specific triple
INSERT DATA {
    ex:claim1 ex:asserts << ex:alice foaf:knows ex:bob >> .
}

Nested quoted triples¶

A quoted triple can itself contain a quoted triple:

# "Source X claims that Source Y claims that Alice knows Bob"
INSERT DATA {
    << ex:sourceX ex:claims << ex:alice foaf:knows ex:bob >> >>
        ex:recordedAt "2026-01-15T09:00:00Z"^^xsd:dateTime .
}

Querying Quoted Triples¶

Match and retrieve annotation properties¶

PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX ex:   <http://example.org/>

SELECT ?person1 ?person2 ?confidence ?source
WHERE {
    << ?person1 foaf:knows ?person2 >>
        ex:confidence ?confidence ;
        ex:source     ?source .
}
ORDER BY DESC(?confidence)

Filter by confidence threshold¶

SELECT ?person1 ?person2 ?confidence
WHERE {
    << ?person1 foaf:knows ?person2 >>
        ex:confidence ?confidence .
    FILTER (?confidence >= 0.8)
}

Find all facts from a specific source¶

SELECT ?s ?p ?o
WHERE {
    << ?s ?p ?o >> ex:source <http://crm.example.org/import-001> .
}

Retrieve annotation or fall back gracefully¶

Use OPTIONAL when not every triple has an annotation:

SELECT ?person1 ?person2 ?confidence
WHERE {
    ?person1 foaf:knows ?person2 .
    OPTIONAL {
        << ?person1 foaf:knows ?person2 >> ex:confidence ?confidence .
    }
}

The TRIPLE() Function Family¶

SPARQL 1.2 adds functions for constructing and decomposing triple terms programmatically.

Construct a triple term¶

SELECT (TRIPLE(ex:alice, foaf:knows, ex:bob) AS ?t) WHERE {}

Decompose a triple term¶

SELECT ?s ?p ?o
WHERE {
    << ex:alice foaf:knows ex:bob >> ex:confidence ?conf .
    BIND(SUBJECT(  << ex:alice foaf:knows ex:bob >> ) AS ?s)
    BIND(PREDICATE(<< ex:alice foaf:knows ex:bob >> ) AS ?p)
    BIND(OBJECT(   << ex:alice foaf:knows ex:bob >> ) AS ?o)
}

Type-test with isTRIPLE()¶

SELECT ?value
WHERE {
    ex:claim1 ex:asserts ?value .
    FILTER (isTRIPLE(?value))
}

Use Cases¶

1. Provenance and lineage¶

Track where every fact came from and how it got into the graph:

INSERT DATA {
    GRAPH <http://example.org/hr> {
        ex:alice ex:worksAt ex:Acme .
    }

    << ex:alice ex:worksAt ex:Acme >>
        prov:wasGeneratedBy  ex:hr_import_2026_q1 ;
        prov:generatedAtTime "2026-01-15T10:00:00Z"^^xsd:dateTime ;
        ex:confidence        "0.99"^^xsd:decimal ;
        ex:source            "hr-system-v2" .
}

2. ML confidence scores¶

Attach model confidence directly to inferred relationships:

INSERT DATA {
    << ex:product_a ex:similarTo ex:product_b >>
        ex:mlScore     "0.87"^^xsd:decimal ;
        ex:modelId     "rec-model-v3" ;
        ex:computedAt  "2026-03-01T00:00:00Z"^^xsd:dateTime .
}

Query: "find all product similarities above 0.85 computed by the latest model":

SELECT ?a ?b ?score
WHERE {
    << ?a ex:similarTo ?b >>
        ex:mlScore  ?score ;
        ex:modelId  "rec-model-v3" .
    FILTER (?score > 0.85)
}
ORDER BY DESC(?score)

3. Access control (triple-level ACL)¶

IndentiaDB uses RDF-star for per-triple access control — the only database that does this at the storage layer:

INSERT DATA {
    << ex:patient_record_42 ex:diagnosis ex:condition_cancer >>
        acl:allowedSid "S-1-5-21-domain-1050" ;  # medical-staff group
        acl:allowedSid "S-1-5-21-domain-2001" .  # oncology group
}

Triples without acl:allowedSid annotations are visible to all principals with graph access. See the ACL documentation for the full model.

4. Temporal validity¶

Combine quoted triples with validity periods for bitemporal modelling:

INSERT DATA {
    << ex:alice ex:ceo_of ex:Acme >>
        ex:validFrom  "2023-06-01"^^xsd:date ;
        ex:validUntil "2025-12-31"^^xsd:date .
}

Query current CEOs:

SELECT ?person ?company
WHERE {
    << ?person ex:ceo_of ?company >>
        ex:validFrom  ?from ;
        ex:validUntil ?until .
    FILTER (
        ?from  <= "2026-03-28"^^xsd:date &&
        ?until >= "2026-03-28"^^xsd:date
    )
}

Comparison: Reification vs RDF-star¶

Aspect	RDF 1.1 Reification	RDF 1.2 (RDF-star)
Triples per annotated fact	4–6 per annotation	1 base triple + annotation triples
Subject identity	Blank node (fragile, local)	The triple itself (stable, global)
Cross-graph reference	Not possible with blank nodes	Works naturally
Query complexity	3-way join through blank node	Direct pattern match
SPARQL syntax	Complex `?stmt rdf:subject ?s ...`	`<< ?s ?p ?o >> ?prop ?val`
Storage overhead (10M facts, 3 annotations each)	~60M triples	~10M base + 30M annotation triples
Interoperability	Wide (RDF 1.1 is universal)	Growing — RDF 1.2 is a W3C Candidate Recommendation (7 April 2026)

Other RDF 1.2 Additions¶

RDF-star is the headline feature of RDF 1.2, but the specification adds two other important changes that IndentiaDB also supports.

rdf:JSON literals¶

rdf:JSON is a new built-in datatype for embedding JSON values as RDF literals:

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX ex:  <http://example.org/>

INSERT DATA {
    ex:sensor42 ex:config
        "{\"threshold\": 25, \"unit\": \"celsius\"}"^^rdf:JSON .
}

In IndentiaDB, fields backed by rdf:JSON literals are automatically indexed as Elasticsearch keyword fields (exact match) rather than text (full-text). This preserves the raw JSON for filtering and aggregation without tokenization.

Language tags with base direction (`rdf:dirLangString`)¶

RDF 1.2 introduces a new datatype for language-tagged strings that also carry a text rendering direction (ltr or rtl). The SPARQL syntax uses a double-dash suffix:

INSERT DATA {
    ex:title1 rdfs:label "Hello World"@en--ltr .
    ex:title2 rdfs:label "مرحبا بالعالم"@ar--rtl .
}

The LANGDIR() SPARQL function returns the direction string ("ltr" or "rtl"):

SELECT ?label (LANG(?label) AS ?lang) (LANGDIR(?label) AS ?dir) WHERE {
    ?doc rdfs:label ?label .
}

Language tag canonicalization (BCP47)¶

Per the RDF 1.2 Candidate Recommendation, language tag comparisons are case-insensitive, and the canonical form of a tag is all lowercase. The direction component of a directional language-tagged string follows the same rule.

IndentiaDB preserves the original casing that was written into the store for regular output, but the canonical N-Triples serialization (see SPARQL endpoints — Response formats) emits tags in their canonical form:

Stored	Canonical N-Triples output
`"chat"@EN-GB`	`"chat"@en-gb`
`"مرحبا"@AR--RTL`	`"مرحبا"@ar--rtl`
`"Hallo"@De-At--Ltr`	`"Hallo"@de-at--ltr`

Use the canonical form whenever byte-identical output is required — e.g. for hashing, signing, or diffing across environments.

RDF Reference IRI validation (§3.3.1)¶

RDF 1.2 Section 3.3.1 defines the notion of an RDF Reference IRI: an IRI that is suitable as a stable resource identifier and that must survive reference resolution unchanged. When strict validation is enabled (via IriValidator::strict_rdf12() in the builder library), the following are flagged:

IRIs with . or .. path segments (error — would collapse during reference resolution)
Missing authority component in hierarchical schemes such as http/https (error)
Non-lowercase scheme names (warning — canonical form is lowercase)
Empty path segments (warning)

This validation runs during index builds when strict mode is enabled; it does not reject writes on the live SPARQL endpoint.

Content-Type version signalling¶

When IndentiaDB returns RDF from the Graph Store Protocol (GET /data), the response includes version=1.2 in the Content-Type header — for example:

Content-Type: application/n-triples; version=1.2
Content-Type: text/turtle; version=1.2

RDF 1.2-aware clients can use this to enable quoted-triple parsing. RDF 1.1 clients that ignore unknown parameters continue to work unchanged.

When to Use Quoted Triples¶

Use RDF-star when you need to annotate facts — provenance, confidence, access control, temporal validity, audit trails.

Use standard triples when the fact itself is the complete statement and you have no per-fact metadata to attach.

Don't reify unless you are integrating with a legacy system that only understands RDF 1.1. Even then, consider storing in RDF-star format and exporting as reification for compatibility.