Skip to content

WebAssembly Runtime

IndentiaDB's SPARQL engine compiles to WebAssembly (WASM), enabling RDF triple pattern lookups and vocabulary resolution to run entirely in the browser or on edge runtimes -- with no server round-trips. The WASM module loads pre-built index and vocabulary files into memory and exposes a JavaScript/TypeScript API for scanning triples, resolving term IDs, and inspecting index statistics.


Overview

The WASM runtime packages IndentiaDB's core storage layer -- permutation indices and vocabulary -- into a WebAssembly module that runs in any environment with a WASM runtime: modern browsers, Cloudflare Workers, Deno Deploy, Node.js, and Fastly Compute.

Capability WASM Runtime Full Server
Triple pattern scan (SPO, POS, OSP) Yes Yes
Vocabulary lookup (term ID to string) Yes Yes
SPARQL SELECT/CONSTRUCT/ASK Planned Yes
SPARQL UPDATE No Yes
Full-text / BM25 search No Yes
Vector search (HNSW) No Yes
Federation (SERVICE) No Yes
Persistence / write-ahead log No Yes
Authentication / ACL No Yes
Elasticsearch-compatible API No Yes

Query-Only, No Persistence

The WASM runtime is read-only. It loads pre-built index files into memory and supports triple pattern lookups and vocabulary resolution. It does not support SPARQL UPDATE, write operations, full-text search, or vector search. For a full-featured deployment, use the native server binary.


Installation

npm Package

npm install @indentia/graph-wasm

CDN (ES Module)

<script type="module">
  import init, { WasmEngine, WasmVocabulary, WasmIndex }
    from 'https://cdn.indentia.ai/wasm/latest/indentiagraph_wasm.js';

  await init();
</script>

Manual Build

Build the WASM module from source using wasm-pack:

# Install wasm-pack
cargo install wasm-pack

# Build for browser (ES module output)
cd indentiagraph-wasm
wasm-pack build --target web --release

# Build for Node.js
wasm-pack build --target nodejs --release

# Build for bundlers (webpack, vite, etc.)
wasm-pack build --target bundler --release

The build output is placed in pkg/:

pkg/
  indentiagraph_wasm.js       # JavaScript glue code
  indentiagraph_wasm_bg.wasm  # WebAssembly binary (~1.2 MB gzipped)
  indentiagraph_wasm.d.ts     # TypeScript type declarations
  package.json                # npm package metadata

Core API

The WASM runtime exposes four main classes: WasmEngine, WasmVocabulary, WasmIndex, and WasmQueryResults.

WasmVocabulary

The vocabulary maps between integer term IDs and their string representations (IRIs, blank nodes, literals).

class WasmVocabulary {
  /** Load vocabulary from raw byte arrays. */
  static from_bytes(vocab_data: Uint8Array, offset_data: Uint8Array): WasmVocabulary;

  /** Create an empty vocabulary (for testing). */
  static empty(): WasmVocabulary;

  /** Number of terms in the vocabulary. */
  len(): number;

  /** Check if the vocabulary is empty. */
  is_empty(): boolean;

  /** Look up a term by its integer ID. Returns null if not found. */
  get_by_index(id: bigint): string | null;

  /** Check if a term ID represents a literal (vs IRI or blank node). */
  is_literal(id: bigint): boolean;

  /** Check if a term ID represents an IRI or blank node. */
  is_iri_or_blank(id: bigint): boolean;

  /** Get the boundary index between literals and IRIs. */
  first_iri_index(): number;

  /** Get vocabulary statistics as a JSON object. */
  stats(): { term_count: number; first_iri_index: number; is_empty: boolean };
}

WasmIndex

A permutation index stores triples in a specific column order (e.g., SPO, POS, OSP) to enable efficient prefix-based lookups.

class WasmIndex {
  /** Load an index from raw byte arrays. */
  static from_bytes(
    permutation: 'SPO' | 'SOP' | 'PSO' | 'POS' | 'OSP' | 'OPS',
    data: Uint8Array,
    metadata: Uint8Array
  ): WasmIndex;

  /** Get the permutation type as a string. */
  permutation(): string;

  /** Total number of triples in this index. */
  num_triples(): bigint;

  /** Number of compressed blocks. */
  num_blocks(): number;

  /** Check if the index is empty. */
  is_empty(): boolean;

  /**
   * Scan for triples matching a prefix pattern.
   * prefix: Array of 0-3 term IDs.
   *   [] = all triples
   *   [s] = triples with subject s
   *   [s, p] = triples with subject s and predicate p
   *   [s, p, o] = exact triple lookup
   * Returns: Array of [subject_id, predicate_id, object_id] tuples.
   */
  scan(prefix: bigint[], limit: number): [bigint, bigint, bigint][];

  /** Count triples matching a prefix (more efficient than scan). */
  count(prefix: bigint[]): bigint;

  /** Check if any triple matches the prefix. */
  exists(prefix: bigint[]): boolean;

  /** Get index statistics as a JSON object. */
  stats(): { permutation: string; triple_count: number; block_count: number; is_empty: boolean };
}

WasmEngine

A convenience wrapper that combines a vocabulary with one or more permutation indices.

class WasmEngine {
  /** Create an engine with the given vocabulary. */
  constructor(vocabulary: WasmVocabulary);

  /** Attach the SPO index. */
  set_spo_index(index: WasmIndex): void;

  /** Attach the POS index. */
  set_pos_index(index: WasmIndex): void;

  /** Attach the OSP index. */
  set_osp_index(index: WasmIndex): void;

  /** Get vocabulary statistics. */
  vocab_stats(): { term_count: number; is_empty: boolean };

  /** Get index statistics for all loaded indices. */
  index_stats(): {
    spo: { triple_count: number; block_count: number } | null;
    pos: { triple_count: number; block_count: number } | null;
    osp: { triple_count: number; block_count: number } | null;
  };

  /** Look up a term string by its ID. */
  lookup_term(id: bigint): string | null;

  /** Scan triples from the SPO index. */
  scan(prefix: bigint[], limit: number): [bigint, bigint, bigint][];

  /** Resolve term IDs to their string representations. */
  resolve_triple(
    subject_id: bigint,
    predicate_id: bigint,
    object_id: bigint
  ): { subject: string; predicate: string; object: string };
}

version()

Returns the library version string:

import { version } from '@indentia/graph-wasm';
console.log(version()); // "1.4.2"

Loading Data

The WASM runtime requires pre-built index files exported from a running IndentiaDB instance. These files are typically served as static assets.

Required Files

File Description Typical Size
vocabulary Concatenated term strings 10 MB - 2 GB
vocabulary.offsets Offset table into vocabulary ~8 bytes per term
spo SPO permutation index (compressed blocks) 5 MB - 1 GB
spo.meta SPO block metadata ~20 bytes per block
pos POS permutation index 5 MB - 1 GB
pos.meta POS block metadata ~20 bytes per block
osp OSP permutation index 5 MB - 1 GB
osp.meta OSP block metadata ~20 bytes per block

Not All Indices Are Required

Load only the permutation indices you need. For subject-centric lookups, only the SPO index is necessary. For predicate-centric queries (e.g., "find all triples with predicate rdf:type"), load POS. Load all three for full triple pattern flexibility.

Exporting Index Files

Export index files from a running IndentiaDB instance:

# Export all index files to a directory
indentiagraph backup create \
  --source /var/lib/indentiadb/data \
  --repository ./wasm-export \
  --index "spo=permutations/spo" \
  --index "pos=permutations/pos" \
  --index "osp=permutations/osp" \
  --path "vocabulary=vocabulary"

Browser Example

Complete Browser Application

<!DOCTYPE html>
<html lang="en">
<head>
  <meta charset="UTF-8">
  <title>IndentiaDB WASM Triple Browser</title>
</head>
<body>
  <h1>IndentiaDB WASM Triple Browser</h1>
  <div id="stats"></div>
  <div>
    <input id="subject-id" type="number" placeholder="Subject ID">
    <button id="scan-btn">Scan Triples</button>
  </div>
  <pre id="results"></pre>

  <script type="module">
    import init, { WasmEngine, WasmVocabulary, WasmIndex }
      from './pkg/indentiagraph_wasm.js';

    async function loadBytes(url) {
      const response = await fetch(url);
      if (!response.ok) throw new Error(`Failed to fetch ${url}: ${response.status}`);
      return new Uint8Array(await response.arrayBuffer());
    }

    async function main() {
      // Initialize WASM module
      await init();

      // Load vocabulary
      const [vocabData, offsetData] = await Promise.all([
        loadBytes('/data/vocabulary'),
        loadBytes('/data/vocabulary.offsets'),
      ]);
      const vocab = WasmVocabulary.from_bytes(vocabData, offsetData);

      // Load SPO index
      const [spoData, spoMeta] = await Promise.all([
        loadBytes('/data/spo'),
        loadBytes('/data/spo.meta'),
      ]);
      const spoIndex = WasmIndex.from_bytes('SPO', spoData, spoMeta);

      // Create engine
      const engine = new WasmEngine(vocab);
      engine.set_spo_index(spoIndex);

      // Display statistics
      const vocabStats = engine.vocab_stats();
      const indexStats = engine.index_stats();
      document.getElementById('stats').innerHTML = `
        <p>Vocabulary: ${vocabStats.term_count} terms</p>
        <p>SPO Index: ${indexStats.spo?.triple_count ?? 0} triples
           in ${indexStats.spo?.block_count ?? 0} blocks</p>
      `;

      // Scan button handler
      document.getElementById('scan-btn').addEventListener('click', () => {
        const subjectId = document.getElementById('subject-id').value;
        const prefix = subjectId ? [BigInt(subjectId)] : [];
        const triples = engine.scan(prefix, 50);

        const resolved = triples.map(([s, p, o]) => {
          const triple = engine.resolve_triple(s, p, o);
          return `${triple.subject}  ${triple.predicate}  ${triple.object}`;
        });

        document.getElementById('results').textContent = resolved.join('\n');
      });
    }

    main().catch(console.error);
  </script>
</body>
</html>

Node.js / Server-Side Example

const { WasmEngine, WasmVocabulary, WasmIndex } = require('@indentia/graph-wasm');
const fs = require('fs');

// Load files from disk
const vocabData = new Uint8Array(fs.readFileSync('./data/vocabulary'));
const offsetData = new Uint8Array(fs.readFileSync('./data/vocabulary.offsets'));
const spoData = new Uint8Array(fs.readFileSync('./data/spo'));
const spoMeta = new Uint8Array(fs.readFileSync('./data/spo.meta'));

// Build engine
const vocab = WasmVocabulary.from_bytes(vocabData, offsetData);
const spoIndex = WasmIndex.from_bytes('SPO', spoData, spoMeta);

const engine = new WasmEngine(vocab);
engine.set_spo_index(spoIndex);

// Scan all triples (first 100)
const triples = engine.scan([], 100);
for (const [s, p, o] of triples) {
  const resolved = engine.resolve_triple(s, p, o);
  console.log(`${resolved.subject}  ${resolved.predicate}  ${resolved.object}`);
}

Edge Runtime Example (Cloudflare Workers)

Deploy a lightweight triple lookup API on Cloudflare Workers:

import init, { WasmEngine, WasmVocabulary, WasmIndex }
  from '@indentia/graph-wasm';

let engine = null;

async function initEngine(env) {
  if (engine) return engine;

  await init();

  // Load from R2 or KV storage
  const [vocabData, offsetData, spoData, spoMeta] = await Promise.all([
    env.INDEX_BUCKET.get('vocabulary').then(r => r.arrayBuffer()),
    env.INDEX_BUCKET.get('vocabulary.offsets').then(r => r.arrayBuffer()),
    env.INDEX_BUCKET.get('spo').then(r => r.arrayBuffer()),
    env.INDEX_BUCKET.get('spo.meta').then(r => r.arrayBuffer()),
  ]);

  const vocab = WasmVocabulary.from_bytes(
    new Uint8Array(vocabData),
    new Uint8Array(offsetData)
  );
  const spoIndex = WasmIndex.from_bytes(
    'SPO',
    new Uint8Array(spoData),
    new Uint8Array(spoMeta)
  );

  engine = new WasmEngine(vocab);
  engine.set_spo_index(spoIndex);
  return engine;
}

export default {
  async fetch(request, env) {
    const url = new URL(request.url);
    const e = await initEngine(env);

    if (url.pathname === '/stats') {
      return Response.json({
        vocab: e.vocab_stats(),
        indices: e.index_stats(),
      });
    }

    if (url.pathname === '/scan') {
      const subjectId = url.searchParams.get('subject');
      const limit = parseInt(url.searchParams.get('limit') || '20', 10);
      const prefix = subjectId ? [BigInt(subjectId)] : [];
      const triples = e.scan(prefix, limit);

      const resolved = triples.map(([s, p, o]) => e.resolve_triple(s, p, o));
      return Response.json({ count: resolved.length, triples: resolved });
    }

    if (url.pathname === '/lookup') {
      const termId = url.searchParams.get('id');
      if (!termId) return new Response('Missing id parameter', { status: 400 });
      const term = e.lookup_term(BigInt(termId));
      return Response.json({ id: termId, term });
    }

    return new Response('Not found', { status: 404 });
  }
};

TypeScript Integration

The npm package ships with TypeScript declarations. Import types directly:

import init, {
  WasmEngine,
  WasmVocabulary,
  WasmIndex,
  version,
} from '@indentia/graph-wasm';

async function buildEngine(baseUrl: string): Promise<WasmEngine> {
  await init();

  const load = async (path: string): Promise<Uint8Array> => {
    const res = await fetch(`${baseUrl}/${path}`);
    return new Uint8Array(await res.arrayBuffer());
  };

  const vocab = WasmVocabulary.from_bytes(
    await load('vocabulary'),
    await load('vocabulary.offsets'),
  );

  const engine = new WasmEngine(vocab);

  // Load indices in parallel
  const [spoData, spoMeta, posData, posMeta] = await Promise.all([
    load('spo'), load('spo.meta'),
    load('pos'), load('pos.meta'),
  ]);

  engine.set_spo_index(WasmIndex.from_bytes('SPO', spoData, spoMeta));
  engine.set_pos_index(WasmIndex.from_bytes('POS', posData, posMeta));

  return engine;
}

// Usage
const engine = await buildEngine('https://cdn.example.com/indentiadb-index');
console.log(`Loaded ${engine.vocab_stats().term_count} terms`);

Performance Considerations

Memory

The WASM runtime loads all index data into linear memory. Plan for approximately:

Dataset Size (triples) Vocabulary Index Files (SPO) Total Memory
100,000 ~5 MB ~3 MB ~15 MB
1,000,000 ~40 MB ~25 MB ~100 MB
10,000,000 ~300 MB ~200 MB ~800 MB
100,000,000 ~2 GB ~1.5 GB ~5 GB

Browser Memory Limits

Most browsers limit WASM linear memory to 2-4 GB. For datasets larger than ~50 million triples, use the native server binary instead of the WASM runtime. Edge runtimes (Cloudflare Workers, Deno Deploy) typically have lower memory limits (128 MB - 512 MB).

Startup Time

Initial loading involves fetching index files over the network and decompressing blocks into memory. Expect:

Dataset Size File Transfer (50 Mbps) Initialization Total
100K triples < 1s < 100ms ~1s
1M triples ~5s ~500ms ~6s
10M triples ~40s ~3s ~45s

Use HTTP range requests or split indices into chunks for progressive loading of large datasets.

Query Latency

Once loaded, triple pattern scans are fast:

Operation Latency
Exact triple lookup (scan([s, p, o], 1)) < 1 ms
Subject scan (scan([s], 100)) 1-5 ms
Full scan (scan([], 1000)) 5-20 ms
Term lookup (lookup_term(id)) < 0.1 ms

Supported Permutations

Each permutation index optimizes for a different query pattern:

Permutation Column Order Optimized Query Pattern
SPO Subject, Predicate, Object "Find all properties of entity X"
SOP Subject, Object, Predicate "Find the relationship between X and Y"
PSO Predicate, Subject, Object "Find all entities with property P"
POS Predicate, Object, Subject "Find entities where P = value"
OSP Object, Subject, Predicate "Find all entities that reference Y"
OPS Object, Predicate, Subject "Find entities linked to Y via P"

For most use cases, loading SPO and POS provides good coverage. Add OSP if you need reverse lookups (finding which subjects reference a given object).


Build Configuration

The Cargo.toml for the WASM crate:

[package]
name = "indentiagraph-wasm"
description = "WebAssembly bindings for IndentiaGraph SPARQL engine"

[lib]
crate-type = ["cdylib", "rlib"]

[dependencies]
wasm-bindgen = "0.2"
wasm-bindgen-futures = "0.4"
js-sys = "0.3"
serde = { version = "1", features = ["derive"] }
serde_json = "1"
serde-wasm-bindgen = "0.6"
getrandom = { version = "0.3", features = ["wasm_js"] }
console_error_panic_hook = { version = "0.1", optional = true }

[features]
default = ["console_error_panic_hook"]

Reducing WASM Binary Size

The release build with wasm-opt disabled produces a ~1.2 MB gzipped binary. Enable wasm-opt in wasm-pack for further size reduction (~15-20% smaller), at the cost of longer build times. Set lto = true and codegen-units = 1 in the workspace [profile.release] for maximum optimization.