WebAssembly Runtime¶

IndentiaDB's SPARQL engine compiles to WebAssembly (WASM), enabling RDF triple pattern lookups and vocabulary resolution to run entirely in the browser or on edge runtimes -- with no server round-trips. The WASM module loads pre-built index and vocabulary files into memory and exposes a JavaScript/TypeScript API for scanning triples, resolving term IDs, and inspecting index statistics.

Overview¶

The WASM runtime packages IndentiaDB's core storage layer -- permutation indices and vocabulary -- into a WebAssembly module that runs in any environment with a WASM runtime: modern browsers, Cloudflare Workers, Deno Deploy, Node.js, and Fastly Compute.

Capability	WASM Runtime	Full Server
Triple pattern scan (SPO, POS, OSP)	Yes	Yes
Vocabulary lookup (term ID to string)	Yes	Yes
SPARQL SELECT/CONSTRUCT/ASK	Planned	Yes
SPARQL UPDATE	No	Yes
Full-text / BM25 search	No	Yes
Vector search (HNSW)	No	Yes
Federation (SERVICE)	No	Yes
Persistence / write-ahead log	No	Yes
Authentication / ACL	No	Yes
Elasticsearch-compatible API	No	Yes

Query-Only, No Persistence

The WASM runtime is read-only. It loads pre-built index files into memory and supports triple pattern lookups and vocabulary resolution. It does not support SPARQL UPDATE, write operations, full-text search, or vector search. For a full-featured deployment, use the native server binary.

Installation¶

npm Package¶

npm install @indentia/graph-wasm

CDN (ES Module)¶

<script type="module">
  import init, { WasmEngine, WasmVocabulary, WasmIndex }
    from 'https://cdn.indentia.ai/wasm/latest/indentiagraph_wasm.js';

  await init();
</script>

Manual Build¶

Build the WASM module from source using wasm-pack:

# Install wasm-pack
cargo install wasm-pack

# Build for browser (ES module output)
cd indentiagraph-wasm
wasm-pack build --target web --release

# Build for Node.js
wasm-pack build --target nodejs --release

# Build for bundlers (webpack, vite, etc.)
wasm-pack build --target bundler --release

The build output is placed in pkg/:

pkg/
  indentiagraph_wasm.js       # JavaScript glue code
  indentiagraph_wasm_bg.wasm  # WebAssembly binary (~1.2 MB gzipped)
  indentiagraph_wasm.d.ts     # TypeScript type declarations
  package.json                # npm package metadata

Core API¶

The WASM runtime exposes four main classes: WasmEngine, WasmVocabulary, WasmIndex, and WasmQueryResults.

`WasmVocabulary`¶

The vocabulary maps between integer term IDs and their string representations (IRIs, blank nodes, literals).

class WasmVocabulary {
  /** Load vocabulary from raw byte arrays. */
  static from_bytes(vocab_data: Uint8Array, offset_data: Uint8Array): WasmVocabulary;

  /** Create an empty vocabulary (for testing). */
  static empty(): WasmVocabulary;

  /** Number of terms in the vocabulary. */
  len(): number;

  /** Check if the vocabulary is empty. */
  is_empty(): boolean;

  /** Look up a term by its integer ID. Returns null if not found. */
  get_by_index(id: bigint): string | null;

  /** Check if a term ID represents a literal (vs IRI or blank node). */
  is_literal(id: bigint): boolean;

  /** Check if a term ID represents an IRI or blank node. */
  is_iri_or_blank(id: bigint): boolean;

  /** Get the boundary index between literals and IRIs. */
  first_iri_index(): number;

  /** Get vocabulary statistics as a JSON object. */
  stats(): { term_count: number; first_iri_index: number; is_empty: boolean };
}

`WasmIndex`¶

A permutation index stores triples in a specific column order (e.g., SPO, POS, OSP) to enable efficient prefix-based lookups.

class WasmIndex {
  /** Load an index from raw byte arrays. */
  static from_bytes(
    permutation: 'SPO' | 'SOP' | 'PSO' | 'POS' | 'OSP' | 'OPS',
    data: Uint8Array,
    metadata: Uint8Array
  ): WasmIndex;

  /** Get the permutation type as a string. */
  permutation(): string;

  /** Total number of triples in this index. */
  num_triples(): bigint;

  /** Number of compressed blocks. */
  num_blocks(): number;

  /** Check if the index is empty. */
  is_empty(): boolean;

  /**
   * Scan for triples matching a prefix pattern.
   * prefix: Array of 0-3 term IDs.
   *   [] = all triples
   *   [s] = triples with subject s
   *   [s, p] = triples with subject s and predicate p
   *   [s, p, o] = exact triple lookup
   * Returns: Array of [subject_id, predicate_id, object_id] tuples.
   */
  scan(prefix: bigint[], limit: number): [bigint, bigint, bigint][];

  /** Count triples matching a prefix (more efficient than scan). */
  count(prefix: bigint[]): bigint;

  /** Check if any triple matches the prefix. */
  exists(prefix: bigint[]): boolean;

  /** Get index statistics as a JSON object. */
  stats(): { permutation: string; triple_count: number; block_count: number; is_empty: boolean };
}

`WasmEngine`¶

A convenience wrapper that combines a vocabulary with one or more permutation indices.

class WasmEngine {
  /** Create an engine with the given vocabulary. */
  constructor(vocabulary: WasmVocabulary);

  /** Attach the SPO index. */
  set_spo_index(index: WasmIndex): void;

  /** Attach the POS index. */
  set_pos_index(index: WasmIndex): void;

  /** Attach the OSP index. */
  set_osp_index(index: WasmIndex): void;

  /** Get vocabulary statistics. */
  vocab_stats(): { term_count: number; is_empty: boolean };

  /** Get index statistics for all loaded indices. */
  index_stats(): {
    spo: { triple_count: number; block_count: number } | null;
    pos: { triple_count: number; block_count: number } | null;
    osp: { triple_count: number; block_count: number } | null;
  };

  /** Look up a term string by its ID. */
  lookup_term(id: bigint): string | null;

  /** Scan triples from the SPO index. */
  scan(prefix: bigint[], limit: number): [bigint, bigint, bigint][];

  /** Resolve term IDs to their string representations. */
  resolve_triple(
    subject_id: bigint,
    predicate_id: bigint,
    object_id: bigint
  ): { subject: string; predicate: string; object: string };
}

`version()`¶

Returns the library version string:

import { version } from '@indentia/graph-wasm';
console.log(version()); // "1.4.2"

Loading Data¶

The WASM runtime requires pre-built index files exported from a running IndentiaDB instance. These files are typically served as static assets.

Required Files¶

File	Description	Typical Size
`vocabulary`	Concatenated term strings	10 MB - 2 GB
`vocabulary.offsets`	Offset table into vocabulary	~8 bytes per term
`spo`	SPO permutation index (compressed blocks)	5 MB - 1 GB
`spo.meta`	SPO block metadata	~20 bytes per block
`pos`	POS permutation index	5 MB - 1 GB
`pos.meta`	POS block metadata	~20 bytes per block
`osp`	OSP permutation index	5 MB - 1 GB
`osp.meta`	OSP block metadata	~20 bytes per block

Not All Indices Are Required

Load only the permutation indices you need. For subject-centric lookups, only the SPO index is necessary. For predicate-centric queries (e.g., "find all triples with predicate rdf:type"), load POS. Load all three for full triple pattern flexibility.

Exporting Index Files¶

Export index files from a running IndentiaDB instance:

# Export all index files to a directory
indentiagraph backup create \
  --source /var/lib/indentiadb/data \
  --repository ./wasm-export \
  --index "spo=permutations/spo" \
  --index "pos=permutations/pos" \
  --index "osp=permutations/osp" \
  --path "vocabulary=vocabulary"

Browser Example¶

Complete Browser Application¶

<!DOCTYPE html>
<html lang="en">
<head>
  <meta charset="UTF-8">
  <title>IndentiaDB WASM Triple Browser</title>
</head>
<body>
  <h1>IndentiaDB WASM Triple Browser</h1>
  <div id="stats"></div>
  <div>
    <input id="subject-id" type="number" placeholder="Subject ID">
    <button id="scan-btn">Scan Triples</button>
  </div>
  <pre id="results"></pre>

  <script type="module">
    import init, { WasmEngine, WasmVocabulary, WasmIndex }
      from './pkg/indentiagraph_wasm.js';

    async function loadBytes(url) {
      const response = await fetch(url);
      if (!response.ok) throw new Error(`Failed to fetch ${url}: ${response.status}`);
      return new Uint8Array(await response.arrayBuffer());
    }

    async function main() {
      // Initialize WASM module
      await init();

      // Load vocabulary
      const [vocabData, offsetData] = await Promise.all([
        loadBytes('/data/vocabulary'),
        loadBytes('/data/vocabulary.offsets'),
      ]);
      const vocab = WasmVocabulary.from_bytes(vocabData, offsetData);

      // Load SPO index
      const [spoData, spoMeta] = await Promise.all([
        loadBytes('/data/spo'),
        loadBytes('/data/spo.meta'),
      ]);
      const spoIndex = WasmIndex.from_bytes('SPO', spoData, spoMeta);

      // Create engine
      const engine = new WasmEngine(vocab);
      engine.set_spo_index(spoIndex);

      // Display statistics
      const vocabStats = engine.vocab_stats();
      const indexStats = engine.index_stats();
      document.getElementById('stats').innerHTML = `
        <p>Vocabulary: ${vocabStats.term_count} terms</p>
        <p>SPO Index: ${indexStats.spo?.triple_count ?? 0} triples
           in ${indexStats.spo?.block_count ?? 0} blocks</p>
      `;

      // Scan button handler
      document.getElementById('scan-btn').addEventListener('click', () => {
        const subjectId = document.getElementById('subject-id').value;
        const prefix = subjectId ? [BigInt(subjectId)] : [];
        const triples = engine.scan(prefix, 50);

        const resolved = triples.map(([s, p, o]) => {
          const triple = engine.resolve_triple(s, p, o);
          return `${triple.subject}  ${triple.predicate}  ${triple.object}`;
        });

        document.getElementById('results').textContent = resolved.join('\n');
      });
    }

    main().catch(console.error);
  </script>
</body>
</html>

Node.js / Server-Side Example¶

const { WasmEngine, WasmVocabulary, WasmIndex } = require('@indentia/graph-wasm');
const fs = require('fs');

// Load files from disk
const vocabData = new Uint8Array(fs.readFileSync('./data/vocabulary'));
const offsetData = new Uint8Array(fs.readFileSync('./data/vocabulary.offsets'));
const spoData = new Uint8Array(fs.readFileSync('./data/spo'));
const spoMeta = new Uint8Array(fs.readFileSync('./data/spo.meta'));

// Build engine
const vocab = WasmVocabulary.from_bytes(vocabData, offsetData);
const spoIndex = WasmIndex.from_bytes('SPO', spoData, spoMeta);

const engine = new WasmEngine(vocab);
engine.set_spo_index(spoIndex);

// Scan all triples (first 100)
const triples = engine.scan([], 100);
for (const [s, p, o] of triples) {
  const resolved = engine.resolve_triple(s, p, o);
  console.log(`${resolved.subject}  ${resolved.predicate}  ${resolved.object}`);
}

Edge Runtime Example (Cloudflare Workers)¶

Deploy a lightweight triple lookup API on Cloudflare Workers:

import init, { WasmEngine, WasmVocabulary, WasmIndex }
  from '@indentia/graph-wasm';

let engine = null;

async function initEngine(env) {
  if (engine) return engine;

  await init();

  // Load from R2 or KV storage
  const [vocabData, offsetData, spoData, spoMeta] = await Promise.all([
    env.INDEX_BUCKET.get('vocabulary').then(r => r.arrayBuffer()),
    env.INDEX_BUCKET.get('vocabulary.offsets').then(r => r.arrayBuffer()),
    env.INDEX_BUCKET.get('spo').then(r => r.arrayBuffer()),
    env.INDEX_BUCKET.get('spo.meta').then(r => r.arrayBuffer()),
  ]);

  const vocab = WasmVocabulary.from_bytes(
    new Uint8Array(vocabData),
    new Uint8Array(offsetData)
  );
  const spoIndex = WasmIndex.from_bytes(
    'SPO',
    new Uint8Array(spoData),
    new Uint8Array(spoMeta)
  );

  engine = new WasmEngine(vocab);
  engine.set_spo_index(spoIndex);
  return engine;
}

export default {
  async fetch(request, env) {
    const url = new URL(request.url);
    const e = await initEngine(env);

    if (url.pathname === '/stats') {
      return Response.json({
        vocab: e.vocab_stats(),
        indices: e.index_stats(),
      });
    }

    if (url.pathname === '/scan') {
      const subjectId = url.searchParams.get('subject');
      const limit = parseInt(url.searchParams.get('limit') || '20', 10);
      const prefix = subjectId ? [BigInt(subjectId)] : [];
      const triples = e.scan(prefix, limit);

      const resolved = triples.map(([s, p, o]) => e.resolve_triple(s, p, o));
      return Response.json({ count: resolved.length, triples: resolved });
    }

    if (url.pathname === '/lookup') {
      const termId = url.searchParams.get('id');
      if (!termId) return new Response('Missing id parameter', { status: 400 });
      const term = e.lookup_term(BigInt(termId));
      return Response.json({ id: termId, term });
    }

    return new Response('Not found', { status: 404 });
  }
};

TypeScript Integration¶

The npm package ships with TypeScript declarations. Import types directly:

import init, {
  WasmEngine,
  WasmVocabulary,
  WasmIndex,
  version,
} from '@indentia/graph-wasm';

async function buildEngine(baseUrl: string): Promise<WasmEngine> {
  await init();

  const load = async (path: string): Promise<Uint8Array> => {
    const res = await fetch(`${baseUrl}/${path}`);
    return new Uint8Array(await res.arrayBuffer());
  };

  const vocab = WasmVocabulary.from_bytes(
    await load('vocabulary'),
    await load('vocabulary.offsets'),
  );

  const engine = new WasmEngine(vocab);

  // Load indices in parallel
  const [spoData, spoMeta, posData, posMeta] = await Promise.all([
    load('spo'), load('spo.meta'),
    load('pos'), load('pos.meta'),
  ]);

  engine.set_spo_index(WasmIndex.from_bytes('SPO', spoData, spoMeta));
  engine.set_pos_index(WasmIndex.from_bytes('POS', posData, posMeta));

  return engine;
}

// Usage
const engine = await buildEngine('https://cdn.example.com/indentiadb-index');
console.log(`Loaded ${engine.vocab_stats().term_count} terms`);

Performance Considerations¶

Memory¶

The WASM runtime loads all index data into linear memory. Plan for approximately:

Dataset Size (triples)	Vocabulary	Index Files (SPO)	Total Memory
100,000	~5 MB	~3 MB	~15 MB
1,000,000	~40 MB	~25 MB	~100 MB
10,000,000	~300 MB	~200 MB	~800 MB
100,000,000	~2 GB	~1.5 GB	~5 GB

Browser Memory Limits

Most browsers limit WASM linear memory to 2-4 GB. For datasets larger than ~50 million triples, use the native server binary instead of the WASM runtime. Edge runtimes (Cloudflare Workers, Deno Deploy) typically have lower memory limits (128 MB - 512 MB).

Startup Time¶

Initial loading involves fetching index files over the network and decompressing blocks into memory. Expect:

Dataset Size	File Transfer (50 Mbps)	Initialization	Total
100K triples	< 1s	< 100ms	~1s
1M triples	~5s	~500ms	~6s
10M triples	~40s	~3s	~45s

Use HTTP range requests or split indices into chunks for progressive loading of large datasets.

Query Latency¶

Once loaded, triple pattern scans are fast:

Operation	Latency
Exact triple lookup (`scan([s, p, o], 1)`)	< 1 ms
Subject scan (`scan([s], 100)`)	1-5 ms
Full scan (`scan([], 1000)`)	5-20 ms
Term lookup (`lookup_term(id)`)	< 0.1 ms

Supported Permutations¶

Each permutation index optimizes for a different query pattern:

Permutation	Column Order	Optimized Query Pattern
`SPO`	Subject, Predicate, Object	"Find all properties of entity X"
`SOP`	Subject, Object, Predicate	"Find the relationship between X and Y"
`PSO`	Predicate, Subject, Object	"Find all entities with property P"
`POS`	Predicate, Object, Subject	"Find entities where P = value"
`OSP`	Object, Subject, Predicate	"Find all entities that reference Y"
`OPS`	Object, Predicate, Subject	"Find entities linked to Y via P"

For most use cases, loading SPO and POS provides good coverage. Add OSP if you need reverse lookups (finding which subjects reference a given object).

Build Configuration¶

The Cargo.toml for the WASM crate:

[package]
name = "indentiagraph-wasm"
description = "WebAssembly bindings for IndentiaGraph SPARQL engine"

[lib]
crate-type = ["cdylib", "rlib"]

[dependencies]
wasm-bindgen = "0.2"
wasm-bindgen-futures = "0.4"
js-sys = "0.3"
serde = { version = "1", features = ["derive"] }
serde_json = "1"
serde-wasm-bindgen = "0.6"
getrandom = { version = "0.3", features = ["wasm_js"] }
console_error_panic_hook = { version = "0.1", optional = true }

[features]
default = ["console_error_panic_hook"]

Reducing WASM Binary Size

The release build with wasm-opt disabled produces a ~1.2 MB gzipped binary. Enable wasm-opt in wasm-pack for further size reduction (~15-20% smaller), at the cost of longer build times. Set lto = true and codegen-units = 1 in the workspace [profile.release] for maximum optimization.