Live Mode

For a hands-on walkthrough, see the live worked example.

Live mode starts an HTTP server that matches records on the fly as they arrive. It supports two storage backends:

In-memory (default) — records and crossmap held in RAM, persisted via a write-ahead log (WAL) and crossmap CSV.
SQLite (set live.db_path) — records, crossmap, and review queue stored in a SQLite database. Durable by default, instant warm restarts.

Both modes use the same scoring pipeline, so a match score means the same thing regardless of which backend produced it.

Starting the server

meld serve --config config.yaml --port 8090

Once ready, the server prints:

meld serve listening on port 8090

Flag	Short	Description
`--config`	`-c`	Path to YAML config file (required)
`--port`	`-p`	TCP port to listen on (default: 8080)

Storage backends

SQLite throughput is ~18% lower than in-memory at the same scale (~1,395 vs ~1,698 req/s at 10k, c=10). The gap comes from B-tree traversal and per-connection page cache overhead vs DashMap hash lookups. Tail latencies (p95, p99) are actually better with SQLite — the connection pool smooths out contention spikes. Records are stored in columnar format (one column per field) for fast access with no JSON serialization overhead.

SQLite uses a writer + reader pool architecture: sqlite_read_pool_size (default 4) read-only connections serve concurrent reads, while a single write connection handles all mutations. sqlite_pool_worker_cache_mb (default 128) controls the page cache per read connection.

Startup sequences

What happens at launch depends on the storage backend:

In-memory (no `db_path`)

Dataset files (CSV/JSONL/Parquet) are loaded as the base record set
Embedding index caches are loaded from disk (if present and valid)
Blocking indices are built from the dataset records
Crossmap CSV is loaded
Exclusions CSV is loaded (if configured)
All WAL files are replayed in chronological order:
- Records are inserted/removed from the in-memory store
- Blocking indices are updated for each replayed record
- Crossmap confirms/breaks are applied
- Exclusions are applied/removed
- Embedding vectors already in the cached index are skipped (no ONNX re-encoding)
Unmatched sets and common-ID indices are rebuilt from the final state
Review queue is populated from unresolved ReviewMatch WAL events
A new timestamped WAL file is opened for the current run
Initial matching pass — all unmatched B records are scored against the A pool using the full scoring pipeline (blocking, BM25, ANN, synonym). Auto-matches are claimed in the crossmap and persisted via WAL. Review-band matches are added to the review queue. This ensures pre-loaded datasets are fully matched before the API starts listening. Set live.skip_initial_match: true to skip this step and start the API immediately.

SQLite — cold start (DB file does not exist)

A new SQLite database is created
Dataset files (CSV/JSONL/Parquet) are loaded and inserted into SQLite
If a crossmap CSV exists at cross_map.path, its pairs are imported into SQLite (one-time migration — the CSV is not updated afterwards)
Embedding indices are built or loaded from cache
Blocking indices are built
A new WAL file is opened

SQLite — warm start (DB file exists)

SQLite database is opened directly — records, crossmap, and reviews are already there
Embedding index caches are loaded from disk
Blocking indices are rebuilt from the SQLite records
A new WAL file is opened

No CSV loading. No WAL replay. Restarts are fast.

Logging

All log output goes to stderr, so it can be redirected independently of any stdout output. Control the level with RUST_LOG:

# Default — info-level messages (startup, shutdown, errors)
meld serve --config config.yaml --port 8090

# Debug — includes per-request timing, encode/search/score spans
RUST_LOG=melder=debug meld serve --config config.yaml --port 8090

# JSON structured logs (for piping to a log aggregator)
meld serve --config config.yaml --port 8090 --log-format json

# Run in background and tail the log
meld serve --config config.yaml --port 8090 2>serve.log &
tail -f serve.log

Pipeline hooks

For event notifications (match confirmed, review queued, no match, match broken), see Hooks. Hooks run a single long-lived subprocess that receives events as JSON on stdin — zero impact on scoring throughput.

Write-ahead log (WAL)

Every record addition and cross-map change is appended to the WAL file (configured via live.upsert_log, e.g. wal.ndjson). This is a newline-delimited JSON file — one event per line.

In in-memory mode, the WAL is essential for crash recovery: if the server is killed, the next startup replays these events to restore state. In SQLite mode, the WAL is still written as a redundant safety net but is not needed for recovery.

On clean shutdown the WAL is compacted (duplicate entries collapsed) and can be inspected with:

# See recent WAL entries
tail -20 wal.ndjson

# Count events by type
jq -r .type wal.ndjson | sort | uniq -c

Each server run creates a new timestamped WAL file (e.g. wal_20260312T143207Z.ndjson). On startup, all WAL files matching the configured base path are discovered and replayed in lexicographic (chronological) order. Each run’s WAL is compacted at shutdown. Old WAL files accumulate across runs; delete them manually if disk space is a concern (only the most recent compacted file is needed for full recovery).

Cross-map persistence

How confirmed matches are persisted depends on the storage backend:

In-memory: The crossmap is held in RAM and flushed to the crossmap CSV periodically (every crossmap_flush_secs, default 5 seconds) and on shutdown. The CSV is the durable record of which pairs have been matched.
SQLite: Every confirm/break is written to the database immediately. The crossmap CSV is never updated. Use the /crossmap/pairs API endpoint or query the crossmap table in the SQLite DB directly to export pairs.

Shutdown

Send Ctrl-C or SIGTERM. The melder will stop accepting new connections, drain in-flight requests, flush and compact the WAL, save the cross-map (in-memory mode) or no-op (SQLite mode), and persist index caches. No data is lost.

Persistence and restart

Live mode is designed to survive restarts. The full state — records added via the API, confirmed crossmap pairs, and embedding vectors — is persisted to disk and restored on the next startup.

In-memory mode (default)

What is persisted

Component	Mechanism	When
Record mutations (add, remove)	Write-ahead log (WAL)	Every API call
Crossmap confirmations/breaks	WAL + crossmap CSV	API call + periodic flush
Embedding vectors	Index cache (`.usearchdb` or `.index`)	Shutdown
Review queue	WAL (`ReviewMatch` events)	Every API call

Shutdown sequence

WAL is flushed and compacted (deduplicates per record ID, last-write-wins)
Crossmap CSV is flushed to disk
Combined embedding index caches are saved (includes all API-added vectors)

Startup sequence

Dataset files (CSV/JSONL/Parquet) are loaded as the base record set
Embedding index caches are loaded from disk (if present and valid)
Blocking indices are built from the dataset records
Crossmap CSV is loaded
All WAL files are replayed in chronological order:
- Records are inserted/removed from the in-memory store
- Blocking indices are updated for each replayed record
- Crossmap confirms/breaks are applied
- Embedding vectors already in the cached index are skipped (no ONNX re-encoding)
Unmatched sets and common-ID indices are rebuilt from the final state
Review queue is populated from unresolved ReviewMatch WAL events
A new timestamped WAL file is opened for the current run

What this means in practice

Records added via /a/add or /b/add survive restarts. They are replayed from the WAL and their embedding vectors are loaded from the index cache — no re-encoding required.
Confirmed crossmap pairs survive via both the crossmap CSV and WAL replay (belt and suspenders).
Blocking works correctly for WAL-replayed records. A new record added after restart will find WAL-replayed records on the opposite side as match candidates.
The review queue is rebuilt from WAL events, minus any pairs that were subsequently confirmed or broken.
The base dataset files are never modified. The WAL captures the delta.

WAL files

SQLite mode (`live.db_path` set)

What is persisted

Component	Mechanism	When
Records	SQLite `records` table	Immediately on every add/remove
Crossmap pairs	SQLite `crossmap` table	Immediately on every confirm/break
Review queue	SQLite `reviews` table	Immediately on every review-band match
Embedding vectors	Index cache (`.usearchdb` or `.index`)	Shutdown
WAL	Same as in-memory mode	Every API call (redundant safety net)

Shutdown sequence

WAL is flushed and compacted
Combined embedding index caches are saved
(No crossmap CSV flush — SQLite is already durable)

Warm startup (DB exists)

SQLite database is opened directly — records, crossmap, and reviews are already there
Embedding index caches are loaded from disk
Blocking indices are rebuilt from the SQLite records
A new WAL file is opened

No CSV loading. No WAL replay. Restarts are fast.

Cold startup (no DB file)

A new SQLite database is created
Dataset files (CSV/JSONL/Parquet) are loaded and inserted into SQLite
If a crossmap CSV exists at cross_map.path, its pairs are imported into SQLite (one-time migration — the CSV is not updated afterwards)
Embedding indices are built or loaded from cache
Blocking indices are built
A new WAL file is opened

Migration from in-memory to SQLite

Add live.db_path to your config and restart. The first startup is a cold start — datasets are loaded from CSV, the crossmap CSV is imported into the database, and the WAL is written as a redundant log. From the second startup onwards, the database is the sole source of truth and restarts are instant. The crossmap CSV is never written to again.

Live Mode

Starting the server

Storage backends

Startup sequences

In-memory (no db_path)

SQLite — cold start (DB file does not exist)

SQLite — warm start (DB file exists)

Logging

Pipeline hooks

Write-ahead log (WAL)

Cross-map persistence

Shutdown

Persistence and restart

In-memory mode (default)

What is persisted

Shutdown sequence

Startup sequence

What this means in practice

WAL files

SQLite mode (live.db_path set)

What is persisted

Shutdown sequence

Warm startup (DB exists)

Cold startup (no DB file)

Migration from in-memory to SQLite

In-memory (no `db_path`)

SQLite mode (`live.db_path` set)