| ← Back to Index | Configuration | Performance |
Building
Build commands
cargo build --release
# With HNSW vector index (recommended for production)
cargo build --release --features usearch
# With Parquet support
cargo build --release --features usearch,parquet-format
# With builtin embedding model (self-contained binary, no network needed at runtime)
cargo build --release --features usearch,builtin-model
# With GPU-accelerated encoding (CoreML on macOS, CUDA on Linux)
cargo build --release --features usearch,gpu-encode
# All features together
cargo build --release --features usearch,parquet-format,builtin-model,gpu-encode
The binary is produced at ./target/release/meld (Windows:
.\target\release\meld.exe). Either add it to your PATH or invoke it
directly.
Feature flags
| Feature | What it does |
|---|---|
usearch |
Enables the HNSW approximate nearest-neighbour graph index for O(log N) candidate search instead of the flat backend’s O(N) brute-force scan. Up to 5x faster at scale. |
parquet-format |
Enables reading Parquet files as input datasets. All column types (string, integer, float, boolean) are converted to strings internally. Snappy-compressed Parquet files are supported. |
builtin-model |
Compiles an embedding model into the binary so no network access or model download is needed at runtime. Set model: builtin in config to use it. See Builtin model below. |
gpu-encode |
Enables GPU-accelerated ONNX encoding for batch mode. Uses CoreML on macOS and CUDA on Linux. Requires the ONNX Runtime shared library at runtime. See GPU encoding below. |
[!TIP] The
usearchfeature (HNSW vector index) is enabled by default. It provides O(log N) candidate search instead of the flat backend’s O(N) brute-force scan. At 100k records, usearch is the difference between a 12-second warm run and a 4-minute one — see Performance for full numbers. To build without it (pure Rust, no C++ dependency):cargo build --release --no-default-features.
Model download
The ONNX model is downloaded automatically on first run to
~/.cache/fastembed/ on Linux/macOS or %LOCALAPPDATA%\fastembed\ on
Windows.
Builtin model
Build with --features builtin-model to compile an embedding model
directly into the binary. The resulting binary needs no network access
and no model files on disk — ideal for air-gapped, containerised, or
edge deployments.
# Default: embeds themelder/arctic-embed-xs-entity-resolution from HuggingFace
cargo build --release --features usearch,builtin-model
# Custom model from HuggingFace
MELDER_BUILTIN_MODEL=sentence-transformers/all-MiniLM-L6-v2 \
cargo build --release --features usearch,builtin-model
# Custom model from a local directory
MELDER_BUILTIN_MODEL=./models/my-fine-tuned-model \
cargo build --release --features usearch,builtin-model
The model directory must contain the standard HuggingFace ONNX export
files: model.onnx, tokenizer.json, config.json,
special_tokens_map.json, and tokenizer_config.json.
To use the builtin model at runtime, set model: builtin in your
config:
embeddings:
model: builtin
The binary size increases by the size of the ONNX model (~90 MB for arctic-embed-xs). The model is downloaded once at build time and cached in the Cargo build directory — subsequent builds reuse the cached files.
GPU encoding
Build with --features gpu-encode to enable GPU-accelerated ONNX
encoding for batch mode (meld run). This offloads the embedding
inference to the GPU, dramatically reducing encoding time for large
datasets.
cargo build --release --features usearch,gpu-encode
GPU encoding is batch mode only. It is ignored in live mode
(meld serve) where single-record GPU dispatch overhead exceeds
the compute savings.
Platform setup
GPU encoding uses dynamic linking against the ONNX Runtime shared library. The library must be present at runtime.
macOS (Apple Silicon). The melder auto-detects ONNX Runtime installed via Homebrew. No manual configuration needed:
brew install onnxruntime
If you install ONNX Runtime to a non-standard location, set the
ORT_DYLIB_PATH environment variable:
export ORT_DYLIB_PATH=/path/to/libonnxruntime.dylib
The CoreML execution provider is used automatically. It dispatches compute across CPU, GPU, and the Neural Engine (ANE) based on the model graph.
Linux (NVIDIA GPU). Download a CUDA-enabled ONNX Runtime build
from Microsoft’s releases
and set ORT_DYLIB_PATH:
export ORT_DYLIB_PATH=/path/to/libonnxruntime.so
Requires CUDA toolkit and cuDNN to be installed. The CUDA execution provider is used automatically.
Windows. GPU encoding is not currently supported on Windows. The
gpu-encode feature will compile but encoder_device: gpu in the
config will fall back to CPU with a warning.
Configuration
Set encoder_device: gpu in the performance section of your
config file:
performance:
encoder_device: gpu
encoder_pool_size: 12 # ~60% of CPU cores (see tuning guide)
encoder_batch_size: 256 # optimal for GPU; default when device is gpu
See Configuration for the full tuning guide including benchmark results.
[!NOTE] If the binary was built without
--features gpu-encode, settingencoder_device: gpuin the config will produce a clear error at startup: “GPU encoding requires building with –features gpu-encode”.
Benchmark data
The repository includes a synthetic data generator and per-directory generation scripts. Only the 10k datasets are committed (small enough for CI); larger sizes must be regenerated after cloning.
Prerequisites:
pip install faker pandas pyarrow
Generation scripts
Each benchmark directory has its own generate_data.sh that produces
exactly the datasets its benchmarks need:
| Script | Sizes | Output location |
|---|---|---|
benchmarks/batch/generate_data.sh |
10k, 100k, 1M | benchmarks/data/ |
benchmarks/live/generate_data.sh |
10k, 100k, 1M | benchmarks/data/ |
benchmarks/accuracy/generate_data.sh |
10k | benchmarks/data/ |
The batch and live scripts accept size arguments:
# Generate all sizes (10k + 100k + 1M)
./benchmarks/batch/generate_data.sh
# Generate specific sizes
./benchmarks/batch/generate_data.sh 10k 100k
./benchmarks/live/generate_data.sh 1M
# Accuracy benchmarks only need 10k
./benchmarks/accuracy/generate_data.sh
All generators use seed 42 for deterministic, reproducible output.
The --addresses flag is always passed so datasets include address
fields used by some configs.
What’s committed vs generated
| Data | Status |
|---|---|
10k datasets (benchmarks/data/dataset_*_10k.*) |
Committed in git |
| 100k, 1M datasets | Gitignored — regenerate with scripts above |
Special cases
-
accuracy/10kx10k_exclusionsgenerates its own data during Phase 0 of itsrun_test.py(usesn_exact=1000for exact-match exclusion testing). It is not covered by the accuracy generation script. -
accuracy/science/is a research journal documenting 14 sequential fine-tuning experiments, not a standard benchmark suite. It is not intended to be reproduced — the committed results (metrics.csv,learning_curve.png,experiments.md) are the scientific record. Reproducing experiments would require days of GPU compute and running experiments in dependency order (later experiments depend on model weights produced by earlier ones). Seebenchmarks/accuracy/science/experiments.mdfor the full experiment history, results, and methodology.
Environment variables
| Variable | Purpose |
|---|---|
RUST_LOG=melder=debug |
Enable debug logging |
RUST_LOG=melder=info |
Default log level |
RAYON_NUM_THREADS |
Batch scoring thread count (defaults to logical CPU count) |
ORT_DYLIB_PATH |
Path to libonnxruntime.dylib (macOS) or libonnxruntime.so (Linux). Required for gpu-encode if ONNX Runtime is not in a standard location. On macOS, Homebrew installs are auto-detected. |
MELDER_BUILTIN_MODEL |
Build-time only. HuggingFace repo ID or local path for the model to embed when --features builtin-model is enabled. Defaults to themelder/arctic-embed-xs-entity-resolution. |
Windows
The melder builds and runs on Windows. A few things to be aware of:
Prerequisites. Install Rust via rustup which defaults to the MSVC toolchain on Windows. You will need the Visual Studio Build Tools (or full Visual Studio) with the “Desktop development with C++” workload installed — this provides the MSVC compiler and linker that Rust requires.
Building. The same cargo build commands work. The binary is
produced at target\release\meld.exe:
cargo build --release --features parquet-format
.\target\release\meld.exe validate --config config.yaml
Environment variables. RUST_LOG and RAYON_NUM_THREADS work the
same way. Set them in PowerShell with $env:RUST_LOG = "melder=debug"
or in cmd with set RUST_LOG=melder=debug.
Graceful shutdown. On Unix, the melder listens for both Ctrl-C and SIGTERM. On Windows, SIGTERM is not available — use Ctrl-C to trigger a clean shutdown (in-flight requests drain, WAL is compacted, cross-map and index caches are saved).
Config paths. Both forward slashes and backslashes work in YAML
config file paths (datasets.a.path, output.results_path, etc.).