npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

fast_plaid_rust

v2.0.0

Published

<div align="center"> <h1>FastPlaid</h1> </div>

Readme

 

 

⭐️ Overview

FastPlaid implements efficient multi-vector search for ColBERT-style models. Unlike traditional single-vector search, multi-vector approaches maintain token-level embeddings for fine-grained similarity matching.

Key Features:

  • 🚀 WASM Support - Browser-native search with mxbai-edge-colbert-v0-17m (48-dim embeddings)
  • 4-bit Quantization + IVF - 8x compression, 3-5x faster search
  • 🔄 Incremental Updates - Add documents without full rebuild (NEW!)
  • 🎯 MaxSim Search - Token-level late interaction for accurate retrieval
  • 📦 Pure Rust - Fast, safe, and portable
  • 🗂️ Offline Index Building - Pre-compute indexes for instant browser loading

🏗️ Architecture

FastPlaid has two implementations for different use cases:

| Component | Purpose | Use Case | |-----------|---------|----------| | Native Rust (search/, index/) | Full PLAID with Product Quantization | Python bindings, CLI, server-side | | WASM (lib_wasm_quantized.rs) | Lightweight 4-bit + IVF | Browser demos, GitHub Pages |

Why two implementations?

  • Native uses Candle (PyTorch-like) tensors for full PLAID algorithm
  • WASM uses pure Rust for browser compatibility (no Candle in WASM)
  • Both share the same 4-bit quantization codec

📖 See OFFLINE_INDEX_GUIDE.md for detailed architecture and workflows

💻 Installation

Python Package

pip install fast-plaid

PyTorch Compatibility: | FastPlaid | PyTorch | Command | |-----------|---------|---------| | 1.2.4.280 | 2.8.0 | pip install fast-plaid==1.2.4.280 | | 1.2.4.271 | 2.7.1 | pip install fast-plaid==1.2.4.271 |

WASM Demo

cd docs
python3 serve.py
# Visit http://localhost:8000/

Offline Index Building

# 1. Compute embeddings (Python)
python scripts/build_offline_wasm_index.py \
    --papers data/papers_1000.json \
    --output docs/data

# 2. Build .fastplaid index (Node.js + WASM)
node scripts/build_fastplaid_index.js \
    docs/data \
    docs/data/index.fastplaid

# 3. Deploy to browser
# index.fastplaid: 6.2 MB, loads in <1s

📖 See OFFLINE_INDEX_GUIDE.md for complete workflows

🎯 Quick Start

Python API

from fast_plaid import FastPlaid

# Initialize with ColBERT embeddings (48-dim token vectors)
index = FastPlaid(dim=48, nbits=4)  # 4-bit quantization

# Add documents (shape: [num_docs, max_tokens, 48])
index.add(doc_embeddings)

# Search (shape: [num_queries, query_tokens, 48])
scores = index.search(query_embeddings, k=10)

WASM Browser Demo

// Load model
const colbert = new ColBERT(
    modelWeights, dense1Weights, dense2Weights,
    tokenizer, config, stConfig,
    dense1Config, dense2Config, tokensConfig, 32
);

// Encode and search
const queryEmb = await colbert.encode({sentences: [query], is_query: true});
const results = await fastPlaid.search(queryEmb, 10);

// Incremental updates (NEW!)
const newDocEmb = await colbert.encode({sentences: [newDoc], is_query: false});
fastPlaid.update_index_incremental(newDocEmb, newDocInfo);

Incremental Index Updates 🔄

FastPlaid now supports adding documents without rebuilding the entire index:

// Create initial index
fastPlaid.load_documents_quantized(embeddings, docInfo, 256);

// Add new documents incrementally (8x faster than rebuild!)
fastPlaid.update_index_incremental(newEmbeddings, newDocInfo);

// Check statistics
const info = JSON.parse(fastPlaid.get_index_info());
console.log(`${info.num_documents} docs, ${info.pending_deltas} deltas`);

// Manual compaction (optional - auto-compacts at 10%)
fastPlaid.compact_index();

Performance:

  • 8.3x faster for small batches (<100 docs)
  • 2.7x faster for large batches (1000 docs)
  • Auto-compaction when deltas exceed 10%
  • <5% search overhead with deltas

📖 See INCREMENTAL_UPDATES.md for full API documentation

🏗️ Architecture

Multi-Vector Pipeline

Text → Tokenizer → ModernBERT (256d) → 1_Dense (512d) → 2_Dense (48d) → MaxSim Search

Key Components:

  • ModernBERT: 17M parameter encoder
  • 2_Dense Projection: 256→512→48 dimensions (10.6x compression)
  • 4-bit Quantization: Additional 8x storage savings
  • MaxSim Scoring: score = Σ max(q_token · d_token) per query token

WASM Implementation

  • Model: mixedbread-ai/mxbai-edge-colbert-v0-17m
  • Runtime: Pure browser (no server)
  • Index Size: ~2.7MB for 200 documents (48-dim, 4-bit)
  • Search Speed: <50ms for 1000 documents

📊 Performance

Index Size Comparison (200 documents)

| Method | Dimensions | Size | Compression | |--------|-----------|------|-------------| | Without 2_Dense | 512 | ~28.6 MB | 1x | | With 2_Dense | 48 | ~2.7 MB | 10.6x | | With 2_Dense + 4-bit | 48 | ~0.7 MB | 40x |

Speed Benchmarks

  • Encoding: ~50ms per document (WASM)
  • Search: ~10ms for 100 docs, ~50ms for 1000 docs
  • Index Build: ~500ms for 200 documents

🔧 WASM Build

The WASM package includes both FastPlaid indexing and ColBERT model inference:

# Quick build (recommended)
./build_wasm.sh

# Or manual build:
# 1. Build pylate-rs with 2_Dense support
cd pylate-rs
cargo build --lib --release --target wasm32-unknown-unknown \
    --no-default-features --features wasm

# 2. Generate bindings
cargo install wasm-bindgen-cli --version 0.2.104
wasm-bindgen target/wasm32-unknown-unknown/release/pylate_rs.wasm \
    --out-dir pkg --target web

# 3. Build FastPlaid WASM
cd ..
RUSTFLAGS="-C target-feature=+simd128" wasm-pack build --target web --out-dir docs/pkg --release

# 4. Fix WASM table limits (required for v1.3.0+)
python3 fix_wasm_table.py

Output:

  • pylate_rs_bg.wasm (4.9MB) - ColBERT model + 2_Dense
  • fast_plaid_rust_bg.wasm (171KB) - Indexing + search with incremental updates

Note: The table fix step is required for v1.3.0+ to support incremental update methods. See WASM_TABLE_FIX.md for details.

🎨 Demo Features

1. Real-Time Search (index.html)

  • Load mxbai-edge-colbert-v0-17m model
  • Index 100 documents
  • Interactive search with result highlighting
  • Performance metrics display

2. Paper Search (papers-demo.html)

  • Adjustable dataset size (10-1000 papers)
  • Compare FastPlaid vs Direct MaxSim
  • Index size visualization
  • Search method toggle

3. Method Comparison

  • FastPlaid (Indexed): 4-bit quantized, ~7KB for 10 docs
  • Direct MaxSim: Full precision, ~57KB for 10 docs
  • Speedup: 2-5x faster with FastPlaid for 100+ documents

📁 Project Structure

fast-plaid/
├── rust/                  # Core Rust implementation
│   ├── lib.rs            # FastPlaid index
│   └── lib_wasm.rs       # WASM bindings
├── docs/                 # Browser demos (GitHub Pages)
│   ├── index.html        # Main demo
│   ├── build-index.html  # Index builder
│   ├── mxbai-integration.js  # ColBERT integration
│   └── node_modules/     # WASM modules
├── python/               # Python bindings
└── README.md            # This file

🔬 Technical Details

2_Dense Support

FastPlaid uses pylate-rs with full 2_Dense layer support for mxbai-edge-colbert-v0-17m:

Architecture:

  1. 1_Dense: 256 → 512 (expansion for representation)
  2. 2_Dense: 512 → 48 (compression for efficiency)

Benefits:

  • Correct 48-dim output (not 512)
  • 10.6x smaller indexes
  • Matches official model specifications

Quantization

4-bit quantization with centroids:

// Quantize to 4-bit (16 levels)
let quantized = embeddings.map(|x| ((x - min) / (max - min) * 15.0) as u8);

// Dequantize for search
let reconstructed = quantized.map(|q| min + (q as f32 / 15.0) * (max - min));

Trade-offs:

  • Storage: 8x smaller
  • Speed: ~10% faster (less memory bandwidth)
  • Quality: <2% accuracy loss

🚀 Deployment

GitHub Pages

The WASM demo can be deployed to GitHub Pages:

# Build for production
cd demo
./build-prod.sh

# Deploy
git add .
git commit -m "Update demo"
git push origin main

Limitations:

  • Max file size: 100MB (GitHub Pages limit)
  • Total site size: <1GB recommended
  • Use 4-bit quantization for large datasets

Local Development

cd demo
python3 serve.py  # http://localhost:8000/

🔗 Resources

📝 Recent Updates

v5.0 (2025-01-22):

  • ✅ Full 2_Dense support (48-dim embeddings)
  • ✅ 4-bit quantization (8x compression)
  • ✅ WASM demo with real ColBERT model
  • ✅ Query expansion support
  • ✅ Index size comparison UI
  • ✅ Adjustable dataset size

Previous:

  • SIMD optimizations
  • Offline index caching
  • PLAID implementation
  • Python/Rust bindings

🤝 Contributing

Contributions welcome! Key areas:

  • Performance optimizations
  • Additional quantization methods
  • More demo examples
  • Documentation improvements

📄 License

MIT License - see LICENSE file for details


Status: Production Ready | WASM: 4.9MB | Embedding Dim: 48 | Model: mxbai-edge-colbert-v0-17m