npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@zlaabsi/turboquant-wasm

v0.1.1

Published

TurboQuant vector quantization for the browser — compress embeddings 8x, search client-side

Readme

turboquant-wasm is a Rust/WebAssembly implementation of the TurboQuant MSE variant (Algorithm 1 from the paper). It is built for applications that already have embeddings and want local retrieval without shipping a vector database or a graph index.

Why this repo does not ship the QJL variant

The short version is that QJL works against the main design goal of turboquant-wasm: keep browser-side retrieval small and memory-efficient.

  • QJL adds an extra projection matrix, which materially increases runtime memory pressure.
  • In browser and WASM settings, that extra matrix becomes expensive quickly, especially once embedding dimensions get large.
  • The MSE variant already gives strong recall for the bit-rates this repo actually targets in practice, especially at 3+ bits.
  • For this project, the tradeoff was not worth it: more complexity and more memory, without fitting the core promise of a tiny browser-first package.

So the repo deliberately optimizes for the TurboQuant MSE path: smaller package, lower memory footprint, simpler runtime story.

At a glance

  • Small web package. The current measured browser npm build is about 30.3 KiB gzip.
  • Aggressive compression. With 4-bit quantization, a 384d vector takes about 196 B and a 768d vector about 388 B.
  • Direct search on compressed vectors. No full decode step on every query.
  • Portable packaging. Runs in browsers, Node.js, and WASM-friendly edge runtimes.
  • Persistence built in. Save indexes with save() and restore them with Index.load().
  • Example-first repo. Includes browser, WebGPU, and Cloudflare demos.

Bundle Size Analysis

Current turboquant-wasm bundle numbers below come from the latest measured snapshot in benchmarks/results/2026-04-09-m1-max-node22.json. That snapshot keeps the 2026-04-08 search measurements and refreshes the browser npm package size to the current pkg-bundler/ output. Alternative-library rows are maintained comparison estimates from benchmarks/wasm_analysis.md, not a fresh side-by-side rerun in this repo.

Current measured package

The npm browser entrypoint now ships the wasm-pack --target bundler output rather than the raw web loader. That keeps the published package free of a runtime fetch()-based Wasm bootstrap, which avoids the Socket alert on pkg/turboquant_wasm.js while still keeping the repo-local demos on the plain web target.

Comparison with alternative browser-side vector search libraries

turboquant-wasm is materially smaller than graph-based WASM alternatives. That matters most for edge deployments, mobile web, and embedded search widgets where bundle budget is tight.

Why it stays small

  • No HNSW graph or graph-tuning machinery in the binary.
  • No external native dependency stack, BLAS, or LAPACK.
  • A small core: PRNG, orthogonalization, centroid tables, scalar quantization, packed storage, and compressed brute-force scan.
  • Size-oriented WASM build settings, plus a design that matches the algorithm instead of wrapping a larger ANN engine.

Feature Comparison

This table keeps the product-level comparison from benchmarks/wasm_analysis.md, but refreshes the turboquant-wasm numbers to the current implementation.

Key Advantages Summary

Good fit

  • Static-site search for docs, blogs, and catalogs
  • Local-first semantic search in PWAs or desktop apps
  • Client-side RAG where documents never leave the machine
  • Browser extensions indexing tabs or notes locally
  • Edge APIs with a prebuilt compressed index

Probably not the right tool

  • Very large corpora where you want graph-based ANN over 100k+ vectors
  • Workloads that need sub-millisecond latency at large N
  • Benchmarks where you need a mature head-to-head comparison suite today

Install

npm install @zlaabsi/turboquant-wasm

For npm consumers, the browser entrypoint is packaged with the wasm-pack bundler target. The repo-local examples/ continue to use the raw web target in pkg/.

Quick start

Minimal usage

import { createQuantizer } from "@zlaabsi/turboquant-wasm";

const dim = 384;
const bits = 4;

const quantizer = await createQuantizer({ dim, bits });
const index = quantizer.buildIndex(embeddings, nVectors);
const resultIds = index.search(queryEmbedding, 10);

Persist and reload

import { createQuantizer, Index } from "@zlaabsi/turboquant-wasm";

const quantizer = await createQuantizer({ dim: 384, bits: 4 });
const index = quantizer.buildIndex(embeddings, nVectors);

const bytes = index.save();
const restored = Index.load(bytes, quantizer);
const resultIds = restored.search(queryEmbedding, 10);

Build from source

rustup target add wasm32-unknown-unknown
cargo install wasm-pack

git clone https://github.com/zlaabsi/turboquant-wasm.git
cd turboquant-wasm
npm run build

Use npm run build:node when you also want the Node.js target in pkg-node/.

Try the examples

npm run build
python3 -m http.server 8080

Then open:

  • http://localhost:8080/examples/browser/
  • http://localhost:8080/examples/transformers-js/
  • http://localhost:8080/examples/onnx-webgpu/

Example matrix:

More detail: examples/README.md

Cookbook

Use these guides when you want an integration pattern instead of a toy demo:

Performance snapshot

Honest version: the implementation looks useful for moderate corpus sizes, but this repo still does not have a full benchmark suite across devices, browsers, public datasets, and competing libraries.

The table below is the current source of truth for measured TurboQuant behavior in this repo. The old March analysis mixed theory, estimates, and older implementation assumptions; benchmarks/wasm_analysis.md now explains explicitly why current measured search latency is higher than those early estimates.

Current evidence is a local snapshot on:

  • Apple M1 Max
  • Node v22.11.0
  • npm 10.9.0
  • Darwin 25.3.0 arm64
  • synthetic clustered embeddings

That means the numbers below are directional evidence, not a universal SLA.

Current snapshot

Charts

Raw benchmark data

Comparative context

The charts above are about turboquant-wasm alone. The charts below add comparative context using the positioning tables in benchmarks/wasm_analysis.md.

Important caveat: these comparative plots are not a fresh controlled benchmark suite run side-by-side in this repo. The TurboQuant bars use the current measured package size and current packed storage model; the alternative-library bars come from the maintained comparison estimates in benchmarks/wasm_analysis.md. They are here for positioning and tradeoff discussion, not to pretend we already have airtight head-to-head numbers.

Reading guide: purple is the current measured turboquant-wasm result, gray bars are the comparison points documented in benchmarks/wasm_analysis.md, and the small labels under the gray bars show the relative overhead versus TurboQuant.

What is still missing

  • repeated runs with variance reporting
  • lower-variance harnesses for build and search sweeps
  • browser benchmarks on low-end and mid-range hardware
  • public real-world embedding corpora
  • head-to-head comparisons against exact float32 search and graph-based ANN libraries

API and package notes

  • Install from npm with @zlaabsi/turboquant-wasm
  • Repository: github.com/zlaabsi/turboquant-wasm
  • Primary workflow: create quantizer -> build or stream index -> save/load -> search
  • Generated artifacts live in pkg/ and pkg-node/

Development

For local workflow, release process, and commit conventions, see CONTRIBUTING.md.

Common commands:

npm run build
npm run build:node
npm run test
npm run verify
npm run bench:realworld
npm run bench:charts

References

License

Apache-2.0