npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@ruvector/ruvllm-wasm

v2.0.2

Published

WASM bindings for RuvLLM - browser-compatible LLM inference runtime with WebGPU acceleration

Readme

@ruvector/ruvllm-wasm

npm License

Browser-compatible LLM inference runtime with WebAssembly. Semantic routing, adaptive learning, KV cache management, and chat template formatting — directly in the browser, no server required.

Features

  • KV Cache Management — Two-tier cache (FP32 tail + u8 quantized store) for efficient token storage
  • Memory Pooling — Arena allocator + buffer pool for minimal allocation overhead
  • Chat Templates — Llama3, Mistral, Qwen, ChatML, Phi, Gemma format support
  • HNSW Semantic Router — 150x faster pattern matching with bidirectional graph search
  • MicroLoRA — Sub-millisecond model adaptation (rank 1-4)
  • SONA Instant Learning — EMA quality tracking + adaptive rank adjustment
  • Web Workers — Parallel inference with SharedArrayBuffer detection
  • Full TypeScript — Complete .d.ts type definitions for all exports

Install

npm install @ruvector/ruvllm-wasm

Quick Start

import init, {
  RuvLLMWasm,
  ChatTemplateWasm,
  ChatMessageWasm,
  HnswRouterWasm,
  healthCheck
} from '@ruvector/ruvllm-wasm';

// Initialize WASM module
await init();

// Verify module loaded
console.log(healthCheck()); // true

// Format chat conversations
const template = ChatTemplateWasm.llama3();
const messages = [
  ChatMessageWasm.system("You are a helpful assistant."),
  ChatMessageWasm.user("What is WebAssembly?"),
];
const prompt = template.format(messages);

// Semantic routing with HNSW
const router = new HnswRouterWasm(384, 1000);
router.addPattern(new Float32Array(384).fill(0.1), "coder", "code tasks");
const result = router.route(new Float32Array(384).fill(0.1));
console.log(result.name, result.score); // "coder", 1.0

API

Core Types

| Type | Description | |------|-------------| | RuvLLMWasm | Main inference engine with KV cache + buffer pool | | GenerateConfig | Generation parameters (temperature, top_k, top_p, repetitionPenalty) | | KvCacheWasm | Two-tier KV cache for token management | | InferenceArenaWasm | O(1) bump allocator for inference temporaries | | BufferPoolWasm | Pre-allocated buffer pool (1KB-256KB size classes) |

Chat Templates

// Auto-detect from model ID
const template = ChatTemplateWasm.detectFromModelId("meta-llama/Llama-3-8B");
// Or use directly
const template = ChatTemplateWasm.mistral();
const prompt = template.format([
  ChatMessageWasm.system("You are helpful."),
  ChatMessageWasm.user("Hello!"),
]);

Supported: llama3(), mistral(), chatml(), phi(), gemma(), custom(name, pattern)

HNSW Semantic Router

const router = new HnswRouterWasm(384, 1000); // dimensions, max_patterns
router.addPattern(embedding, "agent-name", "metadata");
const result = router.route(queryEmbedding);
console.log(result.name, result.score);

// Persistence
const json = router.toJson();
const restored = HnswRouterWasm.fromJson(json);

MicroLoRA Adaptation

const config = new MicroLoraConfigWasm();
config.rank = 2;
config.inFeatures = 384;
config.outFeatures = 384;

const lora = new MicroLoraWasm(config);
const adapted = lora.apply(inputVector);
lora.adapt(new AdaptFeedbackWasm(0.9)); // quality score

SONA Instant Learning

const config = new SonaConfigWasm();
config.hiddenDim = 384;
const sona = new SonaInstantWasm(config);

const result = sona.instantAdapt(inputVector, 0.85); // quality
console.log(result.applied, result.qualityEma);

sona.recordPattern(embedding, "agent", true); // success pattern
const suggestion = sona.suggestAction(queryEmbedding);

Parallel Inference (Web Workers)

import { ParallelInference, feature_summary } from '@ruvector/ruvllm-wasm';

console.log(feature_summary()); // browser capability report

const engine = await new ParallelInference(4); // 4 workers
const result = await engine.matmul(a, b, m, n, k);
engine.terminate();

Build from Source

# Install prerequisites
rustup target add wasm32-unknown-unknown
cargo install wasm-pack

# Release build (workaround for Rust 1.91 codegen bug)
CARGO_PROFILE_RELEASE_CODEGEN_UNITS=256 CARGO_PROFILE_RELEASE_LTO=off \
  wasm-pack build crates/ruvllm-wasm --target web --scope ruvector --release

# Dev build
wasm-pack build crates/ruvllm-wasm --target web --scope ruvector --dev

# With WebGPU support
CARGO_PROFILE_RELEASE_CODEGEN_UNITS=256 CARGO_PROFILE_RELEASE_LTO=off \
  wasm-pack build crates/ruvllm-wasm --target web --scope ruvector --release -- --features webgpu

Browser Compatibility

| Browser | Version | Notes | |---------|---------|-------| | Chrome | 57+ | Full support | | Edge | 79+ | Full support | | Firefox | 52+ | Full support | | Safari | 11+ | Full support |

Optional enhancements:

  • SharedArrayBuffer: Requires Cross-Origin-Opener-Policy: same-origin + Cross-Origin-Embedder-Policy: require-corp
  • WebGPU: Available with webgpu feature flag (Chrome 113+)

Size

~435 KB release WASM (~178 KB gzipped)

Related

License

MIT