wasmicro
v0.3.1
Published
BERT, GPT-2 and T5 inference in a 199 KB WASM bundle. Browser, Node, Cloudflare Workers.
Maintainers
Readme
wasmicro
BERT, GPT-2 and T5 inference in a 199 KB WASM bundle.
Run transformer inference in any JavaScript environment — browser, Node,
Cloudflare Workers, Electron — with a single 199 KB WebAssembly file.
Model type is auto-detected from config.json; no hardcoded parameters needed.
Install
npm install wasmicroQuick start
import init, { WasmPipeline } from "wasmicro";
await init();
// Load files (fetch, fs.readFile, etc.)
const model = new Uint8Array(await (await fetch("model.safetensors")).arrayBuffer());
const tokenizer = new Uint8Array(await (await fetch("vocab.txt")).arrayBuffer());
const config = await (await fetch("config.json")).text();
// merges.txt is required for GPT-2 / T5; pass null for BERT.
const pipeline = WasmPipeline.fromBytes(model, tokenizer, config, null);
// ── BERT: semantic search / embeddings ────────────────────────────────────────
const embedding = pipeline.embed("Hello world", 128); // Float32Array
const batch = pipeline.embedBatch(["text 1", "text 2"], 128); // Float32Array
// ── GPT-2 / T5: text generation ───────────────────────────────────────────────
const merges = new Uint8Array(await (await fetch("merges.txt")).arrayBuffer());
const pipeline2 = WasmPipeline.fromBytes(model, vocabJson, config, merges);
const text = pipeline2.generate("Once upon a time", 50);Supported models
| model_type in config.json | Tokenizer file | Methods |
|---|---|---|
| bert, roberta, distilbert, electra | vocab.txt | embed, embedBatch |
| gpt2, gpt_neo, gpt_neox | vocab.json + merges.txt | generate |
| t5, mt5, longt5 | vocab.json + merges.txt | generate |
Bundle size
| Runtime | WASM/JS payload | |---|---| | wasmicro | 199 KB | | Candle WASM | 1.5 – 5 MB | | transformers.js | ~10 MB | | ONNX Runtime Web | 8 – 20 MB |
Verified against real HuggingFace checkpoints
| Model | Result |
|---|---|
| bert-base-uncased | cosine 1.000000, max|Δ| 8.3 × 10⁻⁷ vs PyTorch |
| openai-community/gpt2 | loads and generates correctly |
| google-t5/t5-small | encoder shape [seq, 512], all values finite |
Full verification source: wasmicro-verify
Links
License
MIT OR Apache-2.0
