npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

doppler-gpu

v0.4.3

Published

Browser-native WebGPU inference engine for local intent and inference loops

Downloads

68

Readme

doppler-gpu

Browser-native inference on raw WebGPU. Pure JS + WGSL.

Try the live demo | npm | docs Broader model status and the surrounding compare evidence live in the support and release matrices. See the benchmark methodology for the receipt contract and disclosure rules.

New in 0.4.3

  • sideEffects is now scoped to *.wgsl and src/gpu/device.js, making most modules tree-shakeable.
  • Tooling and support-surface APIs are split into smaller subpaths (doppler-gpu/tooling/device, doppler-gpu/tooling/storage, doppler-gpu/tooling/manifest, doppler-gpu/structured, and doppler-gpu/client/model-manager).
  • New per-family metadata modules for gemma3, gemma4, embeddinggemma, and qwen3 add model-family constants plus resolveModel and resolveHfBaseUrl helpers.
  • Added direct shader-seeding support via registerShaderSources(map) and hasPreseededShaderSource(name).

Quick start

Browser

Use the live demo link above — it runs entirely in the browser with no server required. Models load into the browser cache and work offline after first download.

CLI

npx doppler-gpu

Downloads the default quickstart model, runs a local prompt, and prints the answer. Node quickstart artifacts are cached in ~/.cache/doppler-gpu/models after the first run; set DOPPLER_QUICKSTART_CACHE_DIR to move the cache or DOPPLER_QUICKSTART_CACHE=0 to disable it.

npx doppler-gpu "Summarize WebGPU in one sentence"
npx doppler-gpu --model qwen3-0.8b --prompt "Write a haiku about GPUs"
npx doppler-gpu --list-models

Root API

The doppler facade is the primary app-facing API. The root package intentionally stays small: it exports doppler and DOPPLER_VERSION. Advanced surfaces now live on explicit subpaths such as doppler-gpu/loaders, doppler-gpu/generation, doppler-gpu/tooling, and doppler-gpu/orchestration. Support tiers for those subpaths are tracked in the subsystem support matrix rather than assumed from export shape alone.

import { doppler } from 'doppler-gpu';

// Stream tokens
const model = await doppler.load('gemma3-270m');
for await (const token of model.generate('Describe WebGPU briefly')) {
  process.stdout.write(token);
}

// One-shot
const text = await model.generateText('Explain WebGPU in one sentence');

OpenAI-compatible server

For existing apps, SDKs, and eval stacks that speak the OpenAI protocol:

npx doppler-serve --model gemma3-270m --port 8080

Then point any OpenAI client at http://localhost:8080/v1:

import OpenAI from 'openai';
const client = new OpenAI({ baseURL: 'http://localhost:8080/v1', apiKey: 'unused' });
const response = await client.chat.completions.create({
  model: 'gemma3-270m',
  messages: [{ role: 'user', content: 'Hello' }],
});

This is a compatibility bridge — the core engine runs identically in the browser or Node.

Registry IDs resolve to hosted RDRR artifacts from Clocksmith/rdrr by default. See the Root API guide.

Support contract

Doppler keeps model support and subsystem support separate:

The tier1 proof surface is the hosted browser demo, the root doppler API, the quickstart CLI, the OpenAI-compatible localhost server, and the verified text-inference path behind them.

Why Doppler

Browser-native. Runs in a WebGPU browser tab with OPFS caching, so models stay available offline after the first load.

JavaScript-first execution. JSON resolves policy, JavaScript handles orchestration, and WGSL kernels handle compute. Kernel paths, dtype choices, and runtime behavior stay visible in the shipped source.

Fast iteration. JS, WGSL, and JSON changes run directly through the same stack used by the browser and Node surfaces, which keeps debugging and profiling close to real runtime behavior.

for await streaming. Generation uses a native AsyncGenerator that fits normal app control flow.

LoRA hot-swap. Experimental advanced surface for swapping adapters at runtime without reloading the base model.

Independent model instances. Run multiple models concurrently. Each owns its pipeline, buffers, and KV cache.

Benchmark evidence

Doppler vs Transformers.js phase timing on Gemma 4 and Qwen 3.5 0.8B workloads

Quickstart-supported models

All models below are verified with deterministic greedy decoding on WebGPU hardware. These registry IDs resolve to hosted RDRR artifacts automatically from the browser demo, npx doppler-gpu, or doppler.load(...).

| Model | Registry ID | Quant | Size | Family | | --- | --- | --- | --- | --- | | Gemma 3 270M IT | gemma3-270m | Q4K | 270M | Gemma | | Gemma 3 1B IT | gemma3-1b | Q4K | 1B | Gemma | | Gemma 4 E2B IT | gemma4-e2b | Q4K | E2B | Gemma | | EmbeddingGemma 300M | embeddinggemma-300m | Q4K | 300M | Gemma | | Qwen 3.5 0.8B | qwen3-0.8b | Q4K | 0.8B | Qwen | | Qwen 3.5 2B | qwen3-2b | Q4K | 2B | Qwen |

Additional verified local-artifact models (TranslateGemma 4B, LFM2.5 1.2B) are available outside the quickstart registry, including Gemma 4 E2B INT4PLE. Conversion configs exist for Gemma 4 MoE and Janus but are not yet in the quickstart registry. See the model support matrix. Subsystem support tiers for direct-source inputs, advanced subpaths, diffusion, energy, and training live in the subsystem support matrix.

Under the hood

  • Sharded weight loading via OPFS moves multi-GB weights into VRAM without blocking the main thread.
  • Quantized inference (Q4K, F16) runs practical model sizes on consumer GPUs.
  • TurboQuant KV-cache profiles are available for quantized decode-cache runs.
  • Kernel hot-swap between prefill and decode paths with zero graph recompilation.
  • Config-driven runtime with explicit profiles, kernel-path selection, and sampling.

Documentation

Environment requirements

  • WebGPU is required.
  • Browser: Current Chromium browsers with WebGPU enabled, including Chrome and Edge. WebGPU shipped in Chrome/Edge 113+. Firefox and Safari support varies.
  • Node: Requires a WebGPU provider (webgpu npm package). Installed automatically as an optional dependency.

License

Apache License 2.0 (Apache-2.0). See LICENSE and NOTICE.