npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@wasmagent/model-local

v1.0.3

Published

Embedded local LLM provider for wasmagent — node-llama-cpp adapter with grammar-constrained tool calling, multi-mirror download (HF/hf-mirror/ModelScope), and cert pipeline

Readme

/model-local

Embedded local-LLM provider for wasmagentnode-llama-cpp adapter with grammar-constrained tool calling, multi-mirror downloads (HuggingFace / hf-mirror / ModelScope), and a certification harness for picking which models actually work in agent workflows.

The whole agent stack — model, code execution, state — runs on the user's machine. No cloud LLM, no API key, no telemetry.

Install

# Provider (small package, no native deps).
npm install /model-local

# Optional native peer — pre-built binaries for macOS/Linux/Windows + ARM/x64.
npm install node-llama-cpp

The native peer is optional: if you only want the registry/downloader/types (e.g. to ship a server that proxies models), you can skip it. LocalModel.generate() will throw a typed LocalModelDependencyError with an actionable install hint if it's missing.

Quick start

import { LocalModel, localFirst } from "/model-local";
import { AnthropicModel, CodeAgent } from "/core";

// Pick one of three sources:
const local = new LocalModel({ source: { model: "qwen2.5-1.5b" } });        // alias
// or:        new LocalModel({ source: { path: "./my-model.gguf" } });       // user GGUF
// or:        new LocalModel({ source: { url: "https://..." } });            // direct URL

// Use it directly:
const agent = new CodeAgent({ model: local, tools: [] });

// Or compose with a cloud fallback for prod:
const model = localFirst(
  local,
  new AnthropicModel("claude-haiku-4-5-20251001", process.env.ANTHROPIC_API_KEY),
);

Three model sources

| Source | Use when | Verification | |---|---|---| | { model: "alias" } | You want a maintained, vetted model | sha256 (registry-pinned) | | { path: "./x.gguf" } | You have a self-trained or hand-downloaded GGUF | none (your file, your trust) | | { url: "https://..." } | One-off pull from any URL | sha256 only if you supply expectedSha256 |

Mirror selection (大陆友好)

Three resolution layers, high → low precedence:

  1. Programmaticnew LocalModel({ source: { model: "qwen2.5-1.5b" }, mirror: "modelscope" })
  2. EnvironmentWASMAGENT_MODEL_MIRROR=hf-mirror (or modelscope, or any URL prefix)
  3. Registry default — HuggingFace first, then mirrors

Built-in presets:

  • huggingface — origin (sha256 anchor)
  • hf-mirrorhf-mirror.com, community-run, URL-compatible with HF
  • modelscopemodelscope.cn, ModelScope魔搭 国内 CDN

Custom CDN: pass any URL prefix as mirror, and the downloader will append the canonical filename and hit your CDN first, falling back to the registry chain if it fails.

# One-line CLI override:
WASMAGENT_MODEL_MIRROR=modelscope npx wasmagent model pull qwen2.5-1.5b

⚠️ Mirror trust model: every download is sha256-verified against the registry value (which is anchored to the HuggingFace original). Mirrors are transport channels, not trust roots.

Grammar-constrained tool calling

Sub-1B models routinely emit malformed JSON when asked to call tools. LocalModel enables JSON-schema grammar in the sampler by default, so tool_use output is structurally legal 100% of the time. Semantic correctness still depends on the model.

const model = new LocalModel({
  source: { model: "qwen2.5-1.5b" },
  enableGrammar: true,  // default
});

Set enableGrammar: false to compare A/B against free-form sampling — useful for diffing on the cert harness.

CLI

# Browse the registry.
wasmagent model list

# Pull (resumable, sha256-verified, multi-mirror).
wasmagent model pull qwen2.5-1.5b

# Force a mirror.
wasmagent model pull qwen2.5-1.5b --mirror modelscope

# Verify a cached file's sha256.
wasmagent model verify qwen2.5-1.5b

# Free up disk.
wasmagent model rm qwen2.5-1.5b

wasmagent/cli declares /model-local as an optional peer — if you don't install this package, the CLI falls back to a clean error message rather than crashing.

Routing presets

import { localFirst, offlineOnly, devLocalOr } from "/model-local";

// Try local; fall through to cloud on any error.
const a = localFirst(localModel, cloudModel);

// Loud "no cloud, ever" envelope (passthrough today; reserves a hook for
// future enforcement).
const b = offlineOnly(localModel);

// Dev convenience: WASMAGENT_DEV_LOCAL=1 → local; otherwise → cloud.
const c = devLocalOr(localModel, cloudModel);

These are documented combinations of the existing FallbackModel from /corenot a parallel routing mechanism. You get the same retry/fallover semantics as everywhere else in the framework.

Recommended models — current registry

All entries are <1.5 GB at q4_k_m or smaller quantisation. The recommended flag flips on once the cert harness publishes a passing score (see L4). Until then you can still wasmagent model pull <alias> and self-evaluate.

| Alias | Best for | License | Size | |---|---|---|---| | qwen2.5-0.5b | Tool calling on tiny footprint — 3/3 form/picked/semantic on cert (real-machine, 2026-06-12) | Apache-2.0 | ~409 MB (q4_0, sha256 pinned 2026-06-13) | | qwen3-0.6b | English/code, only Q8_0 quant published | Apache-2.0 | ~610 MB (q8_0, sha256 pinned 2026-06-13) | | qwen2.5-1.5b | Chinese + English, 32K context (Stage-0 ≤2GB winner per evomerge GSM8K 70.5%) | Apache-2.0 | ~1.07 GB (q4_k_m, sha256 pinned 2026-06-13) | | gemma-3-1b | English tasks, ggml-org mirror | Gemma ToU | ~769 MB (q4_k_m, sha256 pinned 2026-06-13) | | llama-3.2-1b | English/code, 128K context, lmstudio-community mirror | Llama 3.2 Community | ~770 MB (q4_k_m, sha256 pinned 2026-06-13) |

See docs/reports/local-model-cert-2026-06-12.md in the wasmagent repo for the full real-machine baseline.

Run the cert harness on any of them (or your own GGUF):

node examples/benchmarks/local-model-cert.mjs --model qwen2.5-1.5b --kernel quickjs
node examples/benchmarks/local-model-cert.mjs --path ./my-model.gguf --out report.md

Honest caveats

  • Sub-1B models are not Claude/GPT-class. Complex tool routing, multi-step reasoning, and long-form synthesis are still cloud-class jobs. The local model is for high-frequency, lower-difficulty work — drafts, intent classification, summarisation, dev/CI runs.
  • Grammar guarantees form, not semantics. A grammar-clean output can still pick the wrong tool or wrong arguments. The cert harness's form rate and semantic rate are reported separately.
  • Native binding. node-llama-cpp brings prebuilt binaries but requires Node.js 20+ on a desktop/server platform. Cloudflare Workers cannot run this. Use localFirst with a cloud model if you deploy to edge runtimes.

License

Apache-2.0 — see LICENSE.

Model files have their own licenses; they are downloaded from the publisher's host on demand and never re-distributed by this package. See MODEL_REGISTRY (in src/registry.ts) for the license attribute on each entry.