npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

browser-llm-engine

v0.1.3

Published

A browser-friendly library for running LLM inference using Wllama with preset and dynamic model loading, caching, and download capabilities.

Readme

browser-llm-engine

A browser-friendly library for running large language models (LLMs) directly in the browser using Wllama. This library provides a simple interface to load .gguf or .bin models (e.g., from Hugging Face) and generate text completions, including streaming token support.


Features

  • Plug-and-Play: Easy to integrate into your web projects.
  • Local or Remote Models: Load a URL from Hugging Face or pass local File objects.
  • Token-by-Token Streaming: Handle partial results in real-time via onNewToken callback.
  • Templates: Leverages Jinja to format chat-based prompts.
  • Lightweight: Bundles a minimal set of dependencies.

Table of Contents

  1. Installation
  2. Usage
  3. Preset Models
  4. API
  5. Local Development
  6. License

Installation

npm install browser-llm-engine

Or with Yarn:

yarn add browser-llm-engine

Usage

Quick Start

import { createLlmEngine, CHAT_ROLE, PRESET_MODELS } from 'browser-llm-engine';

(async () => {
  // 1) Create an engine instance
  const llm = createLlmEngine({
    // Optional: provide custom WASM paths or config
    wasmPaths: {}
  });

  // 2) Load a preset model from the library
  const modelUrl = PRESET_MODELS["SmolLM2 (360M)"].url;
  await llm.loadModel(modelUrl, {
    progressCallback: (progress) => console.log(`Loading: ${progress}%`),
  });

  // 3) Generate a completion
  const result = await llm.createCompletion("Hello from the browser!");
  console.log("Full model response:", result);

  // 4) Clean up
  await llm.exit();
})();

That’s it! You have a working LLM in the browser.


Streaming

To get partial tokens as they are generated, supply an onNewToken callback:

const llm = createLlmEngine();
await llm.loadModel(PRESET_MODELS["SmolLM2 (360M)"].url);

let outputSoFar = "";
await llm.createCompletion("What's the weather today?", {
  nPredict: 128,
  sampling: { temp: 0.7, penalty_repeat: 1.1 },
  onNewToken: (token) => {
    outputSoFar += token;
    console.log("Streamed token:", token);
  }
});

console.log("Final streamed output:", outputSoFar);

Loading Local Files

If you want to load the model from your local machine:

<input type="file" id="modelFile" multiple />
<script type="module">
  import { createLlmEngine } from 'browser-llm-engine';

  const fileInput = document.getElementById("modelFile");
  const llm = createLlmEngine();

  fileInput.addEventListener("change", async () => {
    try {
      // fileInput.files is a FileList
      await llm.loadModel(fileInput.files);
      console.log("Model loaded locally!");
    } catch (error) {
      console.error("Failed to load local model:", error);
    }
  });
</script>

Preset Models

The library includes a models.json with references to a few hosted models. You can get them via:

import { PRESET_MODELS } from 'browser-llm-engine';

console.log("Available models:", PRESET_MODELS);

Feel free to add or remove entries if you fork this library.


API

createLlmEngine(config?)

Creates a new engine instance.

  • Parameters:
    • config (Object) – Optional configuration, e.g. { wasmPaths: { ... } }.

loadModel(source, options?)

Loads the model from either a remote URL or local File objects.

  • Parameters:
    • source (String | File[] | FileList) – The source of the model.
    • options (Object) – Additional load options:
      • progressCallback (function): (progress) => {} for tracking loading progress
      • useCache (Boolean): Cache the model for faster reloads
      • allowOffline (Boolean): If false, tries to fetch from network

formatChat(messages, useProvidedTemplate?)

Takes an array of messages (each with role and content) and formats them into a single prompt with Jinja.

createCompletion(prompt, options?)

Creates the text completion for a given prompt.

  • Parameters:
    • prompt (String) – The text to generate from.
    • options (Object) – Fine-tuning generation:
      • nPredict (Number) – Maximum tokens to predict (default 512)
      • sampling (Object) – e.g. { temp: 0.7, penalty_repeat: 1.1 }
      • onNewToken (function) – A callback for streaming tokens

exit()

Cleans up resources used by Wllama.

  • Example:
    await llm.exit();

Local Development

If you want to develop locally:

  1. Clone the repo:
    git clone https://github.com/you/browser-llm-engine.git
    cd browser-llm-engine
  2. Install dependencies:
    npm install
  3. Build the library:
    npm run build
    This will create dist/ with both ESM and CJS bundles.
  4. (Optional) Start a dev server (if you add a script in package.json):
    npm run dev
  5. Open index.html (or any dev test page) in your browser to play around with the library.

License

This project is released under the MIT License. Feel free to fork, adapt, and contribute!


Happy coding and enjoy using your LLM in the browser!