npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

runinbrowser-ai

v0.1.1

Published

Run AI models in the browser with React hooks

Readme

runinbrowser-ai 🚀

Run any open-source AI model in React directly in the browser. No servers, no setup.

Live Demo (Pocket Brain)

Overview

runinbrowser-ai is a React-first TypeScript SDK for running AI models directly in the browser. Built with React hooks and OpenAI-compatible types, levereging WebGPU to enable you to build AI-powered applications that run entirely client-side with no backend required.

Why runinbrowser-ai?

  • runinbrowser-ai is the only fully Typescript-based React SDK for browser AI. This means better developer experience, safer integrations, and full type-checking across your entire AI workflow

  • runinbrowser-ai is OpenAI API Compatible: Integrate your application using OpenAI compatible types for functionalities such as streaming, JSON-mode, and soon (tool calling, seeding, etc...)

  • runinbrowser-ai has Plug and Play integration: Integrate and run AI models in your React App by following the comprehensive examples and out-of-the-box hooks that lift the burden of managing AI state so you can focus on UI.

  • runinbrowser-ai comes with latest advancements from its underlying engines such as Web Worker & Service Worker support and more.

Not to forget that everything is 🔒 100% Local & Private, 🌐 Offline Friendly, 📦 Production Ready and with 💰 Zero Server Costs.

ezgif-240bfe1209ae3de8

Quick start

You will need Node.js 18+ and npm (or another package manager) installed on your local development machine.

npm install runinbrowser-ai

Basic Usage

import { useMLCEngine, useChat } from "runinbrowser-ai";

function ChatApp() {
  // Manage engine lifecycle with the useMLCEngine hook
  const { engine, isLoading, progress } = useMLCEngine({
    modelId: "Llama-3.2-1B-Instruct-q4f32_1-MLC",
    useWorker: true,
  });

  // Manage chat state with the useChat hook
  const { messages, send, stop, clear, isStreaming } = useChat({ engine });

  if (isLoading) {
    return <LoadingIndicator progress={progress} />;
  }

  return <ChatUI messages={messages} onSend={send} />;
}

API Reference

useMLCEngine

Manages MLC engine lifecycle, model loading, and caching.

Usage

import { useMLCEngine } from "runinbrowser-ai";

const { engine, isLoading, progress } = useMLCEngine({
  modelId: "Llama-3.2-1B-Instruct-q4f32_1-MLC",
  useWorker: true,
});

Options

| Option | Type | Default | Description | | ------------ | -------------------- | ----------- | --------------------------- | | modelId | MLCModelId | undefined | Model to auto-load on mount | | useWorker | boolean | true | Run inference in Web Worker | | onProgress | (progress) => void | undefined | Progress callback | | onError | (error) => void | undefined | Error callback |

Returns

| Property | Type | Description | | ------------------- | -------------------- | ----------------------------------------------------------------- | | engine | MLCEngineAdapter | Engine instance (pass to useChat) | | status | EngineStatus | Current status: 'idle' | 'loading' | 'ready' | 'error' | | isReady | boolean | True when ready to generate | | isLoading | boolean | True during model loading | | progress | number | Loading progress (0-1) | | error | Error \| null | Last error that occurred | | currentModelId | MLCModelId \| null | Currently loaded model | | loadModel() | function | Load a model manually | | unload() | function | Unload current model | | cachedModels | MLCModelId[] | List of cached models | | hasModelInCache() | function | Check if model is cached | | clearCache() | function | Clear all cached models |

Examples

Auto-load on mount:

const { engine, isReady } = useMLCEngine({
  modelId: "Llama-3.2-1B-Instruct-q4f32_1-MLC",
});

Manual loading:

const { loadModel, isLoading } = useMLCEngine();

// Later...
await loadModel("Llama-3.2-1B-Instruct-q4f32_1-MLC");

Progress tracking:

const { progress, status } = useMLCEngine({
  modelId: "Llama-3.2-1B-Instruct-q4f32_1-MLC",
  onProgress: (p) => console.log(`${p.text} - ${Math.round(p.progress * 100)}%`),
});

useChat

Manages chat messages, streaming, and conversation state. Engine agnostic.

Usage

import { useChat } from "runinbrowser-ai";

const { messages, send, isStreaming } = useChat({
  engine,
  systemPrompt: "You are a helpful assistant.",
});

Options

| Option | Type | Description | | ------------------- | ------------------- | ----------------------------------- | | engine | Engine | Engine instance from useMLCEngine | | systemPrompt | string | System message for context | | generationOptions | GenerationOptions | Default generation options |

Returns

| Property | Type | Description | | -------------- | --------------- | ---------------------------------- | | messages | ChatMessage[] | Full chat history | | isStreaming | boolean | True during active generation | | engine | Engine | The provided engine instance | | send() | function | Send a message and stream response | | stop() | function | Stop mid-stream | | regenerate() | function | Regenerate last assistant response | | clear() | function | Clear all messages |

ChatMessage Type

type ChatMessage = {
  id: string;
  role: "system" | "user" | "assistant";
  content: string;
  meta?: { stopReason?: string; usage?: unknown };
};

Examples

Basic chat:

const { messages, send } = useChat({ engine });

<button onClick={() => send("Hello!")}>Send</button>;

With system prompt:

const { messages, send } = useChat({
  engine,
  systemPrompt: "You are a coding assistant. Always provide examples.",
});

With default options:

const { messages, send } = useChat({
  engine,
  generationOptions: {
    temperature: 0.7,
    max_tokens: 500,
  },
});

Override per message:

const { send } = useChat({ engine });

// Use defaults
send("Tell me a story");

// Override for this message
send("Generate code", {
  temperature: 0.2,
  max_tokens: 1000,
});

useWebGPUSupport

Checks if WebGPU is available in the current browser.

Usage

import { useWebGPUSupport } from "runinbrowser-ai";

const isSupported = useWebGPUSupport();

if (!isSupported) {
  return <p>WebGPU is not supported in this browser.</p>;
}

Returns

boolean - true if WebGPU is available, false otherwise.

Example

function App() {
  const isSupported = useWebGPUSupport();

  if (!isSupported) {
    return (
      <div>
        <h1>WebGPU Not Supported</h1>
        <p>Please use Chrome 113+ or Edge 113+</p>
      </div>
    );
  }

  return <ChatApp />;
}

Demos & examples

  • Live demo: runinbrowser.vercel.app (Brain Pocket experience)
  • Examples in repo: examples/pocket-brain (full UI with roadmap/FAQ sections) — Note: This example is vibe-coded 🎨

Requirements

  • Modern browser with WebGPU (Chrome/Edge 113+, Arc).
  • Devices with shader-f16 support run larger models faster; smaller models are selected by default for compatibility.

Kind-of roadmap (near term)

  • ✅ Solid React chat hook + MLC adapter, WebGPU + Worker support, basic model catalog with caching.
  • ⏳ JSON/structured outputs and richer system prompt helpers.
  • ⏳ More engines (Transformers.js adapter) and voice I/O experiments.
  • 🎯 Developer docs inline (this README) until a docs site is ready; examples stay authoritative.

Contributing

We welcome contributions! Please see CONTRIBUTING.md for guidelines.

Quick start for local development:

git clone https://github.com/insertmike/runinbrowser-ai.git
cd runinbrowser-ai
npm install  # Auto-generates required files
npm run build

Note: Some files are auto-generated and gitignored:

  • src/config/models/mlc-models.ts - Generated during npm install (postinstall) and npm run build (prebuild)
  • src/core/engines/mlc/worker-utils.ts - Generated during npm run build (prebuild, requires build:worker first)

Important: You must run npm run build before TypeScript compilation will work, as worker-utils.ts depends on the worker bundle being built first.

For more details, see CONTRIBUTING.md.

Acknowledgements

This project is built on top of MLC-AI and their incredible work on WebLLM, which enables high-performance language model inference in the browser using WebGPU.

Installation

npm install runinbrowser-ai
# or
yarn add runinbrowser-ai
# or
pnpm add runinbrowser-ai

TypeScript Support

runinbrowser-ai is written in TypeScript and includes comprehensive type definitions. All APIs are fully typed, including:

  • GenerationOptions - OpenAI-compatible generation parameters
  • ChatMessage - Message types with role and content
  • Engine - Core engine interface for extensibility
  • MLCModelId - Type-safe model identifiers
  • Hook return types and options

Exported Types

import type {
  // Core types
  Engine,
  EngineStatus,
  BaseChatMessage,
  ChatRole,
  InputMessage,
  LoadingProgress,
  StreamCallbacks,
  GenerationOptions,
  // Hook types
  UseChatOptions,
  UseChatReturn,
  UseMLCEngineOptions,
  UseMLCEngineReturn,
  // Model types
  MLCModelId,
} from "runinbrowser-ai";

Additional Exports

MLCEngineAdapter

The MLC engine adapter class that implements the Engine interface. You can use this directly if you need more control:

import { MLCEngineAdapter } from "runinbrowser-ai";

const engine = new MLCEngineAdapter();
await engine.loadModel("Llama-3.2-1B-Instruct-q4f32_1-MLC");

enableDebug()

Enable debug logging for troubleshooting:

import { enableDebug } from "runinbrowser-ai";

enableDebug(); // Sets localStorage.debug = 'runinbrowser-ai'

MLC_MODEL_IDS

Array of all available MLC model IDs:

import { MLC_MODEL_IDS } from "runinbrowser-ai";

console.log(MLC_MODEL_IDS); // ["Llama-3.2-1B-Instruct-q4f32_1-MLC", ...]

prebuiltAppConfig

Full MLC-AI configuration with extended model records:

import { prebuiltAppConfig } from "runinbrowser-ai";

// Access model list with group and quantization_id fields
const models = prebuiltAppConfig.model_list;

// Filter by model family
const qwenModels = models.filter((m) => m.group === "Qwen");

// Find specific model
const llamaModel = models.find((m) => m.model_id.includes("Llama-3.2"));

getMLCModel()

Get a model configuration by model ID:

import { getMLCModel } from "runinbrowser-ai";

const model = getMLCModel("Llama-3.2-1B-Instruct-q4f32_1-MLC");

Additional Types

import type { ExtendedModelRecord, ExtendedAppConfig } from "runinbrowser-ai";

Browser Compatibility

  • WebGPU: Required for model inference. Supported in:
    • Chrome/Edge 113+
    • Arc browser
    • Safari 18+ (with experimental WebGPU support)
  • Web Workers: Recommended for better performance (enabled by default)
  • IndexedDB: Used for model caching

Performance Tips

  1. Use Web Workers: Keep inference off the main thread (default: useWorker: true)
  2. Model Selection: Smaller models (1B-3B) load faster and use less memory
  3. Caching: Models are automatically cached in IndexedDB after first load
  4. Shader Support: Devices with shader-f16 support run models faster

Troubleshooting

WebGPU Not Available

Check browser support:

import { useWebGPUSupport } from "runinbrowser-ai";

const isSupported = useWebGPUSupport();
if (!isSupported) {
  // Show fallback UI
}

Enable Debug Logging

import { enableDebug } from "runinbrowser-ai";
enableDebug();

Model Loading Issues

  • Ensure you have sufficient storage space (models can be 1-4GB)
  • Check browser console for IndexedDB errors
  • Try clearing cache: engine.clearCache()

License

MIT License - see LICENSE file for details.

Support

This project is free and open source. If you find it useful, consider subscribing to @mihailyonchev on YouTube for more privacy-focused tech projects and tutorials!