npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@prathikrao/foundry-local-sdk

v0.0.17

Published

Foundry Local JavaScript SDK

Readme

Foundry Local JS SDK

The Foundry Local JS SDK provides a JavaScript/TypeScript interface for running AI models locally on your machine. Discover, download, load, and run inference — all without cloud dependencies.

Features

  • Local-first AI — Run models entirely on your machine with no cloud calls
  • Model catalog — Browse and discover available models, check what's cached or loaded
  • Automatic model management — Download, load, unload, and remove models from cache
  • Chat completions — OpenAI-compatible chat API with both synchronous and streaming responses
  • Audio transcription — Transcribe audio files locally with streaming support
  • Multi-variant models — Models can have multiple variants (e.g., different quantizations) with automatic selection of the best cached variant
  • Embedded web service — Start a local HTTP service for OpenAI-compatible API access
  • WinML support — Automatic execution provider download on Windows for NPU/GPU acceleration
  • Configurable inference — Control temperature, max tokens, top-k, top-p, frequency penalty, and more

Installation

npm install foundry-local-sdk

WinML: Automatic Hardware Acceleration (Windows)

On Windows, install with the --winml flag to enable automatic execution provider management. The SDK will automatically discover, download, and register hardware-specific execution providers (e.g., Qualcomm QNN for NPU acceleration) via the Windows App Runtime — no manual driver or EP setup required.

npm install foundry-local-sdk --winml

When WinML is enabled:

  • Execution providers like QNNExecutionProvider, OpenVINOExecutionProvider, etc. are downloaded and registered on the fly, enabling NPU/GPU acceleration without manual configuration
  • No code changes needed — your application code stays the same whether WinML is enabled or not

Note: The --winml flag is only relevant on Windows. On macOS and Linux, the standard installation is used regardless of this flag.

Quick Start

import { FoundryLocalManager } from 'foundry-local-sdk';

const manager = FoundryLocalManager.create({
    appName: 'foundry_local_samples',
    logLevel: 'info'
});

// Get the model object
const modelAlias = 'qwen2.5-0.5b';
const model = await manager.catalog.getModel(modelAlias);

// Download the model
console.log(`\nDownloading model ${modelAlias}...`);
await model.download((progress) => {
    process.stdout.write(`\rDownloading... ${progress.toFixed(2)}%`);
});

// Load the model
await model.load();

// Create chat client
const chatClient = model.createChatClient();

// Example chat completion
console.log('\nTesting chat completion...');
const completion = await chatClient.completeChat([
    { role: 'user', content: 'Why is the sky blue?' }
]);
console.log(completion.choices[0]?.message?.content);

// Example streaming completion
console.log('\nTesting streaming completion...');
for await (const chunk of chatClient.completeStreamingChat(
    [{ role: 'user', content: 'Write a short poem about programming.' }]
)) {
    const content = chunk.choices?.[0]?.message?.content;
    if (content) {
        process.stdout.write(content);
    }
}
console.log('\n');

// Unload the model
await model.unload();

Usage

Browsing the Model Catalog

The Catalog lets you discover what models are available, which are already cached locally, and which are currently loaded in memory.

const catalog = manager.catalog;

// List all available models
const models = await catalog.getModels();
models.forEach(model => {
    console.log(`${model.alias} — cached: ${model.isCached}`);
});

// See what's already downloaded
const cached = await catalog.getCachedModels();

// See what's currently loaded in memory
const loaded = await catalog.getLoadedModels();

Loading and Running Models

Each Model can have multiple variants (different quantizations or formats). The SDK automatically selects the best available variant, preferring cached versions.

const model = await catalog.getModel('qwen2.5-0.5b');

// Download if not cached (with optional progress tracking)
if (!model.isCached) {
    await model.download((progress) => {
        console.log(`Download: ${progress}%`);
    });
}

// Load into memory and run inference
await model.load();
const chatClient = model.createChatClient();

You can also select a specific variant manually:

const variants = model.variants;
model.selectVariant(variants[0]);

Chat Completions

The ChatClient follows the OpenAI Chat Completion API structure.

const chatClient = model.createChatClient();

// Configure settings
chatClient.settings.temperature = 0.7;
chatClient.settings.maxTokens = 800;
chatClient.settings.topP = 0.9;

// Synchronous completion
const response = await chatClient.completeChat([
    { role: 'system', content: 'You are a helpful assistant.' },
    { role: 'user', content: 'Explain quantum computing in simple terms.' }
]);
console.log(response.choices[0].message.content);

Streaming Responses

For real-time output, use streaming:

for await (const chunk of chatClient.completeStreamingChat(
    [{ role: 'user', content: 'Write a short poem about programming.' }]
)) {
    const content = chunk.choices?.[0]?.message?.content;
    if (content) {
        process.stdout.write(content);
    }
}

Audio Transcription

Transcribe audio files locally using the AudioClient:

const audioClient = model.createAudioClient();
audioClient.settings.language = 'en';

// Synchronous transcription
const result = await audioClient.transcribe('/path/to/audio.wav');

// Streaming transcription
for await (const chunk of audioClient.transcribeStreaming('/path/to/audio.wav')) {
    console.log(chunk);
}

Embedded Web Service

Start a local HTTP server that exposes an OpenAI-compatible API:

manager.startWebService();
console.log('Service running at:', manager.urls);

// Use with any OpenAI-compatible client library
// ...

manager.stopWebService();

Configuration

The SDK is configured via FoundryLocalConfig when creating the manager:

| Option | Description | Default | |--------|-------------|---------| | appName | Required. Application name for logs and telemetry. | — | | appDataDir | Directory where application data should be stored | ~/.{appName} | | logLevel | Logging level: trace, debug, info, warn, error, fatal | warn | | modelCacheDir | Directory for downloaded models | ~/.{appName}/cache/models | | logsDir | Directory for log files | ~/.{appName}/logs | | libraryPath | Path to native Foundry Local Core libraries | Auto-discovered | | serviceEndpoint | URL of an existing external service to connect to | — | | webServiceUrls | URL(s) for the embedded web service to bind to | — |

API Reference

Auto-generated class documentation lives in docs/classes/:

Running Tests

npm test

See test/README.md for details on prerequisites and setup.

Running Examples

npm run example

This runs the chat completion example in examples/chat-completion.ts.