npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

apple-local-llm

v1.0.0

Published

Call Apple's on-device Foundation Models — no servers, no setup.

Readme

apple-local-llm

Call Apple's on-device Foundation Models from JavaScript — no servers, no setup.

Works with Node.js, Electron, and VS Code extensions.

Requirements

  • macOS 26+ (Tahoe)
  • Apple Silicon (M Series)
  • Apple Intelligence enabled in System Settings

Installation

npm install apple-local-llm

Quick Start

Simple API

import { createClient } from "apple-local-llm";

const client = createClient();

// Check compatibility first
const compat = await client.compatibility.check();
if (!compat.compatible) {
  console.log("Not available:", compat.reasonCode);
  // Handle fallback to cloud API
}

// Generate a response
const result = await client.responses.create({
  input: "What is the capital of France?",
});

if (result.ok) {
  console.log(result.text); // "The capital of France is Paris."
}

Streaming

for await (const chunk of client.stream({ input: "Count from 1 to 5." })) {
  if ("delta" in chunk) {
    process.stdout.write(chunk.delta);
  }
}

API Reference

createClient(options?)

Creates a new client instance.

const client = createClient({
  model: "default",               // Optional: model identifier (currently only "default")
  onLog: (msg) => console.log(msg), // Optional: debug logging
  idleTimeoutMs: 5 * 60 * 1000,     // Optional: helper idle timeout (default: 5 min)
});

Defaults:

  • Helper auto-shuts down after 5 minutes of inactivity
  • Helper auto-restarts up to 3 times on crash (with exponential backoff)
  • Request timeout: 60 seconds (configurable via timeoutMs)

You can also import and instantiate the class directly:

import { AppleLocalLLMClient } from "apple-local-llm";
const client = new AppleLocalLLMClient(options);

client.compatibility.check()

Check if the local model is available. Always call this before making requests.

const result = await client.compatibility.check();
// { compatible: true }
// or { compatible: false, reasonCode: "AI_DISABLED" }

Reason codes: | Code | Description | |------|-------------| | NOT_DARWIN | Not running on macOS | | UNSUPPORTED_HARDWARE | Not Apple Silicon | | AI_DISABLED | Apple Intelligence not enabled | | MODEL_NOT_READY | Model still downloading | | SPAWN_FAILED | Helper binary failed to start | | HELPER_NOT_FOUND | Helper binary not found | | HELPER_UNHEALTHY | Helper process not responding correctly | | PROTOCOL_MISMATCH | Helper version incompatible with client |

client.capabilities.get()

Get detailed model capabilities (calls the helper).

const caps = await client.capabilities.get();
// { available: true, model: "apple-on-device" }
// or { available: false, reasonCode: "AI_DISABLED" }

client.responses.create(params)

Generate a response.

const result = await client.responses.create({
  input: "Your prompt here",
  model: "default",         // Optional: model identifier
  max_output_tokens: 500,   // Optional: limit response tokens
  stream: false,            // Optional
  signal: abortController.signal, // Optional: AbortSignal
  timeoutMs: 60000,         // Optional: request timeout (ms)
  response_format: {        // Optional: structured JSON output
    type: "json_schema",
    json_schema: {
      name: "Result",
      schema: { type: "object", properties: { ... } }
    }
  }
});

Structured Output Example:

const result = await client.responses.create({
  input: "List 3 colors",
  response_format: {
    type: "json_schema",
    json_schema: {
      name: "Colors",
      schema: {
        type: "object",
        properties: {
          colors: { type: "array", items: { type: "string" } }
        }
      }
    }
  }
});
const data = JSON.parse(result.text); // { colors: ["red", "blue", "green"] }

response_format is not supported with streaming.

Returns ResponseResult on success, or an error object:

// Success:
{ ok: true, text: "...", request_id: "..." }
// Error:
{ ok: false, error: { code: "...", detail: "..." } }

Note: The return type is a discriminated union, not the exported ResponseResult interface.

Error codes: | Code | Description | |------|-------------| | UNAVAILABLE | Model not available (see reason codes above) | | TIMEOUT | Request timed out (default: 60s) | | CANCELLED | Request was cancelled via AbortSignal | | RATE_LIMITED | System rate limit exceeded | | GUARDRAIL | Content violated Apple's safety guidelines | | INTERNAL | Unexpected error |

client.stream(params)

Async generator for streaming responses.

for await (const chunk of client.stream({ input: "..." })) {
  if ("delta" in chunk) {
    // Partial content
    console.log(chunk.delta);
  } else if ("done" in chunk) {
    // Final complete text
    console.log(chunk.text);
  }
}

client.responses.cancel(requestId)

Cancel an in-progress request.

const result = await client.responses.cancel("req_123");
// { ok: true } or { ok: false, error: { code: "NOT_RUNNING", detail: "..." } }

client.shutdown()

Gracefully shut down the helper process.

await client.shutdown();

TypeScript Types

All types are exported:

import type {
  ClientOptions,
  ReasonCode,
  CompatibilityResult,
  CapabilitiesResult,
  ResponsesCreateParams,
  ResponseResult,
  JSONSchema,
  ResponseFormat,
} from "apple-local-llm";

CLI Usage

The fm-proxy binary can also be used directly from the command line:

# Simple prompt
fm-proxy "What is the capital of France?"

# Streaming output
fm-proxy --stream "Tell me a story"
fm-proxy -s "Tell me a story"

# Limit output tokens
fm-proxy --max-tokens=50 "Count to 100"

# Start HTTP server
fm-proxy --serve
fm-proxy --serve --port=3000

# Other options
fm-proxy --help      # Show usage (or -h)
fm-proxy --version   # Show version (or -v)
fm-proxy --stdio     # stdio mode (used internally by npm package)

HTTP Server Mode

Run fm-proxy --serve to start a local HTTP server:

fm-proxy --serve --port=8080

Endpoints:

| Endpoint | Method | Description | |----------|--------|-------------| | /health | GET | Health check and availability status | | /generate | POST | Text generation (supports streaming) |

Options:

| Option | Description | |--------|-------------| | --port=<PORT> | Set server port (default: 8080) | | --auth-token=<TOKEN> | Require Bearer token for /generate |

You can also set AUTH_TOKEN environment variable instead of --auth-token.

CORS: All endpoints support CORS with Access-Control-Allow-Origin: *.

Examples:

# Health check
curl http://127.0.0.1:8080/health
# Response: {"status":"ok","model":"apple-on-device","available":true}

# Simple generation
curl -X POST http://127.0.0.1:8080/generate \
  -H "Content-Type: application/json" \
  -d '{"input": "What is 2+2?"}'
# Response: {"text":"2+2 equals 4."}

# With max_output_tokens
curl -X POST http://127.0.0.1:8080/generate \
  -H "Content-Type: application/json" \
  -d '{"input": "Count to 100", "max_output_tokens": 50}'

# With structured output (response_format)
curl -X POST http://127.0.0.1:8080/generate \
  -H "Content-Type: application/json" \
  -d '{"input": "List 3 colors", "response_format": {"type": "json_schema", "json_schema": {"name": "Colors", "schema": {"type": "object", "properties": {"colors": {"type": "array", "items": {"type": "string"}}}}}}}'

# With authentication
curl -X POST http://127.0.0.1:8080/generate \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer <token>" \
  -d '{"input": "Hello"}'

Streaming (SSE)

Add "stream": true to get Server-Sent Events with OpenAI-compatible chunks:

curl -N -X POST http://127.0.0.1:8080/generate \
  -H "Content-Type: application/json" \
  -d '{"input": "Write a haiku", "stream": true}'

Response:

data: {"id":"...","object":"chat.completion.chunk","choices":[{"delta":{"role":"assistant"}}]}
data: {"id":"...","object":"chat.completion.chunk","choices":[{"delta":{"content":"..."}}]}
data: {"id":"...","object":"chat.completion.chunk","choices":[{"delta":{},"finish_reason":"stop"}]}
data: [DONE]

How It Works

This package bundles a small native helper (fm-proxy) that communicates with Apple's Foundation Models framework over stdio. The helper is spawned on first request and stays alive to keep the model warm.

  • No localhost server — npm package uses stdio, not HTTP
  • No user setup — just npm install
  • Fails gracefully — check compatibility.check() and fall back to cloud

Runtime Support

JS API (createClient()): | Environment | Supported | |-------------|-----------| | Node.js | ✅ | | Electron (main process) | ✅ | | VS Code extensions | ✅ | | Electron (renderer) | ❌ No child_process | | Browser | ❌ |

HTTP Server (fm-proxy --serve): | Environment | Supported | |-------------|-----------| | Any HTTP client | ✅ | | Browser (fetch) | ✅ | | Electron (renderer) | ✅ |

License

MIT