npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

model-pool

v0.2.0

Published

Free-first model routing with transparent fallback, local model support, and respect for provider limits.

Readme

ModelPool

Free-first model routing with transparent fallback, local model support, and respect for provider limits.

Why ModelPool?

ModelPool is designed for AI agents and applications that need to maximize free/low-cost model usage, respect provider limits, and protect user privacy. It routes requests to local or remote models based on policy, provider health, and privacy requirements, always preferring free or local options when possible.

Features

  • Free-first routing: always tries free/low-cost providers first for eligible requests
  • Transparent fallback: all fallback decisions are visible and explainable
  • Local model support: privacy-sensitive or secret data is routed to local models (Ollama) by default
  • Strict provider limit respect: no quota bypass, no hidden retries, no key/account rotation
  • Configurable profiles and routing policies
  • OpenAI-compatible gateway and CLI
  • Experimental/configurable OpenCode Go/Zen support (v0.1)

Quickstart

  1. Install dependencies and build the local CLI:
    pnpm install
    pnpm build
  2. Initialize config:
    pnpm modelpool init
  3. Edit .modelpool/config.yaml for one of the live paths below.
  4. Start the gateway server:
    pnpm modelpool serve --port 4545
  5. Send your first request through the routing alias:
    curl -s http://127.0.0.1:4545/v1/chat/completions \
      -H 'content-type: application/json' \
      -d '{"model":"modelpool/free","messages":[{"role":"user","content":"Reply with: hello from modelpool"}],"max_tokens":80}'

modelpool/free means "let ModelPool choose from the active profile". v0.2 also exposes modelpool/fast, modelpool/balanced, and modelpool/capable for health-aware route groups. Use a concrete provider model ID only when you want exact-model routing.

Fastest Live Setup: Groq

Use this when you want a cloud live test without running local models.

  1. Export a Groq key in your shell:
    export GROQ_API_KEY=...
  2. Use a config like this:
    ledgerPath: .modelpool/ledger.sqlite
    server:
      port: 4545
    providers:
      groq:
        enabled: true
        apiKey: ${GROQ_API_KEY}
        models:
          - llama-3.3-70b-versatile
    profiles:
      default:
        description: Groq live profile
        providers:
          - groq
        model: llama-3.3-70b-versatile
        fallbackModels: []
    routing:
      defaultProfile: default
      maxAttempts: 1
    privacy:
      sensitivity: public
      allowLogging: false
    modelRegistry:
      - id: llama-3.3-70b-versatile
        name: Llama 3.3 70B Versatile
        provider: groq
        capabilities:
          - chat
        experimental: false
  3. Verify the route before sending prompts:
    pnpm modelpool route explain --model modelpool/free --privacy public
    pnpm modelpool run --model modelpool/free "Reply with exactly: live-ok"

OpenCode Zen Live Setup

OpenCode Zen is configurable/experimental in v0.1, but the OpenAI-compatible path verified during development is https://opencode.ai/zen/v1.

  1. Export an OpenCode key in your shell:
    export OPENCODE_API_KEY=...
  2. Use a config like this:
    ledgerPath: .modelpool/ledger.sqlite
    server:
      port: 4545
    providers:
      opencode:
        enabled: true
        experimental: true
        apiKey: ${OPENCODE_API_KEY}
        baseUrl: https://opencode.ai/zen/v1
        models:
          - big-pickle
    profiles:
      default:
        description: OpenCode Zen live profile
        providers:
          - opencode
        model: big-pickle
        fallbackModels: []
    routing:
      defaultProfile: default
      maxAttempts: 1
    privacy:
      sensitivity: public
      allowLogging: false
    modelRegistry:
      - id: big-pickle
        name: Big Pickle
        provider: opencode
        capabilities:
          - chat
        experimental: true
  3. Use enough output budget for reasoning-capable models through the HTTP API:
    curl -s http://127.0.0.1:4545/v1/chat/completions \
      -H 'content-type: application/json' \
      -d '{"model":"modelpool/free","messages":[{"role":"user","content":"Reply with exactly: opencode-ok"}],"max_tokens":80,"temperature":0}'

Note: reasoning-capable OpenCode Zen models can spend early tokens on reasoning before assistant content appears. If max_tokens is too low, the upstream response may contain reasoning but no visible assistant text.

Ollama Local Setup

Use this when prompts are private/sensitive or you want local-only behavior.

  1. Install and run Ollama, then pull a model:
    ollama serve
    ollama pull llama3.2
  2. Keep the default local-first config generated by pnpm modelpool init, or ensure your profile uses Ollama:
    providers:
      ollama:
        enabled: true
        baseUrl: http://127.0.0.1:11434
        models:
          - llama3.2
    profiles:
      default:
        providers:
          - ollama
        model: llama3.2
        fallbackModels: []
    privacy:
      sensitivity: private
      allowLogging: false
  3. Run:
    pnpm modelpool run "Reply with exactly: local-ok"

Configuration

  • Main config: .modelpool/config.yaml
  • Ledger (usage log): .modelpool/ledger.sqlite
  • Override config/ledger paths with MODELPOOL_CONFIG and MODELPOOL_LEDGER env vars
  • Keep provider keys in environment variables; do not write raw keys into committed config files
  • CLI/server commands, run locally with pnpm modelpool ... after pnpm build:
    • modelpool init [--config path] [--force]
    • modelpool serve [--port number] [--config path]
    • modelpool run [--model id] [--prompt text] <prompt>
    • modelpool status
    • modelpool models
    • modelpool doctor [--probe]
    • modelpool usage [--json]
    • modelpool route explain [--model id] [--privacy public|private|sensitive|secret]
    • modelpool scan <file>

Profiles

Profiles define routing order and fallback for different use cases:

  • default: Local-first with cloud fallback (Ollama -> Groq)
  • public: Free/low-cost providers first (OpenCode Go/Zen -> Groq -> Ollama)
  • private/sensitive/secret: Local only (Ollama), unless explicitly allowed

Supported Providers (MVP)

  • Ollama (local, default for private/sensitive/secret)
  • Groq (cloud, free/low-cost, OpenAI-compatible)
  • OpenCode Go/Zen (configurable/experimental in v0.1; verified against https://opencode.ai/zen/v1 with configured model IDs)

Note: OpenCode Go/Zen support remains experimental/configurable in v0.1. Model availability and response behavior can vary by account and model. Do not assume hardcoded OpenCode model IDs outside your own config.

Privacy Policy

  • Private, sensitive, or secret data is routed to local models (Ollama) by default
  • No prompt/completion content or credentials are stored in the ledger
  • No external provider fallback for secret-classified requests
  • modelpool scan <file> and POST /v1/modelpool/policy/check redact supported secret patterns before returning findings
  • Local privacy support is a first-order product goal

Secret Scanner

The v0.1 scanner is deterministic and regex-based. It detects and redacts .env-style key assignments, OpenAI-style keys, GitHub tokens, JWTs, AWS access key IDs, and SSH/private key blocks. Findings expose redacted matches plus location metadata only; raw matched secret values are not returned. Known limitation: this is not full DLP or entropy analysis, so unusual credential formats may require explicit policy classification.

Server Mode & Endpoints

  • Start with modelpool serve (default port 4545)
  • Endpoints:
    • GET /v1/models
    • POST /v1/chat/completions
    • GET /v1/modelpool/status
    • POST /v1/modelpool/route/explain
    • POST /v1/modelpool/policy/check
  • Unsupported OpenAI endpoints return 501 Not Implemented
  • Streaming is not supported in v0.1 (returns 501)

Route Explain and Usage

  • Use modelpool route explain or POST /v1/modelpool/route/explain to see provider selection, fallback reasons, and policy decisions for a given request
  • Use modelpool/free, modelpool/fast, modelpool/balanced, or modelpool/capable when you want ModelPool to choose from the active profile instead of requiring an exact provider model ID
  • Use modelpool usage --json for privacy-safe aggregate metadata; prompts, completions, credentials, headers, and request bodies are not stored
  • modelpool doctor --probe is opt-in and can consume provider quota; doctor without --probe reports configuration and passive health only

Roadmap

  • v0.1: MVP with Ollama, Groq, and experimental OpenCode Go/Zen support
  • v0.2: No-Ink free-first routing UX with route groups, passive health, cooldowns, and usage metadata
  • v0.3+: Streaming, Anthropic-compatible endpoint if useful, expanded provider support if verified, and advanced policy/routing

Non-goals & Forbidden Behaviors

  • No OpenRouter, Anthropic, Gemini, Fireworks, DeepInfra, or other non-MVP providers
  • No dashboard, billing UI, teams, or LiteLLM replacement
  • No quota bypass, unlimited access, key/account rotation, or hidden retries
  • No claims of unlimited free usage or provider-limit evasion

For provider setup and policy details, see docs/provider-setup.md and docs/policy.md.