modelweaver

v0.3.0

Published

11 days ago

Multi-provider model orchestration proxy for Claude Code

0High
0Medium
0Low

ModelWeaver

Multi-provider model orchestration proxy for Claude Code. Route different agent roles to different model providers with automatic fallback, exact model routing, config hot-reload, and crash recovery.

How It Works

ModelWeaver sits between Claude Code and upstream model providers as a local HTTP proxy. It inspects the model field in each Anthropic Messages API request and routes it to the best-fit provider.

Claude Code  ──→  ModelWeaver  ──→  Anthropic (primary)
                   (localhost)   ──→  OpenRouter (fallback)
                   │
              1. Match exact model name (modelRouting)
              2. Match tier via substring (tierPatterns)
              3. Fallback on 429 / 5xx errors

Features

Tier-based routing — route by model family (sonnet/opus/haiku) using substring pattern matching
Exact model routing — route specific model names to dedicated providers (checked first)
Automatic fallback — transparent failover on rate limits (429) and server errors (5xx)
Model name rewriting — each provider in the chain can use a different model name
Interactive setup wizard — guided configuration with API key validation
Config hot-reload — changes to config file are picked up automatically, no restart needed
Daemon mode — run as a background process with start/stop/status/remove commands
Crash recovery — auto-restarts on crash with rate limiting (max 5 restarts/60s)
Multiple auth types — supports x-api-key (Anthropic) and Bearer token auth
Per-provider timeouts — configurable timeout with AbortController
Structured logging — JSON logs with request IDs for tracing
Env var substitution — config references like ${API_KEY} resolved from environment
Circuit breaker — per-provider circuit breaker with closed/open/half-open states, prevents hammering unhealthy providers
Adaptive fallback — on 429 rate limits, automatically races remaining providers simultaneously instead of sequential fallback
Connection pooling — per-provider undici Agent dispatcher with configurable pool size, closes old agents on config reload
Health endpoint — /api/status returns circuit breaker state and uptime
Desktop GUI — native app with one-command launch (modelweaver gui), auto-downloads from GitHub Releases

Prerequisites

Node.js 20 or later — Install Node.js
npx — included with Node.js (no separate install needed)

Installation

ModelWeaver requires no permanent install — npx downloads and runs it on the fly. But if you prefer a global install:

npm install -g modelweaver

After that, replace npx modelweaver with modelweaver in all commands below.

Quick Start

1. Run the setup wizard

npx modelweaver init

The wizard guides you through:

Selecting from 6 preset providers (Anthropic, OpenRouter, Together AI, GLM/Z.ai, Minimax, Fireworks)
Testing API keys to verify connectivity
Setting up model routing tiers
Auto-configuring ~/.claude/settings.json for Claude Code integration

2. Start ModelWeaver

# Foreground (see logs in terminal)
npx modelweaver

# Background daemon (auto-restarts on crash)
npx modelweaver start

3. Point Claude Code to ModelWeaver

export ANTHROPIC_BASE_URL=http://localhost:3456
export ANTHROPIC_API_KEY=unused-but-required
claude

CLI Commands

npx modelweaver init              # Interactive setup wizard
npx modelweaver start             # Start as background daemon
npx modelweaver stop              # Stop background daemon
npx modelweaver status            # Show daemon status
npx modelweaver remove            # Stop daemon + remove PID and log files
npx modelweaver gui               # Launch desktop GUI (auto-downloads binary)
npx modelweaver [options]         # Run in foreground

CLI Options

  -p, --port <number>      Server port                    (default: from config)
  -c, --config <path>      Config file path               (auto-detected)
  -v, --verbose            Enable debug logging           (default: off)
  -h, --help               Show help

Daemon Mode

Run ModelWeaver as a background process that survives terminal closure and auto-recovers from crashes.

npx modelweaver start             # Start (forks monitor + daemon)
npx modelweaver status            # Check if running
npx modelweaver stop               # Graceful stop (SIGTERM → SIGKILL after 5s)
npx modelweaver remove             # Stop + remove PID file + log file

How it works: start forks a lightweight monitor process that owns the PID file. The monitor spawns the actual daemon worker. If the worker crashes, the monitor auto-restarts it after a 2-second delay (up to 5 restarts per 60-second window to prevent crash loops).

modelweaver.pid        → Monitor process (handles signals, watches child)
  └── modelweaver.worker.pid → Daemon worker (runs HTTP server)

Files:

~/.modelweaver/modelweaver.pid — monitor PID
~/.modelweaver/modelweaver.worker.pid — worker PID
~/.modelweaver/modelweaver.log — daemon output log

Desktop GUI

ModelWeaver ships a native desktop GUI built with Tauri. No Rust toolchain needed — the binary is auto-downloaded from GitHub Releases.

npx modelweaver gui

First run downloads the latest binary for your platform (~10-30 MB). Subsequent launches use the cached version.

Supported platforms:

| Platform | Format | |---|---| | macOS (Apple Silicon) | .dmg | | macOS (Intel) | .dmg | | Linux (x86_64) | .AppImage | | Windows (x86_64) | .msi |

Cached files are stored in ~/.modelweaver/gui/ with version tracking — new versions download automatically on the next gui launch.

Configuration

Config file locations

Checked in order (first found wins):

./modelweaver.yaml (project-local)
~/.modelweaver/config.yaml (user-global)

Full config schema

server:
  port: 3456                  # Server port          (default: 3456)
  host: localhost             # Bind address         (default: localhost)

providers:
  anthropic:
    baseUrl: https://api.anthropic.com
    apiKey: ${ANTHROPIC_API_KEY}  # Env var substitution
    timeout: 30000                # Request timeout in ms  (default: 30000)
    poolSize: 10                  # Connection pool size (default: varies by provider)
    authType: anthropic           # "anthropic" | "bearer"  (default: anthropic)
  openrouter:
    baseUrl: https://openrouter.ai/api
    apiKey: ${OPENROUTER_API_KEY}
    authType: bearer
    timeout: 60000

# Tier-based routing (substring pattern matching)
routing:
  sonnet:
    - provider: anthropic
      model: claude-sonnet-4-20250514      # Optional: rewrite model name
    - provider: openrouter
      model: anthropic/claude-sonnet-4      # Fallback
  opus:
    - provider: anthropic
      model: claude-opus-4-20250514
  haiku:
    - provider: anthropic
      model: claude-haiku-4-5-20251001

# Pattern matching: model name includes any string → matched to tier
tierPatterns:
  sonnet: ["sonnet", "3-5-sonnet", "3.5-sonnet"]
  opus: ["opus", "3-opus", "3.5-opus"]
  haiku: ["haiku", "3-haiku", "3.5-haiku"]

# Exact model name routing (checked FIRST, before tier patterns)
modelRouting:
  "glm-5-turbo":
    - provider: anthropic               # Route to specific provider
  "MiniMax-M2.7":
    - provider: openrouter
      model: minimax/MiniMax-M2.7        # With model name rewrite

Routing priority

Exact model name (modelRouting) — if the request model matches exactly, use that route
Tier pattern (tierPatterns + routing) — substring match the model name against patterns, then use the tier's provider chain
No match — returns 502 with a descriptive error listing configured tiers and model routes

Provider chain behavior

First provider is primary, rest are fallbacks
Fallback triggers on: 429 (rate limit), 5xx (server error), network timeout
Adaptive race mode — when a 429 is received, remaining providers are raced simultaneously (not sequentially) for faster recovery
Circuit breaker — providers that repeatedly fail are temporarily skipped (auto-recovers after cooldown)
No fallback on: 4xx (bad request, auth failure, forbidden) — returned immediately
Model rewriting: each provider entry can override the model field in the request body

Supported providers

| Provider | Auth Type | Base URL | |---|---|---| | Anthropic | x-api-key | https://api.anthropic.com | | OpenRouter | Bearer | https://openrouter.ai/api | | Together AI | Bearer | https://api.together.xyz | | GLM (Z.ai) | x-api-key | https://api.z.ai/api/anthropic | | Minimax | x-api-key | https://api.minimax.io/anthropic | | Fireworks | Bearer | https://api.fireworks.ai/inference/v1 |

Any OpenAI/Anthropic-compatible API works — just set baseUrl and authType appropriately.

Config hot-reload

In daemon mode, ModelWeaver watches the config file for changes and reloads automatically (debounced 300ms). You can also send a manual reload signal:

kill -SIGUSR1 $(cat ~/.modelweaver/modelweaver.pid)

Or just re-run npx modelweaver init — it automatically signals the running daemon to reload.

API

Health check

curl http://localhost:3456/api/status

Returns circuit breaker state for all providers and server uptime.

How Claude Code Uses Model Tiers

Claude Code sends different model names for different agent roles:

| Agent Role | Model Tier | Typical Model Name | |---|---|---| | Main conversation, coding | Sonnet | claude-sonnet-4-20250514 | | Explore (codebase search) | Haiku | claude-haiku-4-5-20251001 | | Plan (analysis) | Sonnet | claude-sonnet-4-20250514 | | Complex subagents | Opus | claude-opus-4-20250514 |

ModelWeaver uses the model name to determine which agent tier is calling, then routes accordingly.

Development

npm install          # Install dependencies
npm test             # Run tests (174 tests)
npm run build        # Build for production (tsup)
npm run dev          # Run in dev mode (tsx)

License

Apache-2.0