claw-auto-router
v0.7.4
Published
Self-hosted OpenAI-compatible LLM router that auto-imports providers from your OpenClaw config
Maintainers
Readme
claw-auto-router
A self-hosted, OpenAI-compatible LLM router for OpenClaw — automatically imports your provider/model configuration and routes each request to the best available model.
Primary use case: Discord → OpenClaw → claw-auto-router → best provider/model
Why this exists
OpenClaw lets you configure multiple LLM providers and run agents across them. But when you want a single "smart" endpoint that automatically picks the best model for each request — without duplicating configuration — you need a router.
claw-auto-router:
- Reads your existing OpenClaw config (zero duplication)
- Exposes an OpenAI-compatible API so OpenClaw treats it like a normal provider
- Routes requests to the most suitable model based on content (tier-based heuristics or optional RouterAI classification + explicit assignments)
- Falls back automatically when a provider fails
- Tracks routing stats, estimated spend/savings, and active session overrides in a live dashboard
- Lets users switch models or tiers mid-conversation with natural-language commands such as
use opusorprefer code - Supports OpenClaw-native thinking overrides
- Delegates all model calls back through the OpenClaw Gateway instead of reimplementing provider OAuth here
Architecture
flowchart TD
A[Incoming Request] --> B{Model ID?}
B -->|simple / medium / complex / reasoning| C[Forced Tier]
B -->|auto| D[Heuristic Classifier\nor RouterAI]
C --> E[Assigned Tier]
D --> E
E --> F[Pick Best Model for Tier]
F --> G[Fallback Proxy]
G --> H[OpenClaw Gateway]
H --> I[LLM Provider]
I -->|Stream| H
H -->|Stream| G
G -->|Stream| AsequenceDiagram
participant Discord
participant OpenClaw as OpenClaw (Gateway)
participant Router as claw-auto-router (port 43123)
Discord->>OpenClaw: Send message
OpenClaw->>Router: POST /v1/chat/completions
Router->>Router: Classify prompt → pick tier & model
Router->>OpenClaw: Forward to best model via Gateway
OpenClaw-->>Router: Stream response
Router-->>OpenClaw: Pipe response
OpenClaw-->>Discord: Display responseRouting tiers
Each request is classified into one of four tiers. By default this is done with deterministic heuristics; during claw-auto-router setup you can optionally enable RouterAI, which asks a dedicated model to choose the tier before routing. claw-auto-router then picks the best model for that tier:
| Tier | Triggers | Preferred model traits | |------|----------|------------------------| | CODE | code fences, "implement/debug/refactor/function/class" | reasoning models, coders | | COMPLEX | analysis keywords, messages > 2000 tokens | large context, reasoning | | SIMPLE | short greetings, simple Q&A, < 200 tokens | fast, cheap | | STANDARD | everything else | config order |
Explicit tier assignments (set via claw-auto-router setup or router.config.json) always override automatic scoring.
Heuristics vs RouterAI:
Heuristicis faster, deterministic, and adds no extra model call. This is the safest default.RouterAIcan do better on ambiguous prompts, but every auto-routed request pays for one small classifier call first.- If RouterAI fails, claw-auto-router automatically falls back to heuristics for that request.
Setup wizard
During claw-auto-router setup, claw-auto-router prompts you to classify any model that lacks a tier assignment:
┌──────────────────────────────────────────────────────────────┐
│ claw-auto-router — Model Tier Setup Wizard │
│ Assign each model to its best routing tier. │
│ Press Enter or type 5 to skip (heuristics decide). │
└──────────────────────────────────────────────────────────────┘
Model : Kimi for Coding
ID : kimi-coding/k2p5
Context : 256k tokens Reasoning: yes ✓
1) SIMPLE Fast, cheap — quick Q&A, one-liners, lookups
2) STANDARD General purpose — default routing for most tasks
3) COMPLEX Large context, deep reasoning — analysis, long docs
4) CODE Code generation, debugging, refactoring, PRs
5) Skip — use auto-heuristics
Choice [1-5, Enter=skip]: 4
✓ Assigned to CODEAssignments are saved to ~/.openclaw/router.config.json by default (or next to the config file you target with --config) and take effect immediately.
The setup wizard also asks whether you want to keep deterministic heuristics or enable RouterAI, and if you choose RouterAI it lets you pick the classifier model to use.
How OpenClaw config is imported
Config discovery order:
OPENCLAW_CONFIG_PATHenv var~/.openclaw/openclaw.json~/.openclaw/moltbot.json
From the config it extracts:
models.providers.*— base URLs, API styles, model definitionsopenclaw models list --jsonand{agentDir}/models.json— built-in provider/model registry (OpenRouter, GitHub Copilot, OpenAI Codex, MiniMax Portal, Google Antigravity, etc.)openclaw models list --json— full model catalog with context window and capability metadataagents.defaults.model.primary— top-priority modelagents.defaults.model.fallbacks— fallback chain orderagents.defaults.models.*— aliases
Execution path:
- All providers run through the OpenClaw Gateway with a provider/model override
- Built-in and OAuth-backed providers like OpenRouter, GitHub Copilot, OpenAI Codex, MiniMax Portal, Qwen Portal, and Google Antigravity stay on OpenClaw's auth/runtime path
API key resolution
| Source | Resolution |
|--------|-----------|
| Literal key in config | Used directly |
| "xxx-oauth" sentinel | Checks {PROVIDER}_TOKEN env var (e.g. QWEN_PORTAL_TOKEN) |
| No key in config | Checks {PROVIDER_ID_UPPER}_API_KEY env var |
| Not resolvable | Hidden from routing pool |
Visibility rules:
- Models appear in
/v1/modelswhen the OpenClaw Gateway is reachable - If the Gateway is down, models are hidden until it comes back
Current caveats for OpenClaw-backed execution:
- Standard chat requests now flow through OpenClaw Gateway's OpenAI-compatible HTTP API, so messages,
temperature,max_tokens, and SSE streaming stay on the native OpenClaw path - Conversation-level
thinkingoverrides still fall back to the Gateway agent bridge until OpenClaw's HTTP chat endpoint exposes the same per-request thinking controls - Agent-bridge fallback still only supports
data:image/...;base64,...URLs from the latest user turn
Quick start
Pick the path that matches your setup:
- Use
npmif you already have Node.js 20+ - Use Docker if you do not want to install Node.js
Easiest install: npm
If you already use Node.js, the best install UX is a single npm command.
Install from npm
npm install -g claw-auto-router
claw-auto-router setupMake sure your OpenClaw Gateway is running before you expect imported models to route:
openclaw gateway statusclaw-auto-router setup automatically:
- detects your active OpenClaw config via
openclaw config file - imports the current OpenClaw model catalog, including built-in configured providers like OpenRouter, GitHub Copilot, OpenAI Codex, MiniMax Portal, and Google Antigravity
- asks you to assign tiers to your current models
- shows the current order inside each tier and lets you save explicit priority overrides
- asks whether routing decisions should stay heuristic or use RouterAI, and lets you pick the classifier model
- writes
~/.openclaw/router.config.json - updates your OpenClaw config to point
claw-auto-router/autoat the local router - ensures
gateway.http.endpoints.chatCompletions.enabled=trueso the router can use OpenClaw's native OpenAI-compatible Gateway path - on macOS, installs and starts a
launchdbackground service automatically
If you want to throw away previous claw-auto-router tier assignments and rebuild them from scratch, use:
claw-auto-router clean-setupIt also installs a short alias:
clawrUseful examples:
# Use an explicit OpenClaw config path
claw-auto-router setup --config ~/.openclaw/moltbot.json
# Rebuild existing claw-auto-router setup from scratch
claw-auto-router clean-setup
# Use a custom router port during setup
claw-auto-router setup --port 3001
# Check the background service on macOS
claw-auto-router service status
# Start or restart the background service manually
claw-auto-router service start
claw-auto-router service restartSee recent routing decisions and why they were chosen:
claw-auto-router logs --limit 20
claw-auto-router logs --jsonOpen the live dashboard:
open http://127.0.0.1:43123/dashboardBackground service management on macOS:
claw-auto-router service install
claw-auto-router service status
claw-auto-router service stop
claw-auto-router service uninstallIf you want the latest unreleased version straight from GitHub instead:
npm install -g github:yuga-hashimoto/claw-auto-router
claw-auto-router setup
claw-auto-routerNo-Node install: Docker Compose
If you want clawr running without installing Node.js locally, use Docker.
What you need
- Docker Desktop or Docker Engine + Docker Compose
- Your OpenClaw config at
~/.openclaw/openclaw.jsonor~/.openclaw/moltbot.json - Provider API keys only if they are not already stored in your OpenClaw config
1. Clone and start
git clone https://github.com/yuga-hashimoto/claw-auto-router.git
cd claw-auto-router
cp .env.example .env
docker compose up --build -ddocker-compose.yml mounts ~/.openclaw read-only and loads values from your local .env file automatically.
2. Add keys only if needed
Open .env and fill in only the provider keys that are missing from your OpenClaw config:
ZAI_API_KEY=
KIMI_CODING_API_KEY=
GOOGLE_API_KEY=
OPENROUTER_API_KEY=
NVIDIA_API_KEY=
QWEN_PORTAL_TOKEN=Then restart:
docker compose restart3. Verify it is up
curl http://localhost:43123/health
curl http://localhost:43123/v1/modelsIf /v1/models returns an empty list:
- start or fix the OpenClaw Gateway for imported models
- then reload the router config with
POST /reload-configor restart the router service
Local install: Node.js + pnpm
Use this if you want local development, hot reload, or to modify the code.
What you need
- Node.js 20+
- pnpm
- Your OpenClaw config at
~/.openclaw/openclaw.jsonor~/.openclaw/moltbot.json
# Install dependencies
pnpm install
# Build the CLI once
pnpm build
# Run one-time setup against your OpenClaw config
pnpm start -- setup
# Or run the server directly during development
pnpm devThe router starts on http://localhost:43123 and reads your OpenClaw config automatically. On macOS, setup also installs a launchd agent so the router can keep running in the background after setup.
For a production-style local run:
pnpm install # also builds dist/ via prepare hook
pnpm start -- setup
pnpm startpnpm dev # Dev server with hot reload
pnpm build # Compile TypeScript
pnpm start # Run compiled output
pnpm test # Run all tests
pnpm typecheck # Type-checkrouter.config.json
Optional claw-auto-router-specific settings.
Default path:
~/.openclaw/router.config.json- If you run
claw-auto-router setup --config /path/to/openclaw.json, it writes/path/to/router.config.json
Example:
{
"modelTiers": {
"kimi-coding/k2p5": "CODE",
"nvidia/qwen/qwen3.5-397b-a17b": "COMPLEX",
"google/gemini-flash": "SIMPLE"
},
"tierPriority": {
"CODE": ["kimi-coding/k2p5", "nvidia/qwen/qwen3.5-397b-a17b"],
"SIMPLE": ["google/gemini-flash"]
},
"routerAI": {
"mode": "ai",
"model": "google/gemini-3-flash-preview",
"timeoutMs": 8000
},
"dashboard": {
"baselineModel": "openai-codex/gpt-5.4",
"refreshSeconds": 5
},
"denylist": ["some-provider/bad-model"]
}| Field | Description |
|-------|-------------|
| modelTiers | Explicit tier per model — overrides heuristic scoring. Set by setup wizard. |
| tierPriority | Preferred model order within each tier (explicit beats score). Setup wizard can write this too. |
| routerAI | Optional AI classifier for tier decisions. If it fails, routing falls back to heuristics automatically. |
| dashboard | Baseline model + refresh interval for /dashboard estimated spend/savings. |
| denylist | Models to exclude from routing |
claw-auto-router setup also writes openClawIntegration metadata here so the router can remember your original OpenClaw primary/fallback chain without routing to itself.
Conversation controls
You can change routing in the middle of a conversation by sending a short user message. When claw-auto-router can identify a stable session (session_id, user, x-session-id, x-openclaw-thread-id, or a derived conversation fingerprint), the override is saved for that conversation until you clear it.
Examples:
use opus
use gpt-5.4
use auto again
prefer code
clear tier
thinking high
thinking off
reset routingWhat these do:
use opus/use gpt-5.4locks that conversation to a specific modeluse auto againreturns to normal auto-routingprefer codeforces theCODEtier for that conversationthinking highenables a conversation-level thinking overridereset routingclears all conversation overrides at once
Current thinking support:
thinking high/thinking medium/thinking loware forwarded to OpenClaw as gateway thinking-level overridesreasoning_effortis mapped onto the same OpenClaw thinking levels- Budget and interleaved hints are normalized to the closest OpenClaw level before dispatch
- Standard OpenAI-style generation controls such as
temperatureandmax_tokensare forwarded through the OpenClaw Gateway HTTP path - Requests with
thinkingoverrides currently use the Gateway agent bridge so OpenClaw can apply the requested thinking level
API reference
POST /v1/chat/completions
OpenAI-compatible chat completions.
# Auto-routing
curl -X POST http://localhost:43123/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{"model":"auto","messages":[{"role":"user","content":"Hello"}]}'
# Explicit model
curl -X POST http://localhost:43123/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{"model":"nvidia/qwen/qwen3.5-397b-a17b","messages":[{"role":"user","content":"Explain neural networks"}]}'GET /v1/models
Returns all models with resolved API keys.
GET /health
Liveness check with model counts.
GET /stats
Routing stats: requests, per-model counts, fallback rate, classifier modes, active session overrides, config status, and estimated spend/savings when model pricing is known.
GET /dashboard
Live HTML dashboard on top of /stats.
- Request volume, success rate, fallback rate
- Estimated spend and savings versus a baseline model
- Tier and classifier distribution
- Per-model usage and recent routing history
- Active conversation overrides
POST /reload-config
Reload OpenClaw config without restart. Atomically replaces the routing pool.
curl -X POST http://localhost:43123/reload-config
# With admin token:
curl -X POST http://localhost:43123/reload-config \
-H "Authorization: Bearer your-token"Pointing OpenClaw at claw-auto-router
If you use claw-auto-router setup, you do not need to edit OpenClaw manually.
Add to your moltbot.json (or openclaw.json):
{
"models": {
"providers": {
"claw-auto-router": {
"baseUrl": "http://localhost:43123",
"apiKey": "any-value",
"api": "openai-completions",
"models": [
{
"id": "auto",
"name": "Auto Router",
"api": "openai-completions",
"contextWindow": 262144,
"maxTokens": 32768
}
]
}
}
},
"agents": {
"defaults": {
"model": {
"primary": "claw-auto-router/auto"
}
}
}
}Set your agent model to claw-auto-router/auto. OpenClaw sends chat completions to claw-auto-router, which routes to the actual best model internally.
Environment variables
| Variable | Default | Description |
|----------|---------|-------------|
| PORT | 43123 | HTTP port |
| HOST | 0.0.0.0 | Bind address |
| LOG_LEVEL | info | trace\|debug\|info\|warn\|error |
| OPENCLAW_CONFIG_PATH | auto-detect | Override config path |
| ROUTER_REQUEST_TIMEOUT_MS | 30000 | Per-provider timeout (ms) |
| ROUTER_ADMIN_TOKEN | (none) | Token for /reload-config |
| ZAI_API_KEY | (none) | zai provider key |
| KIMI_CODING_API_KEY | (none) | kimi-coding provider key |
| GOOGLE_API_KEY | (none) | Google provider key |
| OPENROUTER_API_KEY | (none) | OpenRouter key |
| NVIDIA_API_KEY | (none) | NVIDIA key (if not in config) |
| QWEN_PORTAL_TOKEN | (none) | qwen-portal OAuth token |
| OPENAI_CODEX_TOKEN | (none) | Override token for openai-codex |
Docker
# Start with docker compose
docker compose up
# Manual run (mounts your OpenClaw config read-only)
docker build -t claw-auto-router .
docker run -p 43123:43123 \
-v ~/.openclaw:/root/.openclaw:ro \
-e ZAI_API_KEY=your-key \
claw-auto-routerTroubleshooting
"No resolvable candidates"
→ OpenClaw Gateway is unavailable or OpenClaw cannot resolve any enabled models. Check openclaw gateway status, then inspect GET /stats → configStatus.warnings.
Provider in fallbacks but not in routing pool
→ Phantom ref — add that provider/model to your OpenClaw config so openclaw models list --json can see it.
"env_missing" but key is set
→ Check the provider auth inside OpenClaw itself. claw-auto-router now delegates auth/model execution back through OpenClaw Gateway.
502 All providers failed
→ All providers returned errors. Check GET /stats for per-model failure counts and server logs for specific HTTP errors.
Natural-language model switch did not stick
→ Send a stable session identifier (session_id, user, or x-session-id) so claw-auto-router can remember the override across turns.
Wizard doesn't appear
→ claw-auto-router only runs the wizard when stdin/stdout are TTYs. In Docker or CI, set modelTiers in router.config.json manually.
Release automation
npm publishing is handled by GitHub Actions trusted publishing in
publish.yml.
- Bump the
versioninpackage.json - Register
yuga-hashimoto/claw-auto-router+.github/workflows/publish.ymlonce as an npm trusted publisher - Push to
mainor run the workflow manually from GitHub Actions - The workflow runs
pnpm typecheck,pnpm test, andpnpm build - If that version is not already on npm, it publishes automatically without an npm token
- The same workflow also creates a
vX.Y.ZGit tag and GitHub Release with generated release notes
