@sifxprime/krouter
v0.5.9
Published
kRouter (Kodelyth Router) CLI — Kodelyth AI Infrastructure
Maintainers
Readme
kRouter — Kodelyth AI Infrastructure
A hardened fork of 9Router. Same FREE AI router & token saver — 11 audited security/stability fixes layered on top.
Never stop coding. Save 20-40% tokens with RTK + auto-fallback to FREE & cheap AI models. Connect Claude Code, Cursor, Antigravity, Copilot, Codex, Gemini, OpenCode, Cline, Kiro, OpenClaw... to 40+ AI providers & 100+ models — through one local endpoint.
🚀 Quick Start • 💡 Features • 📖 Setup • 🔧 About this fork
🇻🇳 Tiếng Việt • 🇨🇳 中文 • 🇯🇵 日本語 • 🇷🇺 Русский
You're viewing this on npm. Full docs, screenshots, and changelog: github.com/sifxprime/krouter.
🔧 About this fork — sifxprime/krouter
This is a hardened fork of the upstream decolua/9router, rebranded as kRouter by Kodelyth AI Infrastructure and maintained at sifxprime/krouter.
It tracks upstream but adds an audited security + stability layer focused on production reliability when kRouter is exposed via tunnel (Cloudflare, Tailscale) or used under concurrent multi-agent load. Eleven audit findings closed across nine atomic commits, each with reproducing unit tests and live end-to-end verification before commit.
| Area | Hardening in this fork |
|---|---|
| MITM stream layer | Upstream HTTP error → parseable Kiro exception frame instead of "Truncated event message received". Read-loop guarded against socket hang-up. EventStream encoder bounds-checked (name ≤ 255, value ≤ 65 535, frame ≤ 16 MiB) to prevent silent frame corruption. |
| Auth & sessions | Token refresh no longer mutates caller credentials under concurrency. Post-refresh retry always adopts its own response (no stale-401 masking). Timing-safe CLI token compare. Per-IP brute-force lockout for API-key + CLI-token paths. |
| Account fallback | backoffLevel read-modify-write made atomic via SQLite transaction — concurrent failures no longer lose increments and stall exponential backoff. |
| Provider SSRF | GET /api/providers/[id]/models validates user-supplied baseUrl: blocks cloud metadata endpoints (169.254.169.254, ECS, GCP, Alibaba) + non-http(s) schemes. Loopback and private LAN ranges remain allowed for legitimate self-hosted LLMs. |
| Combo routing | Recursion depth guard (max 3) prevents infinite loops on misconfigured combo-of-combo cycles. Single getSettings() DB read per request (was 2–3). |
See CHANGELOG.md entry v0.4.80+sifxprime.1 for the full commit log with one-line summaries. Each finding's reproduction + verification approach is recorded in the commit body.
Upstream features, providers, docs, and roadmap remain authoritative — credit and stars belong to @decolua and the upstream contributors.
🤔 Why kRouter?
Stop wasting money, tokens and hitting limits:
- ❌ Subscription quota expires unused every month
- ❌ Rate limits stop you mid-coding
- ❌ Tool outputs (git diff, grep, ls...) burn tokens fast
- ❌ Expensive APIs ($20-50/month per provider)
- ❌ Manual switching between providers
kRouter solves this:
- ✅ RTK Token Saver - Auto-compress tool_result content, save 20-40% tokens per request
- ✅ Maximize subscriptions - Track quota, use every bit before reset
- ✅ Auto fallback - Subscription → Cheap → Free, zero downtime
- ✅ Multi-account - Round-robin between accounts per provider
- ✅ Universal - Works with Claude Code, Codex, Cursor, Cline, any CLI tool
🔄 How It Works
┌─────────────┐
│ Your CLI │ (Claude Code, Codex, OpenClaw, Cursor, Cline...)
│ Tool │
└──────┬──────┘
│ http://localhost:20128/v1
↓
┌─────────────────────────────────────────────┐
│ kRouter (Smart Router) │
│ • RTK Token Saver (cut tool_result tokens) │
│ • Format translation (OpenAI ↔ Claude) │
│ • Quota tracking │
│ • Auto token refresh │
└──────┬──────────────────────────────────────┘
│
├─→ [Tier 1: SUBSCRIPTION] Claude Code, Codex, GitHub Copilot
│ ↓ quota exhausted
├─→ [Tier 2: CHEAP] GLM ($0.6/1M), MiniMax ($0.2/1M)
│ ↓ budget limit
└─→ [Tier 3: FREE] Kiro, OpenCode Free, Vertex ($300 credits)
Result: Never stop coding, minimal cost + 20-40% token savings via RTK⚡ Quick Start (this fork)
The npm package
9routerand the Docker imagedecolua/9routerare still the upstream project. To get this fork's hardening pass + Kiro Google/GitHub OAuth, install from source as shown below.
TL;DR — run these four commands one at a time
Press Enter after each line. Windows CMD/PowerShell will not split a single pasted block into separate commands and will treat
git clone …\n cd 9routeras one — you'll seeRepository not foundbecause Git tries to clone a URL withcdappended.
git clone https://github.com/sifxprime/krouter.gitcd krouternpm installnpm run devDashboard opens at http://localhost:20128/dashboard.
That's the whole install. The rest of this section breaks down what each step needs and the production-style run.
Prerequisites
| Tool | Minimum | Notes |
|---|---|---|
| Node.js | ≥ 20 (22 recommended) | node -v to check. macOS via Homebrew: brew install node@22. Linux: nvm or your distro's Node 22. Windows: nodejs.org installer. |
| Git | any recent | git --version |
| A package manager | npm (bundled with Node) | pnpm / yarn / bun also work — the project detects and uses whichever you ran install with. |
| Sudo / admin | only if you enable MITM | Pure router mode (chat completions only) needs no privileges. MITM intercept for Kiro / Antigravity / Copilot / Cursor binds :443 and edits /etc/hosts, which does. |
Step-by-step
1. Get the code
git clone https://github.com/sifxprime/krouter.git
cd krouter2. Install dependencies (≈ 1–3 minutes)
npm install
# or: pnpm install / yarn install / bun installThis pulls Next.js, React, the better-sqlite3 native binding, and the rest. The SQLite binding compiles on install — on a fresh macOS you may be prompted for Xcode Command Line Tools (one-time xcode-select --install).
3. Start it
For day-to-day use:
npm run dev # Next.js dev server, hot reload, port 20128For production-style standalone (smaller memory, no HMR):
npm run build:deploy # one-time build + copy static assets to standalone, ≈ 30s
npm run start # standalone server on PORT=20128Use
npm run build:deploy, not plainnpm run build— the standalone bundle thatnpm run startruns needs the static assets copied alongside it. The:deployvariant does both steps.
Either way, open: http://localhost:20128/dashboard
On first run the app creates ~/.krouter/ (SQLite DB, machine-id, MITM CA) — it's gitignored, per-user, and fully reset by deleting that folder. Upgrading from a pre-rename install? The legacy ~/.9router/ is auto-migrated to ~/.krouter/ on first launch, idempotent and lossless.
What happens next (no reboot, no extra config)
Dashboard → Providers → pick any provider tile.
- Free, no signup needed: MiMo Code Free, OpenCode Free → click [+] on the suggested model.
- Free, login required: Kiro AI (AWS Builder ID or Google or GitHub thanks to this fork's device-code OAuth), Gemini CLI, Qoder.
- API key: OpenRouter, NVIDIA NIM, Anthropic, OpenAI, etc.
Dashboard → API Keys → create one local API key (e.g.
sk-krouter-XXXX).Point any AI tool at kRouter:
Endpoint: http://localhost:20128/v1 API key: sk-krouter-XXXX (from step 2) Model: kr/claude-sonnet-4.5 (or whatever provider/model you connected)Works with Claude Code, Cursor, Antigravity, Copilot, Codex CLI, OpenCode, Cline, OpenClaw, any OpenAI-compatible client.
Updating to the latest fork commits
cd krouter
git pull origin main
npm install # only if package.json changed
# restart dev/startDon't need the hardening pass? Use upstream
If you don't need this fork's audit fixes or Kiro Google/GitHub login, the upstream npm/Docker artifacts still work — they just don't carry this fork's changes:
npm install -g 9router && 9router
# or
docker run -d -p 20128:20128 -v ~/.9router:/root/.9router decolua/9router:latestDefault URLs
- Dashboard:
http://localhost:20128/dashboard - OpenAI-compatible API:
http://localhost:20128/v1 - Anthropic-compatible API:
http://localhost:20128/v1/messages - Health probe:
http://localhost:20128/api/health
Platform support — honest status
This fork's changes are pure cross-platform JavaScript and re-use upstream's per-OS cert/DNS install code. Here is what is actually verified vs what should work but I haven't personally driven on every OS:
| OS | Pure router mode (chat completions) | MITM mode (Kiro / Antigravity / Copilot IDE intercept) |
|---|---|---|
| macOS (Intel + Apple Silicon) | ✅ verified in development | ✅ Kiro driven live, end-to-end |
| Windows 10 / 11 | ✅ verified by a fork user | ✅ confirmed by a fork user: Antigravity streamed gemini-pro-agent end-to-end through MITM, 10.6 s, complete, no errors. Cert install via elevated PowerShell + certutil -addstore -f Root works as upstream ships it. |
| Linux (Debian / Ubuntu / Arch / Fedora / RHEL / openSUSE) | ✅ pure-JS, high confidence | ⚠️ untested as of now; upstream's distro-aware update-ca-certificates / update-ca-trust path is in place and the same JS runs everywhere |
MITM mode is confirmed on macOS and Windows — Antigravity, Copilot, and Kiro share the same hardened pipe layer (base.js) that I verified live. Upstream already supports the Linux distro families above via the OS-aware code in src/mitm/cert/install.js. If MITM fails on Linux, the most likely cause is the OS-level cert-install flow (upstream territory), not anything this fork touched.
Windows note: MITM requires an elevated process (binds :443, edits C:\Windows\System32\drivers\etc\hosts, installs the root CA). Launch your CMD or PowerShell as Run as administrator, then cd into the project and run npm run dev. The dashboard will hide the "Administrator required" banner once net session succeeds inside the dev server's Node process.
If you hit anything OS-specific, open an issue — I'll try to repro and patch.
🛠️ Supported CLI Tools
kRouter works seamlessly with all major AI coding tools:
🌐 Supported Providers
🔐 OAuth Providers
🆓 Free Providers
Note: iFlow, Qwen and Gemini CLI free tiers were discontinued in 2026. Use Kiro / OpenCode Free / Vertex instead.
🔑 API Key Providers (40+)
💡 Key Features
| Feature | What It Does | Why It Matters |
|---------|--------------|----------------|
| 🚀 RTK Token Saver (RTK ⭐40K) | Compress tool outputs (git diff, grep, ls, tree...) before sending to LLM | Save 20-40% input tokens per request |
| 🪨 Caveman Mode (Caveman ⭐52K) | Inject caveman-speak prompt → LLM replies terse, technical substance preserved | Save up to 65% output tokens |
| 🎯 Smart 3-Tier Fallback | Auto-route: Subscription → Cheap → Free | Never stop coding, zero downtime |
| 📊 Real-Time Quota Tracking | Live token count + reset countdown | Maximize subscription value |
| 🔄 Format Translation | OpenAI ↔ Claude ↔ Gemini ↔ Cursor ↔ Kiro ↔ Vertex | Works with any CLI tool |
| 👥 Multi-Account Support | Multiple accounts per provider | Load balancing + redundancy |
| 🔄 Auto Token Refresh | OAuth tokens refresh automatically | No manual re-login needed |
| 🎨 Custom Combos | Create unlimited model combinations | Tailor fallback to your needs |
| 📝 Request Logging | Debug mode with full request/response logs | Troubleshoot issues easily |
| 💾 Cloud Sync | Sync config across devices | Same setup everywhere |
| 📊 Usage Analytics | Track tokens, cost, trends over time | Optimize spending |
| 🌐 Deploy Anywhere | Localhost, VPS, Docker, Cloudflare Workers | Flexible deployment options |
🚀 RTK Token Saver
Tool outputs (git diff, grep, find, ls, tree, log dumps...) often eat 30-50% of your prompt budget. RTK detects them and applies smart, lossless compression before the request hits the LLM:
- Filters:
git-diff,git-status,grep,find,ls,tree,dedup-log,smart-truncate,read-numbered,search-list - Auto-detect: No config needed — RTK peeks the first 1KB of each
tool_resultand picks the right filter. - Safe by design: If a filter fails, throws, or makes output bigger, RTK silently keeps the original text. Errors never break your request.
- Universal: Works across all formats (OpenAI, Claude, Gemini, Cursor, Kiro, OpenAI Responses) because it runs before any format translation.
- Default ON: Toggle anytime in Dashboard → Endpoint settings.
Without RTK: 47K tokens sent to LLM
With RTK: 28K tokens sent to LLM (40% saved · same context · same answer)🎯 Smart 3-Tier Fallback
Create combos with automatic fallback:
Combo: "my-coding-stack"
1. cc/claude-opus-4-6 (your subscription)
2. glm/glm-4.7 (cheap backup, $0.6/1M)
3. if/kimi-k2-thinking (free fallback)
→ Auto switches when quota runs out or errors occur📊 Real-Time Quota Tracking
- Token consumption per provider
- Reset countdown (5-hour, daily, weekly)
- Cost estimation for paid tiers
- Monthly spending reports
🔄 Format Translation
Seamless translation between formats:
- OpenAI ↔ Claude ↔ Gemini ↔ Cursor ↔ Kiro ↔ Vertex ↔ Antigravity ↔ Ollama ↔ OpenAI Responses
- Your CLI tool sends OpenAI format → kRouter translates → Provider receives native format
- Works with any tool that supports custom OpenAI endpoints
👥 Multi-Account Support
- Add multiple accounts per provider
- Auto round-robin or priority-based routing
- Fallback to next account when one hits quota
🔄 Auto Token Refresh
- OAuth tokens automatically refresh before expiration
- No manual re-authentication needed
- Seamless experience across all providers
🎨 Custom Combos
- Create unlimited model combinations
- Mix subscription, cheap, and free tiers
- Name your combos for easy access
- Share combos across devices with Cloud Sync
📝 Request Logging
- Enable debug mode for full request/response logs
- Track API calls, headers, and payloads
- Troubleshoot integration issues
- Export logs for analysis
💾 Cloud Sync
- Sync providers, combos, and settings across devices
- Automatic background sync
- Secure encrypted storage
- Access your setup from anywhere
Cloud Runtime Notes
- Prefer server-side cloud variables in production:
BASE_URL(internal callback URL used by sync scheduler)CLOUD_URL(cloud sync endpoint base)
NEXT_PUBLIC_BASE_URLandNEXT_PUBLIC_CLOUD_URLare still supported for compatibility/UI, but server runtime now prioritizesBASE_URL/CLOUD_URL.- Cloud sync requests now use timeout + fail-fast behavior to avoid UI hanging when cloud DNS/network is unavailable.
📊 Usage Analytics
- Track token usage per provider and model
- Cost estimation and spending trends
- Monthly reports and insights
- Optimize your AI spending
💡 IMPORTANT - Understanding Dashboard Costs:
The "cost" displayed in Usage Analytics is for tracking and comparison purposes only. kRouter itself never charges you anything. You only pay providers directly (if using paid services).
Example: If your dashboard shows "$290 total cost" while using iFlow models, this represents what you would have paid using paid APIs directly. Your actual cost = $0 (iFlow is free unlimited).
Think of it as a "savings tracker" showing how much you're saving by using free models or routing through kRouter!
🌐 Deploy Anywhere
- 💻 Localhost - Default, works offline
- ☁️ VPS/Cloud - Share across devices
- 🐳 Docker - One-command deployment
- 🚀 Cloudflare Workers - Global edge network
💰 Pricing at a Glance
| Tier | Provider | Cost | Quota Reset | Best For | |------|----------|------|-------------|----------| | 🚀 TOKEN SAVER | RTK (built-in) | FREE | Always on | Save 20-40% tokens on EVERY request | | 💳 SUBSCRIPTION | Claude Code (Pro/Max) | $20-200/mo | 5h + weekly | Already subscribed | | | Codex (Plus/Pro) | $20-200/mo | 5h + weekly | OpenAI users | | | GitHub Copilot | $10-19/mo | Monthly | GitHub users | | | Cursor IDE | $20/mo | Monthly | Cursor users | | 💰 CHEAP | GLM-5.1 / GLM-4.7 | $0.6/1M | Daily 10AM | Budget backup | | | MiniMax M2.7 | $0.2/1M | 5-hour rolling | Cheapest option | | | Kimi K2.5 | $9/mo flat | 10M tokens/mo | Predictable cost | | 🆓 FREE | Kiro AI | $0 | Unlimited | Claude 4.5 + GLM-5 + MiniMax free | | | OpenCode Free | $0 | Unlimited | No auth, auto-fetch models | | | Vertex AI | $300 credits | New GCP accounts | Gemini 3 Pro + DeepSeek + GLM-5 |
💡 Pro Tip: RTK + Kiro AI + OpenCode Free combo = $0 cost + 20-40% token savings!
📊 Understanding kRouter Costs & Billing
kRouter Billing Reality:
✅ kRouter software = FREE forever (open source, never charges)
✅ Dashboard "costs" = Display/tracking only (not actual bills)
✅ You pay providers directly (subscriptions or API fees)
✅ FREE providers stay FREE (iFlow, Kiro, Qwen = $0 unlimited)
❌ kRouter never sends invoices or charges your card
How Cost Display Works:
The dashboard shows estimated costs as if you were using paid APIs directly. This is not billing - it's a comparison tool to show your savings.
Example Scenario:
Dashboard Display:
• Total Requests: 1,662
• Total Tokens: 47M
• Display Cost: $290
Reality Check:
• Provider: iFlow (FREE unlimited)
• Actual Payment: $0.00
• What $290 Means: Amount you SAVED by using free models!Payment Rules:
- Subscription providers (Claude Code, Codex): Pay them directly via their websites
- Cheap providers (GLM, MiniMax): Pay them directly, kRouter just routes
- FREE providers (iFlow, Kiro, Qwen): Genuinely free forever, no hidden charges
- kRouter: Never charges anything, ever
🎯 Use Cases
Case 1: "I have Claude Pro subscription"
Problem: Quota expires unused, rate limits during heavy coding
Solution:
Combo: "maximize-claude"
1. cc/claude-opus-4-7 (use subscription fully)
2. glm/glm-5.1 (cheap backup when quota out)
3. kr/claude-sonnet-4.5 (free emergency fallback)
Monthly cost: $20 (subscription) + ~$5 (backup) = $25 total
vs. $20 + hitting limits = frustrationCase 2: "I want zero cost"
Problem: Can't afford subscriptions, need reliable AI coding
Solution:
Combo: "free-forever"
1. kr/claude-sonnet-4.5 (Claude 4.5 free unlimited)
2. kr/glm-5 (GLM-5 free via Kiro)
3. oc/<auto> (OpenCode Free, no auth)
Monthly cost: $0
Quality: Production-ready models + RTK saves 20-40% tokensCase 3: "I need 24/7 coding, no interruptions"
Problem: Deadlines, can't afford downtime
Solution:
Combo: "always-on"
1. cc/claude-opus-4-7 (best quality)
2. cx/gpt-5.5 (second subscription)
3. glm/glm-5.1 (cheap, resets daily)
4. minimax/MiniMax-M2.7 (cheapest, 5h reset)
5. kr/claude-sonnet-4.5 (free unlimited)
Result: 5 layers of fallback = zero downtime
Monthly cost: $20-200 (subscriptions) + $10-20 (backup)Case 4: "I want FREE AI in OpenClaw"
Problem: Need AI assistant in messaging apps (WhatsApp, Telegram, Slack...), completely free
Solution:
Combo: "openclaw-free"
1. kr/claude-sonnet-4.5 (Claude 4.5 free)
2. kr/glm-5 (GLM-5 free)
3. kr/MiniMax-M2.5 (MiniMax free)
Monthly cost: $0
Access via: WhatsApp, Telegram, Slack, Discord, iMessage, Signal...❓ Frequently Asked Questions
The dashboard tracks your token usage and displays estimated costs as if you were using paid APIs directly. This is not actual billing - it's a reference to show how much you're saving by using free models or existing subscriptions through kRouter.
Example:
- Dashboard shows: "$290 total cost"
- Reality: You're using iFlow (FREE unlimited)
- Your actual cost: $0.00
- What $290 means: Amount you saved by using free models instead of paid APIs!
The cost display is a "savings tracker" to help you understand your usage patterns and optimization opportunities.
No. kRouter is free, open-source software that runs on your own computer. It never charges you anything.
You only pay:
- ✅ Subscription providers (Claude Code $20/mo, Codex $20-200/mo) → Pay them directly on their websites
- ✅ Cheap providers (GLM, MiniMax) → Pay them directly, kRouter just routes your requests
- ❌ kRouter itself → Never charges anything, ever
kRouter is a local proxy/router. It doesn't have your credit card, can't send invoices, and has no billing system. It's completely free software.
Yes! The current FREE providers (Kiro, OpenCode Free, Vertex) are genuinely free with no hidden charges.
These are free services offered by those respective companies:
- Kiro AI: Free unlimited Claude 4.5 + GLM-5 + MiniMax via AWS Builder ID / Google / GitHub OAuth
- OpenCode Free: No-auth passthrough proxy, models auto-fetched from
opencode.ai/zen/v1/models - Vertex AI: $300 free credits for new Google Cloud accounts (90 days)
kRouter just routes your requests to them - there's no "catch" or future billing. They're truly free services, and kRouter makes them easy to use with fallback support.
Discontinued free tiers (no longer recommended):
- ❌ iFlow: Was free unlimited, now changed to paid (2026)
- ❌ Qwen Code: Free OAuth tier discontinued by Alibaba on 2026-04-15
- ❌ Gemini CLI: Still works, but using it with non-CLI tools (Claude, Codex, Cursor...) may result in account bans — only use if you stick to Gemini CLI itself
Free-First Strategy:
Start with 100% free combo:
1. gc/gemini-3-flash (180K/month free from Google) 2. if/kimi-k2-thinking (unlimited free from iFlow) 3. qw/qwen3-coder-plus (unlimited free from Qwen)Cost: $0/month
Add cheap backup only if you need it:
4. glm/glm-4.7 ($0.6/1M tokens)Additional cost: Only pay for what you actually use
Use subscription providers last:
- Only if you already have them
- kRouter helps maximize their value through quota tracking
Result: Most users can operate at $0/month using only free tiers!
kRouter's smart fallback prevents surprise charges:
Scenario: You're on a coding sprint and blow through your quotas
Without kRouter:
- ❌ Hit rate limit → Work stops → Frustration
- ❌ Or: Accidentally rack up huge API bills
With kRouter:
- ✅ Subscription hits limit → Auto-fallback to cheap tier
- ✅ Cheap tier gets expensive → Auto-fallback to free tier
- ✅ Never stop coding → Predictable costs
You're in control: Set spending limits per provider in dashboard, and kRouter respects them.
📖 Setup Guide
Claude Code (Pro/Max)
Dashboard → Providers → Connect Claude Code
→ OAuth login → Auto token refresh
→ 5-hour + weekly quota tracking
Models:
cc/claude-opus-4-7
cc/claude-opus-4-6
cc/claude-sonnet-4-6
cc/claude-haiku-4-5-20251001Pro Tip: Use Opus for complex tasks, Sonnet for speed. kRouter tracks quota per model!
OpenAI Codex (Plus/Pro)
Dashboard → Providers → Connect Codex
→ OAuth login (port 1455)
→ 5-hour + weekly reset
Models:
cx/gpt-5.5
cx/gpt-5.4
cx/gpt-5.3-codex
cx/gpt-5.2-codexGitHub Copilot
Dashboard → Providers → Connect GitHub
→ OAuth via GitHub
→ Monthly reset (1st of month)
Models:
gh/gpt-5.4
gh/claude-opus-4.7
gh/claude-sonnet-4.6
gh/gemini-3.1-pro-preview
gh/grok-code-fast-1Cursor IDE
Dashboard → Providers → Connect Cursor
→ OAuth login
→ Monthly subscription
Models:
cu/claude-4.6-opus-max
cu/claude-4.5-sonnet-thinking
cu/gpt-5.3-codexGLM-5.1 / GLM-4.7 (Daily reset, $0.6/1M)
- Sign up: Zhipu AI
- Get API key from Coding Plan
- Dashboard → Add API Key:
- Provider:
glm - API Key:
your-key
- Provider:
Use: glm/glm-5.1, glm/glm-5, glm/glm-4.7
Pro Tip: Coding Plan offers 3× quota at 1/7 cost! Reset daily 10:00 AM.
MiniMax M2.7 (5h reset, $0.20/1M)
- Sign up: MiniMax
- Get API key
- Dashboard → Add API Key
Use: minimax/MiniMax-M2.7, minimax/MiniMax-M2.5
Pro Tip: Cheapest option for long context (1M tokens)!
Kimi K2.5 ($9/month flat)
- Subscribe: Moonshot AI
- Get API key
- Dashboard → Add API Key
Use: kimi/kimi-k2.5, kimi/kimi-k2.5-thinking
Pro Tip: Fixed $9/month for 10M tokens = $0.90/1M effective cost!
Kiro AI (Claude 4.5 + GLM-5 + MiniMax FREE)
Dashboard → Connect Kiro
→ AWS Builder ID, AWS IAM Identity Center, Google, or GitHub
→ Unlimited usage
Models:
kr/claude-sonnet-4.5
kr/claude-haiku-4.5
kr/glm-5
kr/MiniMax-M2.5
kr/qwen3-coder-next
kr/deepseek-3.2Pro Tip: Best free option for Claude. No API key, no payment, fully unlimited.
OpenCode Free (No auth, auto-fetch models)
Dashboard → Connect OpenCode Free
→ No login required (passthrough proxy)
→ Models auto-fetched from opencode.ai/zen/v1/modelsPro Tip: Fastest setup. Just connect and start coding.
Vertex AI ($300 free credits for new GCP accounts)
Dashboard → Connect Vertex AI
→ Upload Google Cloud Service Account JSON
→ Enable Vertex AI API in your GCP project
Models:
vertex/gemini-3.1-pro-preview
vertex/gemini-3-flash-preview
vertex/gemini-2.5-flash
Vertex Partner (Anthropic / DeepSeek / GLM / Qwen via Vertex):
vertex-partner/glm-5-maas
vertex-partner/deepseek-v3.2-maas
vertex-partner/qwen3-next-80b-a3b-thinking-maasPro Tip: New Google Cloud accounts get $300 credits free for 90 days. Plenty for daily coding.
Example 1: Maximize Subscription → Cheap Backup
Dashboard → Combos → Create New
Name: premium-coding
Models:
1. cc/claude-opus-4-7 (Subscription primary)
2. glm/glm-5.1 (Cheap backup, $0.6/1M)
3. minimax/MiniMax-M2.7 (Cheapest fallback, $0.20/1M)
Use in CLI: premium-coding
Monthly cost example (100M tokens):
80M via Claude (subscription): $0 extra
15M via GLM: $9
5M via MiniMax: $1
Total: $10 + your subscriptionExample 2: Free-Only (Zero Cost)
Name: free-combo
Models:
1. kr/claude-sonnet-4.5 (Claude 4.5 free unlimited)
2. kr/glm-5 (GLM-5 free via Kiro)
3. vertex/gemini-3.1-pro-preview ($300 free credits)
Cost: $0 forever (+ 20-40% token savings via RTK)!Cursor IDE
Settings → Models → Advanced:
OpenAI API Base URL: http://localhost:20128/v1
OpenAI API Key: [from 9router dashboard]
Model: cc/claude-opus-4-7Or use combo: premium-coding
Claude Code
Edit ~/.claude/config.json:
{
"anthropic_api_base": "http://localhost:20128/v1",
"anthropic_api_key": "your-9router-api-key"
}Codex CLI
export OPENAI_BASE_URL="http://localhost:20128"
export OPENAI_API_KEY="your-9router-api-key"
codex "your prompt"OpenClaw
Option 1 — Dashboard (recommended):
Dashboard → CLI Tools → OpenClaw → Select Model → ApplyOption 2 — Manual: Edit ~/.openclaw/openclaw.json:
{
"agents": {
"defaults": {
"model": {
"primary": "krouter/kr/claude-sonnet-4.5"
}
}
},
"models": {
"providers": {
"krouter": {
"baseUrl": "http://127.0.0.1:20128/v1",
"apiKey": "sk_krouter",
"api": "openai-completions",
"models": [
{
"id": "kr/claude-sonnet-4.5",
"name": "Claude Sonnet 4.5 (Kiro Free)"
}
]
}
}
}
}Note: OpenClaw only works with local kRouter. Use
127.0.0.1instead oflocalhostto avoid IPv6 resolution issues.
Cline / Continue / RooCode
Provider: OpenAI Compatible
Base URL: http://localhost:20128/v1
API Key: [from dashboard]
Model: cc/claude-opus-4-7VPS Deployment
# Clone and install
git clone https://github.com/decolua/9router.git
cd krouter
npm install
npm run build
# Configure
export JWT_SECRET="your-secure-secret-change-this"
export INITIAL_PASSWORD="your-password"
export DATA_DIR="/var/lib/9router"
export PORT="20128"
export HOSTNAME="0.0.0.0"
export NODE_ENV="production"
export NEXT_PUBLIC_BASE_URL="http://localhost:20128"
export NEXT_PUBLIC_CLOUD_URL="https://9router.com"
export API_KEY_SECRET="endpoint-proxy-api-key-secret"
export MACHINE_ID_SALT="endpoint-proxy-salt"
# Start
npm run start
# Or use PM2
npm install -g pm2
pm2 start npm --name 9router -- start
pm2 save
pm2 startupDocker
Published images (multi-platform linux/amd64 + linux/arm64):
- Docker Hub:
decolua/9router - GHCR:
ghcr.io/decolua/9router
Quick start (use published image):
docker run -d \
--name 9router \
-p 20128:20128 \
-v "$HOME/.9router:/app/data" \
-e DATA_DIR=/app/data \
decolua/9router:latest→ Open http://localhost:20128
Build from source (dev):
git clone https://github.com/decolua/9router.git
cd 9router/app
docker build -t 9router .
docker run -d --name 9router -p 20128:20128 \
-v "$HOME/.9router:/app/data" -e DATA_DIR=/app/data 9routerContainer defaults:
PORT=20128HOSTNAME=0.0.0.0
Useful commands:
docker logs -f 9router
docker restart 9router
docker stop 9router && docker rm 9router
docker pull decolua/9router:latest # update to latestData persistence: $HOME/.9router/db/data.sqlite on host ↔ /app/data/db/data.sqlite in container.
Environment Variables
| Variable | Default | Description |
|----------|---------|-------------|
| JWT_SECRET | Auto-generated (~/.krouter/jwt-secret) | JWT signing secret for dashboard auth cookie (override to share across instances) |
| INITIAL_PASSWORD | 123456 | First login password when no saved hash exists |
| DATA_DIR | ~/.krouter | Main app data location (SQLite at $DATA_DIR/db/data.sqlite) |
| PORT | framework default | Service port (20128 in examples) |
| HOSTNAME | framework default | Bind host (Docker defaults to 0.0.0.0) |
| NODE_ENV | runtime default | Set production for deploy |
| BASE_URL | http://localhost:20128 | Server-side internal base URL used by cloud sync jobs |
| CLOUD_URL | https://9router.com | Server-side cloud sync endpoint base URL |
| NEXT_PUBLIC_BASE_URL | http://localhost:3000 | Backward-compatible/public base URL (prefer BASE_URL for server runtime) |
| NEXT_PUBLIC_CLOUD_URL | https://9router.com | Backward-compatible/public cloud URL (prefer CLOUD_URL for server runtime) |
| API_KEY_SECRET | endpoint-proxy-api-key-secret | HMAC secret for generated API keys |
| MACHINE_ID_SALT | endpoint-proxy-salt | Salt for stable machine ID hashing |
| ENABLE_REQUEST_LOGS | false | Enables request/response logs under logs/ |
| AUTH_COOKIE_SECURE | false | Force Secure auth cookie (set true behind HTTPS reverse proxy) |
| REQUIRE_API_KEY | false | Enforce Bearer API key on /v1/* routes (recommended for internet-exposed deploys) |
| HTTP_PROXY, HTTPS_PROXY, ALL_PROXY, NO_PROXY | empty | Optional outbound proxy for upstream provider calls |
Notes:
- Lowercase proxy variables are also supported:
http_proxy,https_proxy,all_proxy,no_proxy. .envis not baked into Docker image (.dockerignore); inject runtime config with--env-fileor-e.- On Windows,
APPDATAcan be used for local storage path resolution. INSTANCE_NAMEappears in older docs/env templates, but is currently not used at runtime.
Runtime Files and Storage
- Main app state:
${DATA_DIR}/db/data.sqlite(SQLite — providers, combos, aliases, keys, settings, usage history) - Auto backups:
${DATA_DIR}/db/backups/ - Optional request/translator logs:
<repo>/logs/...whenENABLE_REQUEST_LOGS=true - Both
${DATA_DIR}and~/.9routerresolve to the same location in a Docker container — the symlink/root/.9router -> /app/datais created at build time.
📊 Available Models
Claude Code (cc/) - Pro/Max:
cc/claude-opus-4-7cc/claude-opus-4-6cc/claude-sonnet-4-6cc/claude-sonnet-4-5-20250929cc/claude-haiku-4-5-20251001
Codex (cx/) - Plus/Pro:
cx/gpt-5.5cx/gpt-5.4cx/gpt-5.3-codexcx/gpt-5.2-codexcx/gpt-5.1-codex-max
GitHub Copilot (gh/):
gh/gpt-5.4gh/claude-opus-4.7gh/claude-sonnet-4.6gh/gemini-3.1-pro-previewgh/grok-code-fast-1
Cursor (cu/) - Subscription:
cu/claude-4.6-opus-maxcu/claude-4.5-sonnet-thinkingcu/gpt-5.3-codexcu/kimi-k2.5
GLM (glm/) - $0.6/1M:
glm/glm-5.1glm/glm-5glm/glm-4.7
MiniMax (minimax/) - $0.2/1M:
minimax/MiniMax-M2.7minimax/MiniMax-M2.5
Kimi (kimi/) - $9/mo flat:
kimi/kimi-k2.5kimi/kimi-k2.5-thinking
Kiro (kr/) - FREE unlimited:
kr/claude-sonnet-4.5kr/claude-haiku-4.5kr/glm-5kr/MiniMax-M2.5kr/qwen3-coder-nextkr/deepseek-3.2
OpenCode Free (oc/) - FREE no-auth:
- Auto-fetched from
opencode.ai/zen/v1/models
Vertex AI (vertex/) - $300 free credits:
vertex/gemini-3.1-pro-previewvertex/gemini-3-flash-previewvertex/gemini-2.5-flashvertex-partner/glm-5-maasvertex-partner/deepseek-v3.2-maas
🐛 Troubleshooting
"Language model did not provide messages"
- Provider quota exhausted → Check dashboard quota tracker
- Solution: Use combo fallback or switch to cheaper tier
Rate limiting
- Subscription quota out → Fallback to GLM/MiniMax
- Add combo:
cc/claude-opus-4-7 → glm/glm-5.1 → kr/claude-sonnet-4.5
OAuth token expired
- Auto-refreshed by kRouter
- If issues persist: Dashboard → Provider → Reconnect
High costs
- Enable RTK in Dashboard → Endpoint settings (default ON, saves 20-40% tokens)
- Check usage stats in Dashboard
- Switch primary model to GLM/MiniMax
- Use free tier (Kiro, OpenCode Free, Vertex) for non-critical tasks
Dashboard opens on wrong port
- Set
PORT=20128andNEXT_PUBLIC_BASE_URL=http://localhost:20128
First login not working
- Check
INITIAL_PASSWORDin.env - If unset, fallback password is
123456
No request logs under logs/
- Set
ENABLE_REQUEST_LOGS=true
🛠️ Tech Stack
- Runtime: Node.js 20+
- Framework: Next.js 16
- UI: React 19 + Tailwind CSS 4
- Database: SQLite (better-sqlite3 / node:sqlite / sql.js fallback)
- Streaming: Server-Sent Events (SSE)
- Auth: OAuth 2.0 (PKCE) + JWT + API Keys
📝 API Reference
Chat Completions
POST http://localhost:20128/v1/chat/completions
Authorization: Bearer your-api-key
Content-Type: application/json
{
"model": "cc/claude-opus-4-6",
"messages": [
{"role": "user", "content": "Write a function to..."}
],
"stream": true
}List Models
GET http://localhost:20128/v1/models
Authorization: Bearer your-api-key
→ Returns all models + combos in OpenAI format📧 Support
- GitHub: github.com/sifxprime/krouter
- Issues: github.com/sifxprime/krouter/issues
- npm:
@sifxprime/krouter
🙏 Acknowledgments
Built on the shoulders of giants:
- decolua/9router — the upstream project this fork tracks. All core architecture, providers, dashboard, and ongoing feature work are authored upstream. kRouter only adds the security + stability hardening described in the About this fork section.
- CLIProxyAPI — original Go implementation that inspired the upstream JavaScript port.
- CLIProxyAPI — original Go implementation that inspired this JavaScript port.
- RTK
— Rust token-saver. kRouter ports its compression pipeline to JS → −20-40% input tokens on every request.
- Caveman
by @JuliusBrussee — viral "why use many token when few token do trick". kRouter adapts its prompt → −65% output tokens.
Huge thanks to these authors — without their work, kRouter's token-saving features wouldn't exist. ⭐ them on GitHub!
📄 License
MIT License - see LICENSE for details.
