subagent-router
v0.2.3
Published
Route subagent work from primary coding agents to local, low-cost, or cloud model backends.
Readme
Subagent Router
Route subagent work from primary coding agents to local, low-cost, or cloud model backends.
Subagent Router lets primary coding agents delegate subagent tasks to alternate model backends such as DeepSeek, Ollama, Groq, and other OpenAI-compatible providers. It features a robust routing engine, fallback logic, and granular budget controls to ensure privacy, performance, and cost-efficiency.
The current Codex integration talks to this proxy through a local /v1/responses HTTP endpoint. The proxy normalizes requests to various backend formats, manages streaming SSE, and tracks usage across tasks, sessions, and days.

Install
From npm:
npm install -g subagent-routerFrom source:
cd subagent-router
python -m venv .venv
. .venv/bin/activate
pip install -e '.[server]'Quick Start
Check your configuration:
subagent-router doctor
subagent-router pathsInstall Codex integration files:
subagent-router init # same as --profile cost-optimizationStart the proxy:
# Foreground
DEEPSEEK_API_KEY=... subagent-router start
# Background
DEEPSEEK_API_KEY=... subagent-router start --background
subagent-router tui --watchRun Codex with an ephemeral proxy:
DEEPSEEK_API_KEY=... subagent-router run -- codexFeatures
- Multi-Provider Routing: Seamlessly switch between DeepSeek, local Ollama, and OpenAI-compatible endpoints (Groq, vLLM, etc.).
- Smart Fallbacks: Automatically retry failed requests on alternative backends.
- Budget Controls: Hard-stop or warn based on token usage or dollar cost per-task, per-session, or per-day.
- Observability: Structured audit logs with deep token tracking (in/cache/out), real-time usage tracking, and a lightweight interactive Terminal UI (
tui). - Protocol Flexibility: First-class support for the Codex internal Responses protocol, while also transparently accepting standard OpenAI
/v1/chat/completionsmessagespayloads for drop-in compatibility with curl and standard libraries.
Delegation Profiles
Installation profiles control how router subagents are used by the parent coding agent.
Pass --profile to subagent-router init to select one:
| Profile | Default | Parent model role | Delegation style |
|---|---|---|---|
| cost-optimization | yes | Minimal coordinator | Best-effort parent token minimization with compact output, retry caps, and selective delegation |
| deep-delegation | | Delegation coordinator and final acceptor | Maximizes router offload for exploration, implementation, review, and remediation |
| orchestrator | | Primary orchestrator | Keeps broader Codex/GPT-5.5 control while using router agents as bounded helpers |
| manual | | Explicit invocation only | Installs provider and role files without global automatic delegation |
subagent-router init defaults to subagent-router init --profile cost-optimization.
Cost optimization is best-effort and measured through reduced parent Codex token
usage, not wall-clock time. It does not guarantee savings.
Use subagent-router init --profile deep-delegation to maximize offload to
router agents for experiments, external review, and quality-through-delegation.
Use subagent-router init --profile orchestrator to keep Codex/GPT-5.5 in
broader control. Use subagent-router init --profile manual, --mode opt-in,
or --mode provider-only when you do not want global automatic delegation.
--profile only affects --mode default. The opt-in and provider-only
modes install no global profile instructions and print a warning if --profile
is also supplied.
Installed agent roles (written to ~/.codex/agents/subagent-router-*.toml during init):
subagent_router_explorer— read-only repo discovery, file mapping, call-path tracing, and scoped technical questionssubagent_router_worker— delegated implementation, refactors, tests, and bounded bug fixessubagent_router_reviewer— first-pass code review, regression analysis, and implementation critique
See docs/usage.md for details on each profile and role.
Configuration
The router can be configured via environment variables or a config.toml file.
Common Environment Variables
SUBAGENT_ROUTER_PROVIDER: Default provider (deepseek,ollama,openai-compatible)SUBAGENT_ROUTER_BUDGET_MODE:warn(default) orhard-stopSUBAGENT_ROUTER_MAX_COST_PER_DAY: Maximum daily spend in USDSUBAGENT_ROUTER_MAX_TOKENS_PER_SESSION: Token budget for the current session
Example Config (config.toml)
[providers.groq]
type = "openai-compatible"
base_url = "https://api.groq.com/openai/v1"
model = "llama-3.3-70b-versatile"
[budgets]
max_cost_per_task = 0.05
max_cost_per_day = 5.00
mode = "hard-stop"More configuration details are in docs/usage.md.
Roadmap
See docs/ROADMAP.md for implemented and planned features including intelligent provider scoring and advanced routing policies.
Documentation
- Usage and configuration
- Architecture and behavior
- Protocol notes
- Troubleshooting
- Provider compatibility
- Test matrix
- Release checklist
- Changelog
Development
uv run pytestRun a mock proxy for local checks:
subagent-router start --mock --port 8787
curl -sS http://127.0.0.1:8787/health
curl -sS http://127.0.0.1:8787/debug/activityLicense
MIT. See LICENSE.
