subagent-router

v0.2.3

Published

9 days ago

Route subagent work from primary coding agents to local, low-cost, or cloud model backends.

0High
0Medium
0Low

marikarx

Subagent Router

Route subagent work from primary coding agents to local, low-cost, or cloud model backends.

Subagent Router lets primary coding agents delegate subagent tasks to alternate model backends such as DeepSeek, Ollama, Groq, and other OpenAI-compatible providers. It features a robust routing engine, fallback logic, and granular budget controls to ensure privacy, performance, and cost-efficiency.

The current Codex integration talks to this proxy through a local /v1/responses HTTP endpoint. The proxy normalizes requests to various backend formats, manages streaming SSE, and tracks usage across tasks, sessions, and days.

TUI

Install

From npm:

npm install -g subagent-router

From source:

cd subagent-router
python -m venv .venv
. .venv/bin/activate
pip install -e '.[server]'

Quick Start

Check your configuration:

subagent-router doctor
subagent-router paths

Install Codex integration files:

subagent-router init                    # same as --profile cost-optimization

Start the proxy:

# Foreground
DEEPSEEK_API_KEY=... subagent-router start

# Background
DEEPSEEK_API_KEY=... subagent-router start --background
subagent-router tui --watch

Run Codex with an ephemeral proxy:

DEEPSEEK_API_KEY=... subagent-router run -- codex

Features

Multi-Provider Routing: Seamlessly switch between DeepSeek, local Ollama, and OpenAI-compatible endpoints (Groq, vLLM, etc.).
Smart Fallbacks: Automatically retry failed requests on alternative backends.
Budget Controls: Hard-stop or warn based on token usage or dollar cost per-task, per-session, or per-day.
Observability: Structured audit logs with deep token tracking (in/cache/out), real-time usage tracking, and a lightweight interactive Terminal UI (tui).
Protocol Flexibility: First-class support for the Codex internal Responses protocol, while also transparently accepting standard OpenAI /v1/chat/completions messages payloads for drop-in compatibility with curl and standard libraries.

Delegation Profiles

Installation profiles control how router subagents are used by the parent coding agent. Pass --profile to subagent-router init to select one:

| Profile | Default | Parent model role | Delegation style | |---|---|---|---| | cost-optimization | yes | Minimal coordinator | Best-effort parent token minimization with compact output, retry caps, and selective delegation | | deep-delegation | | Delegation coordinator and final acceptor | Maximizes router offload for exploration, implementation, review, and remediation | | orchestrator | | Primary orchestrator | Keeps broader Codex/GPT-5.5 control while using router agents as bounded helpers | | manual | | Explicit invocation only | Installs provider and role files without global automatic delegation |

subagent-router init defaults to subagent-router init --profile cost-optimization. Cost optimization is best-effort and measured through reduced parent Codex token usage, not wall-clock time. It does not guarantee savings.

Use subagent-router init --profile deep-delegation to maximize offload to router agents for experiments, external review, and quality-through-delegation. Use subagent-router init --profile orchestrator to keep Codex/GPT-5.5 in broader control. Use subagent-router init --profile manual, --mode opt-in, or --mode provider-only when you do not want global automatic delegation.

--profile only affects --mode default. The opt-in and provider-only modes install no global profile instructions and print a warning if --profile is also supplied.

Installed agent roles (written to ~/.codex/agents/subagent-router-*.toml during init):

subagent_router_explorer — read-only repo discovery, file mapping, call-path tracing, and scoped technical questions
subagent_router_worker — delegated implementation, refactors, tests, and bounded bug fixes
subagent_router_reviewer — first-pass code review, regression analysis, and implementation critique

See docs/usage.md for details on each profile and role.

Configuration

The router can be configured via environment variables or a config.toml file.

Common Environment Variables

SUBAGENT_ROUTER_PROVIDER: Default provider (deepseek, ollama, openai-compatible)
SUBAGENT_ROUTER_BUDGET_MODE: warn (default) or hard-stop
SUBAGENT_ROUTER_MAX_COST_PER_DAY: Maximum daily spend in USD
SUBAGENT_ROUTER_MAX_TOKENS_PER_SESSION: Token budget for the current session

Example Config (`config.toml`)

[providers.groq]
type = "openai-compatible"
base_url = "https://api.groq.com/openai/v1"
model = "llama-3.3-70b-versatile"

[budgets]
max_cost_per_task = 0.05
max_cost_per_day = 5.00
mode = "hard-stop"

More configuration details are in docs/usage.md.

Roadmap

See docs/ROADMAP.md for implemented and planned features including intelligent provider scoring and advanced routing policies.

Documentation

Development

uv run pytest

Run a mock proxy for local checks:

subagent-router start --mock --port 8787
curl -sS http://127.0.0.1:8787/health
curl -sS http://127.0.0.1:8787/debug/activity

License

MIT. See LICENSE.

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme