runoff

v3.0.0

Published

24 days ago

Multi-step code-change pipelines for coding agents — race mode, git worktree isolation, local traces

Downloads

149

0High
0Medium
0Low

zhangyiqian

mcp llm pipeline orchestration ai-agent dag coding-agent race-mode anthropic

runoff

Run two coding agents on the same task. Pick the winner.

runoff is a multi-step code-change pipeline for coding agents — declarative DAG, git worktree isolation, provider races, and local traces. Works as an MCP server (Cursor, Claude Desktop, Claude Code) or a standalone CLI.

  IDE / MCP host                runoff                coding-agent CLI
  ──────────────►   implement → review → retry   ──►   Claude Code / Codex / Gemini / …
                           ↑ race mode ↑
                   two providers, one task, you pick

Install

npx runoff init --work-dir /path/to/your/repo

Or clone to develop / self-host:

git clone https://github.com/alexangelzhang/runoff.git && cd runoff
npm install
npm run demo          # zero API keys — mock run with trace + experiment

Race mode

Put two providers in an array — they run in parallel, each in its own git worktree, and the pipeline pauses for you to pick:

{
  "pipeline": {
    "implement": [["claude-code", "opencode"]],
    "review":    ["claude-code", "implement"]
  }
}

candidate 0  (claude-code)      src/utils/format.ts  +27 lines
  formatRelativeTime(isoString: string)   — string input only

candidate 1  (opencode/DeepSeek)  src/utils/format.ts  +60 lines
  formatRelativeTime(dateInput: string | Date)  — accepts Date too
  + future dates ("2 hours from now"), week unit, edge-case guards

npx runoff race apply --session abc123 --winner 1

Same spec. Two models, different API decisions. With raceFinalize: defer you see both diffs before any code lands.

→ Full mechanics: docs/features/race-mode.md → Real races with diffs: docs/reference/race-showcase.md — 6 real runs, real providers, real design decisions → Token cost data: docs/reference/benchmarks-data.md

Run on your repo

# 1. Generate pipeline.config.json for your repo
npx runoff init --work-dir /path/to/repo --profile feature

# 2. Verify config + backend connectivity
npx runoff doctor --config /path/to/repo/pipeline.config.json

# 3. Run a task
npx runoff run \
  --prompt "Add hello() with unit tests" \
  --work-dir /path/to/repo \
  --config /path/to/repo/pipeline.config.json

Edit config in a browser (providers, DAG, retry — saves via local HTTP):

npx runoff config edit --config /path/to/pipeline.config.json

Example configs: examples/configs/ — feature, bugfix, refactor, cli

Real CLI backends: docs/guides/coding-agent-backends.md — Codex, Gemini, Claude Code, OpenCode

MCP server

{
  "mcpServers": {
    "runoff": {
      "command": "npx",
      "args": ["runoff", "mcp"],
      "cwd": "/absolute/path/to/your/project"
    }
  }
}

Auto-configure for Cursor / Claude Desktop / Claude Code:

npm run setup:mcp

| Tool | Purpose | |------|---------| | runoff_run_pipeline | Full DAG + retries + checkpoints + race pause | | runoff_run_step | Single step | | runoff_query_traces / runoff_query_experiments | Local observability | | runoff_race_apply / runoff_race_abort | Race finalization |

Full list + governance/memory tools: docs/README.md

Why runoff?

| | runoff | LangGraph | CrewAI | AutoGen | OpenHands | |-|:------:|:---------:|:------:|:-------:|:---------:| | Declarative config DAG (JSON) | ✅ | code-first | Crew/Task | code-first | UI + agent | | Git worktree + lock contract | ✅ | — | — | — | partial | | Provider race + judge pause | ✅ | — | — | — | — | | MCP tool surface for IDE hosts | ✅ | optional | recent | — | different | | Local trace + experiment eval | ✅ | +LangSmith | DIY | DIY | partial |

Full comparison: docs/reference/differentiation.md

Prerequisites

Node 20+, Python 3, Git

bash scripts/shell/check-prereqs.sh

Development & CI

| Command | Purpose | |---------|---------| | npm test | Full suite (~800 tests) | | npm run ci:gates | IPC sync + gate e2e + unit tests | | npm run ci:gates:smoke | PR smoke (allow-skip without secrets) | | npm run check-ipc-sync | After src/core/ipc.ts changes | | npm run typecheck | tsc --noEmit (required in CI) |

Documentation

Full index: docs/README.md

| Doc | Topic | |-----|-------| | getting-started-30min.md | First run → real repo | | coding-agent-backends.md | Codex, Gemini, Claude Code, OpenCode | | race-mode.md | Running multiple LLMs on the same step | | observability.md | Trace + experiment (no LangSmith required) | | differentiation.md | vs LangGraph, CrewAI, AutoGen, OpenHands | | security-model.md | Threat model (self-hosted) | | structure.md | src/ + scripts/ layout | | advanced/ | A2A, Dream, Dreamify (optional) |

Features

Declarative DAG pipeline: implement → review → retry
Provider race mode with judge pause and worktree isolation
Governance: policy, guardrails, plan approval gate
Checkpoint / resume; durable run store
Local trace + experiment logs at ~/.runoff/ (no SaaS required)
Optional: external memory, Dream offline worker, A2A federation (experimental)

License

MIT — LICENSE

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

runoff

Install

Race mode

Run on your repo

MCP server

Why runoff?

Prerequisites

Development & CI

Documentation

Features

License