@agentic-swe/agentic-swe
v3.3.0
Published
Autonomous SWE pipeline (Claude Code–first; Cursor, Codex, OpenCode, and compatible hosts) — install as a plugin (marketplace or --plugin-dir). Per-work state uses .worklogs/ in your project.
Downloads
260
Maintainers
Readme
An open-source autonomous SWE pipeline that runs in your editor or CI, writes every decision into your repo, and gives you a shareable audit trail of what the AI did and why.
What you get:
- Structured PRs — Claude works through a state machine (lean / standard / rigorous), not one mega-prompt
- Cost-attributed decisions — every phase has a dollar amount and an artifact in
.worklogs/<id>/ - Audit trail —
/receiptrenders a shareable summary suitable for a PR description, Slack, or a compliance ticket
Quickstart
npm install -g @agentic-swe/agentic-swe
claude --plugin-dir "$(agentic-swe path)"Then in Claude Code:
/work Add retry logic to the API client
# ...pipeline runs, opens a PR...
/receipt→ See Install & first run for Cursor, Codex, OpenCode, Gemini CLI, and Claude Code plugin marketplace alternatives.
What /receipt looks like
/receipt reads .worklogs/<id>/ and renders the work item as markdown. Sample from test/fixtures/receipt/lean-happy/:
# /work add-retry-logic — Add retry logic to the API client
| Field | Value |
|---|---|
| Work ID | add-retry-logic |
| Track | lean |
| Status | completed |
| Duration | 47 min |
| Cost | $1.84 |
| PR | https://github.com/example/repo/pull/142 |
## Decisions made (6)
1. **feasibility → lean-track-check** ($0.08) — lean signal → feasibility.md#L1-L20
2. **lean-track-check → lean-track-implementation** ($0.04) — verdict: simple
3. **lean-track-implementation → validation** ($1.33) — implementation complete → implementation.md
4. **validation → pr-creation** ($0.21) — tests green
5. **pr-creation → approval-wait** ($0.18) — PR opened
6. **approval-wait → completed** ($0.00) — approved by suraj
## Human gates respected (1)
- `approval-wait` resolved by user at 2026-05-17T14:47:00Z — approved by suraj
## Loop counters
- `self_review_iter`: 0
- `doubt_cycles`: 1
- `code_review_iter`: 0
## Verifiable references
- All artifacts: `test/fixtures/receipt/lean-happy/`
- Audit log: `test/fixtures/receipt/lean-happy/audit.log` (9 entries)Every line above is computed from .worklogs/<id>/ — no LLM summary, no hallucinated PRs. Reproduce locally:
node scripts/render-receipt.cjs --work-dir test/fixtures/receipt/lean-happyDocs: agentic-swe.github.io/agentic-swe-site
Pipeline at a glance
After feasibility, lean-track-check sets pipeline.track in state.json. Tracks merge into PR creation → approval-wait → completed.
%%{init: {'theme': 'dark', 'fontFamily': 'ui-sans-serif, system-ui, -apple-system, Segoe UI, sans-serif'}}%%
flowchart TD
start(["/work — start or resume"])
feasibility["feasibility"]
check{"lean-track-check<br/>sets pipeline.track"}
lean["Lean track<br/>lean-track-implementation<br/>validation · pr-creation"]
std["Standard track<br/>design → verification → test-strategy<br/>implementation → self-review<br/>validation · pr-creation"]
rig["Rigorous track<br/>design → design-review<br/>verification → test-strategy<br/>implementation → self-review<br/>code-review → permissions-check<br/>validation · pr-creation"]
gate{{"approval-wait<br/>human gate"}}
done(["completed"])
start --> feasibility --> check
check -->|lean| lean
check -->|standard| std
check -->|rigorous| rig
lean --> gate
std --> gate
rig --> gate
gate --> done
classDef accent fill:#1f6feb,stroke:#58a6ff,color:#ffffff,stroke-width:2px
classDef step fill:#21262d,stroke:#30363d,color:#e6edf3,stroke-width:1px
classDef branch fill:#21262d,stroke:#388bfd,color:#c9d1d9,stroke-width:2px
classDef decide fill:#21262d,stroke:#d29922,color:#ffdfb8,stroke-width:2px
classDef gateNode fill:#21262d,stroke:#a371f7,color:#e6edf3,stroke-width:2px
class start,done accent
class feasibility step
class check decide
class lean,std,rig branch
class gate gateNodeCanonical transitions: state-machine.json and the fenced graph in CLAUDE.md (checked in CI).
Install & first run
Beyond the Quickstart above, alternate paths:
Claude Code (plugin marketplace)
/plugin marketplace add agentic-swe/agentic-swe
/plugin install agentic-swe@agentic-swe-catalogOther hosts
| Host | How |
|------|-----|
| Cursor | curl -fsSL https://raw.githubusercontent.com/agentic-swe/agentic-swe/main/scripts/install-cursor-plugin.sh \| bash |
| Codex | .codex/INSTALL.md |
| OpenCode | .opencode/ |
| Gemini CLI | gemini-extension.json · GEMINI.md |
After enabling the plugin, run /install once to merge CLAUDE.md and an optional .gitignore entry for .worklogs/. Maintainers see docs/PUBLISHING.md.
→ Full installation guide · Golden path (~15 min)
Commands
| Command | Role |
|---------|------|
| /work | Start or resume a work item |
| /plan-only | Feasibility / design without implementation |
| /brainstorm | Design-first exploration (optional UI server) |
| /write-plan · /execute-plan | Plan bar then execution |
| /check budget · /check transition · /check artifacts | Enforcement before phases / transitions |
| /subagent | Browse / invoke specialists |
| /repo-scan · /test-runner · /lint | Evidence helpers |
Full list: Usage · commands/
Subagents
138+ specialists under agents/subagents/
Across 10 categories — Language Specialists (29), Infrastructure (16), Specialized Domains (15), Quality & Security (14), Data & AI (13), Developer Experience (13), Business & Product (11), Core Development (10), Meta & Orchestration (10), Research & Analysis (7).
Details: Subagent catalog · Catalog routing
Work state
.worklogs/<id>/ holds state.json (source of truth), progress.md (timeline), audit.log (append-only), and per-phase markdown (e.g. feasibility.md, implementation.md, validation-results.md, pr-link.txt).
- State over chat — resume from files, not from thread memory alone.
- Evidence — tie claims to commands, paths, or CI (
templates/evidence-standard.md). - CI parity —
scripts/work-engine.cjsenforces/check-style rules.
Architecture
A single Hypervisor session (this one) owns transitions, gates, and synthesis. Three core agents — developer, git-operations, pr-manager — carry bounded work. A design panel (architect, security, adversarial) reviews in parallel on the rigorous track. All consult the 135+ subagent catalog, auto-selected from repo signals.
→ Architecture overview (full diagram)
Extending · CI · License
| Topic | Where |
|-------|--------|
| Extend pipeline | /author-pipeline · references/authoring-pipeline-capabilities.md |
| CI | .github/workflows/ci.yml — npm run ci locally |
| Research basis | CLAUDE.md — Research basis |
| License | MIT · Licensing |
