work-kit-cli
v0.7.0
Published
Structured development workflow for Claude Code. Two modes, 7 phases, 31 steps, plus debug recovery.
Maintainers
Readme
work-kit
Structured development workflow for Claude Code. Two modes, 7 phases, 13 steps, plus auto-debug recovery — orchestrated by a TypeScript CLI with reusable skill files. DDD-disciplined end-to-end (Plan models the domain; Build implements it).
What's new
- Structured receipts +
work-kit run— every step now writes a JSON receipt that the CLI validates and derives the step outcome from. Agents no longer pick--outcomeflags. The orchestrator loop collapses to a singlework-kit rundriver: it tells you what agent to spawn and what bash to run after it. Routing decisions are pure functions of receipt fields, not free-form model calls. - Triage phase replaces Define as the first phase. One step (Classify) — picks the work class (bug-fix / small-change / refactor / feature / large-feature), and the CLI builds the workflow. Refine and Spec moved into Plan/Understand.
- Review/Triage step renamed to Review/Scope to disambiguate from the new front Triage.
- DDD discipline baked into Plan and Build — bounded contexts, aggregates, value objects, repository contracts in Plan; TDD-implemented cleanly inside
build/implement. - Build collapsed to 3 steps — Setup (branch + deps + migrations), Implement (full TDD cycle internally: Red → Core → UI → Refactor → Integration), Commit.
- Plan collapsed to 3 steps — Understand (refine + spec for features, then criteria + investigation), Design, Audit.
- wk-debug triage skill auto-fires when any step reports
needs_debug, then the originating step retries (max 2 iterations). Not user-invocable. - Test collapsed to 2 steps — Exercise (fans out 3 parallel lens sub-agents internally: Verify, E2E, Browser) and Validate (criteria mapping + verdict). Browser lens drives the running app via Chrome DevTools MCP and skips gracefully if the MCP isn't installed.
- Decisions matching
**<context>**: chose X over Y — <why>auto-graduate into.work-kit-knowledge/decisions.mdat wrap-up.
Installation
One-step setup (recommended):
npx work-kit-cli setupThis auto-detects Claude Code projects in your workspace and installs the skill files. If multiple projects are found, it lists them for you to choose.
Global install:
npm install -g work-kit-cliQuick start
# 1. Set up work-kit in your project
npx work-kit-cli setup
# 2. In Claude Code, start a workflow
/full-kit add user avatar upload # strict, all phases
/auto-kit fix login redirect bug # dynamic, only needed stagesModes
/full-kit <description>
Runs every phase and step in strict order. No shortcuts.
Best for: large features, new systems, maximum rigor.
/auto-kit <description>
Classifies the request (bug-fix, small-change, refactor, feature, large-feature) and builds a dynamic workflow with only the steps needed.
Best for: bug fixes, small changes, refactors, well-understood tasks.
CLI commands
| Command | Description |
|---------|-------------|
| init <description> | Initialize a new workflow with a task description |
| run | Skinny orchestrator driver — returns the next imperative action with the bash to run after it. The orchestrator skill loops on this. |
| run --finished <phase>/<step> | Mark a step finished (reads its receipt, derives the outcome) and return the next action |
| next | Advance to the next step (low-level — run wraps this) |
| complete | Mark a step complete (low-level — run wraps this) |
| status | Show current workflow state (phase, step, progress) |
| context | Generate context summary for the current phase |
| validate | Validate state integrity and phase prerequisites |
| loopback | Route back to a previous stage (max 2 per route) |
| workflow | Display the full workflow plan |
| pause | Pause the active session (state preserved on disk) |
| resume [--slug <slug>] | Without --slug: list resumable sessions in this repo. With --slug: resume the named session |
| observe [--all] | Live TUI dashboard of active/paused/completed sessions. --all watches every work-kit project on the system |
| doctor | Run environment health checks (supports --json) |
| setup | Install work-kit skills into a Claude Code project |
Picking up where you left off
work-kit resume (or /resume-kit in Claude Code) scans every worktree of the current repo for .work-kit/tracker.json files in paused or in-progress state and lets you pick one. It works from the main repo root — no need to cd into a worktree first. In-progress sessions are listed too, so a terminal you closed without pausing can be recovered: just look for the row with a stale lastUpdatedAgoMs.
Watching multiple projects
work-kit observe --all discovers every work-kit-enabled repo from your ~/.claude/projects/ history and watches them all in one dashboard. Each row shows the project name, work item slug, mode, type, current state, and worktree.
Phases
| Phase | Steps | Agent | |-------|-----------|-------| | Triage | Classify | Single (always — picks the work class) | | Plan | Understand (refine + spec + criteria + investigate), Design, Audit | Single (DDD-disciplined) | | Build | Setup, Implement, Commit | Single (Implement runs the full TDD cycle internally, DDD-disciplined) | | Test | Exercise, Validate | Exercise (fans out 3 parallel lens sub-agents internally: Verify, E2E, Browser) → Validate (criteria mapping + verdict) | | Review | Scope, Review, Resolve | Scope → Review (fans out 4 parallel reviewer sub-agents internally: Quality, Efficiency, Security, Compliance) → Resolve (fix + ship decision) | | Deploy | Ship | Single (optional — runs pre-flight, merge, monitor, remediate as one autonomous flow) | | Wrap-up | Summary, Knowledge | Single |
Architecture
Context management
Each phase runs as a fresh agent. The Build agent doesn't carry Plan's investigation notes — no context bloat.
Phases communicate through Final sections in .work-kit/state.md. Each phase writes a ### <Phase>: Final section that the next phase reads.
State management
Dual state files in .work-kit/:
- tracker.json — state machine (current phase, step, transitions, loop-back counts)
- state.md — content (working notes, Final sections, accumulated context)
All writes are atomic to prevent state corruption.
Parallel agents
- Test phase: Exercise fans out 3 parallel lens sub-agents (Verify, E2E, Browser) using the Agent tool, then Validate runs sequentially to map criteria and produce the verdict
- Review phase: Scope classifies the diff, then Review fans out 4 parallel reviewer sub-agents (Quality, Efficiency, Security, Compliance) using the Agent tool, then Resolve aggregates findings, fixes aggressively, and makes the ship decision
Loop-back routing
Any stage can route back to a previous stage. Each route is enforced with a max count of 2 to prevent infinite loops.
Validation
- full-kit: phase-level prerequisites (Plan before Build before Test...)
- auto-kit: step-level validation against the
## Workflowchecklist - Both modes refuse to skip ahead
Output
.work-kit-tracker/
index.md # log of all completed work (links to summaries + archives)
archive/
avatar-upload-2026-04-03/
state.md # full phase outputs
tracker.json # full JSON tracker (phases, timing, status)
summary.md # distilled wrap-up summaryRepo structure
work-kit/
cli/
src/
commands/ # CLI command implementations
config/ # Configuration and defaults
context/ # Context generation for phases
engine/ # Workflow engine and transitions
state/ # State machine and file management
index.ts # Entry point
skills/
full-kit/SKILL.md # /full-kit orchestrator (all phases, strict order)
auto-kit/SKILL.md # /auto-kit orchestrator (dynamic, only needed steps)
wk-bootstrap/SKILL.md # /wk-bootstrap session detection
wk-triage/SKILL.md # Triage phase runner (1 step: classify)
wk-triage/steps/ # 1 step file: classify
wk-plan/SKILL.md # Plan phase runner
wk-plan/steps/ # 3 step files: understand (refine+spec+criteria+investigate), design, audit
wk-build/SKILL.md # Build phase runner
wk-build/steps/ # 3 step files: setup, implement, commit
wk-test/SKILL.md # Test phase runner
wk-test/steps/ # 2 step files: exercise (3 parallel lens sub-agents internally), validate
wk-review/SKILL.md # Review phase runner
wk-review/steps/ # 3 step files: scope, review (4 parallel lens sub-agents internally), resolve
wk-deploy/SKILL.md # Deploy phase runner
wk-deploy/steps/ # 1 step file: ship (pre-flight + merge + monitor + remediate)
wk-wrap-up/SKILL.md # Wrap-up: summary + knowledge harvest
wk-wrap-up/steps/ # 2 step files: summary, knowledge
wk-debug/SKILL.md # Auto-debug triage (not user-invocable)
pause-kit/SKILL.md # /pause-kit — pause active session
resume-kit/SKILL.md # /resume-kit — resume paused session
cancel-kit/SKILL.md # /cancel-kit — cancel and clean up session
package.jsonRequirements
- Node.js >= 18
- Git
- Claude Code
