@pieerry/harness-kit
v4.1.0
Published
Claude Code harness for product + engineering delivery. From idea to merged PR, one pipeline.
Downloads
611
Maintainers
Readme
harness-kit
From idea to merged PR. One pipeline. Six stages. Or spec-driven loop until the spec is met.

~2min walkthrough · agents · skills · install · 6 commands · sensors+evals matrix · auto-watch PR until merged.
What it is
Two Claude Code agents — product-manager and staff-software-engineer — sharing one pipeline:
prd → prp → plan → dev → test → prEach stage produces a markdown artifact, gated by deterministic sensors (pass/fail) and a scored eval (≥ 8.0). After the PR opens, an in-session monitor watches for merge.
For local-only work, two extra modes skip the PR stage:
/sse:run --local plan → dev → test → STOP (single shot, no loop)
/sse:sdd plan → [dev↔test↔eval]×3 → STOP (spec-driven goal loop)/sse:sdd is the SDD variant: the PRP is the spec, and an independent supervisor session re-checks the repo against Success criteria (verifiable) + Validation gates after every dev↔test iteration. PR is never auto-opened — user runs /sse:pr after reviewing the loop transcript.
Install
npm i -g @pieerry/harness-kit
hk installRestart Claude Code. Done.
Without npm:
git clone https://github.com/Pierry/harness-kit ~/.harness-kit
bash ~/.harness-kit/setup/install.shCLI: hk install · hk update · hk uninstall · hk status · hk version.
Getting started
Pick the flow that matches the task. All of them share the same pipeline state, so you can switch between them mid-feature.
Big task — full pipeline (PM + Eng)
A new feature with stakes, ambiguity, or a Jira ticket attached. You want a written PRD, a thought-through PRP, a plan, code, tests, and a PR.
/product-manager:run # drafts PRD then PRP, with sensor + eval gates
/sse:run # plans, implements, tests, opens PR, watches for mergeApprove each artifact when prompted. The status bar tracks where you are in the six stages.
Spec only — no code yet
You need the PRD and PRP to align with stakeholders before any engineering work. Stop after the PRP.
/product-manager:runWhen eng is ready, hand them the repo and they run /sse:run against the approved PRP.
Dev only — small change, plan in your head
A bug fix, a small enhancement, or a refactor where writing a PRD would be theatre. Skip PM, run engineering directly.
/sse:run # plan → dev → test → PR
/sse:run --local # plan → dev → test, stop before PR (push manually later)Or run a single stage if that's all you need:
/sse:plan # just the plan
/sse:dev # just the code (against an approved plan)
/sse:test # just the tests
/sse:pr # just open the PRSpec-driven loop — iterate locally until the PRP is satisfied
You have an approved PRP and want Claude to loop dev↔test until the spec actually passes, judged by an independent supervisor session. No PR until you say so.
/sse:sdd # plan once + dev↔test↔spec-satisfied eval, cap 3 iters
# review .claude/runtime/outputs/sse/sdd/{feature_id}.md
/sse:pr # manual gate when readyThe loop predicate is built from the PRP's Success criteria (verifiable) and Validation gates sections — both must be present and concrete, or the prp-has-acceptance-criteria sensor blocks before the first iteration runs. Cap hit without a PASS verdict returns a blocker listing the unmet criteria.
Resume — pick up where you left off
Closed the session, restarted Claude Code, or got interrupted. State persists at .claude/.pipeline-state.json.
/pipeline:continue # next pending stage for the active feature
/pipeline:reset # abandon the active run and start freshWhen the PR merges, the in-session monitor clears state automatically.
Use it
/product-manager:run draft PRD then PRP
/sse:run plan, dev, test, open PR, watch for merge
/sse:run --local plan, dev, test — stop before PR
/sse:sdd spec-driven loop: dev↔test↔eval until PRP met, no PR
/context:pack <feature_id> repomix snapshot of target repo (per-feature cache)
/context:graph [repo] graphify knowledge graph of a target repo (per-repo cache)
/pipeline:continue resume next pending stage
/pipeline:reset abandon active runNeed just one stage? Each is its own slash command:
| Stage | Command | Gates |
|---|---|---|
| prd | /product-manager:prd | prd-structure, prd-acceptance-criteria · prd-quality, prd-readiness |
| prp | /product-manager:prp | prp-structure, prp-context-quality, prp-links, link-validator · prp-quality, prp-context-readiness |
| plan | /sse:plan | plan-structure · plan-quality |
| dev | /sse:dev | code-conventions, test-coverage, dev-structure · dev-quality |
| test | /sse:test | test-structure · test-quality |
| pr | /sse:pr | pr-structure · pr-quality · auto-arms /sse:pr-monitor |
| sdd | /sse:sdd | prp-has-acceptance-criteria (pre-flight) · spec-satisfied per iter (fresh session) · cap 3 iters |
Sensors block on failure (Claude regenerates). Evals score; threshold 8.0; retried up to 3 times. SDD eval returns PASS/FAIL — FAIL re-enters the loop with a next_iter_focus hint.
Agents
Registered in AGENTS.md at the repo root. Each ships its own sensors, evals, guides, skills.
product-manager — turns a problem into an engineering-ready spec
- Skills:
prd,prp - Sensors: 5 (structure + acceptance criteria + cross-links)
- Evals: 4 (quality + readiness for each of PRD, PRP)
- Guides:
pipeline.md,prd-guidelines.md,prp-guidelines.md,writing-style.md,templates/,examples/ - Full docs →
staff-software-engineer — turns an approved PRP into a merged PR (or a satisfied spec)
- Skills:
backend,web,mobile,devops(auto-detected from repo) - Sensors: 7 (
plan-structure,code-conventions,test-coverage,dev-structure,test-structure,pr-structure,prp-has-acceptance-criteria) - Evals: 5 (
plan,dev,test,prquality;spec-satisfiedsupervisor for SDD loop) - Guides:
pipeline.md,coding-style.md,commit-style.md,conventions-override.md,sdd-loop.md - Modes:
/sse:run(full pipeline),/sse:run --local(skip PR),/sse:sdd(spec-driven loop) - Full docs →
Anatomy of every stage
GUIDE how to write it pipeline.md · coding-style.md
REF context to pull in AGENTS.md · prp/<feature>.md · conventions/{area}.md
SENSOR must-pass structure deterministic, blocks approval
EVAL scored rubric LLM-judge, threshold 8.0Approval marker (<!-- approved: -->) gates the next stage. Token spend per phase appended as inline <!-- tokens: ... -->.
Status bar
Live indicator at the bottom of every Claude Code session:
idle · /product-manager:run · /sse:run · /pipeline:continue
billing-fix [prd+prp+plan+dev+test+pr] · prp approved · plan drafting · next /sse:plan · sensor: plan-structure
billing-fix · complete (prd/prp/plan/dev/test/pr)State persists at .claude/.pipeline-state.json. Close the session and reopen — /pipeline:continue picks up at the next pending stage. When the PR merges, state auto-clears.
Project conventions
The SSE agent has defaults per area. Override per repo:
{your-repo}/.claude/conventions/{backend,web,mobile,devops}.mdOnly the area files you need. The agent reads them on top of defaults. See conventions-override.md.
Layout
What hk install lays down in your repo:
{your-repo}/
├── AGENTS.md agent registry + routing
├── CLAUDE.md workspace style + role
└── .claude/
├── agents/ agent definitions (sensors, evals, guides, skills)
├── commands/ slash command entry points (pm, sse, context, pipeline)
├── shared/ cross-agent guides (context-strategy.md)
├── hooks/ status-line + lifecycle hooks
├── scripts/ pipeline.py · activity.py · pr-monitor.py · pack-repo.sh · graph-repo.sh
├── runtime/
│ ├── hooks/<agent>/ per-agent lifecycle (post-write, post-eval, pre-prp-check)
│ ├── scripts/<agent>/ per-agent utilities (sensor-runner, token-phase, link-validator)
│ ├── outputs/{pm,sse}/ generated artifacts, markers, tokens (incl. sse/sdd/ loop transcripts)
│ └── cache/ repomix packs + graphify graphs (optional, gitignored)
├── conventions/ your per-repo overrides
└── settings.json hook wiringFull path-by-path map in AGENTS.md.
Tooling
| Tool | Why | Required |
|------|-----|----------|
| Claude Code | agent runtime | yes |
| python3 | sensors, token accounting, pipeline state | yes |
| gh CLI | opens PR, polls for merge | for /sse:pr |
| git | branch + commit ops | yes |
| repomix | snapshot target repo for AI context (/context:pack) | optional |
| graphify | queryable knowledge graph of a repo (/context:graph) | optional |
Install optional tools:
npm i -g repomix # or: brew install repomix
uv tool install graphifyy # or: pipx install graphifyy (CLI cmd is `graphify`)hk install detects both and prints a hint if missing — never auto-installs. See .claude/shared/context-strategy.md for when each tier is worth it (grep vs pack vs graph).
Other optional: jq for token JSON queries. JIRA_USERNAME + JIRA_API_TOKEN to publish PRD/PRP to Confluence.
MIT. Built on Claude Code. Works in any repo Claude Code touches.
