valent-pipeline
v0.19.50
Published
v3 multi-agent AI pipeline for software development lifecycle
Maintainers
Readme
valent-pipeline
A multi-agent AI pipeline that takes user stories and ships tested, reviewed, committed code. Built on Claude Code agent teams.
You write the story. The pipeline handles requirements analysis, UX specification, test planning, implementation, adversarial code review, test execution, and a final evidence-based ship decision -- producing a full artifact trail for every story.
Quick Start
# Initialize in your project — no global install needed.
# `init` scaffolds .valent-pipeline/ AND vendors the CLI into it, so every project
# pins its own version and the agents call it via `node .valent-pipeline/bin/cli.js`.
cd your-project
npx valent-pipeline init
# Run the interactive configuration wizard
/valent-configure
# Execute a story
/valent-run-story-workflow STORY-001No global install required.
npx valent-pipeline initcopies the CLI (bin/+src/) into.valent-pipeline/and installs its dependencies there. Agents invokenode .valent-pipeline/bin/cli.js <cmd>— so different projects can run different CLI versions, and you can customize the pipeline (includingsrc/) per project. A global install (npm install -g valent-pipeline) still works if you prefer the barevalent-pipelinecommand for manual use.
How It Works
A deterministic Workflow orchestrator (sprint.workflow.js) reads your story, spawns specialist agents per task, and drives them through a dependency-driven pipeline:
REQS -> UXA -> QA-A -> SPECCHECK -> RED -> BEND + FEND -> STATIC -> CRITIC -> GREEN -> QA-B -> EVIDENCE -> JUDGE -> SHIP- REQS translates acceptance criteria into an implementation brief (+ a machine-readable AC manifest)
- UXA converts UX specs into component specifications (frontend projects)
- QA-A writes behavioral test specifications before any code exists (+ a spec manifest mapping every case to its AC)
- SPECCHECK gate validates the spec chain mechanically (
valent spec check+trace check); RED gate proves the acceptance suite fails pre-implementation (ATDD) - BEND + FEND implement production code and tests in parallel
- STATIC gate runs deterministic checks at mechanical cost: lint/type commands, spec-conformance (no dropped assertion targets or un-waived skips), trace check (every AC has a covering case)
- CRITIC runs a 3-pass adversarial code review (blind hunt, edge cases, acceptance audit), pinning the git SHA it reviewed
- QA-B executes tests against real infrastructure through
evidence run— exit codes, full logs, junit reports (frozen + SHA-256-hashed), and the git SHA are captured by the CLI, never transcribed by a model — then files bugs and builds the traceability matrix - EVIDENCE gate deterministically re-verifies the machine record: report hashes intact, junit green (with the brownfield pre-existing carve-out), every spec'd case executed, CRITIC's pin matches the tested SHA
- JUDGE makes the SHIP or REJECT decision from the machine verdicts (its self-reported numbers are recounted from the artifacts afterward — a mismatch stops the run)
Quality gates (SPECCHECK, RED, STATIC, CRITIC, GREEN, EVIDENCE, JUDGE) enforce pass/fail checkpoints. Rejection loops send work back to the responsible agent with specific corrections, with a code-owned circuit breaker to prevent infinite cycles. The trust property throughout: facts are captured by code, not transcribed by models — test results live in stories/<id>/evidence/ as hashed, machine-written records that gates re-verify.
Project Types
The pipeline supports 9 project types, each with a tailored task graph and specialized developer agent:
| Project Type | Developer Agent | Agents Skipped |
|---|---|---|
| fullstack-web | BEND + FEND | (none) |
| backend-api | BEND | UXA, FEND, PMCP |
| frontend-only | FEND | BEND |
| data-pipeline | DATA | UXA, FEND, PMCP |
| mcp-server | MCP-DEV | UXA, FEND, PMCP |
| document-generation | DOCGEN | UXA, FEND, PMCP |
| library | LIBDEV | UXA, FEND, PMCP |
| cli-tool | CLI-DEV | UXA, FEND, PMCP |
| mobile-app | MOBILE | (conditional) |
The workflow selects which agents to spawn based on project.type in your pipeline-config.yaml and the story's testing_profiles (resolved deterministically by resolve-graph).
Agent Roster
Per-Story Agents (10)
Spawned fresh per story, torn down after ship or cancel.
| Agent | Model | Role | Output |
|---|---|---|---|
| REQS | Sonnet | Requirements analyst | reqs-brief.md |
| UXA | Sonnet | UX specification | uxa-spec.md |
| QA-A | Sonnet | Test specification | qa-test-spec.md, visual-validation-checklist.md |
| SPECCHECK | Haiku | Mechanical spec gate (artifact matrix + AC coverage CLIs) | gate transcript |
| BEND | Sonnet | Backend developer | bend-handoff.md |
| FEND | Sonnet | Frontend developer | fend-handoff.md |
| CRITIC | Opus | Adversarial code reviewer | critic-review.md |
| QA-B | Sonnet | Test executor | execution-report.md, bugs.md, traceability-matrix.md |
| JUDGE | Sonnet/Opus | Final quality gate (evidence pass / binding decision) | judge-review.md, judge-decision.md |
Knowledge retrieval has no dedicated agent — every agent self-serves from curated files and the SQLite knowledge base via the db CLI commands.
Domain Developer Agents
Specialized agents that replace BEND for non-API project types:
| Agent | Model | Project Type | Output |
|---|---|---|---|
| DATA | Sonnet | data-pipeline | data-handoff.md |
| MCP-DEV | Sonnet | mcp-server | mcp-dev-handoff.md |
| LIBDEV | Sonnet | library | libdev-handoff.md |
| CLI-DEV | Sonnet | cli-tool | cli-dev-handoff.md |
| DOCGEN | Sonnet | document-generation | docgen-handoff.md |
| IAC | Sonnet | Cross-cutting (any type) | iac-handoff.md |
| MOBILE | Sonnet | mobile-app | mobile-handoff.md |
Cross-Story Stages
| Stage | Model | Trigger |
|---|---|---|
| PMCP (visual evidence) | Haiku | Orchestrator stage after QA-B on ui stories |
| Retrospective | Sonnet | Retro workflow after a sprint |
| Knowledge persist | Haiku | db index-curated CLI step driven by the retro workflow |
Installation
Prerequisites
- Node.js >= 18
- Claude Code CLI
- npm account (for publishing)
Initialize a Project
cd your-project
npx valent-pipeline initThe init command:
- Runs an interactive wizard to set project type, tech stack, and model assignments
- Copies pipeline infrastructure to
.valent-pipeline/ - Vendors the CLI (
bin/+src/) into.valent-pipeline/and installs its runtime dependencies there, so the project is self-contained and agents runnode .valent-pipeline/bin/cli.js <cmd>— no global install ornpxround-trip at run time - Generates
pipeline-config.yamlfrom your answers - Creates knowledge directories and initializes the backlog
- Installs Claude Code skills for story/epic/project execution
A global install (npm install -g valent-pipeline) is optional — only needed if you want the
bare valent-pipeline command available for manual use outside a project.
Upgrade
npx valent-pipeline upgrade
npx valent-pipeline upgrade --dry-run # preview changes without applyingUpgrades pipeline infrastructure (prompts, templates, task graphs, scripts) and re-vendors the
CLI (bin/ + src/) while preserving your project-specific files (config, knowledge, backlog).
Validate Configuration
node .valent-pipeline/bin/cli.js config validateConfiguration
All configuration lives in .valent-pipeline/pipeline-config.yaml. Run /valent-configure to edit interactively, or edit the file directly.
Key Sections
project:
type: fullstack-web # Project type (determines agent roster)
root: . # Project root directory
story_directory: ./stories # Where story inputs live
backlog_path: ./pipeline-backlog.yaml
tech_stack:
language: TypeScript
backend_framework: Express
frontend_framework: React
test_framework_unit: Vitest
test_framework_e2e: Playwright
browser_automation_mcp: playwright-mcp
models:
opus: [BEND, FEND, CRITIC] # Complex code generation, review
sonnet: [REQS, UXA, QA-A, ...] # Analysis, spec writing, judgment
haiku: [Knowledge, Embed, Help] # Retrieval, indexing, lookups
quality:
max_rejection_cycles: 3 # Circuit breaker for rejection loops
retrospective_every_n_stories: 5 # Retrospective trigger frequency
stall_threshold_minutes: 15 # Agent stall detection timeout
git: # Story git flow (valent git start-story|commit-phase|ship-story|leave-story)
enabled: true # Branch-per-story + typed commit trail + merge --no-ff on ship
target_branch: "" # Base/merge-back branch ("" = the branch the sprint starts on)
story_branch_prefix: story/ # Branch naming convention (story/<id>)
parallelism: # Concurrent stories in git worktrees (merges stay serialized)
enabled: false # OFF = strictly sequential sprint (the default)
max_stories: 2 # Concurrency cap when enabled
worktree_dir: .valent-worktrees # Worktrees at <dir>/<story-id> (ignored via .git/info/exclude)
setup_commands: [] # Run in each fresh worktree (env files, per-story ports, installs)
knowledge:
mode: sqlite # none | sqlite | local-docker | connect-to-existing
sqlite_db_path: ./.valent-pipeline/pipeline.db
sprint: # Only used in epic/project mode
duration_minutes: 480
initial_velocity_points: 60
estimation_model: calibrated # calibrated | baseline
fibonacci_scale: [1, 2, 3, 5, 8, 13, 21]CLI Commands
Pipeline Management
| Command | Description |
|---|---|
| valent-pipeline init | Initialize pipeline in current project |
| valent-pipeline upgrade | Upgrade pipeline infrastructure |
| valent-pipeline upgrade --dry-run | Preview upgrade changes |
| valent-pipeline config validate | Validate pipeline-config.yaml |
Database Commands
| Command | Description |
|---|---|
| valent-pipeline db init | Initialize SQLite knowledge database |
| valent-pipeline db rebuild | Rebuild the database from story artifacts |
| valent-pipeline db index-handoff --file <path> | Index a single handoff artifact |
| valent-pipeline db search --query <text> | Full-text search across all artifacts |
| valent-pipeline db query-artifact --story <id> --type <type> | Fetch a specific artifact |
| valent-pipeline db query-directives [--agent <role>] | Get active correction directives |
| valent-pipeline db add-directive --json-file <path> --batch <n> | Persist correction directives (canonical YAML + queryable index, one call) |
| valent-pipeline db retire-directive --id <id> [--status expired\|superseded] | Expire/supersede a directive (status change only — audit trail kept) |
| valent-pipeline db sync-directives | Re-derive the directives index from knowledge/correction-directives.yaml |
| valent-pipeline db index-curated --file <path> | Process embed-instructions.md from a retrospective |
Calibration/query helpers: db record-calibration, db query-velocity, db query-list, db query-stories, db query-bugs-since (see valent-pipeline db --help).
Claude Code Skills
Invoked as slash commands inside Claude Code:
| Skill | Description |
|---|---|
| /valent-configure | Interactive configuration wizard |
| /valent-setup-backlog | Convert epics/stories into pipeline backlog |
| /valent-run-story-workflow STORY-ID | Execute a single story via the Claude Code Workflow orchestrator |
| /valent-run-epic-workflow EPIC-ID | Execute an epic (sprint planning + execution) via the Workflow orchestrator |
| /valent-run-project-workflow | Execute a full project across all epics via the Workflow orchestrator |
| /valent-run-spike SPIKE-ID | Run a time-boxed de-risking spike to a recorded GO/STEP-DOWN/HALT decision |
| /valent-run-deferred-tests | Run deferred iOS tests on Mac |
| /valent-debug-export | Export diagnostic dump |
| /valent-help | Pipeline documentation and FAQ |
Story Inputs
Create a story directory with at least a story.md file:
stories/
STORY-001/
story.md # Required: user story + acceptance criteria
ux-spec.md # Optional: UX specification
trigger-map.md # Optional: interaction flows
scenarios.md # Optional: behavioral scenarios
architecture-notes.md # Optional: constraints and decisionsThe pipeline writes all output to stories/STORY-001/output/.
Pipeline Output
For each story, the pipeline produces 15+ artifacts in stories/{story-id}/output/:
| Artifact | Agent | Purpose |
|---|---|---|
| reqs-brief.md | REQS | Implementation brief from ACs |
| uxa-spec.md | UXA | Component specs from UX spec |
| qa-test-spec.md | QA-A | Behavioral test specifications |
| visual-validation-checklist.md | QA-A | Browser automation checklist |
| {dev}-handoff.md | BEND/FEND/etc. | Implementation summary |
| critic-review.md | CRITIC | 3-pass code review findings |
| execution-report.md | QA-B | Test execution results |
| bugs.md | QA-B | Filed bugs with priorities |
| traceability-matrix.md | QA-B | AC-to-test coverage map |
| evidence/atdd-red.json, evidence/proof.json | RED/GREEN gates (CLI-written) | ATDD red baseline + the red/green/diff proof object |
| judge-review.md | JUDGE | Bug review findings |
| judge-decision.md | JUDGE | Ship/reject decision with evidence |
| story-report.md | orchestrator | Story completion summary |
Plus committed, tested production code in your project source tree.
Communication Model
All inter-agent communication follows the Distilled Communication Standard:
- Handoff documents -- structured artifacts with YAML frontmatter, orchestrator summary, and facts-only content. Every handoff follows a template skeleton. The handoff file IS the completion signal -- the orchestrator sequences agents from it.
- Structured returns -- each agent returns a terse, schema-validated machine block with file pointers; the orchestrator routes all coordination (rejections, bugs, escalations) from these.
- Design Council -- structured deliberation protocol for contested design decisions with position statements, synthesis, and escalation to user if consensus fails.
- Human Escalation -- when agent deliberation is insufficient, the orchestrator surfaces the issue to the user with full context.
Knowledge System
The pipeline learns from its own output through a knowledge system with three data sources:
| Source | Location | Purpose |
|---|---|---|
| Correction directives | knowledge/correction-directives.yaml | Behavioral changes for agents from past patterns |
| Curated knowledge | knowledge/curated/ | Conventions, validated patterns, known pitfalls |
| SQLite (FTS5) | .valent-pipeline/pipeline.db | Full-text retrieval over indexed artifacts and curated lessons |
The retrospective workflow (retro.workflow.js, triggered every N stories) is the sole gatekeeper for what enters persistent knowledge: it analyzes batch outputs, synthesizes gated correction directives, and emits indexing instructions executed by its EMBED stage. During story execution, agents self-serve from the knowledge sources via the valent-knowledge skill.
Knowledge Modes
| Mode | Dependencies | Description |
|---|---|---|
| none | None | Curated files + correction directives only |
| sqlite | better-sqlite3 | Local SQLite with FTS5 and vector search |
| local-docker | Docker | ChromaDB via Docker Compose + curated files |
| connect-to-existing | Network | Remote ChromaDB instance + curated files |
Execution Modes
The pipeline runs on a single orchestration path: the Claude Code Workflow path. A deterministic Workflow script (pipeline/orchestrators/claude-code/{plan,sprint,retro}.workflow.js) drives the pipeline with schema-validated gates, a code-owned rejection cap, parallel CRITIC passes, and journal-based resume (resumeFromRunId). Control flow lives in JavaScript and the journal — not in a model interpreting prose. Control flow is validated by scripts/test-workflow.js, and the orchestrator is exercised end-to-end against live stories (live runs have driven the version history). See pipeline/orchestrators/claude-code/README.md.
Requires Claude Code (the Workflow tool).
runtime.providermust beclaude-code.
Single Story
/valent-run-story-workflow STORY-001Executes one story through the full pipeline.
Epic (Sprint-Based)
/valent-run-epic-workflow EPIC-001Runs an epic with sprint planning: grooms stories, estimates sizing using calibrated Fibonacci points, plans sprints, executes stories in priority order, and runs retrospectives between sprints.
Full Project
/valent-run-project-workflowExecutes all epics in the backlog with cross-epic dependency resolution.
Backlog Setup
/valent-setup-backlogConverts your epics and stories documents into a prioritized pipeline-backlog.yaml with vertical slice ordering and knowledge base initialization.
Quality Gates
SPECCHECK Gate (mechanical)
Validates the spec chain before any code is written — as CLIs, not LLM judgment:
valent spec check: artifact-existence matrix per testing profile, acceptance-file existence, and the acceptance-tier mock banvalent trace check: every unwaived AC covered by at least one spec'd case
Rework routes to the CLI-named owner (REQS/UXA/QA-A) with downstream specs re-derived automatically.
RED / GREEN Gates (ATDD)
When atdd.command is configured, QA-A authors executable acceptance tests before any implementation exists:
- RED (pre-dev): the suite runs via
valent evidence runand every required acceptance case must FAIL — a pre-passing test is a spec bug. The passing red writesevidence/atdd-red.json, snapshot-hashing the acceptance sources. - The acceptance tests are read-only for dev agents — any edit is hash-detected and auto-REJECTed to QA-A arbitration (restore, or amend with an audited
evidence rebaseline). - GREEN (post-CRITIC): the suite re-runs and must pass with the sources byte-identical since red.
valent evidence proofthen assemblesproof.json: the failing run before AI wrote code, the passing run after, the exact diff between, and the hashes.
JUDGE Gate
Makes the final ship decision based on evidence:
- Bug priority review (can reclassify P4 bugs to P1-P3)
- Test execution results verification
- Traceability matrix completeness
- PMCP visual evidence (UI projects)
- Applies "evidence over assertion" -- independently verifies every upstream claim
Verdicts: SHIP (the story branch merges --no-ff into the target branch and the backlog flips to shipped), SHIP-PARTIAL (mobile: ship Android, defer iOS), REJECT (the story branch is left unmerged for the fix/retry; send back with corrections).
Rejection Loops
When CRITIC or JUDGE rejects work:
- Lead re-queues the responsible agent with the specific rejection findings
- Agent reworks and resubmits
- Circuit breaker (
max_rejection_cycles, default 3) prevents infinite loops - After max cycles, Lead escalates to user
Crash Recovery
All pipeline state is persisted to disk:
pipeline-state.json-- current story, backlog, phase timing, team roster- Handoff files with YAML frontmatter tracking step progress
- Git working directory preserves code state
- Inbox files preserve communication history
If the Lead crashes, it can reconstruct the full pipeline state from these artifacts on restart.
Directory Structure
After initialization, the pipeline installs to .valent-pipeline/ in your project:
.valent-pipeline/
pipeline-config.yaml # Your project configuration
pipeline-state.json # Pipeline runtime state (derived, human-readable view)
orchestrators/claude-code/ # Workflow orchestrator scripts (plan/sprint/retro)
prompts/ # Agent prompt templates
templates/ # Handoff document templates
task-graphs/ # Task dependency graphs per project type
steps/ # Agent step files
bend/ # Backend developer steps
fend/ # Frontend developer steps
critic/ # Code review steps
qa-a/ # Test spec steps (domain-specific)
qa-b/ # Test execution steps (domain-specific)
reqs/ # Requirements analysis steps
judge/ # Judge gate steps
orchestration/ # Shared orchestration steps (config, story resolution, status)
retrospective/ # Retrospective analysis steps
common/ # Shared agent protocols
data/ # Data pipeline developer steps
docgen/ # Document generation steps
fend/ # Frontend developer steps
iac/ # Infrastructure-as-code steps
libdev/ # Library developer steps
mcp-dev/ # MCP server developer steps
mobile/ # Mobile developer steps
uxa/ # UX specification steps
scripts/ # Pipeline utility scripts
docs/ # Pipeline reference documentation
knowledge/
curated/ # Curated knowledge files
correction-directives.yaml
pipeline.db # SQLite knowledge databaseDocumentation
Full reference documentation lives in pipeline/docs/:
| Document | Description | |---|---| | Pipeline Overview | Architecture, flow, artifact map | | Agent Reference | All agents, models, inputs/outputs | | Communication Standard | Handoff format, inbox protocol, Design Council | | Task Graph Specification | Dependencies, task states, claiming | | Pipeline State Schema | JSON schema for pipeline-state.json | | Knowledge System | RAG assessment, correction directives, curation | | Template Skeleton | Universal handoff document structure | | NPX Packaging | Package distribution and init workflow |
License
MIT
