pi-pipelines

v0.2.0

Published

2 days ago

Predefined, reusable agent loops for Pi: declarative YAML pipelines with review gates, scoring, and goal-loop convergence. Built on pi-subagents.

0High
0Medium
0Low

rybens92

agent-loops agentic self-refine review-loops ai-agents multi-agent llm pi-package pi pi-coding-agent pipelines orchestration workflows

🧩 Pi Pipelines

Predefined, reusable agent loops for Pi — declarative YAML pipelines with review gates, scoring, and goal-loop convergence. Built on pi-subagents.

⭐ Key Ideas

1. Loops as first-class artifacts. Agent loops are normally ad-hoc — a quick /goal, a bash script driving a harness in non-interactive mode, a plugin or skill glued inside a harness. They work, but the loop is improvised each time and lives in a chat history or a throwaway script. Pi Pipelines makes a loop a named YAML file you commit, version, and reuse across projects — with the convergence mechanics built in rather than re-implemented each time.

2. Review gates — AI reviewing AI, iterating until convergence. The core loop mechanic. A worker produces; independent reviewers (in their own context, isolated from the worker and each other) score it 0–10. If the average is below the target, the worker retries with their feedback. Repeat until the quality target is met or rounds run out.

3. Preflight — human taste before automation. Before any stage runs, the extension interviews you — one question at a time — to extract what you actually mean and what your vision for the task is. Your answers are injected into every stage's task. It's the grill-me pattern embedded into the pipeline. This is my take on adding a human's taste to agent loops — and it's novel for this type of extension. Agentic coding tends to rush to execution; preflight makes the loop slow down and align with intent first.

If you follow the agent-loop discourse, here's where the terms land:

| In the discourse | In this project | |---|---| | Agent loop / self-refinement | Review gate: worker → reviewers → score → retry | | Goal-driven loop | goal-driven pipeline: derived goal → iterate → converge | | Agent recurrence / spawning | Dynamic stage expansion (one level deep) | | Quality gate / guardrail | targetScore + parallel independent reviewers |

📦 Install

Prerequisites

| Requirement | Version | Install | |---|---|---| | Pi | >= 0.74 | npm install -g @earendil-works/pi-coding-agent | | pi-subagents | latest | pi install npm:pi-subagents |

Install the extension

# From npm (recommended)
pi install npm:pi-pipelines

# From GitHub
pi install git:github.com/Rybens92/pi-pipelines

# From a local path
pi install /path/to/pi-pipelines

🚀 Quick Start

# 1. Create a pipeline directory
mkdir -p .pi/pipelines

# 2. Create a pipeline file
cat > .pi/pipelines/hello.pipeline.yaml << 'EOF'
name: hello
description: "Quick project exploration and action plan"

stages:
  - id: explore
    agent: scout
    task: "Explore the project for: {task}"
  - id: plan
    agent: planner
    task: "Create a plan based on: {outputs.explore}"
EOF

# 3. Run it (via command)
/run-pipeline hello "Add user authentication"

# 4. Or discover available pipelines
/list-pipelines

# 5. Or let the agent auto-select the best pipeline for any task
/pipeline-auto "Fix the flaky login tests and make sure they stay green"

# 6. Or let the agent author a new pipeline for you
/pipeline-new "Create a code review pipeline with a security gate"

Letting the Agent Create Pipelines for You

You can also have the agent author pipelines for you — see /pipeline-new in Running & Driving Pipelines below.

💡 Why This Exists

The three pillars above are the thesis. The honest framing is that this is an attempt — a personal take on the agent-loop conversation, built for my own use first and shared openly.

Pi Pipelines isn't "better" than a quick /goal or a CI bash script — those are the right tool at their point on the complexity spectrum. It earns its keep when you want a loop that's reusable, version-controlled, and has automated quality gates:

| Approach | Abstraction | Reusable | Versioned | Quality gate | |---|---|---|---|---| | /goal (built-in) | Single manual loop, ad-hoc | No | No | Manual | | Bash + harness in non-interactive mode | Scripted, rigid | The script | Git-tagged | Manual / re-scripted | | In-harness plugins / skills | Coupled to one harness | Partial | Partial | Varies | | Pi Pipelines | Declarative, composable, multi-stage | YAML file | Git-native | Scored + auto-retry |

How the mechanics work — review gates, dynamic stage expansion, preflight — is detailed in Features and Running & Driving Pipelines below.

🧱 Built on pi-subagents

Pi Pipelines doesn't ship its own subagent runtime. It's built on pi-subagents — the mature subagent delegation extension for Pi — and runs its agents (scout, planner, worker, reviewer, oracle, …) through pi-subagents' event bridge.

The current focus is the high-level orchestration layer — defining, running, and managing pipeline YAML, plus the loop and gate mechanics on top. A custom subagent runtime is a possible future direction, but only if a real need for one emerges.

Non-goals

Not a subagent runtime. It reuses pi-subagents rather than reinventing that layer.
Not true agent recursion (yet). Dynamic stage expansion is one level deep — a step toward recurrence, not the full thing.
Not a competitor to /goal or bash loops. Different points on the complexity spectrum; use the lightest tool that fits.
Not a polished product. A personal project, built for my own use first and shared openly. It works for my use cases; your mileage may vary.

✨ Features

🎯 Declarative YAML Pipelines

Define entire workflows in .pi/pipelines/*.pipeline.yaml. No code, just YAML.

🔄 Three Stage Types

| Stage Type | Description | |---|---| | Sequential | Agents run one after another, passing outputs via {outputs.stageId} | | Parallel | Fan-out to multiple agents concurrently for independent work | | Review Gate | Worker → parallel reviewers → score → retry loop until quality target is met |

📊 Review Gates with Scoring — The Loop Pattern

This is the core idea: AI reviewing what AI produced, iterating until it's good enough.

Round 1: Worker → 3 Reviewers (parallel, independent context) → Avg Score: 7.3
         ❌ 7.3 < 9.0 → Feedback collected → Round 2

Round 2: Worker (with feedback from round 1) → 3 Reviewers → Avg Score: 9.3
         ✅ PASS — quality target met

Each reviewer runs in its own context — isolated from the worker and from each other — to provide an independent quality assessment. They score on a 0–10 scale. The worker retries with the collected feedback until the average meets targetScore or maxRounds is exhausted.

For the mission-level counterpart — did we solve the whole problem, not just one step — see Goal-Loop below.

🔁 Goal-Loop — Mission-Level Convergence

Review gates ask: "Is this step's output good enough?" Goal-loops ask: "Did we actually solve the user's problem?"

A goal-loop is a composite stage that wraps inner stages and runs them iteratively. After each pass, a critic (typically oracle) inspects the results and decides GOAL: ACHIEVED or NOT ACHIEVED — if not, its feedback feeds back in and the stages re-run, until convergence or maxIterations is exhausted. The two mechanics compose: automatic-loop wraps scored review gates inside a goal-loop for two-layer convergence — every step high-quality and the whole mission achieved.

The goal can be static (declared inline) or derived (a planner generates it from your task and context). Inner stages get {goal}, {goalFeedback} (empty on iteration 1), and {iteration}. The loop never hard-fails — on exhaustion it returns partial progress.

stages:
  - id: converge              # composite stage
    goal-loop:
      maxIterations: 3
      goal:
        derive: true
        hint: "Feature works end-to-end"
      critic:
        agent: oracle
    stages:                   # inner stages, repeated each iteration
      - id: implement
        agent: worker
        task: "Work on: {task}. Goal: {goal}. Feedback: {goalFeedback}."
      - id: verify
        agent: reviewer
        task: "Verify against: {goal}"

🧩 Dynamic Stage Expansion — Towards Agent Recurrence

A related idea in the agentic-workflow space is recurrence — agents spawning subagents that spawn further subagents, recursively decomposing work.

Dynamic Stage Expansion is this project's take on that idea. It takes structured output from one stage (JSON, YAML, or a markdown list) and dynamically fans it out into N parallel stages. It's not true agent recursion — each expansion is one level deep and each expanded stage runs the same agent type. But it's a step in that direction, and it's already useful for real workflows.

stages:
  - id: find-files
    agent: scout
    task: "Return JSON files to refactor: [{\"path\":\"...\"}, ...]"

  - id: refactor-each
    expand:
      from: find-files
      maxItems: 10
    agent: worker
    task: "Refactor {item.path}"

🔗 Template Variables

| Variable | Resolves to | |---|---| | {task} | Original user task passed to the pipeline | | {outputs.<stageId>} | Output from a previous stage | | {lastFeedback} | Latest review feedback (auto-injected in gate retries) | | {item} | Whole item from dynamic stage expansion | | {item.<key>} | Single field from a dynamic expansion item |

🤖 LLM-Friendly Tools

Pi Pipelines registers tools that the LLM can use:

| Tool | Purpose | |---|---| | run_pipeline({ pipeline, task }) | Execute any defined pipeline | | list_pipelines({ query? }) | Discover and filter available pipelines |

📋 Automatic Report Synthesis

After all stages complete, a synthesis agent automatically generates a structured report of what was accomplished, key findings, issues, and next steps.

📝 Persistent Run Logging

Every pipeline run is automatically logged to .pi/pipelines/runs/ with stage-by-stage details (resolved prompts, outputs, scores, timing) and incremental writes that survive crashes. Inspect logs with /run-history.

🧭 Running & Driving Pipelines

Slash commands at a glance

| Command | What it does | |---|---| | /pipeline-<name> [task] | Run a specific pipeline by name (auto-discovered from .pi/pipelines/) | | /run-pipeline <name> [task] | Generic fallback — run any pipeline by name | | /pipeline-auto [task] | Interview you, auto-select the best pipeline, confirm, then run it | | /pipeline-new [description] | Have the agent author a new pipeline YAML for you | | /list-pipelines | List every available pipeline |

`/pipeline-auto` — interview, then pick the right pipeline

Don't know which pipeline fits a task? /pipeline-auto runs you through a short preflight interview (see below), then matches your task and answers to the best available pipeline, confirms the choice with you, and executes it. If nothing is a strong match, it falls back to the universal automatic-loop.

/pipeline-auto "Fix the flaky login tests and make sure they stay green"

`/pipeline-new` — ask the agent to author a pipeline

The extension ships with a built-in skill that teaches the agent how to author, validate, and place pipeline YAML. Describe what you want in plain language and the agent writes the file, validates it, and can even test it. Example prompts that trigger the skill:

"Create a security review pipeline that checks for secrets and SQL injection"
"Create a CI-style pipeline that runs tests and lints before merging"
"Create a pipeline that researches a topic, expands each source into a draft post, then reviews them all"

For the full authoring guide (all stage types, gates, expand, template variables), see Creating Custom Pipelines below.

Preflight — adding human taste to agent loops

This is pillar #3 from Key Ideas above: before any stage runs, the extension pulls your actual intent out of you. A pipeline can declare preflight topics — areas to clarify before execution. When you run such a pipeline, you're interviewed one question at a time (via the bundled pi-pipelines-grill skill — a grill-me pattern), your answers are collected, and injected into every stage's task. /pipeline-auto flows through preflight automatically.

name: release-check
description: Check a release for quality and stability
preflight:
  topics:
    - scope of the release (what changed)
    - target environment (staging / production)
    - rollback strategy
stages:
  - id: analyze
    agent: scout
    task: "Analyze the changes within scope: {answers}"

Your answers become a ## User Answers block prepended to every stage's task, and the {answers} variable is replaced inline where you reference it. It's guidance for the agents — not a rigid spec.

📚 Built-in Pipelines

The extension ships with 8 pipelines. Seven are ready-to-use in any project; hello-world is a minimal smoke test for first runs. They're automatically copied to ~/.pi/pipelines/ on each startup to stay in sync with the extension. Custom pipelines should go in .pi/pipelines/ (project-scoped) or use different names in ~/.pi/pipelines/ to avoid being overwritten.

| Pipeline | Stages | Gates | Intent | Best for | |---|---|---|---|---| | automatic-loop | 3 | 2 + goal-loop | Ready to use | Daily driver: scored gates + goal-loop critic | | goal-driven | 3 | critic | Ready to use | Goal-loop iteration with a derived goal | | dev-sprint | 6 | 2 | Ready to use | Full development cycle + project review | | tdd-review | 5 | 2 | Ready to use | TDD with test + code review gates | | refactor | 5 | 1 | Ready to use | Safe refactoring with a gate | | release-check | 2 | 0 | Ready to use | Parallel pre-release quality gates | | bug-triage | 5 | 0 (expand) | Ready to use | Multi-file fixes via dynamic expand | | hello-world | 2 | 0 | Smoke test | First run / verify the extension works |

Use them as-is, or fork one as a template for your own.

🏗️ Creating Custom Pipelines

Three ways to create pipelines

| Method | When to use | |---|---| | Manually — write .pipeline.yaml files in .pi/pipelines/ | When you know exactly what you want | | Via the agent — describe your workflow, the agent uses the built-in skill to write the YAML (see /pipeline-new in Running & Driving Pipelines) | When you want the LLM to handle the details | | Copy and modify — fork one of the 8 built-in pipelines | The fastest way to get started |

Minimal Pipeline

# .pi/pipelines/my-pipeline.pipeline.yaml
name: my-pipeline
description: "Short description"

stages:
  - id: explore
    agent: scout
    task: "Explore: {task}"

  - id: plan
    agent: planner
    task: "Plan based on: {outputs.explore}"

Pipeline with a Review Gate

stages:
  - id: implement
    agent: worker
    task: "Implement: {outputs.analyze}"
    gate:
      type: review-loop
      maxRounds: 3          # Default: 3
      targetScore: 8        # Default: 8 (use 9 for tests)
      reviewers:
        - focus: "Does the implementation satisfy all criteria?"
        - focus: "Is the code clean and maintainable?"
        - focus: "Are error paths handled correctly?"

Parallel Stage

stages:
  - id: checks
    parallel:
      - id: code-review
        agent: reviewer
        task: "Review code quality: {task}"
      - id: security
        agent: reviewer
        task: "Security audit: {task}"
      - id: perf
        agent: scout
        task: "Performance analysis: {task}"

  - id: decision
    agent: planner
    task: >
      Based on:
      Code Review: {outputs.code-review}
      Security: {outputs.security}
      Decide next steps.

Expand Stage (Dynamic)

stages:
  - id: research
    agent: researcher
    task: >
      Find topics for: {task}.
      Return JSON: [{"title":"...","angle":"..."}]

  - id: write-posts
    expand:
      from: research
      maxItems: 5
    agent: worker
    task: "Write blog post: {item.title}. Angle: {item.angle}"

For the full pipeline authoring guide with all configuration options, see skills/pi-pipelines/SKILL.md.

📐 Architecture

User / LLM
  │
  ├── /run-pipeline <name> <task>    (TUI command)
  ├── /pipeline-<name> <task>        (dedicated command per pipeline)
  ├── /pipeline-new <description>    (prompt agent to author a pipeline)
  └── run_pipeline()  /  list_pipelines()   (LLM tools)
  │
  ▼
Pi Pipelines Extension (TypeScript)
  │
  ├── config-loader.ts   — Parse & validate YAML pipeline definitions
  ├── pipeline-runner.ts — Orchestrate stages, gates, and expansions
  ├── subagent-bridge.ts — Event bridge to pi-subagents (with fallback)
  ├── tui-widgets.ts     — TUI status widget for pipeline progress
  └── utils.ts           — Shared utilities
  │
  ▼
pi-subagents (event bridge)
  │
  └── Subagents (scout, planner, worker, reviewer, oracle, ...)

📊 Project Status

| Metric | Status | |---|---| | Tests | 791 passing | | Code Coverage | 95.9% statements, 90% branches | | Linter | ESLint + Prettier (flat config) | | Runtime | TypeScript (no build step — Pi loads via jiti) | | Dependencies | 1 production dep (js-yaml) |

🤝 Contributing

Contributions are welcome! Please follow these steps:

Fork the repository
Create a branch for your feature or fix
Write tests for your changes
Run the check suite: pnpm check (lint + format + tests)
Open a pull request with a clear description

Development Setup

git clone https://github.com/Rybens92/pi-pipelines.git
cd pi-pipelines
pnpm install
pnpm test        # Run tests (791 tests, ~1s)
pnpm check       # Full suite: lint + format + tests
pnpm test:coverage  # With coverage report

Code of Conduct

This project follows the Contributor Covenant.

❓ FAQ

Q: Do I need pi-subagents? Yes. Pi Pipelines delegates agent execution to pi-subagents. Install it first: pi install npm:pi-subagents.

Q: Can I use this without Pi? No. This is a Pi extension and requires the Pi CLI environment.

Q: Do I need to build / compile TypeScript? No. Pi uses jiti to load TypeScript directly. No build step needed.

Q: How many pipelines can I have? As many as you like. Each .pipeline.yaml file in .pi/pipelines/ becomes a /pipeline-<name> command.

Q: Can the agent create pipelines for me? Yes. The extension includes a skill that teaches the Pi agent how to create, validate, and manage pipeline YAML files. Just describe what you need.

Q: What agents are available for pipeline stages?

| Agent | Read-only | Edits files | Use case | |---|---|---|---| | scout | ✅ | ❌ | Code exploration and analysis | | planner | ✅ | ❌ | Planning and synthesis | | worker | ❌ | ✅ | Implementation — only one at a time | | reviewer | ✅ | ❌ | Code review and quality assessment | | oracle | ✅ | ❌ | Strategic analysis and second opinions | | researcher | ✅ | ❌ | Web research (requires pi-web-access) |

Q: How do reviewers score? Each reviewer ends their output with SCORE: N on the last line (0–10). The pipeline runner parses these scores automatically.