copilot-flow

v2.17.3

Published

a month ago

Multi-agent orchestration framework for GitHub Copilot CLI — inspired by Ruflo (claude-flow)

0High
0Medium
0Low

kmamalis

github-copilot copilot multi-agent ai swarm orchestration cli

copilot-flow

copilot-flow lets you orchestrate fleets of GitHub Copilot agents that work together — researching, designing, coding, testing, and reviewing — so you can go from a product idea to working software faster than ever before.

Unlike other orchestration tools, copilot-flow accumulates experience across every run. Agents learn from their own successes and failures, build up project-specific knowledge, and carry that knowledge into every future session — automatically.

Inspired by Ruflo (claude-flow), built on the official @github/copilot-sdk.

A system that gets smarter with every run

Most orchestration tools start fresh each time. copilot-flow doesn't.

After every phase, agent task, or swarm run, the system distils what it learned — key decisions, constraints, patterns, and pitfalls — and stores them for future runs. When something goes wrong and an agent recovers, the recovery strategy becomes a lesson. When an important architectural decision is made, it's retained permanently. Over time, your agents carry institutional project knowledge that no single run could hold.

Three layers of persistent context

Every agent prompt is built in order:

| Layer | Source | Lifetime | |-------|--------|----------| | Project identity | .github/memory-identity.md | Permanent — written once, always present | | Lessons learned | .github/lessons/<agentType>.md + _global.md | Permanent — appended automatically as agents run | | Remembered context | SQLite memory (BM25-ranked by task relevance) | 30-day TTL, refreshed on every run |

What gets captured automatically

Successful runs → facts are distilled and stored; high-importance findings (importance 4–5) are additionally promoted to the permanent lessons file for that agent type
Acceptance failures that recover → the failure reason is written as a permanent lesson so the agent knows what to avoid next time
Swarm task failures → appended to the failing agent's lesson file
All-retries-exhausted → appended to the agent's lesson file

Lessons are scoped to agent type

A coder's TypeScript patterns don't pollute a security auditor's context. Each agent sees only its own lessons plus global lessons — nothing more:

.github/
  lessons/
    coder.md            ← only coder agents see this
    reviewer.md         ← only reviewer agents see this
    architect.md
    _global.md          ← all agents see this (cross-cutting lessons)

Active memory tidying

# Consolidate a namespace: deduplicate, merge related facts, promote lessons
copilot-flow memory lint --namespace my-project

# Preview first
copilot-flow memory lint --namespace my-project --dry-run

Agent prompts are yours to edit

Run copilot-flow init and every agent gets a .github/agents/<type>.md file containing its default system prompt. Edit any file to add project-specific rules — your changes are picked up immediately, no rebuild needed:

.github/
  agents/
    coder.md       ← add your stack, coding conventions, type constraints
    reviewer.md    ← add your review checklist
    architect.md   ← add your architecture principles

The result: agents that understand your project's conventions from day one, and get progressively better at applying them as they accumulate experience.

What can you build?

From a napkin idea to a shipped product

Imagine you have an idea for TripMind — an AI travel planning SaaS. Here's how copilot-flow takes it from concept to code:

# Write your idea in a file
cat > tripmind-prd.md << 'EOF'
# TripMind
An AI-powered travel planning SaaS. Users describe their dream trip
in plain English and get a personalised itinerary, flight options,
hotel recommendations, and a packing list — all in one place.
Target: frequent travellers, aged 28–45, who hate spending hours
researching trips on multiple sites.
EOF

# Let an agent break it into epics and user stories
copilot-flow agent spawn \
  --spec tripmind-prd.md \
  --output tripmind-stories.md \
  --agent-dir .copilot/agents \
  --skill-dir .copilot/skills \
  --agent product-manager

# Then let a swarm research, design, and implement the first epic
copilot-flow swarm start \
  --spec tripmind-stories.md \
  --output tripmind-implementation.md \
  --topology hierarchical \
  --agents researcher,architect,coder,coder,tester,reviewer

Or run the whole journey as a phased pipeline:

copilot-flow plan tripmind-prd.md          # AI generates phases.yaml
copilot-flow exec phases.yaml --stream     # runs research → stories → implement → review

More ideas you can build

SaaS & B2B

BuilderStack — a no-code internal tool builder for ops teams

copilot-flow plan builderstack-prd.md
# generates: research → data-model → api → ui → tests phases
copilot-flow exec phases.yaml --phase research
copilot-flow exec phases.yaml --phase data-model

CareSync — patient coordination platform for private clinics

# Break down the appointment booking epic into stories
copilot-flow agent spawn \
  --agent product-manager \
  --agent-dir .copilot/agents \
  --skill-dir .copilot/skills \
  --task "Write user stories for the appointment booking epic in CareSync.
          Patients should be able to book, reschedule, and cancel appointments.
          Clinics should receive notifications and manage their calendar."

# Then implement
copilot-flow swarm start --spec booking-stories.md --agents coder,tester

FleetOps — vehicle fleet management for logistics companies

# Audit existing codebase for compliance and security
copilot-flow swarm start \
  --task "Audit the fleet tracking module for GDPR compliance and security vulnerabilities" \
  --topology mesh \
  --agents security-auditor,reviewer,documenter

Granular everyday use

copilot-flow is just as useful for day-to-day tasks as it is for full product builds.

Story → Code → Tests in one command

copilot-flow swarm start \
  --task "Implement user story: As a user, I want to reset my password via email,
          so that I can regain access to my account if I forget it.
          Stack: Next.js, Prisma, PostgreSQL, Resend for email." \
  --agents coder,tester \
  --topology sequential \
  --stream

Write a PRD for a single feature

copilot-flow agent spawn \
  --agent product-manager \
  --agent-dir .copilot/agents \
  --skill-dir .copilot/skills \
  --task "Write a PRD for a CSV bulk-import feature for our user management screen.
          Admin users should be able to upload a CSV of up to 10 000 rows." \
  --output csv-import-prd.md

Debug a production issue

copilot-flow agent spawn \
  --type debugger \
  --spec error-report.md \
  --output root-cause.md

Refactor a module with tests

copilot-flow swarm start \
  --task "Refactor src/billing/ to use the new Stripe SDK v5. Keep all existing tests passing and add tests for the new retry logic." \
  --agents coder,tester,reviewer \
  --topology sequential

Generate API documentation

copilot-flow agent spawn \
  --type documenter \
  --task "Generate OpenAPI 3.1 documentation for all routes in src/routes/" \
  --output openapi.yaml

Greenfield project kickstart

echo "Build a REST API for a todo app: users, todos, tags, auth with JWT" > spec.md
copilot-flow plan spec.md
copilot-flow exec phases.yaml --stream
# Phase outputs: phase-research.md, phase-design.md, phase-implement.md, phase-review.md

Prerequisites

Node.js >= 22.5
GitHub Copilot CLI (copilot) installed and authenticated
A GitHub account with Copilot access

copilot-flow doctor   # checks everything for you

Three ways to use copilot-flow

0. TUI — interactive terminal UI

Launch a full-screen, slash-command-driven interface that wraps every command:

copilot-flow tui              # start on the home dashboard
copilot-flow tui --screen exec  # jump straight to a screen

Navigate with /spec, /plan, /exec, /memory, /swarm, /telemetry, /doctor, and more. See docs/commands/tui.md for the full screen reference.

1. CLI — run commands directly

Install globally and orchestrate agents from your terminal:

npm install -g copilot-flow
# or without installing:
npx copilot-flow <command>

2. AI-first — let your AI assistant do the orchestrating

Copy .copilot/agents/ and .copilot/skills/ into your project (they're included in this repo). Any AI assistant — GitHub Copilot, Claude, Codex — that loads the skill will use npx copilot-flow to orchestrate tasks on your behalf. You describe the goal in plain English; the AI picks the right strategy and runs the commands.

# The orchestrator agent receives any goal and figures out the right approach
copilot-flow agent spawn \
  --agent orchestrator \
  --agent-dir .copilot/agents \
  --skill-dir .copilot/skills \
  --task "Build the user authentication feature described in AUTH.md"

No need to know which agent type to use, which topology, or how many retries — the orchestrator uses copilot-flow route task internally to make those decisions.

Quick Start

# 1. Check your setup
copilot-flow doctor

# 2. Initialise in your project
copilot-flow init

# 3. Spawn a single agent
copilot-flow agent spawn --task "Explain the architecture of this codebase" --stream

# 4. Run a multi-agent swarm
copilot-flow swarm start --task "Implement a JWT auth middleware" --stream

# 5. Generate a phased plan and execute it
copilot-flow plan spec.md
copilot-flow exec phases.yaml --stream

The Product Manager agent & skill

copilot-flow ships with a ready-to-use product-manager agent and a copilot-flow skill in .copilot/agents/ and .copilot/skills/.

Two agents and a skill are included in .copilot/:

orchestrator — the entry point for AI-first usage. Receives any goal, uses copilot-flow route task to decide the right strategy (single agent / swarm / phased plan), delegates entirely to specialist agents, and never does implementation work itself.

product-manager — turns a rough idea or PRD into structured epics, user stories, and Given/When/Then acceptance criteria. Delegates research sub-tasks to a researcher agent via npx copilot-flow automatically.

The skill (SKILL.md) teaches any AI assistant — GitHub Copilot, Claude, Codex — how to use copilot-flow commands to orchestrate work. Load it via --skill-dir and the model knows when to call agent spawn, swarm start, or plan/exec, and will always use copilot-flow route task when uncertain which agent fits.

# Let the orchestrator figure out the best strategy for any goal
copilot-flow agent spawn \
  --agent orchestrator \
  --agent-dir .copilot/agents \
  --skill-dir .copilot/skills \
  --task "Build the notifications feature described in NOTIFICATIONS.md"

# Or use the product-manager directly for product planning
copilot-flow agent spawn \
  --agent product-manager \
  --agent-dir .copilot/agents \
  --skill-dir .copilot/skills \
  --spec your-idea.md \
  --output stories.md

Command Reference

| Command | Description | Docs | |---------|-------------|------| | agent spawn | Run a single specialist agent | → docs/commands/agent.md | | swarm start | Orchestrate multiple agents | → docs/commands/swarm.md | | plan / exec | Phased multi-swarm pipelines | → docs/commands/plan-exec.md | | memory | Persist and query the knowledge base | → docs/commands/memory.md | | memory lint | LLM-powered dedup, merge, and lesson promotion | → docs/commands/memory.md | | doctor / init / status | Setup and diagnostics | → docs/commands/doctor.md | | models | List models available on your Copilot plan | → docs/commands/doctor.md | | hooks | List and configure hooks | → docs/commands/hooks.md | | telemetry | Run metrics, latency, and agent performance dashboard | → docs/commands/telemetry.md | | tui | Interactive full-screen terminal UI | → docs/commands/tui.md |

Skills, Custom Agents & Repo Instructions

Repo instructions (auto-loaded)

Place a copilot-instructions.md file at .github/copilot-instructions.md. It is automatically injected into every session — stack rules, coding conventions, constraints.

copilot-flow agent spawn --task "..."                     # auto-detected
copilot-flow agent spawn --task "..." --no-instructions   # disable
copilot-flow agent spawn --task "..." --instructions docs/rules.md  # explicit path

Custom agent format

Agents are .md files — YAML frontmatter for metadata, markdown body for the system prompt:

---
name: my-agent
displayName: My Agent
description: What this agent does
tools:
  - read_file
  - write_file
---

Your agent's system prompt goes here.

Persisting defaults in config

{
  "instructions": { "file": ".github/copilot-instructions.md", "autoLoad": true },
  "skills":       { "directories": [".copilot/skills", ".github"], "disabled": [] },
  "agents":       { "directories": [".copilot/agents"] }
}

See docs/custom-agents-example.md for a full worked example.

Model selection

By default copilot-flow lets the Copilot CLI choose the model, so you don't need to configure anything. If you want to pin a specific model, or you see a "model X is not available" error:

# Per-run override
copilot-flow agent spawn --task "..." --model claude-sonnet-4-5

# Permanent default via environment variable
export COPILOT_FLOW_DEFAULT_MODEL=claude-sonnet-4-5

# Permanent default via config file (.copilot-flow/config.json)
{ "defaultModel": "claude-sonnet-4-5" }

Which models are available depends on your GitHub Copilot plan. Common options include claude-sonnet-4-5, gpt-4o, gpt-4o-mini, and o3-mini. If a model name is rejected, try another or omit --model entirely to use your plan's default.

Enterprise & managed Macs

If authentication fails with a macOS keychain prompt timing out:

export GH_TOKEN=$(gh auth token)        # reuse your GitHub CLI token
# or
export GITHUB_TOKEN=ghp_your_pat_here   # personal access token

Add to your shell profile to make it permanent. See docs/commands/doctor.md.

Retry System

Every agent call retries automatically on transient failures with configurable backoff:

| Flag | Default | Description | |------|---------|-------------| | --max-retries <n> | 3 | Maximum attempts | | --retry-delay <ms> | 1000 | Initial delay | | --retry-strategy | exponential | exponential | linear | constant | fibonacci | | --no-retry | — | Disable retries |

Retried automatically: network errors, rate limits (429), session crashes, timeouts. Never retried: authentication errors, authorization errors, validation errors.

Programmatic API

import { runAgentTask, runSwarm, withRetry, getMemoryStore, globalHooks } from 'copilot-flow';

// Single agent
const result = await runAgentTask('coder', 'Write a binary search in TypeScript', {
  retryConfig: { maxAttempts: 3, backoffStrategy: 'exponential' },
  onChunk: chunk => process.stdout.write(chunk),
});

// Multi-agent swarm
const results = await runSwarm([
  { id: 'research', agentType: 'researcher', prompt: 'Research OAuth2 best practices' },
  { id: 'implement', agentType: 'coder', prompt: 'Implement OAuth2 login', dependsOn: ['research'] },
  { id: 'test', agentType: 'tester', prompt: 'Write tests for OAuth2', dependsOn: ['implement'] },
], 'hierarchical');

// Shared memory
const mem = getMemoryStore();
mem.store('project', 'stack', 'Next.js + Prisma + PostgreSQL');

// Hooks
globalHooks.on('post-task', async ctx => console.log('Task done:', ctx.data));

Memory system

The full memory system — importance scoring, BM25 search, layered injection, TTL management, memory types, project identity, wisdom retention, and memory lint — is documented in:

docs/commands/memory.md — full command reference and feature guide
docs/future-improvements/memory.md — implementation history and decisions

Examples

Ready-to-run phases.yaml files in the examples/ folder:

| File | Demonstrates | |------|-------------| | examples/acceptance-criteria.yaml | Per-phase acceptance criteria with generous timeouts — each phase is evaluated by a reviewer agent and automatically retried until it passes (the "Ralph Wiggum" loop) | | examples/memory-and-models.yaml | Cross-run memory via --memory-namespace, contextTags for per-phase fact filtering, and choosing the right model tier (fast/cheap vs reasoning) per phase | | examples/parallel-execution.yaml | Wave-based parallel execution — five backend and frontend phases run concurrently in Wave 3, three audit phases run concurrently in Wave 4, with a single final-review gate |

Quick start:

# Acceptance-criteria pipeline (auth API with retry loops)
copilot-flow exec examples/acceptance-criteria.yaml --stream

# Memory + model selection (collaborative editor)
copilot-flow exec examples/memory-and-models.yaml \
  --memory-namespace my-project \
  --stream

# Parallel e-commerce platform build
copilot-flow exec examples/parallel-execution.yaml --stream

Attribution

copilot-flow is inspired by Ruflo (claude-flow) — the multi-agent orchestration framework for Claude. copilot-flow brings the same swarm coordination patterns to the GitHub Copilot ecosystem.

License

MIT

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

copilot-flow

A system that gets smarter with every run

Three layers of persistent context

What gets captured automatically

Lessons are scoped to agent type

Active memory tidying

Agent prompts are yours to edit

What can you build?

From a napkin idea to a shipped product

More ideas you can build

SaaS & B2B

Granular everyday use

Story → Code → Tests in one command

Write a PRD for a single feature

Debug a production issue

Refactor a module with tests

Generate API documentation

Greenfield project kickstart

Prerequisites

Three ways to use copilot-flow

0. TUI — interactive terminal UI

1. CLI — run commands directly

2. AI-first — let your AI assistant do the orchestrating

Quick Start

The Product Manager agent & skill

Command Reference

Skills, Custom Agents & Repo Instructions

Repo instructions (auto-loaded)

Custom agent format

Persisting defaults in config

Model selection

Enterprise & managed Macs

Retry System

Programmatic API

Memory system

Examples

Attribution

License