claude-orchestra
v1.0.1-alpha.13
Published
Multi-agent orchestration system for Claude Code - maintain architectural integrity at scale
Maintainers
Readme
Claude Orchestra: Multi-Agent Orchestration System
Break the complexity ceiling. Build production systems with coordinated AI agents that maintain architectural integrity at scale.
This repository implements the 4-layer orchestra architecture for managing multiple Claude Code agents without context pollution, architectural drift, or agent collision.
The Problem
Traditional single-agent approaches hit a complexity ceiling around 10-15 file modifications. Beyond that:
- Context Window Death Spiral: Implementation details consume 73% of context, pushing architectural requirements below attention threshold
- Permission Interrupt Cascade: Every file modification fragments context, creating subtle inconsistencies
- Agent Collision Syndrome: Multiple agents create incompatible implementations without coordination
The Solution: 4-Layer Orchestra Architecture
Layer 1: The Orchestrator Agent
Pure orchestration - never writes code. Only decomposes tasks and coordinates specialists.
Layer 2: Context Management System
Maintains state across all agents without mixing implementation details. Tracks tasks, dependencies, and interfaces.
Layer 3: Specialized Execution Agents
Domain experts that receive minimal, focused context and return only completed work.
Layer 4: Integration Validation Layer
Prevents subtle bugs from parallel development by validating interfaces and contracts.
Quick Start
Installation
Add to existing project (recommended):
npx claude-orchestra initThis single command will:
- Copy the
.claude/directory to your project - Install required dependencies (
smol-toml, TypeScript) - Update
.gitignoreto exclude state files
No project pollution - everything stays in .claude/
Clone for development:
git clone https://github.com/crs/claude-orchestra.git
cd claude-orchestra
yarn installUpdating
To update to the latest version of Claude Orchestra:
# Recommended: Use the update command
npx claude-orchestra@latest update
# Alternative: Re-run init
npx claude-orchestra@latest init
# Force reinstall (discards customizations)
npx claude-orchestra@latest init --forceWhat gets updated:
- ✅ All agents, commands, and tools
- ✅ Documentation and context manager
- ✅ Preserves your orchestration state (
.claude/state/) - ✅ Merges your
settings.local.jsoncustomizations - ✅ Preserves your
.mcp.jsonconfiguration - ✅ Creates backup in
.claude.backup/
Run Your First Orchestrated Task
In Claude Code, use the orchestrate command:
/orchestrate Implement user authentication with OAuth and email/password supportThe orchestrator will:
- Decompose into specialist tasks
- Present you with a plan
- Launch agents in parallel waves
- Validate integration
- Report completion
Recent Updates
System Refinements (November 2024)
Reduced Verbosity Across All Agents (58% reduction)
- Extracted common patterns to shared documentation
- Created SECURITY_CHECKLIST.md - Security requirements reference for all agents
- Created AGENT_CONTRACT.md - Core principles and handoff protocols
- Created CODE_REVIEW_SEVERITY_GUIDE.md - Calibration examples for consistent reviews
- Reduced 8 major specialist files from 4,051 to 1,719 lines while maintaining all essential information
New Specialists
- deployment-specialist.md - CI/CD, Docker, Kubernetes expertise
- documentation-specialist.md - API docs, OpenAPI, README generation
Enhanced Orchestration
- Added
/orchestrate status- Track progress of current orchestration - Added
/orchestrate resume- Resume interrupted orchestrations from saved state - Added comprehensive status tracking methods to context manager:
getOrchestrationStatus()- Full orchestration overviewgetTaskDetails(taskId)- Detailed task informationgetResumableOrchestration()- Resume capabilities
Improved Dependency Parsing
- Replaced fragile regex-based parsing with proper TOML parser
- Now correctly handles complex pyproject.toml and Cargo.toml files
- Better Go mod parsing with state machine approach
Validator Clarity
- Added dedicated section explaining Integration Validator vs Code Reviewer roles
- Clear distinction: "Does it fit together?" vs "Is it production-ready?"
- Updated documentation to show when each validator should be used
Version Management & CI/CD
- Added Justfile for semantic versioning (major/minor/patch)
- Automated GitHub Actions workflows with OIDC trusted publishing
- Git tag-based releases with embedded release notes
- Three-stage CI/CD: validate → publish → create-release
- Pre-release support with automatic dist-tag detection (alpha/beta/rc)
Architecture Overview
graph TB
subgraph Layer1["Layer 1: Orchestration"]
Orch[Orchestrator Agent<br/>Never writes code, only coordinates]
end
subgraph Layer2["Layer 2: Context Management"]
CM[Context Manager<br/>State & Dependencies]
TT[Task Tracker<br/>Progress & Handoffs]
end
subgraph Layer3["Layer 3: Specialist Agents"]
BE[Backend Specialist]
FE[Frontend Specialist]
DB[Database Specialist]
TEST[Test Specialist]
ML[ML Specialist]
DOC[Documentation Specialist]
end
subgraph Layer4["Layer 4: Validation"]
IV[Integration Validator<br/>Checks interfaces & contracts]
CR[Code Reviewer<br/>Security & quality gates]
end
Orch --> CM
Orch --> TT
CM --> BE
CM --> FE
CM --> DB
CM --> TEST
CM --> ML
CM --> DOC
BE --> IV
FE --> IV
DB --> IV
TEST --> IV
IV --> CR
CR --> Orch
style Orch fill:#e1f5ff
style CM fill:#fff4e1
style TT fill:#fff4e1
style IV fill:#e8f5e9
style CR fill:#e8f5e9Technology Detection & Routing
The orchestrator automatically detects your project's technology stack and routes to the appropriate specialists:
flowchart LR
Start[Project Analysis] --> Detect{Detect Config Files}
Detect -->|package.json| Node[Node.js Stack]
Detect -->|pyproject.toml| Python[Python Stack]
Detect -->|go.mod| Go[Go Stack]
Detect -->|Cargo.toml| Rust[Rust Stack]
Detect -->|pom.xml/build.gradle| Java[Java Stack]
Node --> NodeSpec[Backend Node.js<br/>Frontend React<br/>Test Jest]
Python --> PythonChoice{ML Task?}
PythonChoice -->|Yes| MLSpec[ML Specialist<br/>polars + PyTorch]
PythonChoice -->|No| BackendPy[Backend Python<br/>FastAPI + uv]
Go --> GoSpec[Backend Go<br/>High Performance]
Rust --> RustSpec[Backend Rust<br/>Systems Programming]
Java --> JavaSpec[Backend Java<br/>Enterprise Apps]
style Node fill:#68a063
style Python fill:#3776ab
style Go fill:#00add8
style Rust fill:#ce412b
style Java fill:#f89820Available Agents
Orchestrator
Role: Decomposes complex tasks, detects technology stack, routes to specialists Location: .claude/agents/orchestrator.md When to use: For any multi-file operation or complex feature Features: Auto-detects Python/Node.js/Go/Rust/Java, recommends tech for greenfield
Backend Specialists (Multi-Language)
Node.js Backend
Location: .claude/agents/backend-nodejs-specialist.md Stack: Node.js 20+, TypeScript, Express/Fastify/NestJS, Prisma Package Manager: npm, pnpm
Python Backend
Location: .claude/agents/backend-python-specialist.md Stack: Python 3.11+, FastAPI/Django/Flask, SQLAlchemy, Pydantic Package Manager: uv (10-100x faster than pip) Config: pyproject.toml (modern standard)
Go Backend
Location: .claude/agents/backend-go-specialist.md Stack: Go 1.21+, Gin/Echo, pgx/GORM Package Manager: go modules (built-in) Best for: High-performance APIs, microservices, concurrency
Rust Backend
Location: .claude/agents/backend-rust-specialist.md Stack: Rust 1.75+, Axum/Actix-web, sqlx Package Manager: Cargo (built-in) Best for: Maximum performance, systems programming
Java Backend
Location: .claude/agents/backend-java-specialist.md Stack: Java 17+, Spring Boot/Quarkus, JPA/Hibernate Package Manager: Maven, Gradle Best for: Enterprise applications
Machine Learning Specialist
Location: .claude/agents/ml-specialist.md Stack: Python 3.11+, PyTorch/TensorFlow/scikit-learn, polars (fast data), MLflow Package Manager: uv Best for: ML pipelines, model training, data science
Frontend Specialist
Location: .claude/agents/frontend-specialist.md Stack: React 18+, TypeScript, Redux/Zustand, Tailwind CSS Expertise: Components, state, accessibility, performance
Test Specialist
Location: .claude/agents/test-specialist.md Expertise: Jest/Vitest, React Testing Library, Playwright, integration tests
Database Specialist
Location: .claude/agents/database-specialist.md Expertise: PostgreSQL, Prisma/TypeORM/SQLAlchemy, indexing, migrations
Integration Validator
Location: .claude/agents/integration-validator.md Validates: Type interfaces, API contracts, race conditions, security
Code Reviewer (Quality Gate)
Location: .claude/agents/code-reviewer.md Role: Reviews completed code for security, performance, technical debt Called: After each implementation wave, before feature completion Actions:
- 🔴 P0 Critical: Blocks deployment (security vulnerabilities, data corruption)
- 🟡 P1 Major: Blocks release (performance issues, significant bugs)
- 🟢 P2 Minor: Logs as technical debt (code quality improvements) Creates: Refinement tasks for orchestrator when P0/P1 issues found
Deployment Specialist
Location: .claude/agents/deployment-specialist.md Expertise: CI/CD pipelines, Docker, Kubernetes, cloud deployment Handles: Dockerfiles, GitHub Actions, deployment strategies, rollback procedures
Documentation Specialist
Location: .claude/agents/documentation-specialist.md Expertise: API docs, README files, code documentation, OpenAPI/Swagger Creates: API documentation, inline docstrings, architecture docs, changelogs
Create Your Own!
Guide: CREATING_CUSTOM_AGENTS.md Template: .claude/agents/_template-specialist.md Add support for: Ruby, PHP, Elixir, or any language/domain you need
Understanding the Validation Layer
The system uses two complementary validators with distinct responsibilities:
Integration Validator (Structural Validation)
Purpose: Ensures all agent outputs work together cohesively
Validates:
- ✅ Type interfaces match across agent boundaries (frontend ↔ backend)
- ✅ API contracts are consistent (request/response formats, status codes)
- ✅ Race conditions in concurrent code
- ✅ Data flow is lossless across transformations
- ✅ Performance (N+1 queries, missing indexes)
When called: After each wave when multiple agents have completed work
Example issues found:
// Backend returns user_id: number
// Frontend expects userId: string
// → Integration Validator detects this mismatchCode Reviewer (Quality Gate)
Purpose: Reviews completed code for security, maintainability, and technical debt
Reviews:
- 🔴 Security vulnerabilities (SQL injection, XSS, auth bypass)
- 🟡 Performance issues (memory leaks, inefficient algorithms)
- 🟢 Code quality (naming, duplication, missing tests)
- 📋 Technical debt (hardcoded config, missing error handling)
When called: After implementation waves, before feature completion
Severity system:
- P0 Critical: Blocks deployment (security, data corruption)
- P1 Major: Blocks release (performance, significant bugs)
- P2 Minor: Logged as tech debt (code quality)
Example issues found:
# P0: Passwords stored in plaintext
user.password = request_data['password']
# → Code Reviewer flags as Critical, creates refinement taskKey Difference
| Integration Validator | Code Reviewer | | ---------------------------------- | ------------------------------ | | "Does it fit together?" | "Is it production-ready?" | | Validates contracts between agents | Reviews implementation quality | | Structural/architectural | Security/performance/quality | | Prevents agent collision | Prevents technical debt | | After parallel waves | After implementation complete |
Both are essential - Integration Validator prevents subtle integration bugs, Code Reviewer prevents production incidents.
Context Management System
The context manager (.claude/lib/context-manager.ts) provides:
Task Registration
const hub = new AgentContextHub();
const task = hub.registerTask({
description: 'Implement user API endpoints',
assignedTo: 'backend-specialist',
dependencies: ['types-task-id'],
estimatedContextTokens: 15000,
});Dependency Tracking
// Check if task can start
const result = hub.canStartTask(taskId);
if (result.canStart) {
// Launch agent
} else {
console.log('Blocked by:', result.blockedBy);
}Handoff Protocol
// Prepare structured handoff between agents
const handoff = hub.prepareHandoff(fromTaskId, 'frontend-specialist');
// Contains: interfaces, notes, dependencies, context budgetProgress Monitoring
const report = hub.getProgressReport();
// Shows: total tasks, by status, completion rate, blocked tasksOrchestration Workflow
sequenceDiagram
participant User
participant Orchestrator
participant Context as Context Manager
participant Specialist as Specialist Agent
participant Validator as Integration Validator
participant Reviewer as Code Reviewer
User->>Orchestrator: /orchestrate <task>
Orchestrator->>Context: Decompose task
Context-->>Orchestrator: Task plan with waves
Orchestrator->>User: Present plan for approval
User-->>Orchestrator: Approve
loop For each wave
Orchestrator->>Context: Register tasks
Context->>Specialist: Launch with minimal context
Specialist->>Specialist: Implement assigned work
Specialist->>Context: Report completion + artifacts
Context->>Validator: Validate integration
Validator-->>Context: Check interfaces & contracts
end
Context->>Reviewer: Review all implementations
Reviewer-->>Orchestrator: P0/P1/P2 issues
alt Critical issues found
Orchestrator->>Context: Create refinement tasks
Context->>Specialist: Fix issues
end
Orchestrator->>User: Feature completeUsage Examples
Example 1: Simple Feature (User Profile)
/orchestrate Add user profile page with avatar upload and bio editingOrchestrator breaks down into:
Wave 1 (Parallel):
├─ backend-specialist: User profile API endpoints
└─ frontend-specialist: ProfilePage component
Wave 2 (Sequential):
├─ integration-validator: Validate API contracts
└─ test-specialist: Integration testsResult: Feature complete in ~30 minutes vs 2-3 hours single-agent
Example 2: Complex System (Real-time Collaboration)
/orchestrate Implement real-time collaborative editing with operational transformation,
supporting 50+ concurrent users, WebSocket + Redis, sub-100ms latencyOrchestrator breaks down into:
Wave 1 (Parallel) - Foundation:
├─ types-specialist: Define Operation, Transform, Document types
└─ database-specialist: Design documents, operations, presence schema
Wave 2 (Parallel) - Implementation:
├─ backend-specialist: WebSocket server with room management
├─ redis-specialist: Pub/sub channels for operations
├─ algorithm-specialist: Operational transformation merge logic
└─ frontend-specialist: Collaborative editor component
Wave 3 (Sequential) - Integration:
├─ integration-validator: Validate all interfaces
└─ test-specialist: Concurrent operation tests
Wave 4 (Sequential) - Optimization:
└─ performance-specialist: Latency optimization, connection poolingResult: Production system in 3-4 days vs 2-3 weeks single-developer
Example 3: Validation Only
After manual implementation, validate everything works together:
/validate-integrationValidator checks:
- Type interface consistency across files
- API contract alignment (backend ↔ frontend)
- Race conditions in concurrent code
- Security practices (auth, validation, injection prevention)
- Performance issues (N+1 queries, missing indexes)
Token Economics
Single-Agent Approach:
Initial context: 50,000 tokens
Implementation: 180,000 tokens
Debugging: 270,000 tokens
Total: 500,000+ tokens → partial success + architectural driftOrchestrated Approach:
Orchestrator: 5,000 tokens
6 specialists: 60,000 tokens (10k each)
Integration: 15,000 tokens
Total: 80,000 tokens → complete success, zero driftEfficiency gain: 84% fewer tokens while achieving 100% completion
Best Practices
1. Always Start with Orchestrator
For any task involving 3+ files or multiple domains, use /orchestrate <complex task> instead of manually asking Claude to implement everything.
2. Present Plans Before Executing
The orchestrator should always show you the decomposition before launching agents.
3. Use Wave-Based Deployment
Don't launch all agents at once. Deploy in waves based on dependencies:
- Wave 1: Foundation (types, schemas)
- Wave 2: Implementation (parallel specialist work)
- Wave 3: Integration (validation, tests)
4. Maintain Context Boundaries
Each specialist should see:
- Their specific task
- Relevant interfaces
- Project conventions
Each specialist should NOT see:
- Other agents' implementation details
- Full project history
- Unrelated code
5. Always Validate Integration
Before considering work complete:
/validate-integration6. Keep Orchestrator Pure
The orchestrator never writes code. If you catch yourself writing implementation as orchestrator, STOP and delegate to a specialist.
Customization
Adding Custom Specialists
Create a new agent in .claude/agents/:
# My Custom Specialist
## Role
You are a [domain] specialist focused on [specific responsibilities].
## Technical Constraints
- Stack/framework requirements
- Coding standards
- Performance requirements
## Responsibilities
1. Primary responsibility
2. Secondary responsibility
## Input/Output Format
[Specify what this agent receives and returns]
## Quality Standards
[Checklist of requirements before completion]Modifying Workflow Commands
Edit files in .claude/commands/ to customize orchestration workflow.
Adjusting Context Budgets
In context-manager.ts, modify:
private maxContextPerWave: number = 100000; // Adjust per wave budgetMonitoring and Debugging
View Current State
import { AgentContextHub } from './context-manager';
const hub = new AgentContextHub();
const report = hub.getProgressReport();
console.log(report);Check for Conflicts
const conflicts = hub.detectConflicts();
conflicts.forEach((c) => {
console.log(`${c.severity}: ${c.description}`);
});View Dependency Graph
const graph = hub.getDependencyGraph();
// Visualize task dependenciesTroubleshooting
"Agent keeps rewriting the same code"
Cause: Single agent exceeding context capacity
Solution: Use /orchestrate to break into smaller tasks
"Interfaces don't match between frontend/backend"
Cause: No integration validation
Solution: Run /validate-integration regularly
"Agents making conflicting changes"
Cause: No dependency management Solution: Orchestrator should define clear dependencies and wave deployment
"Too many permission prompts"
Cause: Not batching file operations Solution: Each specialist should complete its domain fully before handoff
Real-World Results
From the article's case studies:
Microservices Migration:
- Traditional: Failed after 3 days, 347 conflicting commits
- Orchestrated: Completed in 4 days, 12 agents, zero breaking changes
WebSocket Collaboration System:
- Traditional: 2-3 weeks, senior developer
- Orchestrated: 3 days, 2 developers + agents, 92% test coverage
Authentication Module:
- Single agent: 72 hours of circular rewrites, 3.2M tokens
- Orchestrated: 8 hours, 89% context accuracy maintained
Advanced Topics
Progressive Context Summarization
For long-running sessions, the context manager can compress previous waves while maintaining architectural decisions.
Agent Lifecycle Management
Automatic termination when:
- Three consecutive incorrect suggestions
- Context usage > 85%
- Circular modifications detected
- Task complete
Performance Optimization
- Connection pooling for context manager
- Parallel wave execution
- Lazy loading of agent prompts
Contributing
To add new specialists or improve orchestration:
- Create agent definition in
.claude/agents/ - Add tests to verify agent behavior
- Update this README with usage examples
- Submit PR with example orchestration
License
MIT - Use this orchestration system however you want
Credits
Architecture based on research by Alireza Rezvani: 97% of Developers Kill Their Claude Code Agents in the First 10 Minutes
Implemented with practical tools for production use.
Quick Reference
Commands
/orchestrate <task>- Start new orchestration (decompose and coordinate)/orchestrate status- Show current orchestration progress/orchestrate resume- Resume from saved orchestration state/validate-integration- Check all interfaces and contracts
Agents
orchestrator- Task decomposition and coordinationbackend-specialist- APIs, services, business logicfrontend-specialist- UI components, state managementtest-specialist- Test coverage and quality assurancedatabase-specialist- Schema design and optimizationintegration-validator- Interface validation and conflict detection
Context Manager API
const hub = new AgentContextHub();
// Task management
hub.registerTask({ description, assignedTo, dependencies });
hub.canStartTask(taskId);
hub.updateTaskStatus(taskId, 'completed', artifacts);
// Status tracking
hub.getProgressReport(); // Progress summary
hub.getOrchestrationStatus(); // Comprehensive status
hub.getTaskDetails(taskId); // Detailed task info
hub.getResumableOrchestration(); // Resume info
// Validation
hub.detectConflicts();File Structure
.claude/
agents/ # Agent definitions (specialists)
commands/ # Workflow commands (/orchestrate, /validate-integration)
docs/ # Shared documentation (security, contracts, guides)
lib/ # Context manager and core utilities
tools/ # Helper scripts (validate-integration, state management)
state/ # Runtime state (gitignored)
README.md # This documentation
package.json # Dependencies (smol-toml, TypeScript)Version Management & Publishing
This project uses an automated version management and CI/CD system powered by Justfile and GitHub Actions.
Quick Release Workflow
# 1. Validate everything works
just validate
# 2. Bump version with release notes
just bump patch "Fix memory leak in context manager"
# or: just bump minor "Add new ML specialist"
# or: just bump major "Breaking: Redesign agent API"
# 3. Review the changes
git show HEAD
# 4. Push to trigger CI/CD
just pushAvailable Commands
just version # Show current version
just bump [type] "notes" # Bump version (patch/minor/major)
just validate # Run build and validation
just release [type] "..." # Complete workflow (bump + push)
just push # Push commits and tags
just tags # List recent version tags
just show-tag # Show latest tag details
just undo # Undo last bump (before push!)Pre-release Versions
Pre-releases are published to npm with dist-tags so users can opt-in:
# Create and push pre-release
just prerelease alpha # Creates v1.0.1-alpha.0
just push # Triggers automatic publish to npm@alpha
# Users install with:
npm install claude-orchestra@alpha # Latest alpha
npm install [email protected] # Specific versionSupported pre-release types:
alpha- Early testing releasesbeta- Feature-complete testing releasesrc- Release candidates
Dist-tag behavior:
- Stable releases →
npm install claude-orchestra(uses@latest) - Pre-releases →
npm install claude-orchestra@alpha(explicit opt-in)
CI/CD Pipeline
The project uses GitHub Actions with OIDC trusted publishing and shared workflows to eliminate redundancy.
Architecture:
_publish-shared.yml- Reusable workflow with core logic (validate, build, publish, create release)publish-npm.yml- Unified workflow that auto-detects release type and calls shared workflow
Automatic workflows:
| Tag Format | Detected Type | npm dist-tag | Installation |
| ---------------- | ------------- | ------------ | ------------------------------------ |
| v1.0.0 | Stable | latest | npm install claude-orchestra |
| v1.0.1-alpha.0 | Pre-release | alpha | npm install claude-orchestra@alpha |
| v1.0.1-beta.1 | Pre-release | beta | npm install claude-orchestra@beta |
| v2.0.0-rc.0 | Pre-release | rc | npm install claude-orchestra@rc |
Publishing Setup:
Create npm access token:
- Go to https://www.npmjs.com/settings/YOUR_USERNAME/tokens
- Create a new "Automation" token
- Add as GitHub secret:
Settings→Secrets and variables→Actions→New repository secret - Name:
NPM_TOKEN, Value: your npm token
npm uses hybrid authentication:
NPM_TOKEN: Used for publishing authentication--provenanceflag: Uses OIDC to sign attestations (supply chain security)- Both together provide secure, verifiable publishing
Development with Yarn
This project uses yarn for dependency management:
yarn install # Install dependencies
yarn build # Compile TypeScript
yarn validate # Run integration validationPublishing uses npm publish for OIDC support, but all development uses yarn.
Next Steps
- Read through the agent definitions in
.claude/agents/ - Try your first orchestrated task:
/orchestrate <your task> - Validate integration with
/validate-integration - Monitor progress using the context manager API
- Scale to complex, multi-agent projects
Welcome to unstoppable multi-agent development.
