@realtimex/collab

v0.3.2

Published

2 months ago

Multi-agent code review and autonomous improvement framework

0High
0Medium
0Low

realtimex

code-review ai multi-agent collaboration autonomous gemini codex cli

Collab - Multi-Agent Code Review Framework

A revolutionary multi-agent orchestration framework for autonomous code review and continuous improvement.

🚀 What is Collab?

Collab is a new kind of coding agent that orchestrates multiple AI agents to review and improve your code. Unlike single-agent tools, Collab runs specialized agents in parallel, each finding different types of issues:

Codex finds structural and architectural issues
Gemini finds logic, configuration, and performance issues
Custom agents can be added for specialized reviews

✨ Features

🤖 Multi-Agent Orchestration - Run multiple AI agents in parallel
🔍 Deep Code Analysis - Agents use filesystem, git, and tools (not just prompts)
📊 Comprehensive Logging - Full audit trail of all reviews
⚙️ Configuration-Driven - Add agents via JSON, no code changes
🔄 Autonomous Loop (Coming Soon) - Continuous improvement until no issues remain
🌍 Language-Agnostic - Works with any Git repository

📦 Installation

npm install -g @realtimex/collab

Or run directly:

npx @realtimex/collab init

🎯 Quick Start

1. Initialize in your project

cd my-project
collab init

collab init now runs in smart mode by default:

auto-initializes Git in empty folders (or prompts in interactive mode)
auto-detects init lane: attach (existing repo) vs bootstrap (empty repo)
scans your codebase (scripts, stack, API/DB/env signals)
probes likely commands with safe timeouts
asks for missing inputs in interactive terminals
generates .collaboration/docs/*.md guides from discovered project context
in bootstrap lane, runs a wizard and creates starter project boilerplate

It also creates .collaboration/ with config and logging defaults.

2. Run a review

collab review
# Disable live per-agent stdout/stderr threads:
collab review --no-threads

This runs all enabled agents on your uncommitted changes.

Live Agent Attach

# Attach to latest running/recorded agent stream:
collab attach coder

# Attach to a specific session + agent:
collab attach 20260213212940_loop contractor

3. View results

collab analytics

🧭 Engineering Guardrails

Contributor and agent standards are documented in:

AGENTS.md (cross-agent engineering guardrails)
CLAUDE.md (Claude execution contract under Collab)
docs-dev/ENGINEERING_GUARDRAILS.md (developer notes and definition of done)

📚 Commands

| Command | Description | |---------|-------------| | collab init | Smart-initialize collaboration in current Git repo (default) | | collab review [feature] | Run multi-agent code review | | collab gen contracts --all | Alias for full contract generation (contracts sync --all) | | collab contracts sync | Generate/update executable test contracts from changed files | | collab contracts enrich | Fill missing executable contract fields using hints + heuristics | | collab contracts run | Execute generated contracts (Playwright runtime) | | collab contracts doctor | Diagnose unresolved/non-executable contracts | | collab agents list | List available agents | | collab skills init playwright-cli | Install/sync Playwright CLI skill into managed paths | | collab skills doctor | Validate skill setup and detect drift between managed/agent copies | | collab config show | View current configuration | | collab config get <path> | Read a config value (supports dotted paths) | | collab config set <path> <value> | Update a config value (supports dotted paths) | | collab config edit | Edit configuration in $EDITOR | | collab analytics | View collaboration metrics | | collab doctor | Run preflight health checks | | collab loop | Start autonomous improvement | | collab loop roles <action> | Scaffold/edit loop multi-role workflow | | collab loop workflows <list\|show> | List and inspect workflow templates | | collab attach [session] [agent] | Attach to a live per-agent stream from session logs |

⚙️ Configuration

After collab init, edit .collaboration/config.json:

{
  "project": {
    "name": "my-app",
    "language": "TypeScript",
    "testCommand": "npm test"
  },
  "agents": [
    {
      "name": "codex",
      "enabled": true,
      "tier": "primary"
    },
    {
      "name": "gemini",
      "enabled": true,
      "tier": "primary"
    }
  ]
}

Smart Init Options

collab init                 # smart mode (default)
collab init --template      # legacy template-only initialization
collab init --yes           # non-interactive (accept detected defaults)
collab init --mode auto     # lane selection: auto|attach|bootstrap
collab init --attach        # force existing-repo onboarding lane
collab init --bootstrap     # force starter-boilerplate lane
collab init --no-probe      # skip command probes
collab init --force         # reinitialize existing .collaboration/

Smart init generates:

.collaboration/docs/PROJECT.md
.collaboration/docs/DEV.md
.collaboration/docs/TEST.md
.collaboration/docs/API.md
.collaboration/docs/GUARDRAILS.md

Action-Specific Agent Commands (Superagent Mode)

You can route each agent to different commands per action:

{
  "agents": [
    {
      "name": "codex",
      "commands": {
        "review": {
          "command": "codex review --uncommitted",
          "inputMethod": "args",
          "nonInteractive": true,
          "interactionMode": "non-interactive"
        },
        "implement": {
          "command": "codex exec --full-auto --sandbox workspace-write",
          "inputMethod": "args",
          "nonInteractive": true,
          "interactionMode": "non-interactive"
        }
      }
    }
  ]
}

collab review uses review; loop/codegen paths use implement and fix.

For loop reliability, commands should explicitly declare interaction behavior:

nonInteractive: true for commands safe to run without prompts.
requiresTTY: true / interactionMode: "interactive" for prompt-driven commands.

Loop defaults to non-interactive execution and skips interactive commands with a needs-tty status. Use collab loop "<task>" --allow-interactive to run interactive commands in pseudo-TTY mode and answer agent prompts inline.

Autonomous Loop Defaults

collab loop "<task>" now runs with autonomous defaults:

approval prompts are skipped by default (--require-approval to enable)
review/test remediation is auto-attempted by implementation agents (--no-auto-fix to disable)
manual Enter pauses are off by default (--manual to enable step-by-step intervention)
test contracts are auto-synced from changed UI/form/view files before checks (--no-contracts-sync to disable)
executable contracts are auto-run before tests (--no-contracts-run to disable)
unresolved contracts auto-trigger enrichment before next checks (--no-contracts-enrich to disable)
before each contract execution, loop runs skill preflight and auto-heals Playwright skill paths (.collaboration/skills ↔ agent skill dirs)
unresolved contracts are treated as blocking by default in collab loop (--contracts-allow-unresolved for best-effort mode)
failing executable contracts are auto-remediated by coding agents and re-run on failed IDs only
autonomous --no-manual runs continue to the next iteration when remediation returns no edits (bounded by loop.maxIterations); use --fail-fast to abort immediately
collab init (smart/TDD) now scaffolds contract-aware default roles: spec, coder, contractor, reviewer, qa
agent terminal mirrors auto-open by default in TTY runs; each running role opens a dedicated collab attach <session> <role> terminal stream (--no-agent-terminals to disable)
loop supports workflow templates (--workflow <id>) with dynamic mapping to enabled local agents; run collab loop workflows list to discover templates
bugfix loops use an escalation ladder by default: L1 one-shot owner, L2 owner+investigator, L3 broader multi-agent remediation/review
bugfix review prompts are evidence-driven (repro steps, stack traces, console/network artifacts) instead of broad "review everything" prompts
loop now runs targeted checks first (impacted tests inferred from changed files) before full checks; configure project.targetedTestCommand with {files} for project-specific targeting

Terminal mirror controls:

CLI:
- --agent-terminals / --open-agent-terminals
- --no-agent-terminals / --no-open-agent-terminals
- --agent-terminal-mode auto|terminal|tmux|off
- --agent-terminals-max 4
Config (.collaboration/config.json):
- ui.agentTerminals.autoOpen
- ui.agentTerminals.mode
- ui.agentTerminals.maxOpen

Contract Sync + Enrich + Run

Generate/update contract files that describe expected behavior for changed surfaces:

collab gen contracts --all
collab contracts sync
collab contracts enrich
collab contracts sync --all
collab contracts sync --base HEAD~1 --json
# run changed contracts
collab contracts run --base-url http://localhost:5173
# run only selected contracts
collab contracts run --only src-views-accounts,src-components-forms-leadform --base-url http://localhost:5173
# run all contracts, strict mode
collab contracts run --all --base-url http://localhost:5173 --strict
# run with orchestration hooks and reliability controls
collab contracts run --base-url http://localhost:5173 \
  --seed-command "npm run seed:test -- --run-id={runId}" \
  --cleanup-command "npm run seed:cleanup -- --run-id={runId}" \
  --max-wall-clock-ms 300000 --step-timeout-ms 15000 --flake-retries 1
# diagnose unresolved contracts
collab contracts doctor --json

Contracts are written to:

.collaboration/contracts/*.contract.json
.collaboration/contracts/index.json
.collaboration/collab.db (SQLite tracking of contracts now, plus room for memory/lessons/analytics tables)

Optional hints file:

.collaboration/contracts/hints.json

Hints can define:

routes.byFile / routes.bySurfaceName
selectors.byFile / selectors.bySurfaceName / selectors.byType
network.byFile expected request patterns for load-data

collab contracts run executes contracts deterministically with Playwright and reports:

passed: contract assertions passed
failed: runtime/assertion failures
unresolved: contract missing required execution hints (for example entry.url)
skipped: runtime unavailable (for example Playwright not installed)

Flow contracts can execute step types:

visit, click, fill, submit, expectText, expectResponse, expectNoConsoleError

Contract runtime orchestration can:

start/reuse dev server via project.devCommand
wait for healthUrl readiness
run seedCommand / cleanupCommand hooks with {runId} and {baseUrl} placeholders
capture artifacts for failures (trace/screenshot/events) under .collaboration/logs/contracts/

Multi-Role Loop Workflow

You can configure loop to run dedicated roles with different agents (including multiple instances of the same base CLI):

{
  "loop": {
    "enabled": true,
    "roles": [
      {
        "role": "spec",
        "phase": "planning",
        "agent": "codex",
        "action": "review",
        "instruction": "Create acceptance criteria and executable contract expectations."
      },
      {
        "role": "coder",
        "phase": "implementation",
        "agent": "claude",
        "action": "implement",
        "instruction": "Implement scoped changes and add/update tests."
      },
      {
        "role": "contractor",
        "phase": "implementation",
        "agent": "gemini",
        "action": "implement",
        "instruction": "Enrich executable contracts and fix browser/runtime contract failures."
      },
      {
        "role": "reviewer",
        "phase": "review",
        "agent": "codex",
        "action": "review"
      },
      {
        "role": "qa",
        "phase": "review",
        "agent": "qwen",
        "action": "test"
      }
    ]
  }
}

Role-scoped execution controls are supported via loop.roles[].execution:

model
maxTurns (attempt budget for that role)
retries (attempts = retries + 1 when maxTurns is unset)
timeoutMs
maxWallClockMs
temperature
extraArgs

Timeout precedence in loop execution is deterministic:

participant.executionOverrides (if present)
loop.roles[].execution
agents[].execution
phase defaults (loop.phaseTimeoutMs, loop.phaseMaxWallClockMs)
transport defaults (loop.transportTimeoutMs, loop.transportMaxWallClockMs, then built-in defaults)

Agent defaults can be set once in agents[].execution and are overridden by loop.roles[].execution.

Example:

{
  "agents": [
    {
      "name": "claude",
      "execution": {
        "model": "sonnet",
        "timeoutMs": 300000
      }
    }
  ],
  "loop": {
    "roles": [
      {
        "role": "coder",
        "phase": "implementation",
        "agent": "claude",
        "action": "implement",
        "execution": {
          "model": "opus",
          "maxTurns": 4,
          "maxWallClockMs": 1200000
        }
      }
    ]
  }
}

Supported phases: planning, implementation, review.

🤖 How It Works

Single Review

Human → collab review
  ├─ AgentManager loads config
  ├─ Runs agents in parallel
  │   ├─ Codex reviews code quality
  │   └─ Gemini reviews logic
  ├─ Aggregates findings
  └─ Logs results

Autonomous Loop (Coming Soon)

collab loop
  ├─ Agents review code
  ├─ Claude fixes issues
  ├─ Tests validate fixes
  ├─ Changes committed
  └─ Repeat until no issues

🌟 Why Collab?

Traditional Code Review

⏱️ Slow (hours to days)
👥 Requires human reviewers
🎯 One perspective

Single AI Agent

⚡ Fast
🤖 Automated
🎯 One perspective

Collab (Multi-Agent)

⚡ Fast (parallel execution)
🤖 Automated
🎯 Multiple perspectives
🔄 Continuous improvement
📊 Full audit trail

📖 Example Output

$ collab review

🤖🤝👨‍💻 Collab - Multi-Agent Collaboration Framework

============================================================
🤖 Multi-Agent Review: uncommitted-changes
============================================================

📝 Running 2 agents: codex, gemini

✅ Review completed in 45s
   Agents: 2/2 succeeded
   Findings: 3 issues found

────────────────────────────────────────────────────────────
📊 Review Summary
────────────────────────────────────────────────────────────
Total Issues: 3
  🔴 High Priority: 1
  🟡 Medium Priority: 2
  🟢 Low Priority: 0

🔴 High Priority Issues:

  1. [codex] Schema migration compatibility issue
     💡 Add forward migration for unique constraint

🟡 Medium Priority Issues:

  1. [gemini] Timeout misalignment (35s vs 30s)
     💡 Sync frontend and backend timeouts to 45s

  2. [codex] Missing CLI scripts
     💡 Remove references to non-existent files

🛠️ Development

# Clone and install
git clone https://github.com/realtimex/collab.git
cd collab
npm install
npm run build

# Link for local testing
npm link

# Test
cd /path/to/your/project
collab init
collab review

🤝 Contributing

We welcome contributions! Areas to help:

🤖 Add new agent integrations (Cursor, Claude, GPT-4)
📦 Improve parsers for different output formats
🔄 Build the autonomous loop system
📚 Write documentation and examples

📄 License

MIT

🙏 Credits

Built by the RealTimeX team. Inspired by the belief that AI agents should collaborate, not compete.

"If collaboration succeeds, we can 100x development. Maybe even 1000x." 🚀