quorum-audit
v0.4.5
Published
Cross-model audit gate with structural enforcement. Edit → audit → agree → retro → commit.
Maintainers
Readme
quorum
Cross-model audit gate with structural enforcement. One model cannot approve its own code.
edit → audit → agree → retro → commitWhat it does
quorum enforces a consensus protocol between AI agents. When code is written, an independent auditor reviews the evidence. If rejected, the author must fix and resubmit. The cycle repeats until consensus is reached — only then can the code be committed.
The key principle: no single model can both write and approve code. This is the "quorum" — a minimum number of independent voices required for a decision.
Installation
Standalone (any AI tool)
quorum works without any IDE plugin. Just the CLI.
npm install -g quorum-audit # global install
# or
npx quorum-audit setup # one-shot without install
cd your-project
quorum setup # creates config + MCP server registration
quorum daemon # TUI dashboardWorks with any AI coding tool — Claude Code, Codex, Gemini, or manual use.
As a Claude Code plugin
For automatic hook integration (event-driven audit on every edit):
claude plugin marketplace add berrzebb/quorum
claude plugin install quorum@berrzebb-pluginsThis registers 22 lifecycle hooks, 22 MCP tools, 9 skills, and 12 specialist agents automatically. The CLI still works alongside the plugin.
As a Gemini CLI extension
For automatic hook integration with Gemini CLI:
gemini extensions install https://github.com/berrzebb/quorum.git
# or for development:
gemini extensions link adapters/geminiAs a Codex CLI hook
For automatic hook integration with OpenAI Codex CLI:
# Copy hooks config to project
cp adapters/codex/hooks/hooks.json .codex/hooks.json
# Enable hooks feature flag
codex -c features.codex_hooks=trueThis registers 5 hooks (SessionStart, Stop, UserPromptSubmit, AfterAgent, AfterToolUse). Same audit engine as Claude Code and Gemini.
This registers 11 hooks, 8 skills, 4 commands, and 22 MCP tools. Same audit engine as Claude Code.
From source
git clone https://github.com/berrzebb/quorum.git
cd quorum && npm install && npm run build
npm link # makes 'quorum' available globallyCLI
quorum <command>
setup Initialize quorum in current project
interview Interactive requirement clarification
daemon Start TUI dashboard
status Show audit gate status
audit Trigger manual audit
plan Work breakdown planning
orchestrate Track orchestration (parallel execution) # v0.4.0
ask <provider> Query a provider directly
tool <name> Run MCP analysis tool
migrate Import consensus-loop data into quorum
help Show helpMigrating from consensus-loop
If you were using consensus-loop (v2.5.0), quorum can import your existing data:
quorum migrate # import config, audit history, session state
quorum migrate --dry-run # preview without changesWhat it migrates:
| Data | From | To |
|------|------|----|
| Config | .claude/consensus-loop/config.json | .claude/quorum/config.json |
| Audit history | .claude/audit-history.jsonl | SQLite EventStore |
| Session state | .session-state/retro-marker.json | Preserved (shared location) |
| Evidence submission | docs/feedback/claude.md | audit_submit MCP tool |
| MCP server | .mcp.json consensus-loop entry | Cloned as quorum entry |
Your existing evidence is preserved — quorum reads from SQLite via audit_submit tool.
How it works
Without a plugin (standalone)
you write code
→ quorum audit # trigger manually
→ auditor reviews # Codex, GPT, Claude, or any provider
→ quorum status # check verdict
→ fix if rejected # resubmit
→ quorum daemon # watch the cycle in real-time TUIWith Claude Code plugin (automatic)
you write code
→ PostToolUse hook fires # automatic
→ regex scan + AST refine # hybrid: false positive removal
→ fitness score computed # 7-component quality metric
→ fitness gate # auto-reject / self-correct / proceed
→ trigger eval (12 factors)# skip, simple, or deliberative
→ auditor runs # background, debounced
→ verdict syncs # tag promotion/demotion
→ session-gate # blocks until retro complete
→ commit allowedBoth paths use the same core engine: bus/ + providers/ + core/.
Architecture
quorum/
├── cli/ ← unified entry point (works without any plugin)
├── daemon/ ← Ink TUI dashboard + FitnessPanel (works standalone)
├── bus/ ← EventStore (SQLite) + pub/sub + stagnation + LockService + Fitness + Claims + Orchestrator
├── providers/ ← consensus protocol + trigger (12-factor) + router + domain specialists + AST analyzer
├── core/ ← audit protocol (7 modules), templates, 22 MCP tools
├── languages/ ← pluggable language specs (fragment-based: spec.mjs + spec.{domain}.mjs)
├── agents/knowledge/ ← shared agent protocols (cross-adapter: implementer, scout, 9 specialist domains)
└── adapters/
├── shared/ ← adapter-agnostic business logic (17 modules, incl. HookRunner, NDJSON, MuxAdapter)
├── claude-code/ ← Claude Code hooks (22) + agents (12) + skills (9)
├── gemini/ ← Gemini CLI hooks (11) + skills (8) + commands (4)
└── codex/ ← Codex CLI hooks (5)The adapters/ layer is optional. Everything above it runs independently. Adding a new adapter requires only I/O wrappers — business logic is in adapters/shared/.
Core Concepts
Parliament Protocol
Legislative deliberation framework for structured consensus:
quorum parliament "topic" → CPS (Context-Problem-Solution)
quorum orchestrate plan <track> → interactive planner (Socratic + CPS)
quorum orchestrate run <track> → full implementation loop (auto)Enforcement Gates
8 gates that block progress until conditions are met (not just document — code enforces):
| Gate | Blocks when | Releases when | |------|------------|---------------| | Audit | Evidence submitted | Auditor approves | | Retro | Audit approved | Retrospective complete | | Quality | Lint/test fails | All checks pass | | Amendment | Pending amendments | All resolved (vote) | | Verdict | Last verdict ≠ approved | Re-audit passes | | Confluence | Integrity check failed | 4-point verification passes | | Design | Design artifacts missing | Spec + Blueprint exist | | Regression | Normal-form stage regressed | Alert only |
Deliberative Consensus
For complex changes (T3), a 3-role protocol runs:
- Advocate: finds merit in the submission
- Devil's Advocate: challenges assumptions, checks root cause vs symptom
- Judge: weighs both opinions, delivers final verdict
Language Spec Fragments (v0.4.1)
Quality patterns are defined per language in pluggable fragment files:
languages/typescript/
spec.mjs ← core: id, name, extensions (3 lines)
spec.symbols.mjs ← symbol extraction patterns
spec.imports.mjs ← dependency parsing
spec.perf.mjs ← performance anti-patterns
spec.a11y.mjs ← accessibility patterns
spec.observability.mjs
spec.compat.mjs
spec.doc.mjs ← documentation coverageAdding a new language = spec.mjs (3 lines) + relevant fragments. Adding a domain to an existing language = one new fragment file. The registry (languages/registry.mjs) auto-discovers and merges fragments at load time.
Domain Specialists (v0.3.0)
When changes touch specialized domains, quorum conditionally activates expert reviewers:
| Domain | Tool | Agent | Min Tier |
|--------|------|-------|----------|
| Performance | perf_scan | perf-analyst | T2 |
| Migration | compat_check | compat-reviewer | T2 |
| Accessibility | a11y_scan | a11y-auditor | T2 |
| Compliance | license_scan | compliance-officer | T2 |
| i18n | i18n_validate | — | T2 |
| Infrastructure | infra_scan | — | T2 |
| Observability | observability_check | — | T3 |
| Documentation | doc_coverage | — | T3 |
| Concurrency | — | concurrency-verifier | T3 |
Tools are deterministic (zero cost, always run). Agents are LLM-powered (only at sufficient tier).
Hybrid Scanning
Pattern scanning uses a 3-layer defense against false positives:
- Regex first pass — fast (<1ms/file), catches candidates
- scan-ignore pragma —
// scan-ignoresuppresses self-referential matches - AST second pass — precise (<50ms/file), removes comment/string matches, analyzes control flow
The perf_scan tool uses hybrid scanning: regex detects while(true), AST verifies if break/return exists.
Program mode (ts.createProgram()) enables cross-file analysis: unused export detection and import cycle detection via dependency graph DFS.
Fitness Score Engine
Inspired by Karpathy's autoresearch: what is measurable is not asked to the LLM.
Seven components combine into a 0.0–1.0 fitness score:
| Component | Weight | Input |
|-----------|--------|-------|
| Type Safety | 0.20 | as any count per KLOC |
| Test Coverage | 0.20 | Line + branch coverage |
| Pattern Scan | 0.20 | HIGH-severity findings |
| Build Health | 0.15 | tsc + eslint pass rate |
| Complexity | 0.10 | Avg cyclomatic complexity |
| Security | 0.10 | Vulnerability findings |
| Dependencies | 0.05 | Outdated/vulnerable deps |
The FitnessLoop gates LLM audit with 3 decisions:
- auto-reject: score drop >0.15 or absolute <0.3 → skip LLM audit (cost savings)
- self-correct: mild drop (0.05–0.15) → warn agent, continue
- proceed: stable/improved → update baseline, continue to audit
Conditional Trigger
Not every change needs full consensus. A 13-factor scoring system (6 base + domain + plan + fitness + blast radius + velocity + stagnation + interaction multipliers) determines the audit level:
| Tier | Score | Mode | |------|-------|------| | T1 | < 0.3 | Skip (micro change) | | T2 | 0.3–0.7 | Simple (single auditor) | | T3 | > 0.7 | Deliberative (3-role) |
3-Layer Adapter Pattern (v0.4.2)
Shared business logic across adapters. Only I/O differs per runtime:
I/O (adapters/{adapter}/)
Claude Code: hookSpecificOutput, permissionDecision
Gemini CLI: JSON-only stdout, hookSpecificOutput
Codex CLI: .codex/hooks.json, config.toml
↓ readStdinJson() + withBridge() + createHookContext()
Business Logic (adapters/shared/ — 17 modules)
hook-runner, hook-loader, trigger-runner, ndjson-parser,
cli-adapter, mux-adapter, jsonrpc-client, sdk-tool-bridge, ...
↓ bridge.init() + bridge.checkHookGate()
Core (core/)
audit, tools (21 MCP), EventStore, bus, providersAdding a new adapter requires ~280 lines (proven by the Codex adapter).
HookRunner Engine (v0.4.2)
User-defined hooks. Configure in config.json or HOOK.md:
// .claude/quorum/config.json
{
"hooks": {
"audit.submit": [
{ "name": "freeze-guard", "handler": { "type": "command", "command": "node scripts/check-freeze.mjs" } }
]
}
}command/http handlers, env interpolation ($VAR, ${VAR}), deny-first-break, async fire-and-forget, regex matcher filtering.
Multi-Model NDJSON Protocol (v0.4.2)
Unified parsing of 3 CLI runtime outputs:
| Runtime | Format | Adapter |
|---------|--------|---------|
| Claude Code | stream-json | ClaudeCliAdapter |
| Codex | exec --json | CodexCliAdapter |
| Gemini | stream-json | GeminiCliAdapter |
All outputs are normalized to AgentOutputMessage (assistant_chunk, tool_use, tool_result, complete, error). MuxAdapter bridges ProcessMux (tmux/psmux) sessions for real-time cross-model consensus.
Stagnation Detection
If the audit loop cycles without progress, 5 patterns are detected:
- Spinning: same verdict 3+ times
- Oscillation: approve → reject → approve → reject
- No drift: identical rejection codes repeating
- Diminishing returns: improvement rate declining
- Fitness plateau: fitness score slope ≈ 0 over last N evaluations
Blast Radius Analysis (v0.4.0)
BFS on the reverse import graph computes transitive dependents of changed files:
quorum tool blast_radius --changed_files '["core/bridge.mjs"]'
# → 12/95 files affected (12.6%) — depth-sorted impact list- 10th trigger factor: ratio > 10% → score += up to 0.15 (auto-escalation to T3)
- Pre-verify evidence: blast radius section included in auditor evidence
- Reuses
buildRawGraph()extracted fromdependency_graph(TTL-cached)
Structured Orchestration (v0.4.0)
Multi-agent coordination for parallel worktree execution:
| Component | Purpose |
|-----------|---------|
| ClaimService | Per-file ownership (INSERT...ON CONFLICT), TTL-based expiry |
| ParallelPlanner | Graph coloring for conflict-free execution groups |
| OrchestratorMode | Auto-selects: serial / parallel / fan-out / pipeline / hybrid |
| Auto-learning | Detects repeat rejection patterns (3+), suggests CLAUDE.md rules |
Event Reactor (v0.4.0)
respond.mjs rewritten as a pure event reactor: reads SQLite verdict events → executes side-effects only. No markdown read/write. -1043/+211 lines refactoring.
Dynamic Escalation
The tier router tracks failure history per task:
- 2 consecutive failures → escalate to higher tier
- 2 consecutive successes → downgrade back
- Frontier failures → stagnation signal
Planner Documents
The planner skill produces 10 document types for structured project planning:
| Document | Level | Purpose |
|----------|-------|---------|
| PRD | Project | Product requirements — problem, goals, features, acceptance criteria |
| Execution Order | Project | Track dependency graph — which tracks to execute first |
| Work Catalog | Project | All tasks across all tracks with status and priority |
| ADR | Project | Architecture Decision Records — why, not just what |
| Track README | Track | Track scope, goals, success criteria, constraints |
| Work Breakdown | Track | Task decomposition — ### [task-id] blocks with depends_on/blocks |
| API Contract | Track | Endpoint specs, request/response schemas, auth |
| Test Strategy | Track | Test plan — unit/integration/e2e scope, coverage targets |
| UI Spec | Track | Component hierarchy, states, interactions |
| Data Model | Track | Entity relationships, schemas, migrations |
Providers
quorum is provider-agnostic. Bring your own auditor.
| Provider | Mechanism | Hooks | Plugin needed? |
|----------|-----------|-------|---------------|
| Claude Code | 22 native hooks | SessionStart, PreToolUse, PostToolUse, Stop, PermissionRequest, Notification, ... | Optional (auto-triggers) |
| Gemini CLI | 11 hooks + 8 skills | SessionStart, BeforeAgent, AfterAgent, BeforeTool, AfterTool, BeforeModel, ... | Optional (gemini extensions install) |
| Codex CLI | 5 hooks | SessionStart, Stop, UserPromptSubmit, AfterAgent, AfterToolUse | Optional (.codex/hooks.json) |
| Manual | quorum audit | — | No |
Tools & Verification
Deterministic tools that replace LLM judgment with facts. No hallucination possible.
Analysis tools (19):
# Core analysis
quorum tool code_map src/ # symbol index
quorum tool dependency_graph . # import DAG, cycles
quorum tool blast_radius --changed_files '["src/api.ts"]' # transitive impact (v0.4.0)
quorum tool audit_scan src/ # type-safety, hardcoded patterns
quorum tool coverage_map # per-file test coverage
quorum tool audit_history --summary # verdict patterns
quorum tool ai_guide # context-aware onboarding (v0.4.0)
# RTM & verification
quorum tool rtm_parse docs/rtm.md # parse RTM → structured rows
quorum tool rtm_merge --base a --updates '["b"]' # merge worktree RTMs
quorum tool fvm_generate /project # FE×API×BE access matrix
quorum tool fvm_validate --fvm_path x --base_url http://localhost:3000 --credentials '{}'
# Domain specialists (v0.3.0)
quorum tool perf_scan src/ # performance anti-patterns (hybrid: regex+AST)
quorum tool compat_check src/ # API breaking changes
quorum tool a11y_scan src/ # accessibility (JSX/TSX)
quorum tool license_scan . # license compliance + PII
quorum tool i18n_validate . # locale key parity
quorum tool infra_scan . # Dockerfile/CI security
quorum tool observability_check src/ # empty catch, logging gaps
quorum tool doc_coverage src/ # JSDoc coverage %Verification pipeline (quorum verify):
quorum verify # all checks
quorum verify CQ # code quality (eslint)
quorum verify SEC # OWASP security (10 patterns, semgrep if available)
quorum verify LEAK # secrets in git (gitleaks if available, built-in fallback)
quorum verify DEP # dependency vulnerabilities (npm audit)
quorum verify SCOPE # diff vs evidence matchFull reference: docs/en/TOOLS.md | docs/ko/TOOLS.md
Tests
npm test # 1055 tests
npm run typecheck # TypeScript check
npm run build # compileCI/CD
GitHub Actions builds cross-platform binaries on tag push:
git tag v0.4.2
git push origin v0.4.2
# → linux-x64, darwin-x64, darwin-arm64, win-x64 binaries in ReleasesLicense
MIT
