magi-ai
v0.1.4
Published
MAGI System - Multi-AI CLI Deliberation Framework (Evangelion-inspired, unofficial fan-made)
Maintainers
Readme
MAGI System
English | 日本語
When one AI isn't enough — Three AI models debate, cross-examine, and vote. Catch what a single model misses.
Unofficial fan project inspired by Neon Genesis Evangelion. Not affiliated with Khara, Inc.
Try It in 60 Seconds
No API keys needed. Watch a pre-recorded 3-body deliberation with the Evangelion TUI:
npx magi-ai demoReady to deliberate for real? See Quick Start.
Why Multi-AI Consensus?
Every AI model has blind spots. Claude is thorough but verbose. GPT is practical but sometimes overconfident. Gemini is cautious but can be vague. Using any single model means inheriting its biases undetected.
MAGI forces disagreement to surface. When three models independently analyze the same code and two flag a security issue that the third missed, you know the issue is real. When all three agree, you can act with higher confidence than any single review provides.
| Single-model review | MAGI 3-body review | |---------------------|--------------------| | One perspective, one bias | Three perspectives, biases cancel out | | "Looks good to me" | 2/3 APPROVE, 1 REJECT (dissent recorded) | | You trust the model | The models verify each other | | Silent failures | Cross-examination catches oversights |
What is MAGI?
In Neon Genesis Evangelion, the MAGI supercomputer consists of three systems, each imprinted with a different aspect of Dr. Naoko Akagi's personality, making decisions through collective deliberation.
This project recreates that architecture: three AI models from different providers analyze the same problem from distinct perspectives. Cross-validation between models eliminates provider-specific biases, achieving judgment quality that no single model can reach alone.
| MAGI Unit | CLI Tool | Persona | Focus | |-----------|----------|---------|-------| | MELCHIOR-1 | Claude Code (Anthropic) | Scientist | Logic, consistency, structured reasoning | | BALTHASAR-2 | Codex CLI (OpenAI) | Engineer | Practicality, code quality | | CASPER-3 | Gemini CLI (Google) | Auditor | Safety, risk, holistic perspective |
N-body support: Defaults to 3 units, but supports 2-7 arbitrary AI units for deliberation.
How It Works
Phase 1: INITIAL OPINION (parallel)
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ MELCHIOR │ │ BALTHASAR │ │ CASPER │
│ (Claude) │ │ (Codex) │ │ (Gemini) │
└──────┬──────┘ └──────┬──────┘ └──────┬──────┘
└────── Promise.allSettled() ─────┘
│
Early exit if unanimous
│
Phase 2: CROSS-EXAMINATION (parallel)
Each unit critiques the other two — may change position
│
Phase 3: FINAL VOTE (parallel)
Final vote informed by full deliberation history → Consensus Engine decides
│
┌───────▼───────┐
│ MAGI DECISION │
└───────────────┘| Decision | Condition | Outcome | |----------|-----------|---------| | Unanimous | 3/3 agree | Immediate adoption | | Majority | 2/3 agree | Adopted (dissent recorded) | | Deadlock | All disagree | Tiebreak / Human-in-the-loop | | No quorum | 2+ ABSTAIN | Unable to decide |
Use Cases
| Scenario | Command | What MAGI adds |
|----------|---------|----------------|
| PR code review | magi review src/auth.ts | 3 models catch different bug classes |
| Architecture decision | magi decide "PostgreSQL vs MongoDB?" | Structured pros/cons from 3 perspectives |
| Security audit | magi deliberate "Audit this endpoint" | Cross-validation reduces false negatives |
| Technical debt assessment | /ops prophecy src/legacy/ | Random forest prediction from git history |
| CI/CD gate | magi review --output json | Machine-readable consensus for merge gates |
| Team tie-breaking | magi decide "Monorepo migration?" | Objective multi-perspective arbitration |
Quick Start
Prerequisites
- Node.js v24+ (v20+ minimum)
- The following 3 AI CLIs installed and authenticated:
# Install (either one)
curl -fsSL https://claude.ai/install.sh | bash # Recommended
npm install -g @anthropic-ai/claude-code # Via Node.js
# Auth — browser authentication opens on first launch
claudeRequires a paid plan (Pro / Max / Teams / Enterprise). → Official docs
# Install
npm install -g @openai/codex
# Auth — setup wizard starts on first launch
codexRequires a ChatGPT subscription (Plus / Pro / Team / Enterprise). → Official repo
# Install
npm install -g @google/gemini-cli
# Auth — select "Sign in with Google" on first launch
geminiFree with a Google account (60 req/min, 1,000 req/day). → Official repo
Install
npm install -g magi-aiVerify
magi doctor ✓ MELCHIOR: 2.x.x (Claude Code)
✓ BALTHASAR: codex-cli 0.x.x
✓ CASPER: 0.x.xStart
magiUsage
On launch, a NERV boot sequence plays and the interactive REPL starts. Type a question in natural language and a 3-body deliberation begins, with the full-screen TUI showing the process in real time.
╔════════════════════════════════════════════╗
║ M A G I S Y S T E M ║
║ REPL INTERFACE READY ║
╚════════════════════════════════════════════╝
Type a question to deliberate. /help for commands.
MAGI[01|NORM|3/3|CTX:0]> Should we mass-adopt this new ORM?
[Full-screen TUI → NERV boot → 3-body deliberation → result]
╔═══════════════════════════════════════════════╗
║ MAGI DECISION: MAJORITY_APPROVE CONF 82% ║
╠═══════════════════════════════════════════════╣
║ MELCHIOR ✓ APPROVE ████████░░ 84% ║
║ BALTHASAR ✓ APPROVE ██████████ 95% ║
║ CASPER ✗ REJECT ██████░░░░ 67% ║
╚═══════════════════════════════════════════════╝
MAGI[02|NORM|3/3|CTX:1]> Can you elaborate on the risks?
[Re-deliberation with previous context auto-injected]Type any question to start deliberation. Slash commands provide specialized operations:
| Command | Action |
|---------|--------|
| /review <file> | Code review deliberation (Tab completion) |
| /decide <question> | Architecture decision |
| /berserk <prompt> | BERSERK mode (5 strategies x N units deathmatch) |
| /status | System dashboard |
| /export [file] | Export deliberation results |
| /ops watch start | Start Angel detection daemon |
| /diag evolve | S² Engine self-evolution |
| /admin self-destruct <reason> | Self-destruct sequence (unanimous required) |
| /help | Full command list |
Context persistence: The last 5 deliberation summaries are retained and automatically injected into follow-up questions. Project context (git status, package.json, related files) is also auto-collected.
MCP Server (Claude Code Integration)
An .mcp.json is included at the repo root. Restart Claude Code and the 4 tools (magi_deliberate, magi_review, magi_decide, magi_doctor) become available automatically. Verify with MCP Inspector: npx @modelcontextprotocol/inspector magi-mcp
Programmatic API
import { Magi } from 'magi-ai';
const magi = new Magi();
const result = await magi.deliberate({
type: 'code-review',
title: 'Review auth middleware',
artifacts: [{ type: 'file', path: 'src/auth.ts', content: '...', language: 'typescript' }],
});
console.log(result.consensus.decision); // => 'MAJORITY_APPROVE'Non-interactive commands (
magi deliberate,magi review,magi decide) are also available for CI/scripts. Seemagi --help.
Features
Evangelion TUI
An Evangelion-themed full-screen interface launches by default. From the NERV boot sequence to the inverted-Y 3-panel vote visualization, it faithfully recreates the feel of the original. Zero additional dependencies (raw ANSI + chalk).
- NERV boot sequence (~1.5s)
- Inverted-Y panels — BALTHASAR (top center), CASPER (bottom left), MELCHIOR (bottom right)
- Breathing animation — panel borders pulse on a sin() curve while thinking
- Cascading vote reveal + decision stamps
- BERSERK flash — red flash on berserk warning
- Detail overlay — press
1/2/3to view per-unit reasoning
Soul System — 20 Subsystems That Give MAGI a Soul
20 subsystems that map Evangelion lore into technically meaningful features:
- Memory & Personality (EngRam) — TF-IDF similarity search, 3-layer memory, 12-dim drift detection
- Sync Rate — Beta-distribution Bayesian estimation for task-fitness tracking
- A.T. Field — Quantitative groupthink bias detection and neutralization
- BERSERK Mode — 5 strategies x N units in parallel, fitness-based selection deathmatch
- Instrumentality — MoA weighted reasoning fusion for deadlock resolution
- LCL — Phase-based information density control + 4-type hallucination detection/purification
- Angel Detection — Git-diff-based 6-type code threat pattern detection
- Dead Sea Scrolls — Random forest (50 trees) technical debt prediction
- Type-666 Firewall — 4-layer defense-in-depth firewall
- Self-Destruct Sequence — BFT f=0 unanimous irreversible operation ritual
- S² Engine — Self-diagnosis, improvement proposals, meta-deliberation self-evolution
- SEELE Council — PBFT+Raft distributed consensus protocol
- Neon Genesis — Fisher-Yates memory survival full reset
Security
9-layer security hardening: Zod schema input validation → prompt injection prevention (9 patterns) → stdin delivery + dangerous flag detection → process sandbox (ENV/CMD allowlist) → SafeOpinionSchema output validation → SHA-256 hash-chain audit log → semaphore + timeouts → replay prevention nonce → random-salt encryption
Auto-Context Collection
Automatically collects and injects project context (git status, package.json, directory tree) and related files (imports, test files) during deliberation. Opt out with --no-auto-context. Sensitive files (.env, .key, credentials, etc.) are excluded automatically.
Architecture
MAGI-system/
├── bin/ # CLI entry points (magi, magi-mcp, magi-benchmark)
├── src/
│ ├── index.ts # Public API (Magi class)
│ ├── types/ # TypeScript types & Zod schemas
│ ├── adapters/ # CLI wrappers (claude, codex, gemini)
│ ├── engine/ # Core deliberation (kernel, consensus, middleware)
│ │ ├── kernel/ # Deliberation loop, phase runner, unit executor
│ │ └── middleware/ # Koa-style chain (cache, firewall)
│ ├── parsers/ # Opinion extraction (JSON extractor, Zod schema, unstructured)
│ ├── pipelines/ # Task-specific flows (code-review, architecture, bug-analysis)
│ ├── tui/ # Evangelion full-screen TUI (raw ANSI, double-buffered)
│ ├── repl/ # Interactive REPL (19 slash commands, state machine)
│ ├── mcp/ # MCP server (4 tools, stdio transport)
│ ├── context/ # Auto-context collection (git, imports, tests)
│ ├── cache/ # SHA-256 keyed result cache with TTL
│ ├── metrics/ # Token usage tracking
│ ├── audit/ # SHA-256 hash chain audit logging
│ └── utils/ # Shared utilities (process sandbox, file validation)
├── test/ # 2135 tests (vitest)
│ ├── e2e/ # E2E with real CLIs (MAGI_E2E=1 gated)
│ ├── integration/ # Orchestrator + TUI integration
│ └── unit/ # Per-module unit tests
└── docs/ # User-facing documentationDevelopment
To develop from source:
git clone https://github.com/ryu-tada/MAGI-system.git
cd MAGI-system
npm install
npm test # 2135 tests
npm run typecheck # Type check
npm run build # BuildDocumentation
- Practical Examples — 8 scenarios with commands and expected output
- Tuning Guide — Configuration reference and optimization strategies
- Troubleshooting — Common issues and solutions
- Benchmark Results — Scoring framework validation (run with real CLIs for production data)
- Contributing — Development setup, adding adapters/pipelines
- Changelog — Release history
Tech Stack
- Language: TypeScript (ESM)
- Runtime: Node.js v24+ (v20+ minimum)
- Dependencies: commander, chalk, zod, @modelcontextprotocol/sdk
- Test: Vitest (2135 tests)
- Dev: tsx
Credits & Legal
"The MAGI's answer is — unanimous." — Ritsuko Akagi, Neon Genesis Evangelion
MAGI System is inspired by the MAGI supercomputer system from Neon Genesis Evangelion, created by Hideaki Anno and produced by Gainax / Khara, Inc.
Disclaimer
This is an unofficial, non-commercial, fan-made open-source project. It is not affiliated with, endorsed by, or connected to Khara, Inc. or any official Evangelion production.
Neon Genesis Evangelion and related names are copyrighted works of Khara, Inc. All rights reserved by their respective owners.
License
MIT — See LICENSE for details.
