karajan-code
v2.12.0
Published
Local multi-agent coding orchestrator with TDD, SonarQube, and code review pipeline
Maintainers
Readme
v2.12.0 released — Quality-measurement release. Two new features land together: every
kj runagainst a known plan now scores how faithfully the coder followed it (deterministic 0–100 plan adherence metric, four weighted components, rendered insummary.md), and a small golden-tasks regression suite (todo-rest-api,npm-package-cli,react-counter-component) catches output-quality drops between Karajan versions before npm publish. Plus the shrink-budget CI gate now exempts human-facing docs from its 200-LOC ceiling while keeping AI-rule files (CLAUDE.md, AGENTS.md, role prompts) capped. 3 PRs for plan adherence (#645–#647), 4 for golden tasks (#648, #650–#652), 1 for the CI policy (#649). 4522/4522 tests passing. Safe upgrade from 2.11.0.v2.11.0 — Dogfooding pass release. Two-day pass through a 10-level test plan surfaced and fixed a long tail of UX papercuts and three latent bugs that only show up on fresh
/tmprepos: theSonarStageno longer loops on remoteless projects (was burning iterations untilmax_iterations-fallback-approval), the post-loopcommitAllnow tolerates the locale-specific "nothing to commit" race, the HU sub-pipeline branches offmaster/HEADwhen the configuredmaindoesn't exist, andrunFlownow sealssession.statusat the boundary sokj statusnever shows zombirunningruns again. Plushu-boardgains automatic ephemeral-project cleanup and an in-UI help modal for the five views. 14 PRs (#624–#637), 4452/4452 tests passing. Safe upgrade from 2.10.2.v2.10.2 — Patch release.
kj initwizard expanded from 9 prompts to a full setup: per-role provider selection (10 roles, "inherit / pick CLI / disable"), automatic SonarQube token generation via REST API (no more web UI walkthrough), git automation flags (auto_commit/push/pr) and HU Board security (bind host + port). +16 new tests. Safe upgrade from 2.10.1.v2.10.1 — Patch release. One-line fix for a stdout contamination bug in
kj audit --agent-readiness --json(the[info]banner was breaking downstreamjqpipes), plus polish in the asciinema demo scripts. Safe upgrade from 2.10.0.v2.10.0 — Agent-readiness release. Karajan is now the first orchestrator with a full agent-readability surface: an
llms.txtindex at the root, aSKILL.mdper CLI command underdocs/agents/, and a static auditor that scores any third-party repo against the same shape. Highlights: (1)kj audit --agent-readinessscores any repo 0–100 across seven checks (llms.txt, robots AI-bot allowlist, page token budgets ≤ 32 KB, heading hierarchy, agents/README, SKILL.md coverage). LLM-free, deterministic, JSON-able. Karajan-on-Karajan: 100/100. (2) Six new SKILL.md files (kj doctor / init / board / review / resume / clean) underdocs/agents/, all with the sameWhat it does · Inputs · Outputs · Side effects · Failure modes · Examplecontract; CI guards every link inllms.txtresolves. (3) Webperf quality gate inside the iteration loop (pipeline.perf.enabled): PASS continues, FAIL pushes blocking-metric feedback to the coder, scanner-missing skips best-effort. (4) HU Board hardening: binds 127.0.0.1 by default, opt-in--bind 0.0.0.0enforces an auto-generated token, helmet headers, rate limiting at 300 req/min — "safe by default on a coffee-shop WiFi". (5) a11y skills auto-route: tasks mentioningaccessibility / WCAG / ARIA / screen reader / keyboard navautomatically pull thefrontend-ui-engineeringskill. (6) Asciinema demo scripts underdocs/demos/so the recordings re-record per release instead of rotting. 5 PRs merged (#605–#609 + #610), 4358/4358 tests passing. See CHANGELOG.md for the full punch list.
You describe what you want to build. Karajan orchestrates multiple AI agents to plan it, implement it, test it, review it with SonarQube, and iterate. No babysitting required.
What is Karajan?
Karajan is a local coding orchestrator. It runs on your machine, uses your existing AI providers (Claude, Codex, Gemini, Aider, OpenCode), and coordinates a pipeline of specialized agents that work together on your code.
It is not a hosted service. It is not a VS Code extension. It is a tool you install once and use from the terminal or as an MCP server inside your AI agent.
The name comes from Herbert von Karajan, the conductor who believed that the best orchestras are made of great independent musicians who know exactly when to play and when to listen. Same idea, applied to AI agents.
Why not just use Claude Code?
Claude Code is excellent. Use it for interactive, session-based coding.
Use Karajan when you want:
- A repeatable, documented pipeline that runs the same way every time
- TDD by default. Tests are written before implementation, not after
- SonarQube integration. Code quality gates as part of the flow, not an afterthought
- Solomon as pipeline boss. Every reviewer rejection is evaluated by a supervisor that decides if it's valid or just style noise
- Multi-provider routing. Claude as coder, Codex as reviewer, or any combination
- Zero-config operation. Auto-detects test frameworks, starts SonarQube, simplifies pipeline for trivial tasks
- Composable role architecture. Agent behaviors defined as plain markdown files that travel with your project
- Local-first. Your code, your keys, your machine. No data leaves unless you say so
- Zero API costs. Karajan uses AI agent CLIs (Claude Code, Codex, Gemini CLI), not APIs. You pay your existing subscription (Claude Pro, ChatGPT Plus), not per-token API fees
If Claude Code is a smart pair programmer, Karajan is the CI/CD pipeline for AI-assisted development. They work great together: Karajan is designed to be used as an MCP server inside Claude Code.
How Karajan differs from AI frameworks
While Genkit, Mastra, LangChain and Vercel AI SDK call /v1/messages, Karajan orchestrates the AI CLIs your developers already use in their terminals.
| Axis | Karajan | Genkit / Mastra / LangChain / Vercel AI SDK |
|------|---------|---------------------------------------------|
| Calls provider HTTP API (/v1/messages, etc.) | ❌ Delegates to CLIs | ✅ |
| Orchestrates existing AI CLIs (claude, codex, gemini, aider, opencode) as subprocesses | ✅ | ❌ |
| Depends on cloud infrastructure | ❌ Fully local | ⚠️ Varies |
| Vanilla JS (no TypeScript required) | ✅ | ⚠️ TS-first |
| Token billing | Uses your existing CLI subscriptions | Pay per API call |
Two technical facts worth keeping straight:
- Subprocess, not PTY. Karajan spawns each CLI via
execa/child_processwith plainstdin/stdout/stderr— seesrc/infrastructure/command-runner.jsandsrc/agents/*.js. There is no PTY emulation. - Fresh subprocess per invocation + state on disk. Every coder run is a new process; the state lives in
~/.karajan/sessions/(seesrc/session-store.js) and the per-session journal under.reviews/<session-id>/. This is what makes pipelines reproducible and resumable withkj resume.
Full write-up with mental mapping for Genkit / Mastra / LangChain / Vercel AI SDK developers: docs/COMPARISON.md.
Install
npm (recommended):
npm install -g karajan-codeHomebrew (macOS):
brew install manufosela/tap/karajan-codeStandalone binary (no Node.js needed):
# macOS (Apple Silicon)
curl -L https://github.com/manufosela/karajan-code/releases/latest/download/kj-darwin-arm64 -o kj && chmod +x kj
# Linux x64
curl -L https://github.com/manufosela/karajan-code/releases/latest/download/kj-linux-x64 -o kj && chmod +x kj
# Windows
curl -L https://github.com/manufosela/karajan-code/releases/latest/download/kj-win-x64.exe -o kj.exeOne-liner (detects OS, installs via npm):
curl -fsSL https://raw.githubusercontent.com/manufosela/karajan-code/main/scripts/install-kj.sh | shDocker:
docker run --rm -v $(pwd):/workspace karajan-code kj --versionPython:
cd wrappers/python && pip install .That's it. kj init auto-detects your installed agents and installs RTK for token optimization.
Optional scanners for kj audit + kj webperf
Karajan auto-skips any scanner that isn't installed. Add the ones that match your projects:
| Tool | Install | What you get |
|------|---------|--------------|
| SonarQube | docker compose -f ~/sonarqube/docker-compose.yml up -d | Code quality + security rules with line-precision in kj audit |
| OSV-Scanner | go install github.com/google/osv-scanner@latest | Dependency CVE coverage broader than npm audit |
| Semgrep | pipx install semgrep | SAST: XSS, SQLi, taint flow, secrets — equivalent to snyk code, free for OSS |
| Lighthouse | npm install -g lighthouse | Core Web Vitals + opportunities for kj webperf (auto-feeds kj audit) |
Skip any per-run with --no-sonar, --no-osv, --no-semgrep. See docs/GETTING-STARTED.md for full table.
Three ways to use Karajan
Karajan installs three commands: kj, kj-tail, and karajan-mcp.
1. CLI: direct from terminal
Run Karajan directly. You see the full pipeline output in real time.
kj run "Create a utility function that validates Spanish DNI numbers, with tests"
kj code "Add input validation to the signup form" # Coder only
kj review "Check the authentication changes" # Review current diff
kj audit "Full health analysis of this codebase" # Read-only audit
# Planning workflow (v2.5+)
kj plan "Refactor the database layer" # Generate plan + HUs
kj plan list # List plans for this project
kj plan show <planId> # Show plan details + HU table
kj plan validate <planId> # Check structure and deps
kj plan ready <planId> # Certify all HUs, mark ready
kj plan add-hu <planId> --title "..." --type feat # Add HU to plan
kj plan remove-hu <planId> <huId> # Remove HU from plan
kj plan delete <planId> # Delete plan from disk
kj run --plan <planId> "task" # Execute an approved plan
# HU Board dashboard (v1.34.0+)
kj board start # Start web dashboard (port 4000)
kj board open # Start + open in browser
kj board status # Check if running
kj board stop # Stop the board2. MCP: inside your AI agent
This is the primary use case. Karajan runs as an MCP server inside Claude Code, Codex, or Gemini. You ask your AI agent to do something, and it delegates the heavy lifting to Karajan's pipeline.
You → Claude Code → kj_run (via MCP) → triage → coder → sonar → reviewer → tester → securityThe MCP server auto-registers during npm install. Your AI agent sees 23 tools (kj_run, kj_code, kj_review, etc.) and uses them as needed.
The problem: when Karajan runs inside an AI agent, you lose visibility. The agent shows you the final result, but not the pipeline stages, iterations, or Solomon decisions happening in real time.
3. kj-tail: monitor from a separate terminal
This is the companion tool. Open a second terminal in the same project directory where your AI agent is working, and run:
kj-tailYou'll see the live pipeline output (stages, results, iterations, errors) as they happen. Same view as running kj run directly.
kj-tail # Follow pipeline in real time (default)
kj-tail -v # Verbose: include agent heartbeats and budget
kj-tail -t # Show timestamps
kj-tail -s # Snapshot: show current log and exit
kj-tail -n 50 # Show last 50 lines then follow
kj-tail --help # Full optionsImportant:
kj-tailmust run from the same directory where the AI agent is executing. It reads<project>/.kj/run.log, which is created when Karajan starts a pipeline via MCP.
Typical workflow:
Terminal 1 Terminal 2
$ claude $ kj-tail
> implement the next
priority task [triage] medium (sw)
[researcher] 3 patterns, 5 constraints
(Claude calls kj_run [planner] 6 steps (tests first)
via MCP, you see [coder] 3 endpoints + 18 tests
only the final result) [tdd] PASS (3 src, 2 test)
[sonar] Quality gate OK
[reviewer] REJECTED (2 blocking)
[solomon] 2 conditions
[coder] fixed, 22 tests now
[reviewer] APPROVED
[tester] 94% coverage, 22 tests
[security] passed
Result: APPROVEDWatch the full pipeline demo: triage, architecture, TDD, SonarQube, code review, Solomon arbitration, security audit.
The pipeline
hu-reviewer? → triage → domain-curator? → discover? → architect? → planner? → coder → sonar? → impeccable? → reviewer → tester? → security? → solomon → commiter?16 roles, each executed by the AI agent you choose:
| Role | What it does | Default | |------|-------------|---------| | hu-reviewer | Certifies user stories before coding (6 dimensions, 7 antipatterns) | Auto (medium/complex) | | triage | Classifies complexity, activates roles, detects domain hints | On | | domain-curator | Discovers, proposes and synthesizes business-domain knowledge for the pipeline | Auto (when domains exist) | | discover | Detects gaps in requirements (Mom Test, Wendel, JTBD) | Off | | architect | Designs solution architecture before planning | Off | | planner | Generates structured implementation plans | Off | | coder | Writes code and tests following TDD methodology | Always on | | refactorer | Improves code clarity without changing behavior | Off | | sonar | SonarQube static analysis with quality gate enforcement | On (auto-managed) | | impeccable | UI/UX audit for frontend tasks (a11y, performance, theming) | Auto (frontend) | | reviewer | Code review with configurable strictness profiles | Always on | | tester | Test quality gate and coverage verification | On | | security | OWASP security audit | On | | solomon | Pipeline boss: evaluates every rejection, overrides style-only blocks | On | | commiter | Git commit, push, and PR automation after approval | Off | | audit | Read-only codebase health analysis (5 dimensions, A-F scores) | Standalone |
5 AI agents supported
| Agent | CLI | Install |
|-------|-----|---------|
| Claude | claude | npm install -g @anthropic-ai/claude-code |
| Codex | codex | npm install -g @openai/codex |
| Gemini | gemini | See Gemini CLI docs |
| Aider | aider | pipx install aider-chat (or pip3 install aider-chat) |
| OpenCode | opencode | See OpenCode docs |
Mix and match. Use Claude as coder and Codex as reviewer. Karajan auto-detects installed agents during kj init.
MCP server (23 tools)
After npm install -g karajan-code, the MCP server auto-registers in Claude and Codex. Manual config if needed:
# Claude: add to ~/.claude.json → "mcpServers":
# { "karajan-mcp": { "command": "karajan-mcp" } }
# Codex: add to ~/.codex/config.toml → [mcp_servers."karajan-mcp"]
# command = "karajan-mcp"24 tools available: kj_run, kj_code, kj_review, kj_plan, kj_board, kj_audit, kj_scan, kj_doctor, kj_config, kj_report, kj_resume, kj_roles, kj_agents, kj_preflight, kj_status, kj_init, kj_discover, kj_triage, kj_researcher, kj_architect, kj_impeccable, kj_hu, kj_skills, kj_suggest.
Use kj-tail in a separate terminal to see what the pipeline is doing in real time (see Three ways to use Karajan).
The role architecture
Every role in Karajan is defined by a markdown file: a plain document that describes how the agent should behave, what to check, and what good output looks like.
.karajan/roles/ # Project overrides (optional)
~/.karajan/roles/ # Global overrides (optional)
templates/roles/ # Built-in defaults (shipped with package)You can override any built-in role or create new ones. No code required. The agents read the role files and adapt their behavior. Encode your team's conventions, domain rules, and quality standards, and every run of Karajan applies them automatically.
Use kj roles show <role> to inspect any template.
Zero-config by design
Karajan auto-detects and auto-configures everything it can:
- TDD: Detects test framework for 12 languages (vitest, jest, JUnit, pytest, go test, cargo test, and more). Auto-enables TDD for code tasks, skips for doc/infra
- Bootstrap gate: Validates all prerequisites (git repo, remote, config, agents, SonarQube) before any tool runs. Fails hard with actionable fix instructions, never silently degrades
- Injection guard: Scans diffs for prompt injection before AI review. Detects directive overrides, invisible Unicode, oversized comment payloads. Also runs as a GitHub Action on every PR
- SonarQube: Auto-starts Docker container, waits up to 60s for startup, generates config if missing
- Pipeline complexity: Triage classifies task → trivial tasks skip reviewer loop
- Provider outages: Retries on 500/502/503/504 with backoff (same as rate limits)
- Coverage: Coverage-only quality gate failures treated as advisory
- HU Manager: Complex tasks auto-decompose into formal user stories with dependencies. Each HU runs as its own sub-pipeline with state tracking visible in the HU Board
No per-project configuration required. If you want to customize, config is layered: session > project > global.
Why vanilla JavaScript?
Not nostalgia, not stubbornness. I've been using JavaScript since 1997, when Brendan Eich created it in a week and changed the lives of everyone building for the web. I know its guts, its bugs, its quirks. And I know that whoever truly understands JS turns those bugs into features. TypeScript exists so that developers used to strongly-typed languages don't panic when they see JS. I respect that. But I don't need it. Tests are my type safety. JSDoc and a good IDE are my intellisense. And not having a compiler between the code and me is what lets me ship 57 releases in 45 days without fear.
Why vanilla JavaScript: the long version
Recommended companions
| Tool | Why | |------|-----| | RTK | Reduces token consumption by 60-90% on Bash command outputs | | Planning Game MCP | Agile project management (tasks, sprints, estimation), XP-native | | GitHub MCP | Create PRs, manage issues directly from the agent | | Chrome DevTools MCP | Verify UI changes visually after frontend modifications |
Contributing
git clone https://github.com/manufosela/karajan-code.git
cd karajan-code
npm install
npm test # Run ~2599 tests with Vitest
npm run validate # Lint + testIssues and pull requests welcome. If something doesn't work as documented, open an issue. That's the most useful contribution at this stage.
Telemetry
Karajan collects anonymous usage statistics to improve the tool: version, OS, command used, pipeline duration and success rate. No code, task descriptions, or personal data is ever sent.
Opt out: set telemetry: false in ~/.karajan/kj.config.yml
Links
Built by @manufosela. Head of Engineering at Geniova Technologies, co-organizer of NodeJS Madrid, author of Liderazgo Afectivo. 90+ npm packages published.
Contributors
- @aitormf — OpenCode agent (5th built-in agent)
- @reiaguilera — Beta testing, feature proposals, and quality feedback
