arkaos

v4.20.0

Published

15 hours ago

The Operating System for AI Agent Teams

0High
0Medium
0Low

wizardingcode

ai agents orchestration claude codex gemini cursor saas teams automation

ArkaOS

The Operating System for AI Agent Teams.

86 agents. 17 departments. 287 skills. Enterprise frameworks. Multi-runtime. One install.

npx arkaos install

All counts in this document are generated by python scripts/tools/docs_stats.py and locked by a test — they cannot drift from the repository.

Why ArkaOS

Every AI coding tool gives you one developer agent. ArkaOS gives you an entire company.

Marketing teams. Brand designers. Financial analysts. Strategy consultants. Security auditors. E-commerce specialists. Content creators. Sales negotiators. Project managers. Quality reviewers. All working together, following enterprise workflows, with mandatory quality gates.

And ArkaOS learns from experience: it closes two feedback loops end to end. Every session writes semantic memory that the next session retrieves, and every Quality Gate verdict feeds back into routing, so a department that keeps getting work rejected gets flagged before it ships again. Both loops run on your machine, and every claim about them is reproducible: the hook that fires on every tool call runs at 18ms p50 (down from 83ms, measured by benchmarks/hooks-bench.sh), and the default install keeps its always-on context surface inside a CI-enforced budget.

You: "add stripe subscription billing"

ArkaOS: → Routes to Development department
        → Paulo (Tech Lead) plans the feature
        → Ana (Backend) implements with TDD
        → Sofia (Security) audits for OWASP Top 10
        → Quality Gate: Marta reviews, Eduardo checks docs, Francisca verifies code
        → Delivers: tested, secure, documented implementation

You: "create a go-to-market plan for my SaaS"

ArkaOS: → Routes to Strategy + Marketing + SaaS departments
        → Tomas (Strategist) builds competitive analysis
        → Luna (Growth) designs acquisition funnels
        → Tiago (SaaS) models PLG metrics and pricing
        → Quality Gate ensures Silicon Valley-grade output
        → Delivers: complete GTM plan with financial projections

Quick Start

Prerequisites

Node.js 18+ (or Bun)
Python 3.11+
One of: Claude Code, Codex CLI, Gemini CLI, or Cursor

Install

npx arkaos install

The installer auto-detects your runtime and configures everything:

Python dependencies (Pydantic, PyYAML, Rich, Click)
Hook system (9 lifecycle hooks, with Node fast-path shims that decide trivial calls in 18ms instead of 83ms)
A curated skill set that stays inside a 60 to 80 skill context budget, enforced in CI; the full catalog installs with npx arkaos update --skills full or per department from the plugin marketplace
The arka-tools MCP server: 12 programmatic tools for KB search, workflow state, the Quality Gate queue, recipes, session memory, and telemetry
Cognitive Layer scheduler (Dreaming + Research)

Prefer a specific runtime?

npx arkaos install --runtime claude-code
npx arkaos install --runtime codex
npx arkaos install --runtime gemini
npx arkaos install --runtime cursor

Update

# Step 1: Update core (terminal)
npx arkaos@latest update

# Step 2: Sync projects (inside your AI tool)
/arka update

Verify

npx arkaos doctor    # Health check

Skill packs, à la carte

The default install ships a curated core so your context window stays lean. Everything else lives in the ArkaOS plugin marketplace: 16 department packs with 206 skills, generated straight from the same sources the core uses. Inside Claude Code:

/plugin marketplace add andreagroferreira/arka-os
/plugin install arkaos-marketing@arkaos     # or arkaos-dev, arkaos-finance, ...

Each pack namespaces its skills, so nothing collides with your setup. Prefer everything preinstalled the old way? npx arkaos update --skills full restores the complete catalog and keeps that choice across updates.

Run the MCP server anywhere

Claude Code picks up arka-tools automatically. For other runtimes, debugging, or CI:

npx arkaos mcp start           # read-only
npx arkaos mcp start --write   # enables workflow/QG writes

How It Works

1. You describe what you need

In plain language. No special syntax required.

2. ArkaOS routes to the right squad

The Synapse engine (12-layer context injection, ~87ms cold / ~83ms warm — see Benchmarks) analyzes your request and routes it to the correct department. Each department has a lead agent who orchestrates specialists.

3. Agents execute with enterprise frameworks

Not generic prompts. Validated methodologies:

| Area | Frameworks Used | |------|----------------| | Development | Clean Code, SOLID, DDD, TDD, DORA, OWASP | | Strategy | Porter's Five Forces, Blue Ocean, BMC, Wardley Maps | | Finance | DCF Valuation, Unit Economics, COSO ERM | | Marketing | AARRR, Growth Loops, PLG, STEPPS | | Brand | Primal Branding, StoryBrand, 12 Archetypes | | E-Commerce | ResearchXL, RFM, Baymard, MACH | | Compliance | GDPR, ISO 27001, SOC 2, ISO 31000 |

4. Quality Gate reviews everything

Nothing reaches you without review by three agents:

Marta (CQO) — orchestrates, issues final APPROVED/REJECTED
Eduardo — text quality: spelling, grammar, tone, zero AI cliches
Francisca — technical quality: code, tests, UX, security

5. Knowledge persists

Every decision, solution, and pattern is captured. The Cognitive Layer curates it overnight. Tomorrow, ArkaOS is smarter than today.

17 Departments

| Department | Prefix | Agents | What It Does | |-----------|--------|--------|-------------| | Development | /dev | 15 | Full-stack features, APIs, architecture, security, CI/CD | | Brand & Design | /brand | 10 | Brand identity, UX/UI, design systems, naming | | Marketing | /mkt | 4 | SEO, paid ads, email campaigns, growth loops | | Strategy | /strat | 4 | Market analysis, competitive intelligence, business models | | E-Commerce | /ecom | 4 | Store optimization, CRO, pricing, RFM segmentation | | Knowledge | /kb | 4 | Research, Zettelkasten, persona building, ingestion | | Project Mgmt | /pm | 4 | Scrum, Shape Up, discovery, roadmaps | | Content | /content | 4 | Viral hooks, scripts, repurposing, content calendars | | Sales | /sales | 4 | Pipeline management, SPIN selling, negotiation | | SaaS | /saas | 5 | Idea validation, metrics, PLG strategy, scaffolding | | Organization | /org | 5 | Org design, team topologies, matrix structure | | Landing Pages | /landing | 4 | Sales copy, funnels, offers, page generation | | Finance | /fin | 3 | DCF valuation, unit economics, budgets, investor prep | | Operations | /ops | 3 | Automation, SOPs, compliance (GDPR, ISO, SOC 2) | | Communities | /community | 3 | Groups, membership, gamification, engagement | | Leadership | /lead | 3 | Team health, OKRs, culture, hiring frameworks | | Quality Gate | (auto) | 3 | Mandatory review on every workflow. Veto power. |

86 agents across 17 departments (85 unique; cro-specialist is shared by E-Commerce and Landing in the matrix structure).

Cognitive Layer (v2.10)

ArkaOS doesn't just execute — it learns, dreams, and researches.

Institutional Memory

Every solution you implement is captured and indexed. When you need authentication in a new Laravel project, ArkaOS already knows how you did it in the last three projects — with the exact pattern, configuration, and lessons learned.

Dual-write: Obsidian (human-readable) + Vector DB (semantic search)
Cross-project: Knowledge from ClientRetail applies to ClientFashion
Confidence scoring: Patterns validated 3+ times become "validated patterns"

Dreaming (runs at 02:00)

Every night, ArkaOS reviews the entire day:

Self-critique: "Did I do this the best way? Was there a simpler approach?"
Pattern detection: Promotes recurring solutions to validated patterns
Anti-pattern detection: Flags repeated mistakes
Strategic reflection: "Does this serve the business or just the developer?"
Actionable insights: Concrete recommendations per project

When you open a project the next morning:

Pending reflections from Dreaming:

1. [business] Offer model — rethink
   The offers table doesn't consider volume pricing tiers.
   Shopify B2B uses min_qty + tier_price for 23% higher conversion.

2. [technical] Sync retry — improve
   Fixed backoff can cause thundering herd. Use exponential
   backoff with jitter (validated pattern from ClientRetail).

Want me to elaborate?

Research (runs at 05:00)

Every morning, ArkaOS researches updates relevant to your work:

Stack-aware: Laravel security patches, Nuxt 4 migration guides, Python releases
Domain-aware: E-commerce trends, AI/ML updates, industry news
Business-aware: Competitor moves, market opportunities, funding trends
Adaptive: Infers your profile from active projects — no manual configuration

Intelligence Briefing — 2026-04-10

ACTION REQUIRED:
- Laravel 12.1.3 security patch — SQL injection in whereHas.
  Affects: ClientFashion, ClientCommerce. Fix: composer update laravel/framework.

OPPORTUNITIES:
- Shopify Winter '26 bulk product API — ClientCommerce sync could be 10x faster.
- Nuxt 4 RC2 migration guide published — start preparing ClientVideo.

COMPETITOR WATCH:
- CrewAI v3 launched memory layer — similar to our Cognitive Layer
  but without dual-write. ArkaOS is ahead.

Cross-Platform Scheduler

The scheduler works on macOS, Linux, and Windows:

arkaos scheduler status       # Check status
arkaos scheduler run dreaming # Run manually
arkaos scheduler run research # Run manually
arkaos scheduler logs         # View logs

Intelligence Loop (v2.21.0+)

Three-layer system that makes ArkaOS consult its brain before going external, and write every session's learnings back to the vault automatically. The system compounds daily — tomorrow's ArkaOS is smarter than today's.

Synapse L2.5 — KB Context Injection (v2.21.0)

On every user prompt, ArkaOS runs semantic search against the Obsidian vault and injects the top relevant notes (with wikilinks and excerpts) as context before the model starts planning. The agent sees what the vault already knows, naturally — no hard gate, no friction.

Top 3–5 notes by similarity
Jaccard fallback when vector store absent
Feature flag synapse.l25KbContext default true
Kill switch ARKA_BYPASS_L25=1

Research Gate — Natural Nudge (v2.21.0)

When an agent tries Context7, WebSearch, WebFetch, or Firecrawl without consulting Obsidian first:

First attempt → natural PT-PT nudge listing top 3 vault hits. Allows the call.
Second attempt in same turn → deny. The nudge wasn't enough.

Feature flag hooks.kbFirst default false in v2.21.0 — dormant until operator flips. Kill switch ARKA_BYPASS_KB_FIRST=1 (audited).

Auto-Documentor + Cataloger + Relator (v2.21.0)

After every approved session, an async worker writes the learnings back to the vault:

Extract — parses the transcript for sources consulted, decisions made, deliverables produced
Classify — cataloger chooses the right taxonomic home (code pattern, persona, client strategy, marketing test, ADR, research finding, framework, or session fallback)
Write — creates the note at the correct vault path with frontmatter + tags
Relate — relator finds semantically similar existing notes, creates bidirectional [[wikilinks]], updates MOCs

The vault becomes a relational graph that grows with every session.

LLM Agnostic Auto-Documentor (v2.22.0)

The synthesis LLM call is runtime-agnostic. Zero model hardcoding. Whatever runtime you have configured decides the model:

| Provider | Default | Model decided by | |---|---|---| | subagent | ✅ | Active runtime's headless CLI (claude -p, gemini -p …) | | anthropic-direct | — | ANTHROPIC_MODEL env var; no code default | | stub | tests | Template fallback |

Fallback chain subagent → anthropic-direct → stub never raises. Prompt caching ON by default (5-min TTL cache_control: ephemeral on system block).

Budget Telemetry + `/arka costs` (v2.22.0)

Every LLM call appends to ~/.arkaos/telemetry/llm-cost.jsonl with tokens, cache hits, and estimated cost. Per ADR-011, this is visibility only — never blocks.

/arka costs           # today (default)
/arka costs week      # last 7 days
/arka costs month     # last 30 days
/arka costs sessions  # top 10 expensive sessions

Soft advisory when a single session exceeds $5 equivalent. No hard caps.

Flow Marker v2 (v2.21.0)

The constitutional flow enforcement (binding in v2.20, then covering the 13-phase flow) got a turn-scoped marker cache, eliminating the cross-turn false positives that triggered on subagent dispatch and short user continuations. ADR-compliant: cache accelerates ALLOW decisions only; the transcript remains authoritative for DENY. Since v4.1.0 the enforcer gates on the evidence-flow [arka:gate:N] markers (legacy [arka:phase:N] accepted during the deprecation window).

Polish & Consolidation (v2.22.1)

core/shared/safe_session_id.py — single source of truth for the path-traversal / injection allowlist, deduped across 6 modules (100% coverage, 33 security-contract tests)
Gemini CLI headless live (gemini -p "…" --output-format json)
_MAX_FALLBACK_NOTES = 2000 bound on Synapse L2.5 degraded-mode scan
Auto-documentor coverage lifted 93% → 98%
Template synthesiser now mirrors LLM prompt structure (Key Facts → Decisions → Sources)

ADRs

2026-04-20-flow-marker-v2.md — turn-scoped cache amendment
2026-04-20-kb-first-intelligence.md — Synapse L2.5 + gate + auto-documentor
2026-04-20-llm-agnostic.md — runtime-agnostic LLM wiring

Ecosystem Management

ArkaOS manages client projects as ecosystems — groups of related projects with dedicated squads.

/client_retail          → ClientRetail ecosystem (4 projects: API, frontend, admin, docs)
/client_commerce            → ClientCommerce ecosystem (supplier sync + Shopify theme)
/client_fashion      → ClientFashion (6 projects: CRM, store, API, migration...)
/client_energy     → ClientEnergy ecosystem (3 projects: portal, API, analytics)

Each ecosystem gets:

A dedicated squad with specialized roles
Project-specific context loaded automatically when you cd into the project
Overnight insights tailored to that ecosystem's domain
Knowledge that compounds across all projects in the ecosystem

The Conclave

Your personal AI advisory board. 20 real-world advisor personas — Munger, Dalio, Bezos, Naval, Jobs, Sinek, and more — matched to your behavioral DNA.

/arka conclave         # 17-question profiling to build your board
/arka conclave ask     # Ask all advisors a question
/arka conclave debate  # Watch advisors debate a topic

Agent DNA

Every agent has a complete behavioral profile from 4 psychological frameworks:

| Framework | What It Defines | Example (Paulo, Tech Lead) | |-----------|----------------|---------------------------| | DISC | Communication style | D: 85, I: 60, S: 40, C: 75 | | Enneagram | Core motivation | Type 5w6 (Investigator) | | Big Five | Personality traits | O:88 C:92 E:55 A:65 N:22 | | MBTI | Information processing | INTJ |

This isn't cosmetic — it affects how agents collaborate, what they prioritize, and how they communicate. A high-D agent pushes for speed. A high-C agent insists on thoroughness. The tension produces better outcomes.

Python CLI Tools

8 standalone tools for quantitative analysis. No dependencies beyond stdlib.

python scripts/tools/headline_scorer.py "10x Your Revenue" --json
python scripts/tools/seo_checker.py page.html --json
python scripts/tools/dcf_calculator.py --revenue 1000000 --growth 20 --json
python scripts/tools/rice_prioritizer.py features.json --json
python scripts/tools/saas_metrics.py --new-mrr 50000 --json
python scripts/tools/tech_debt_analyzer.py src/ --json
python scripts/tools/brand_voice_analyzer.py content.txt --json
python scripts/tools/okr_cascade.py growth --json

Multi-Runtime Support

| Runtime | Status | Features | |---------|--------|----------| | Claude Code | Primary | Hooks, subagents, MCP, 1M context | | Codex CLI | Supported | Subagents, sandboxed execution | | Gemini CLI | Supported | Subagents, MCP, 1M context | | Cursor | Supported | Agent mode, MCP |

Architecture

User Input
  │
  ▼
Synapse v2 (12-layer context injection, ~87ms cold / ~83ms warm)
  │
  ▼
Orchestrator (/do → department routing)
  │
  ▼
Squad (YAML workflow with phases and gates)
  │
  ▼
Quality Gate (Marta + Eduardo + Francisca)
  │
  ▼
Cognitive Layer (capture → dual-write → insights)
  │
  ▼
Output (Obsidian vault + structured deliverables)

Core Systems

| System | Purpose | |--------|---------| | Synapse v2 | 12-layer context injection (~87ms cold, ~83ms warm; cacheable layers are sub-millisecond) | | Workflow Engine | YAML workflows with phases, gates, parallelization | | Agent Schema | 4-framework behavioral DNA with consistency validation | | Squad Framework | Department squads + ad-hoc project squads (matrix) | | Cognitive Layer | Memory, Dreaming, Research, Scheduler | | Living Specs | Bidirectional spec/code sync | | Governance | Constitution with 25 non-negotiable rules (+ 11 must, 8 should) | | Multi-Runtime | Claude Code, Codex, Gemini, Cursor adapters |

Tech Stack

| Component | Technology | |-----------|-----------| | Core Engine | Python 3.11+ (Pydantic, PyYAML, Rich) | | Installer | Node.js/Bun (ESM) | | Hooks | Bash | | Workflows | YAML | | Agent Definitions | YAML | | Knowledge | Obsidian + SQLite-VSS | | Tests | pytest (4,500+ tests) |

Documentation

Full documentation lives in two places in this repository:

wiki/ — the user-facing guide (step-by-step, features, benchmarks):

Home — the index of everything
Getting Started — install and run your first command
Core Concepts — squads, agents, tiers, behavioral DNA
The Evidence Flow (4 Gates) — how every request is handled
Departments — one page per department
Commands Reference
Cognitive Layer — memory, dreaming, research
Skill Packs — the curated core and the plugin marketplace
Benchmarks — measured, reproducible numbers
Competitive Analysis and Benefits & ROI

docs/ — the technical/contributor reference (architecture, API, agent schema, core engine, ADRs).

CLI Reference

Installer (terminal):

npx arkaos install             # Fresh install (auto-detects runtime)
npx arkaos@latest update       # Update core + hooks to latest version
npx arkaos migrate             # Migrate from v1
npx arkaos doctor              # Health check
npx arkaos dashboard           # Start monitoring dashboard
npx arkaos keys                # Manage API keys
npx arkaos uninstall           # Remove ArkaOS

In-session commands (inside Claude Code / Codex / Gemini / Cursor):

/arka update                   # Sync all project configs after core update
/arka status                   # System health + LLM costs + Enforcement + Reorganization (24h)
/arka costs [today|week|month|all|sessions]  # LLM cost visibility
/arka enforcement [today|week|month|all]     # Flow-marker compliance (v2.41.0)
/arka reorganize [--since-days N]            # KB → agent proposal markdown (v2.42.0)
/arka index                    # (Re)index Obsidian vault into vector store
/arka search <query>           # Semantic search in knowledge base
/arka standup                  # Daily standup (projects, blockers, updates)
/arka onboard <path>           # Onboard an existing project
/arka conclave                 # Personal AI advisory board
/do <description>              # Universal routing — natural language to department

Release pipeline (operator-side, headless):

~/.arkaos/bin/arka-py -m core.release.preflight_cli --expected-npm-user <user>
# Step 0 NON-NEGOTIABLE before any tag/push/publish (v2.43.0+).
# Six checks: version-alignment / npm-auth / npm-publish-capability /
# gh-auth / git-remote / git-clean. Plus no-client-name-leaks
# (v2.44.0+, BLOCKING since v2.45.0).
# Exit 1 = STOP, fix every remediation, re-run.

Department commands: /dev, /mkt, /brand, /fin, /strat, /ecom, /kb, /ops, /pm, /saas, /landing, /content, /community, /sales, /lead, /org.

Contributing

See CONTRIBUTING.md. PRs welcome — all changes require passing the full test suite (4,500+ tests as of v4.0.0) and Quality Gate review (Marta CQO + Eduardo Copy + Francisca Tech, all Opus).

License

MIT — WizardingCode