kata-os

v2.4.0

Published

11 days ago

A Claude Code skill suite with deterministic verification gates. 24 skills, 24 enforcement hooks, engram library daemon.

Downloads

252

0High
0Medium
0Low

wolfonwings

claude claude-code ai development pipeline tdd architecture skills

Kata — A Claude Code skill suite with deterministic verification gates

Kata is a Claude Code skill suite with deterministic verification gates. For engineers who want AI to accelerate work without grading its own homework. Every verification layer classifies as L1 (deterministic tools, highest trust), L2 (AI anchored to external sources, medium trust), or L3 (AI evaluating AI, lowest trust) — and the rule is absolute: every gate that can be L1 must be L1, and L3 is never the only layer.

What Is Kata?

Eleven of the skills compose into an end-to-end planning-and-build pipeline, but the pipeline is one composition, not the only one. Every skill works standalone.

What You Get First: `context-kata`

What it does. Loads your whole project into Claude at once — Repomix + secret redaction + image/PDF capture + manifest + ordered XML.
How to use. In any Claude Code session inside a repo, type /context-kata.
Why it is the lead. Pays off without the pipeline, without a build, without a second skill. Claude holds the project instead of searching it.

Quick Start

Requirements

Claude Code — the CLI, VS Code extension, JetBrains extension, desktop app, or claude.ai/code
Git — for cloning and for skills that read git history

Install

npx kata-os install

One command. Works on Windows, Mac, and Linux. No symlinks, no Developer Mode, no admin required.

Copies 24 skills plus 3 adversarial agents to ~/.claude/, installs hook scripts and the engineering foundation to ~/.kata/, and sets up claude-mem for persistent cross-session memory. Hooks are activated per-project by /init-kata — not globally.

Requires Node.js 18+. Run the same command anytime to upgrade.

Setup (per project)

# Open your project in Claude Code, then:
/init-kata

Setup detects your stack, installs missing verification tools, configures hooks, sets up MCP servers, activates the orchestrator, and for existing projects automatically analyzes your codebase.

After setup, close and restart Claude Code. The orchestrator activates on the next session — it won't take effect until you restart.

Upgrade

npx kata-os install

Same command as install — detects existing installation and overwrites with the latest version. Or from inside Claude Code: /upgrade-kata. Version tracked at ~/.kata/version.

How It Works

One command (/init-kata) configures everything — tool detection, hooks, project analysis, and the orchestrator. After a restart, the Kata orchestrator becomes your default session agent. You talk to it. It runs skills directly in your context — you collaborate in real-time on every decision. For QA, it runs dual-mode: a collaborative pass with you plus an independent adversarial pass that catches what you both missed. You make the decisions. Kata does everything else.

The guardrails are structural, not optional. Every TDD cycle hits an L1 deterministic gate (linters, type-check, credential scanning) that blocks on failure. The AI fixes the issue or it doesn't proceed. No exceptions, no reasoning around it. Figma MCP provides design specs, Chrome extension gives live visual collaboration, and Context7 prevents hallucinated APIs with live library docs.

Every skill also works standalone. You don't need the pipeline to get value. Drop /architect-kata into any conversation to think through a system design. Run /investigate-kata when something breaks. Use /security-kata before any deploy. The pipeline is there when you want it — the skills are useful the moment you install them.

How Do I Actually Use This?

After setup, the orchestrator handles routing. You don't need to memorize skill names — just tell it what you want to do.

"I have an idea." Run /init-kata in an empty project folder. Restart. The orchestrator routes you to brainstorm, then scope, architect, design, test specs, security audit. By the time planning is done, every build cycle is mapped out. Say "let's build" and the orchestrator runs build-code directly — you collaborate on every TDD cycle.

"I have an existing project and want Kata to help." Run /init-kata in your project folder. Setup detects your stack, configures verification hooks, and automatically analyzes your entire codebase. Restart your terminal. The orchestrator knows your project and is ready to plan, build, or debug.

"Something is broken and I've been staring at it for an hour." Tell the orchestrator "debug this" — or run /investigate-kata directly. It systematically isolates the root cause using binary search, hypothesis testing, and scope locks. For complex bugs with multiple theories, it uses /fork to test hypotheses in parallel.

"I just want to ship this PR." Tell the orchestrator "ship it" — or run /ship-kata directly. Diff review, version bump, PR creation, deploy, canary monitoring, documentation update.

"I stepped away for a week." Just open Claude Code in the project. The orchestrator auto-loads, shows pipeline state, and orients you to where you left off. claude-mem captures everything automatically — every tool use, every decision, every file change — compresses it with AI, and injects relevant context from past sessions on startup.

What Makes Kata Different?

There are other great projects for Claude Code on GitHub — Superpowers for subagent-based TDD, gstack for virtual engineering teams. Kata builds on their work (and credits them). The difference is verification philosophy: Kata is organized around the L1/L2/L3 trust hierarchy. Every output is gated by deterministic tools the AI cannot influence (L1) wherever possible; AI anchored to external sources (L2) where L1 is unavailable; AI-on-AI (L3) only as an outer layer, never as the only layer. Dual-mode QA uses independent adversarial review to catch blind spots shared-bias evaluation cannot.

The Guardrails

The core premise: AI-generated code is guilty until proven innocent. Every verification layer in Kata exists because trusting AI output at face value is how slop ships.

Three bias levels. Kata classifies every verification by how much you should trust it. Level 1: deterministic tools — binary pass/fail, highest trust. Level 2: AI anchored to external sources (Context7 live docs, API contracts) — medium trust. Level 3: AI evaluating AI — lowest trust. The rule: every gate that can be Level 1 must be Level 1. Level 3 is never the only layer.

24 enforcement hooks, deterministic where possible. Every hook script is bash, sources a shared JSON parsing library (_kata_json.sh), and degrades gracefully — warn visibly, never crash, never fail silently.

| Event | Hook | What it does | |---|---|---| | SessionStart | identity-foundation.sh + identity-briefing.sh | First-run welcome, foundation injection, branch/pipeline/next-cycle banner | | Stop | check-new-files.sh | Alert on new files not documented in CLAUDE.md | | Stop | check-p1-todos.sh | Flag open P1 TODOs potentially addressed by session changes | | Stop | check-askuser-usage.sh | Block responses ending with a plain prose question (AUQ contract) | | PreToolUse (PowerShell) | inline blocker | Reject PowerShell tool calls, force Bash | | PreToolUse (AskUserQuestion) | validate-askuser.sh | L1 regex gate: completeness scores + emoji + Recommended on option labels | | PreToolUse (Edit|Write) | protect-config.sh | Block edits to linter/formatter/type-checker config files | | PreToolUse (Edit|Write) | skill-read-checkpoint.sh | Remind about pending lazy-load reference files | | PreToolUse (Grep) | lsp-guard.sh | Redirect code-symbol searches to cclsp MCP | | PreToolUse (Bash) | lsp-guard.sh | Redirect grep/rg/ag code-symbol searches to cclsp MCP | | PostToolUse (Edit|Write) | validate-artifact.sh | Validate pipeline artifact required sections | | PostToolUse (Read) | track-reads.sh | Track skill reads + surface lazy-load reference instructions |

Project-fitted leashing. Kata doesn't ship a generic tool list. It detects your stack and builds a custom gate for your project. A Python project gets ruff, mypy, pytest. A TypeScript project gets eslint, tsc, vitest. Tools are added as your codebase grows. The leash is fitted to your project, out of the box.

Test design separation. The skill that writes test specs (testdesign) never sees the implementation. The skill that implements code (build-code) works from those specs. A fresh-context subagent verifies test independence. The exam writer and the answer checker never see each other's work. This prevents the most common AI testing failure: tests that pass because they were written to match the code, not the spec.

Live library docs via MCP + API doc registry. Context7 provides version-specific documentation for 9,000+ libraries during builds. Figma MCP bridges design-to-code with 16 tools (read design specs, write frames, extract tokens, visual diff). Every dependency is registered in kata-tools.json with a doc source. Build-code checks the registry during setup and queries docs inline — no hook needed.

Chrome as collaboration surface. During UI builds, the Chrome extension gives both user and Claude a shared live view of the app. Either party can flag issues. Console access, visual verification, and GIF capture are built into the build cycle. Adversarial QA uses headless /browse instead — different tool, different perspective.

Skills

Every skill works standalone. The pipeline is one composition; it is not the only one.

Pipeline (`src/pipeline/`)

Sequential planning and build skills. Each reads what the previous wrote. You can run any one of them on its own — the chain is a convenience, not a requirement.

| Skill | What it does | |---|---| | discover-kata | Validates the problem space before any solution work. Produces brainstorm.md. | | onboard-kata | Project archaeology for existing codebases. Produces onboarding.md. | | scope-kata | Defines what is in, out, and deferred. User stories, acceptance criteria, success metrics. | | interrogate-kata | Exhaustively traverses every unresolved design decision, one AskUserQuestion at a time. | | architect-kata | Component boundaries, four-path data flow, API contracts, failure mode analysis. | | design-kata | User research, UX strategy, visual system, implementation guidance, handoff. | | spec-kata | Test-first specification. Acceptance criteria, test matrix, edge cases, invariants. | | refine-kata | Optimize / delight / full mode. Performance, workflow, and polish. | | code-kata | TDD-first implementation with hard gates. Red-green-refactor with L1 deterministic verification. | | test-kata | Three-tier adversarial QA with health scoring. Dual-mode: direct + adversarial agent. | | ship-kata | Ship + reflect. Diff review, version bump, CHANGELOG, push gate, PR, deploy, canary, retro. |

Standalone (`src/standalone/`)

Skills that do not fit the linear pipeline. Each is self-contained and can run at any time.

| Skill | What it does | |---|---| | security-kata | OWASP + STRIDE security audit with severity scoring. Daily or comprehensive mode. | | investigate-kata | Full-spectrum debugging with Iron Law (no fix without root cause), 3-strike rule, scope lock. | | justify-kata | Five-pass adversarial audit of any AI-generated analysis. Claim extraction, evidence classification, trust verdict. | | auditpass-kata | Single-shot adversarial SKILL.md auditor. 11 defect categories + 7 improvement dimensions, fixes in place. | | context-kata | Full-project context loader. Runs Repomix, redacts secrets, captures images/PDFs, prepends a collaborative manifest. |

Meta (`src/meta/`)

Skills that manage other skills, hooks, annotations, upgrades, and task boards.

| Skill | What it does | |---|---| | init-kata | One-command project setup: stack detection, hooks, MCP, auto-onboarding for existing projects. | | upgrade-kata | One-command Kata upgrade from inside Claude Code. Fetch, reinstall, patch hooks, show changelog. | | annotate-kata | Reconcile CLAUDE.md files with actual project state using claude-mem observations. | | next-kata | ROI-driven task prioritizer. Effort × impact scoring, dependency awareness, next-action surfacing. | | skillsetup-kata | Scaffold new skills to the canonical 11-section template. | | hooksetup-kata | Design or rework Claude Code hooks. Drop-in implementations, execution-tested, conformance-validated. | | engram-kata | Engram library curator. Author, review, stats, seed. Fronts the kata-daemon over IPC — semantic + L1 matching, adherence tracking, candidate draft from memory writes. |

Utils (`src/utils/`)

| Skill | What it does | |---|---| | browse-kata | Headless browser utility for QA testing, visual verification, and site inspection. |

Adversarial agents (`src/adversarial/`)

Three adversarial agents (investigate-kata-adversarial, refine-kata-adversarial, test-kata-adversarial) run on Sonnet with clean context during dual-mode QA. The main agent steers the direct pass; the adversarial agent runs in parallel and findings are diffed afterward.

The Pipeline (optional composition)

The 11 pipeline skills compose into a linear chain when you want them to. Each reads the previous artifact and writes its own.

discover-kata   ──┐
                  ├─→ scope-kata ─→ interrogate-kata ─→ architect-kata ─→ design-kata
onboard-kata    ──┘

                       ┌─→ spec-kata ─→ refine-kata ─→ code-kata ─→ test-kata ─→ ship-kata
design-kata ──→ security-kata ──┘

You can enter the chain at any point — every skill auto-detects whether the upstream artifact exists and prompts for context if it does not. You can also skip any step; nothing in the chain forces the next.

Anytime skills (outside the pipeline): investigate-kata for debugging, justify-kata for AI-reasoning audits, context-kata for full-project ingestion, security-kata on demand, refine-kata in standalone optimization mode.

Settings

Kata stores preferences at ~/.kata/settings.json. All features default to ON. Disable any individually:

mkdir -p ~/.kata
echo '{"welcome_banner": false}' > ~/.kata/settings.json

Available settings: welcome_banner, completion_banner, first_run_intro, decision_tally, pipeline_progress, model_warnings, artifact_preview, help_on_request, quicksave, milestone_celebrations, pseudocode_mode

Architecture

kata/
├── shared/                          ← Shared sources (edit here)
│   ├── kata-foundation.md            ← Engineering foundation scaffolded into project CLAUDE.md
│   ├── project-hooks.json            ← 12 enforcement hooks
│   ├── artifact-schemas.md           ← Pipeline artifact schemas
│   ├── operational-patterns.md       ← Cross-skill operational patterns
│   ├── glossary-template.md          ← Project glossary template
│   └── counts.json                   ← Authoritative skill/hook/agent/stage counts
├── src/                             ← Skill sources (edit here)
│   ├── pipeline/                     ← 11 chained skills (plan → build → QA → release)
│   ├── standalone/                   ← 5 skills (security, investigate, justify, auditpass, context)
│   ├── meta/                         ← 6 skills (init, annotate, upgrade, next, skillsetup, hooksetup)
│   ├── utils/                        ← 1 skill (browse)
│   └── adversarial/                  ← 3 adversarial agent source files
├── bin/                             ← CLI + verifier + hooks
│   ├── cli.js / install.js / uninstall.js / doctor.js
│   ├── verify.sh / kata-upgrade.sh
│   └── hooks/                        ← 12 hook scripts + _kata_json.sh + _kata_hook.js
├── dist/                            ← Generated compiled skills (do not edit)
├── agents/                          ← Generated agent definitions (do not edit)
├── *-kata/                          ← Generated per-skill dirs at repo root (do not edit)
├── build.sh                         ← Compiles src/ + shared/ → dist/ + agents/ + root *-kata/
├── package.json                     ← npm metadata (name: kata-os)
├── VERSION                          ← Semantic version
├── CHANGELOG.md                     ← Release history
├── CLAUDE.md                        ← Engineering foundation + orchestrator identity
└── manifest.md                      ← One-page product summary

Edit surfaces. shared/ and src/ are the only places you edit. dist/, agents/, and the root-level *-kata/ directories are generated by build.sh and committed for clone-and-use installs.

Foundation. The engineering foundation (core principles, bias classification, quality gates, AskUserQuestion contract) lives in CLAUDE.md at the project root and is loaded once per session. Skill-specific content (role, method, steps, cognitive patterns, calibration examples) lives in src/ source files.

Adversarial agents. Three adversarial agents (investigate-kata-adversarial, refine-kata-adversarial, test-kata-adversarial) run on Sonnet with clean context during dual-mode QA. Source is in src/adversarial/.

Contributing

git clone https://github.com/WolfOnWings/Kata.git
cd Kata

# Edit source files in shared/ or src/
# Build, verify, and check structural integrity
./build.sh
./build.sh --verify-only
./bin/verify.sh

A note on tokens: Kata skills are deep — thousands of lines of cognitive patterns, quality gates, and calibration examples. This is the tradeoff: shallow instructions are cheaper, but what you give is what you get.

Built On

Kata synthesizes ideas, patterns, and techniques from these projects:

Major — gstack · Superpowers · everything-claude-code

Significant — Trail of Bits (security audit methodology)

Supporting — Anthropic Cookbook · Context7 MCP

License

MIT — WolfOnWings, 2026

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme