@wednesday-sol/keel

v0.18.0

Published

4 days ago

A quality gate + AI engineering toolkit you drop into any repo: skills hold your AI's code to production standards; the gate enforces them on every commit — blocking untested code, leaked secrets, stale docs, sloppy commits — scoped to your diff, with `ke

0High
0Medium
0Low

praveenkumar1798-org

keel ⚓

An AI engineering harness for the whole software lifecycle. keel runs your AI through every stage — spec → plan → build → verify → review → ship → release — and enforces the standard at each step, so what your AI ships is production-grade, not just plausible.

Jump to: See it work · The three moves · What each phase does · What makes it work · Adopt it · The gate · Install

See it work — one feature, start to finish

You want a search box that filters your list as you type. Here's the whole thing — idea to a merged release — and the only two moments that need you are the first and the last.

1 · Describe it — `/keel:spec`

You hand keel a problem. It interviews you until the intent is sharp, then commits a spec on its own.

you  ▸  /keel:spec "add a search box to filter my task list"

keel ▸  A few questions to sharpen it:
          • match the title only, or notes too?
          • case-sensitive?
          • what shows when nothing matches?

you  ▸  title + notes · case-insensitive · show "No tasks match"

keel ▸  ✓ committed  docs/specs/SPEC_07_task_search.md
          (Goal · Tasks · Done — each Done line becomes a test)

        next ▸  /keel:plan docs/specs/SPEC_07_task_search.md

2 · Plan it — `/keel:plan <spec path>`

You pass the spec path back. keel slices the work into small, shippable steps and writes them into the spec, so the plan survives the session.

you  ▸  /keel:plan docs/specs/SPEC_07_task_search.md

keel ▸  3 slices, smallest-shippable first:

          P1 ·  filterTasks() — pure, case-insensitive match on title + notes
          P2 ·  the search box, wired to the list
          P3 ·  the "No tasks match" empty state

        ✓ written into the spec's  ## Plan  section

        next ▸  /keel:run docs/specs/SPEC_07_task_search.md

3 · Build it — `/keel:run <spec path>`

You pass the spec path one last time. Now keel is the autopilot: it takes each slice through the full lifecycle — Build → Verify → Review → Ship — and stops only at an open, green, review-resolved PR. Read each slice top to bottom; every phase is its own step:

you ▸  /keel:run docs/specs/SPEC_07_task_search.md auto


━━━━━  slice P1 · filterTasks()  ━━━━━━━━━━━━━━━━━━━━━━━━━━

  BUILD    Writes the failing test first (red), from the Done line:

               it("matches case-insensitively, title + notes", () => {
                 expect(filterTasks(tasks, "MILK"))
                   .toEqual([{ title: "Buy milk" }])
               })

           …then the smallest code to make it pass — pure, no I/O.

  VERIFY   Tests green · keel eval green · coverage 100%.
           Runs it for real, leaves an end-to-end test behind.

  REVIEW   Adversarial read: "what about an empty query?"
           Adds that case as a test — clean.

  SHIP     Opens PR #41, watches CI  ·········  🟢

  GEMINI   Review bot: "trim the query before matching?"
           keel checks it → valid → fixes test-first,
           replies on the thread, re-runs CI  ·········  🟢

  ✓  P1 done — PR #41 open, green, review resolved


━━━━━  slice P2 · search box  ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

  …same loop (build → verify → review → ship)  →  PR #42  🟢


━━━━━  slice P3 · empty state  ━━━━━━━━━━━━━━━━━━━━━━━━━━━━

  …same loop  →  PR #43  🟢


3 PRs, open and green.  Review and merge when you're ready.

4 · Merge — that part's yours

You review each PR and click merge. On merge, keel releases automatically: reads the Conventional Commits, bumps the version, rolls the CHANGELOG, publishes, and tags.

The only two decisions that were yours: what to build (the spec interview) and whether to merge. keel drove everything in between — and never merged on its own.

The three moves

For almost any change it's three commands; everything in the example above happens between them:

/keel:spec "<what you want>" — keel interviews you, commits the spec.
/keel:plan <spec path> — keel slices it into the spec's ## Plan; you sign off.
/keel:run <spec path> — keel drives each slice Build → Verify → Review → Ship to a green PR. step pauses after each slice so you can eyeball it; auto runs them all on its own.

Then you review and merge. Verify and Review run on every slice — never skipped.

Who decides what: you own is this the right thing to build? and is it good enough to merge? keel drives everything mechanical — the failing test, the gate, CI, the review bots, the release — to green on its own. You're never re-running CI or chasing a bot's comment.

What each phase does

/keel:run runs these for you per slice — but each is also a command you can drive by hand. The full depth (the questions each phase asks, worked inputs) lives in docs/LIFECYCLE.md.

| Phase | Command | What happens | The gate enforces | | --- | --- | --- | --- | | Define | /keel:spec | interview a fuzzy idea into a committed spec — Goal · Tasks · Done | spec-gate · spec-quality | | Plan | /keel:plan <spec> | slice the spec into its ## Plan, dependency-ordered; you sign off | PR-size budget | | Build | /keel:build | the next slice, test-first, to the code-craft bar | TDD · size · lint · types · duplication | | Verify | /keel:verify | tests + coverage green, then exercise it live and leave an e2e | patch coverage · e2e | | Review | /keel:review | adversarial pass — correctness, security, performance, doc honesty | a red gate is an auto-BLOCK | | Ship | /keel:ship | commit, open the PR, watch CI to green, resolve review feedback | commit-msg · pr-description · changeset | | Release | automatic | on merge: SemVer bump · CHANGELOG · publish · tag | — |

Two helpers tie it together: /keel:run <spec> is the driver that walks the ## Plan and runs Build → Ship for each slice (so you don't call them one at a time); /keel:address-review is the loop Ship uses to triage and answer reviewer comments until none remain. Neither ever merges — that stays your call.

What makes it work

keel isn't a pile of scripts you invoke by hand — it understands the request and the repo:

🧠 It drives the lifecycle. Phase commands walk the AI through each stage, loading the right skill so it knows how to do that step — not just that it should.
🦷 It enforces the bar. The same rules run as a deterministic gate (keel eval) on every commit and in CI — "≤300 lines," "patch coverage ≥ the bar," "no secret in the diff," "a spec behind this code" actually bite. Verify and Review are mandatory, never skipped.
🎯 It's smart about where you are. A free-form prompt ("add a search box") is classified into the right phase, keel states its read back to you ("this looks like Define — no spec yet"), and 1–3 questions sharpen it before any code. You describe the work; you don't memorize commands.
♻️ It closes its own loops. Ship watches CI to green; address-review answers reviewer feedback to resolved; on merge, release runs itself. It never hands you a half-done state.
📏 It only judges the lines you changed. Every check is diff-scoped — a 4,000-line legacy file never blocks you; the function you touch today is held to the bar.

Under the hood, every discipline is a skill file with the same anatomy — Process · Rationalizations · Red flags · Verification — so the AI follows the exact pattern and the anti-rationalizations stop it from talking itself out of the rule. Thresholds, logic dirs, base branches, and stack profile all live in keel.config.json; skills and checks read config, never hardcode.

Adopt it

keel meets you wherever you are.

Starting a new project

keel init to adopt, then describe what you're building — keel routes you to Define and interviews you into a first spec.
Work the lifecycle from a clean slate: every subsystem gets a spec, every slice is test-first, the gate is green from commit one.
You end up with a spec-driven codebase, living docs, and a tamper-proof CI gate built in from the start — no retrofit later.

Best for: greenfield work where you want discipline baked in, not bolted on.

Adopting into an existing repo

pnpm add -D @wednesday-sol/keel && pnpm exec keel init — copies the skills/commands into .claude/, writes keel.config.json, provisions ESLint, installs the PR template. It never clobbers your config and is safe to re-run.
Run keel eval — it judges only the lines you change. Your 4,000-line legacy file isn't blocked; the function you touch today is held to the bar.
Turn on the opt-in gates (spec-gate, spec-quality, changelog) when your team is ready — they're off by default, so adoption is gradual.

Best for: raising the floor on an existing codebase without a big-bang cleanup. You raise the floor on new code and grandfather the old.

Refactoring an existing repo, top to bottom

When you've decided to bring the whole legacy up to standard — not grandfather it — run a refactor campaign: /keel:refactor. It reconciles a full cleanup with the diff-scoped gate by doing it as a planned sequence of small, behavior-preserving slices — so the floor rises file by file and you never get a big-bang rewrite or a 4,000-issue dump.

Baseline — keel eval / keel report to record where you're starting from (and confirm the suite is green).
Spec the target — a Define pass for the refactor itself: the standard every module must reach, the behavior that must be preserved, and what's out of scope.
Characterize first — pin current behavior with tests before you touch untested code. The net before the cut.
Prioritize — rank modules by risk × churn × blast-radius; find the shallow ones with the deletion test (would removing it concentrate complexity, or just move it?).
Refactor slice by slice — each module runs the normal lifecycle (Build → Verify → Review → Ship). One intent per PR: refactor or feature, never both. keel eval holds each diff to the full bar.
Ratchet up — as a subsystem reaches standard, tighten its thresholds and turn on the opt-in gates so it can't regress. The grandfathered set shrinks to zero by intent, never by lowering the bar.

Backed by the refactor skill (characterization-first, deepen-don't-reshuffle, seam-first), which leans on code-craft, testing, and deprecation-migration.

Best for: an inherited or legacy repo you've committed to modernizing — a deliberate campaign, not a weekend rewrite.

The gate — `keel eval`

One command, before every push and in CI. Every check is diff-scoped — new/changed lines held to the full bar; untouched legacy shown, never blocking.

| Check | What it enforces | | --- | --- | | pr-size | substantive changed files under budget — reviewable PRs | | file size | changed files within limits.fileLines (300) / componentLines (150) | | naming | no lazily-named files (numbered duplicates, copy/backup/final) | | TDD | every new logic file ships with a matching test | | lint | ESLint errors block; warnings on added lines block; legacy grandfathered | | duplication | copy-paste under threshold (jscpd) | | type check | tsc --noEmit clean | | patch coverage | coverage of the lines this diff added ≥ coverage.min | | doc-sync | every backticked repo path in markdown actually exists | | jsdoc | each documented function has a description, @function, @param | | spec-sync | changing a subsystem touches its spec (configurable) | | plan-sync | changing a subsystem moves its spec's ## Plan — a slice ticked off (off by default) | | spec-gate | substantive code ships with a spec behind it (off by default) | | spec-quality | a changed spec is clearly written — Goal · Tasks · Done (off by default) | | changelog | a changed spec ships with a dated .changelog/ ledger entry (off by default) | | e2e | a feature change ships with an end-to-end test (configurable) | | secret-scan | no probable secret introduced on added lines | | changeset | a published-package change adds a changeset | | commit-msg | commit headers in base..HEAD are Conventional Commits | | pr-description | required PR-template sections present (on a PR run) | | boundary-guard | no edits to configured read-only/protected paths |

keel fix auto-clears the mechanical ones (eslint --fix over just your diff — formatting, quotes, import order) without touching legacy; anything needing judgment stays reported, never silently rewritten.

Tamper-proof in CI — reference keel's reusable workflow by a pinned version, so a repo can't quietly weaken its own checks:

# .github/workflows/keel.yml
jobs:
  gate:
    uses: wednesday-solutions/keel/.github/workflows/[email protected]
    with:
      version: "0.8.0"

Only Node ≥ 20 and git are required. Every other tool is optional and degrades gracefully — a missing tool skips with a reason, never a silent install. Pure Node, so it works on Windows and offline.

Install

keel has two halves — use either or both.

The gate (any repo, any editor):

pnpm add -D @wednesday-sol/keel
pnpm exec keel init    # copies skills/commands → .claude/, writes keel.config.json
pnpm exec keel eval    # check your current changes

The skills + commands (inside Claude Code):

/plugin marketplace add wednesday-solutions/keel
/plugin install keel@keel

You now have the /keel:* phase commands, the skills, and the reviewer subagents.

Reference

CLI — keel <command>:

| Command | What it does | | --- | --- | | keel init / --update | adopt keel / refresh an adopted copy (never touches your config or skills/local/) | | keel eval [base] | the full diff-scoped gate | | keel fix [base] | apply safe auto-fixes (eslint --fix) to the diff | | keel report [base] | run eval + write a CI job-summary table | | keel <check> | run any single check (keel lint, keel coverage, keel spec-gate, …) | | keel version | next SemVer from Conventional Commits (--write bumps files) | | keel hooks install | git pre-commit / commit-msg / pre-push hooks (no husky) | | keel codeowners · keel eslint-config | generate .github/CODEOWNERS / the provisioned ESLint config (complexity rules + the Airbnb style guide at warn, eslint.airbnb default on) from config |

Skills & subagents — each skill follows the same anatomy (Process · Rationalizations · Red flags · Verification). Crown jewels: spec-driven, living-docs, changesets, code-craft, ship-check, testing. Add your own under skills/local/ — keel init --update never touches it.
The document system — templates in docs-system/ for the source-of-truth chain (BRD → PRD → TRD → SPEC → ADR → architecture); per-spec change history accrues in .changelog/.
Configuration — every knob lives in keel.config.json, resolved as defaults ∪ stack-profile overlay ∪ your override. Full shape: config/keel.schema.json.

Credits

keel synthesizes ideas from wednesday-harness (the enforcement spine), addyosmani/agent-skills (lifecycle + anti-rationalization anatomy), mattpocock/skills (composable meta-skills), and gstack (the CI-watch loop).

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme