@ruchit07/specloom

v0.7.0

Published

3 days ago

A spec-driven development framework for agent-driven SDLC — a deterministic context compiler that ships per-task token bundles, hash-anchored drift detection, and engine-enforced stage gates

0High
0Medium
0Low

ruchit07

spec-driven-development sdd context-engineering deterministic-context-compiler claude-code cursor copilot agents llm token-efficiency drift-detection

SpecLoom

Hand your coding agent a one-paragraph requirement. Get back working code, tests, specs, ADRs, and traceability docs — after answering three review questions in chat. You never write a spec block, never run a compile command; the agent and a deterministic engine do that.

By Ruchit Suthar — Software Architect & Technical Leader. 📖 Background: SpecLoom: Deterministic Context for Coding Agents

Get started — three steps

Step 1 — one command in your project

npx @ruchit07/specloom init

That's the entire setup. It is safe for existing projects: your CLAUDE.md / AGENTS.md / .github/copilot-instructions.md are never overwritten — SpecLoom appends its ~250-token protocol at the end between  markers and leaves your content untouched (re-running updates only that section). It installs:

specloom/ — the single artifact home: specloom/specs/ (constitution, pipeline, 10 persona contracts, a sample), specloom/flow/ (pipeline state), and specloom/docs/ (generated docs)
the protocol into every agent's instructions file (Claude Code, Cursor, Copilot, Windsurf, AGENTS.md)
the specloom-ship command (solo) plus the specloom-po / specloom-arch / specloom-dev team commands for every tool + universal SPECLOOM_*.md files

Upgrading from a pre-specloom/ layout (specs/ + docs/ + .specloom/)? Run npx @ruchit07/specloom migrate once — it moves everything under specloom/ and leaves your code untouched.

Step 2 — hand over the requirement

| Your tool | What you type | |---|---| | Claude Code | /specloom-ship users can reset their password via an email link | | Cursor | /specloom-ship <requirement> | | Copilot (VS Code) | /specloom-ship in chat (enable Chat: Prompt Files in settings) | | Windsurf | the specloom-ship workflow | | Any other agent | "Follow SPECLOOM_SHIP.md and ship this: <requirement>" |

Already have an analyzed requirement written up? Pass the file: /specloom-ship requirements.md.

Step 3 — answer three questions

The agent drives all seven stages (discover → specify → design → plan → build → verify → ship) and pauses only three times, in chat:

Spec review — the goal, every requirement's acceptance lines (A1, A2…), the architecture decisions, the task list. Approve, or say what to change.
Code review — the diff, an A# → test traceability table, a clean drift check.
Release go/no-go.

That's your whole job. If the agent is missing information it asks (NEED: …) — it is forbidden from inventing requirements.

What you have when it ships

src/…                          code, every unit stamped @spec:REQ-xxx#hash
test/…                         one test per acceptance line
specloom/specs/<feature>/      typed spec blocks — the single source of truth
specloom/docs/<feature>/       generated, stakeholder-ready:
  spec.md             requirements + acceptance criteria
  architecture.md     architecture, interfaces, security
  decisions/DEC-*.md  one ADR per decision
  verification.md     REQ → A# → test traceability
  release.md          runbook + go/no-go evidence

Multiple features can run concurrently — each keeps its own state and gates.

Why bother? With vs. without

You ask an agent to implement signup.

❌ Without — the agent re-reads your PRD + architecture doc + rules file every turn (~25k tokens, mostly irrelevant), picks different context on every run, leaves no record of what it saw, and three weeks later nobody knows if the spec still matches the code.

✅ With — a deterministic compiler (plain code, no LLM) resolves exactly which spec blocks one task depends on and hands the agent ~370 tokens, byte-identical on every run, hash-logged. Code is stamped with the spec's content hash, so the day someone changes the spec without the code, npx specloom verify turns CI red.

| | Without | With SpecLoom | |---|---|---| | Context per task | ~25k tokens, agent-assembled | ~370 tokens, compiled | | Same task → same context? | no, varies per run | yes, byte-identical + hashed | | "Is the spec still true?" | nobody knows | verify answers mechanically, in CI | | Who does the bookkeeping | the LLM (your tokens) | a deterministic engine (free) |

The agent cannot skip your review: stage gates are enforced by the engine, not the prompt. A draft block can't be compiled into a build bundle, and flow approve only passes when the required approved artifacts and a clean drift check actually exist.

Gate your CI in one line

# .github/workflows/specs.yml
name: specs
on: [pull_request]
jobs:
  gate:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: ruchit07/specloom@main   # spec lint + drift gate, findings annotated inline on the PR

Two checks run on every PR, and findings appear as inline ::error annotations on the changed files:

specloom lint — malformed specs never reach a reviewer: requirements without acceptance lines, broken A# numbering, approved blocks depending on drafts, needs edges to nowhere.
specloom verify — drift never lands: spec changed after the code (STALE), anchor to a deleted block (ORPHAN), approved requirement nobody implemented (UNBOUND).

This pair is the smallest possible team adoption: nobody has to change how they work, and "is the spec still true?" gets answered on every PR.

Adopting at your scale

Solo developer — use the whole loop: init, hand requirements to /specloom-ship, review three checkpoints. You get a second brain that never forgets the spec, plus the paper trail (specloom/docs/<feature>/) that makes your POC look like a team built it.

Team / mid-size company — start with the CI gate above on one repo; zero workflow change, immediate value. Then run the pipeline as three asynchronous handoffs instead of one solo session — the same engine, split across roles:

| Command | Role | Owns | Hands off when | |---|---|---|---| | /specloom-po | PO | discover + specify (goal, metrics, requirements, UX) | the spec is approved | | /specloom-arch | Architect | design + plan (architecture, decisions, security, NFRs, tasks) | the design + plan are approved | | /specloom-dev | Dev + QA | build → verify → ship (code, tests, verification, release) | the feature ships |

A PO can refine FEAT-005 two sprints ahead while Dev builds FEAT-003 — each feature keeps its own state, so concurrent work never collides. Each command refuses to work a feature out of band (flow next --role exits non-zero), so nobody designs against an unapproved spec or builds against an unapproved plan. specloom flow list is the board (feature · stage · band · gate); flow approve --by <name> records who signed each gate off; flow reject --to <stage> bounces a feature back to an earlier band for rework. /specloom-ship (solo, all bands) stays available for small features.

Enterprise / AI-native workflow — the audit story is built in: every bundle is content-hashed (any agent output is traceable to the exact context that produced it), every gate approval is recorded in an append-only ledger per feature, decisions are first-class DEC-* blocks exported as ADRs, and verify makes spec/code conformance a continuously enforced control rather than a quarterly review. Map the seven stages onto your sprint ritual: spec review = refinement, checkpoints = the definition of done.

Troubleshooting

/specloom-ship doesn't appear in Claude Code — commands are read at session start; restart the session (or run claude fresh) after init.
/specloom-ship doesn't appear in Copilot — enable Chat: Prompt Files ("chat.promptFiles": true) in VS Code settings, then reload. On older Copilot versions without prompt files, just say: "Follow SPECLOOM_SHIP.md and ship this: …" — the workflow is identical.
Any tool, zero setup — every agent that can read a file can follow SPECLOOM_SHIP.md.
Upgrading from ≤ 0.4 — old versions overwrote existing CLAUDE.md/AGENTS.md on adapt. 0.5.0 appends between markers and never touches your content. (If you were bitten: git checkout the file, re-run npx specloom init.)

Under the hood (what the agent runs so you don't have to)

A .loom file is markdown; only :::-fenced typed blocks are machine-read:

::: id=REQ-010 type=requirement needs=[GOAL-001] tier=1 owner=pm status=approved
Password reset via a one-time email link.
Acceptance:
- A1: requesting a reset for a known email sends a link within 30s
- A2: the link is single-use and expires after 1 hour
:::

needs=[…] gives the compiler the dependency graph; tier controls what may be summarized or dropped under a token budget (acceptance lines are never lost); status gates the pipeline. The agent's loop per task:

specloom compile --task TASK-010 --persona engineer   # → minimal deterministic bundle
# implement ONLY from the bundle; stamp code with @spec:REQ-010#<hash>; 1 test per A#
specloom verify                                       # drift gate: STALE / ORPHAN / UNBOUND ⇒ exit 1

All commands (you'll rarely need them yourself):

| Command | What it does | |---|---| | specloom init | the one setup command — specloom/ workspace + protocol files (append-safe) + all commands | | specloom migrate | move a legacy specs/ + docs/ + .specloom/ workspace into the single specloom/ home | | specloom flow start "<req>" | --from req.md | start a feature; flow status/next/approve/reject drive the stages | | specloom flow next [FEAT] --role po\|architect\|dev | the band guard — prints the stage instruction, exits 1 if the feature is out of your band | | specloom flow approve [FEAT] --by <name> · flow reject [FEAT] --to <stage> | sign a gate off (recorded) · bounce a feature back to an earlier band | | specloom compile --task TASK-001 [--persona engineer] [--budget 8000] [--manifest-only] | deterministic bundle → stdout (--manifest-only = hash + block ids, no bodies) | | specloom digest [FEAT-001] [--stage S] | compact, deterministic checkpoint review of a band's blocks | | specloom verify [--format github] | drift gate; exit 1 on findings — put it in CI | | specloom lint [--format github] | spec-quality gate: missing acceptance lines, bad numbering, premature/dangling needs | | specloom flow list | the board: every feature · stage · band · gate | | specloom export [FEAT-001] | render a feature's blocks → specloom/docs/ (spec, ADRs, traceability, release) | | specloom hash REQ-001 · specloom ls | print an anchor · list all blocks | | specloom adapt [target\|all] · specloom command [target\|all] | re-emit protocol files / commands individually |

Programmatic API: import { compile, verify, exportFeature, digest, blockHash } from '@ruchit07/specloom'.

When to use it — and when not

Use it for POC/MVP and product work where the spec must stay honest as the code grows, where you're tired of re-feeding docs to agents, or where "does the code still match the spec?" needs a mechanical answer.

Skip it for throwaway prototypes (vibe-code those; spec the rewrite), or if you want a full agent runtime — SpecLoom is an engine + protocol that pairs with Claude Code / Cursor / Copilot / Windsurf, not a replacement for them.

Smallest possible adoption: anchor your existing requirements with @spec: comments and put specloom verify in CI. Drift detection alone pays the rent.

How it compares

| | GitHub Spec Kit | BMAD-METHOD | SpecLoom | |---|---|---|---| | Spec form | prose docs | prose + templates | typed, ID'd, hashed blocks | | Context assembly | agent reads files | agent reads files | deterministic compiler | | Token posture | heavy | very heavy | budgeted, ~10–30× lighter | | Drift detection | none | none | hash anchors + CI gate | | Human's role | reads everything | reads everything | three review checkpoints, engine-enforced | | Determinism | no | no | byte-identical bundles, hashed |

Deep design rationale: ANALYSIS.md.

Design notes: zero runtime dependencies (Node ≥ 18, plain ESM); 17 deterministic e2e tests in CI across Linux/Windows/macOS × Node 18/20/22; hashes stable across line endings (autocrlf-proof); token budgeting uses a deterministic chars/4 estimate (set budgets ~15% under target); a bundle that can't fit fails loudly instead of silently truncating. SpecLoom's own specs are anchored and gate its CI — this repo dogfoods itself.

Pairs with

The Spec-First Workflow — the method this evolves from.
ai-spec — scaffold a single AI-feature spec before you write code.

License

MIT