npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

claude-devkit-cli

v1.5.8

Published

CLI toolkit for spec-first development with Claude Code — hooks, commands, guards, and test runners

Readme

claude-devkit-cli

A lightweight, spec-first development toolkit for Claude Code. It enforces the cycle spec (with acceptance scenarios) → code + tests → build pass through custom commands, automatic hooks, and a universal test runner.

Works with: Swift, TypeScript/JavaScript, Python, Rust, Go, Java/Kotlin, C#, Ruby. Dependencies: None (requires only Claude Code CLI, Node.js, Git, and Bash).


Table of Contents

  1. Philosophy
  2. Quick Start
  3. Setup
  4. Daily Workflows
  5. Commands Reference
  6. Automatic Guards (Hooks)
  7. Build Test Script
  8. Spec Format
  9. Customization
  10. Token Cost Guide
  11. Troubleshooting
  12. FAQ

1. Philosophy

The Core Cycle

SPEC (with acceptance scenarios) → CODE + TESTS → BUILD PASS

Every code change — feature, fix, or removal — follows this cycle. The spec is the source of truth. Acceptance scenarios (Given/When/Then) are embedded directly in the spec — no separate test plan file. If code contradicts the spec, the code is wrong.

Why Spec-First?

  • Prevents drift. Acceptance scenarios live inside the spec — no separate test plan to fall out of sync.
  • Tests have purpose. Scenarios derived from specs test behavior, not implementation details. This means tests survive refactoring.
  • AI writes better code. When Claude Code has a spec with concrete Given/When/Then scenarios, it generates more accurate implementations and more meaningful tests.
  • Reviews are grounded. Reviewers can check code against the spec rather than guessing at intent.

Principles

  1. Specs are source of truth — Code changes require spec updates first.
  2. Incremental, not big-bang — Test after each code chunk, not after everything is done.
  3. Tests travel with code — Every PR includes production code + tests + spec updates.
  4. Build pass is the gate — Nothing merges with failing tests.
  5. Everything in the repo — Specs, plans, tests, and code are version-controlled and reviewable.

2. Quick Start

Time needed: 5 minutes.

# 1. Install dev-kit into your project
npx claude-devkit-cli init .

# 2. Open your project in Claude Code
claude

# 3. Create your first spec
/mf-plan "describe your feature here"

# 4. Write code, then test
/mf-build

# 5. Review before merging
/mf-review

# 6. Commit
/mf-commit

That's it. The CLI auto-detects your project type and configures everything.


3. Setup

Prerequisites

| Tool | Required | Why | |------|----------|-----| | Claude Code CLI | Yes | Runs the commands and hooks | | Git | Yes | Change detection, commit workflow | | Node.js (18+) | Yes | File guard hook, JSON parsing | | Bash (4+) | Yes | Path guard hook, build-test script | | Language toolchain | Yes | Whatever your project uses (Swift, npm, pytest, etc.) |

Installation

Option A: One-command install (recommended)

npx claude-devkit-cli init .

Option B: Global install

npm install -g claude-devkit-cli

# Then, in any project:
cd my-project
claude-devkit init .

Option C: Global skills install (available in all projects without running init again)

claude-devkit init --global
# or after per-project init, answer "yes" to the global prompt

Skills installed globally at ~/.claude/skills/ are available in every project. Per-project .claude/skills/ always takes precedence over global — so projects can still override individual skills.

Option D: Force re-install (overwrites existing files)

npx claude-devkit-cli init --force .

Option D: Selective install (only specific components)

npx claude-devkit-cli init --only hooks,skills .

What Gets Installed

your-project/
├── .claude/
│   ├── CLAUDE.md              ← Project rules hub
│   ├── settings.json          ← Hook wiring
│   ├── hooks/
│   │   ├── file-guard.js      ← Warns on large files
│   │   ├── path-guard.sh      ← Blocks wasteful Bash paths
│   │   ├── glob-guard.js      ← Blocks broad glob patterns
│   │   ├── comment-guard.js   ← Blocks placeholder comments
│   │   ├── sensitive-guard.sh ← Blocks access to secrets
│   │   └── self-review.sh     ← Quality checklist on stop
│   └── skills/
│       ├── mf-explore/SKILL.md      ← /mf-explore skill
│       ├── mf-plan/SKILL.md         ← /mf-plan skill
│       ├── mf-challenge/SKILL.md    ← /mf-challenge skill
│       ├── mf-build/SKILL.md        ← /mf-build skill
│       ├── mf-fix/SKILL.md          ← /mf-fix skill
│       ├── mf-review/SKILL.md       ← /mf-review skill
│       └── mf-commit/SKILL.md       ← /mf-commit skill
├── scripts/
│   └── build-test.sh          ← Universal test runner
└── docs/
    ├── specs/                 ← Your specs (folder-per-feature)
    │   └── <feature>/
    │       ├── <feature>.md   ← Spec with acceptance scenarios
    │       └── snapshots/     ← Version history (managed by /mf-plan)
    └── WORKFLOW.md            ← Process reference

Post-Install Configuration

The CLI auto-detects your project type and fills in CLAUDE.md. Verify it's correct:

cat .claude/CLAUDE.md

Look for the Project Info section. Ensure language, test framework, and directories are correct. Edit manually if needed.

Upgrade

npx claude-devkit-cli upgrade

Smart upgrade — updates kit files but preserves any you've customized. Use --force to overwrite everything.

# Check if update is available
npx claude-devkit-cli check

# See what changed
npx claude-devkit-cli diff

# View installed files and status
npx claude-devkit-cli list

Uninstall

npx claude-devkit-cli remove

This removes hooks, skills, settings, and build-test.sh. It preserves CLAUDE.md (which you may have customized) and docs/ (which contains your specs).


4. Daily Workflows

Explore Before Planning

When: Requirements are unclear, you're debating between approaches, or it's a brownfield feature with existing code to understand first.

1. /mf-explore "feature description"
   → Asks questions as a Client Technical Lead — one topic at a time.
   → Clarifies: why, behavior, boundaries, business rules, edge cases, permissions, UI.
   → Output: docs/explore/<feature>.md

2. /mf-plan "feature description"
   → Auto-detects docs/explore/<feature>.md, skips redundant discovery.
   → Continue with the normal New Feature flow.

Example:

/mf-explore "cancel order request"

New Feature

When: Building something new — no existing code or spec.

1. /mf-plan "description of the feature"
   → Generates spec with acceptance scenarios at docs/specs/<feature>/<feature>.md.

2. Implement code in chunks.
   After each chunk: /mf-build
   Repeat until green.

3. /mf-review (before merge)

4. /mf-commit

Example:

/mf-plan "User authentication with email/password login, password reset via email, and session management with 24h expiry"

Update Existing Feature

When: Changing behavior of something that already exists.

1. /mf-plan docs/specs/<feature>/<feature>.md "description of changes"
   → Mode C handles everything: snapshot → classification → change report → apply.
   Do NOT manually edit the spec before running /mf-plan.

2. Implement the code change.
   /mf-build
   Fix until green.

3. /mf-review → /mf-commit

Bug Fix

When: Something is broken.

1. /mf-fix "description of the bug"
   → Writes failing test → fixes code → runs full suite.

2. /mf-commit

Example:

/mf-fix "Search returns no results when query contains apostrophes like O'Brien"

Remove Feature

When: Deleting code, removing deprecated functionality.

1. /mf-plan docs/specs/<feature>/<feature>.md "remove stories S-XXX"
   → Mode C creates a snapshot (removing stories = Major), then marks as removed.

2. Delete production code + related tests.

3. bash scripts/build-test.sh (run full suite)
   Fix cascading breaks.

4. /mf-commit

5. Commands Reference

/mf-explore — Feature Discovery as Client Technical Lead

Usage:

/mf-explore "cancel order request"
/mf-explore "user notification preferences"

When to use: Requirements are unclear, you're debating between approaches, or you want to clarify a feature deeply before committing to a spec. Runs before /mf-plan.

How it works:

  1. Phase 0: Codebase scan — Silently checks for existing code, related specs, and existing explore docs before asking anything.
  2. Phase 1: Why, not what — Asks what problem requires this feature, who faces it, and how they handle it today. Prevents building the wrong thing.
  3. Phase 2: Desired behavior — Walks through the flow step by step, identifies trigger and final result, checks for multi-role approval chains.
  4. Phase 2.5: UI/UX expectation — Clarifies interface type (table, form, wizard, dashboard). Offers sensible defaults when the client is unsure. Suggests simpler approaches when expectations are complex.
  5. Phase 3: Boundaries — Impact on existing screens, data changes, migration needs, out of scope, permissions.
  6. Phase 3.5: Scope optimization — Identifies what can ship fast vs what can defer to phase 2.
  7. Phase 4: Business rules & validation — Conditions, formulas (with real numbers), input validation, notifications, time constraints, concurrency.
  8. Phase 5: Edge cases — Empty states, error messages, double submit, network loss, limits, sensitive data, domain-specific cases (payment double-charge, booking overbooking, etc.).
  9. Phase 6: Scenario confirmation — Presents concrete happy path + unhappy paths with fake data. Confirms with user before proceeding.
  10. Phase 7: Handoff summary — Compiles everything into a structured doc, confirms with user, writes to docs/explore/<feature>.md.

Output: docs/explore/<feature>.md — auto-detected by /mf-plan, which skips redundant discovery and maps explore findings directly to spec sections.

Token cost: 10–20k


/mf-plan — Generate Spec with Acceptance Scenarios

Usage:

/mf-plan "user authentication with OAuth2"                          # Mode A: new spec from description
/mf-plan docs/specs/auth/auth.md                                    # Mode B: add scenarios to existing spec
/mf-plan docs/specs/auth/auth.md "add password reset flow"          # Mode C: update existing spec

Modes:

  • Mode A — Creates a new spec with stories and acceptance scenarios from your description.
  • Mode B — Reads an existing spec that has no acceptance scenarios yet, adds them.
  • Mode C — Updates an existing spec: creates a snapshot before Major changes, shows a change report, waits for confirmation, then applies.

How it works:

  1. Phase 0: Codebase Awareness — Scans existing code, docs/specs/, and project patterns before planning. Prevents specs that conflict with existing implementations.
  2. Phase 1: Scope & Split + Scope Challenge — Evaluates feature size (>7 stories or >20 AS → must split). When a feature is large, applies Sizing & Phasing: Phase 1 (minimum viable — smallest slice with value), Phase 2 (core experience — happy path), Phase 3 (edge cases, polish), Phase 4 (optimization, monitoring) — each phase mergeable independently. Also runs a Scope Challenge before drafting: checks for existing code that already solves sub-problems (reuse vs rebuild), flags complexity smells (8+ files or 2+ new classes/services), searches for framework built-ins, checks for distribution needs (new artifact → CI/CD in scope?), and applies the Completeness Principle (complete version costs only CC: ≤15m more → recommend it directly).
  3. Phase 2: Draft Spec — Generates a structured spec with stories and acceptance scenarios (Given/When/Then). Depth scales by priority: P0 gets full GWT + test data, P1 gets GWT, P2 gets 1-2 line descriptions. Runs consistency checks (CC1-CC6) before showing draft.
  4. Phase 3: Clarify Ambiguities — Systematically finds gaps across behavioral, data, auth, non-functional, integration, and concurrency dimensions. Questions include (human: ~X / CC: ~Y) effort scales and Completeness: X/10 scores for each option.
  5. Phase 4: Summary — Shows story counts, AS counts, implementation order, next steps. Every spec also gets a "What Already Exists" section (existing code that partially solves the problem) and a "Not in Scope" section (deferred work with rationale — prevents work from silently dropping).

Mode C (Update) adds:

  • Classification — Walks through M1-M6 checklist to determine Major vs Minor change.
  • Snapshot — Major changes trigger an automatic snapshot (cp, bit-perfect) before editing.
  • Change report — Shows what will change, waits for user confirmation.
  • Consistency check — Runs CC1-CC6 after every update.

Traceability IDs:

  • S-NNN — Stories (with priority P0/P1/P2)
  • AS-NNN — Acceptance Scenarios (Given/When/Then, embedded in stories)
  • FR-NNN — Functional Requirements (if needed)
  • SC-NNN — Success Criteria (if needed)
  • IDs are immutable — deleted IDs are never reused.

Directory structure:

docs/specs/<feature>/
  <feature>.md              # single source of truth — always read this file
  snapshots/                # version history (managed by mf-plan, not developers)
    YYYY-MM-DD.md
    YYYY-MM-DD-<REF>.md

Output:

  • Spec with acceptance scenarios: docs/specs/<feature>/<feature>.md

/mf-challenge — Adversarial Plan Review

Usage:

/mf-challenge docs/specs/auth/auth.md   # challenge a spec
/mf-challenge "user authentication"     # challenge by feature name

How it works (7 phases):

  1. Read & Map — Reads the spec (including acceptance scenarios) and maps: decisions made, assumptions (stated AND implied), dependencies, scope boundaries, risk acknowledgments, story-AS consistency.

  2. Scale Reviewers — Assesses complexity and selects reviewers:

    | Complexity | Signals | Reviewers | |------------|---------|-----------| | Simple | 1 spec section, <20 acceptance scenarios, no auth/data | 2 | | Standard | Multiple sections, auth or data involved | 3 | | Complex | Multiple integrations, concurrency, migrations, 6+ phases | 4 |

  3. Spawn Reviewers — Launches parallel subagents, each with an adversarial lens:

    • Security Adversary

      • OWASP Top 10
      • Injection vectors
      • Auth/authz bypass
      • Crypto issues
      • Data exposure
      • Supply chain risks
    • Failure Mode Analyst"Everything that can go wrong, will — simultaneously, at 3 AM, during peak traffic"

      • Partial failures
      • Concurrency & race conditions
      • Cascading failures
      • Recovery paths
      • Idempotency
      • Observability gaps
    • Assumption Destroyer"'It should work' is not evidence"

      • Unverified claims
      • Scale assumptions
      • Environment differences
      • Integration contracts
      • Data shape assumptions
      • Timing dependencies
      • Hidden dependencies
    • Scope & YAGNI Critic"The best code is no code. The best feature is the one you didn't build"

      • Over-engineering
      • Premature abstraction
      • Missing MVP cuts
      • Gold plating
      • Simpler alternatives
  4. Deduplicate & Rate — Collects all findings, removes duplicates, rates severity using a Likelihood x Impact matrix. Caps at 15 findings: keeps all Critical, top High by specificity, notes how many Medium were dropped. Each reviewer is limited to top 7 findings.

  5. Adjudicate — Evaluates each finding: Accept (valid flaw, plan should change) or Reject (false positive, acceptable risk, already handled). 1-sentence rationale for each.

  6. User Choice — Two modes: "Apply all accepted" (fast) or "Review each" (walk through one by one).

  7. Apply — Surgical edits only to accepted findings. Doesn't rewrite surrounding sections.

Finding format: Each finding includes Title, Severity, Confidence score (9-10 = verified; 7-8 = strong match; 5-6 = note caveat; ≤4 = omit unless Critical), Location, Flaw description, Evidence (direct quote from the plan), step-by-step Failure scenario, and Suggested fix.

6 non-negotiable rules:

  1. Spawn reviewers in parallel (not sequential)
  2. Reviewers read files directly, not summarized content
  3. Be hostile — no praise, no softening
  4. Every finding must quote the plan directly as evidence
  5. Quality over quantity — 3 honest findings > 15 padded ones
  6. Skip style/formatting — substance only

When to use:

  • After /mf-plan, before coding — for complex features
  • Features involving auth, payments, data pipelines, multi-service integration
  • NOT needed for simple CRUD, small bug fixes, or trivial features

Token cost: 15-30k (uses parallel subagents, doesn't bloat main context)

/mf-build — TDD Delivery Loop

Usage:

/mf-build                              # build all changes vs base branch
/mf-build src/api/users.ts             # build specific file
/mf-build "user authentication"        # build specific feature

How it works:

  1. Phase 0: Build Context — Finds changed files vs base branch, reads the spec (acceptance scenarios in ## Stories section are the roadmap), checks docs/specs/<feature>/.build-progress to resume from a previous interrupted session, reads existing tests for patterns, fixtures, and naming conventions. Doesn't duplicate what already exists.
  2. Phase 1: Decide What to Test — Determines test scope from acceptance scenarios. Applies the Completeness Principle: AI writes tests ~50x faster than humans, so if full coverage costs CC: ≤15m, it writes complete tests without asking. Always checks 8 mandatory edge case categories: null/undefined, empty arrays/strings, invalid types, boundary values (min/max), error paths (network failures, DB errors), race conditions, large data (10k+ items), and special characters (Unicode, SQL chars).
  3. Phase 1.5: Coverage Map — Before writing a single test, traces every code path (if/else, switch, guard, try/catch) AND user flows (double-click, stale session, navigate away mid-op). Draws an ASCII diagram marking each path as [★★★ TESTED], [★★ TESTED], [★ TESTED], or [GAP]. Gaps marked [GAP] [→E2E] need E2E tests; [GAP] [→EVAL] need evals — when flagged, defines capability + regression evals before implementing and reports pass@1/pass@3. Regression rule: if the diff changes existing behavior with no covering test, a regression test is a CRITICAL requirement — no asking, no skipping.
  4. Phase 2: Write Tests — Writes tests for every [GAP] identified in the Coverage Map. Before moving to Phase 3, verifies: all public functions have unit tests, all API endpoints have integration tests, edge cases covered, error paths tested, tests independent, assertions specific.
  5. Phase 3: Build and Run — Compiles/typechecks first, then runs tests.
  6. Phase 4: Fix Loop — If tests fail, fixes test code only (max 3 attempts, then hard stop and report). If tests expect X but code does Y, asks whether to fix production code or adjust the test — with effort scales (human: ~X / CC: ~Y).
  7. Phase 5: Report — Summary with test counts, results, coverage, files touched, and any E2E/eval gaps to follow up on.

Rules:

  • Never changes production code without asking first
  • Never deletes or weakens existing tests
  • Never adds skip/xit/@disabled to hide failures
  • Max 3 fix attempts — then stops and reports the issue

What NOT to test: Private/internal methods, framework behavior, trivial getters/setters, implementation details.

/mf-fix — Test-First Bug Fix

Usage:

/mf-fix "description of the bug"

How it works:

  1. Phase 0: Investigate — Parses the bug report, locates relevant code, checks git history, and forms a root cause hypothesis. Then draws a Bug Path Diagram (same [GAP]/[★★ TESTED] format as /mf-build) for the buggy function — if no specific [GAP] path can be identified, the hypothesis isn't specific enough yet.
  2. Phase 1: Write Failing TestRegression rule first: if the bug exists because the diff changed existing behavior with no test covering that path, a regression test is a CRITICAL requirement. Creates a test that reproduces the bug and MUST fail with current code.
  3. Phase 2: Fix — Minimal change only. Blast radius check: if fix touches >5 files, stops and asks before editing.
  4. Phase 3: Verify — Bug test must pass; full suite must show no new regressions.
  5. Phase 4: Root Cause Analysis — Documents: Symptom, Root cause, Gap (why wasn't this caught earlier?), Prevention (one of: type constraint, validation, lint rule, spec update). Non-optional for serious bugs.
  6. Phase 5: Report — Structured debug report with hypothesis, fix, evidence, and regression test reference.

Multiple bugs: Triages by severity, fixes one at a time, commits each separately.

/mf-review — Pre-Merge Quality Gate

Usage:

/mf-review                            # review all changes vs base branch
/mf-review src/auth/                  # review specific directory

How it works:

  1. Phase 0: Understand Intent — Reads commit messages, checks for related spec, expands blast radius. Also notes what already exists: flags if the diff rebuilds something that already exists in the codebase.
  2. Phase 1: Smart Focus — Auto-detects what to focus on based on the diff (auth → security, SQL → injection, payments → idempotency, etc.). Spends 60% of analysis on the primary focus.
  3. Phase 2: Review — Security, correctness, API/Backend patterns (unvalidated input, missing rate limiting, missing timeouts, missing CORS, error message leakage), spec-test alignment, code quality (including diagram maintenance: stale ASCII diagrams in comments are flagged), performance, a Failure Mode Grid for each new codepath (3 dimensions: test covers it? error handling exists? user sees a clear error or silent failure? — all 3 missing = Critical gap), and an AI-generated code addendum when reviewing AI-written changes (behavioral regressions, trust boundaries, architecture drift, model cost escalation).
  4. Phase 3: Report — Structured report. Every finding includes a confidence score (confidence: N/10): 9-10 = verified in code; 7-8 = strong pattern match; 5-6 = possible false positive; <5 = appendix only. Includes a "Not in scope" section listing deferred work with rationale.

Proportional review: A 5-line doc change gets a light review. A 500-line auth rewrite gets file-by-file deep analysis.

Verdicts: APPROVE / REQUEST CHANGES / NEEDS DISCUSSION.

Rules:

  • At least 1 positive note — reinforces good patterns, not just problems
  • Never auto-fixes code — report only
  • Checks spec-test alignment: code changed → spec/acceptance scenarios/tests also changed?

/mf-commit — Smart Git Commit

Usage:

/mf-commit

How it works:

  1. Analyze — Scans git status, diff stats, and file contents in one pass.
  2. Scan for secrets — Matches patterns: api_key, token, password, secret, private_key, credential, auth_token. Hard block — stops immediately if found, non-negotiable.
  3. Scan for debug code — Matches: console.log, debugger, print(), TODO:remove, HACK:, FIXME:temp, binding.pry, var_dump. Soft warn — proceeds if you confirm.
  4. Stage files — Stages specific files by name. Never uses git add -A.
  5. Generate message — Conventional format: type(scope): description. Imperative tense ("add" not "added"), no period, WHAT+WHY not HOW.
  6. Commit — Does NOT push (safe default). Ask Claude explicitly to push.

Large diff warning: If >10 files OR >300 lines changed, suggests splitting into smaller commits for easier review.

Never stages: .env, credentials, build artifacts, generated files, binaries >1MB.

Breaking changes: If the diff removes/renames a public function, export, or API endpoint, uses feat! or fix! type, or adds a BREAKING CHANGE: footer.


6. Automatic Guards (Hooks)

Hooks run automatically — you don't invoke them. They provide passive protection.

File Guard (file-guard.js)

Trigger: After every Write or Edit operation. Action: If a modified source code file exceeds 350 lines, injects a warning suggesting modularization. Docs, configs, and templates are intentionally excluded — they are naturally long. Blocking: No — warns only, does not prevent the edit.

Checked extensions: .ts, .tsx, .js, .jsx, .py, .php, .rb, .rs, .go, .swift, .kt, .java, .cs, .cpp, .c, .dart, .vue, .svelte, .astro, and more. Not checked: .md, .json, .yaml, .toml, .html, .css, .sh, and other non-source files.

Configuration:

# Change the line threshold (default: 350)
export FILE_GUARD_THRESHOLD=500

# Exclude files from checking (comma-separated globs)
export FILE_GUARD_EXCLUDE="*.generated.swift,*.pb.go,*.min.js"

Path Guard (path-guard.sh)

Trigger: Before every Bash command. Action: Blocks commands that reference large directories (node_modules, build artifacts, etc.). Blocking: Yes — prevents the command from running.

Default blocked paths: node_modules, __pycache__, .git/objects, dist/, build/, .next/, vendor/, Pods/, .build/, DerivedData/, .gradle/, target/debug, target/release, .nuget, .cache

Configuration:

# Add project-specific blocked paths (pipe-separated)
export PATH_GUARD_EXTRA="\.terraform|\.vagrant|\.docker"

Glob Guard (glob-guard.js)

Trigger: Before every Glob (file search) operation. Action: Blocks overly broad glob patterns at project root that would return thousands of files and fill the context window. Blocking: Yes — prevents the glob and suggests scoped alternatives.

What it blocks:

  • **/*.ts at project root (use src/**/*.ts instead)
  • **/* at project root (use src/**/* instead)
  • * or ** at project root
  • Any recursive glob without a specific directory prefix

What it allows:

  • src/**/*.ts — scoped to a specific directory
  • tests/**/*.test.js — scoped to tests
  • **/*.ts when run from inside a scoped directory (e.g., path: "src")

Comment Guard (comment-guard.js)

Trigger: After every Edit operation. Action: Detects when real code is replaced with placeholder comments like // ... existing code ... or // rest of implementation. This is a common LLM laziness pattern. Blocking: Yes — rejects the edit and tells Claude to preserve the original code.

What it catches:

  • // ... existing code ..., // ... rest of implementation
  • // [previous code remains], // unchanged
  • /* ... */ replacing real code
  • # ... existing ... (Python placeholders)
  • // TODO: implement replacing real code
  • Any edit where real code is replaced with a much shorter comment-only block

What it allows:

  • Editing comments (old content was already comments)
  • Adding comments alongside code (new content has both)
  • Normal code replacements

Sensitive Guard (sensitive-guard.sh)

Trigger: Before every Read, Write, Edit, and Bash command. Action: Protects files containing secrets: .env, private keys, credentials, tokens. Blocking: Read/Write/Edit → blocks (exit 2). Bash commands → warns only (allows access).

The Bash warn-only behavior enables an approval flow: Claude asks the user for permission, and if approved, can use bash cat .env to read the file.

Protected files:

  • .env, .env.local, .env.production, etc. (but NOT .env.example)
  • Private keys: *.pem, *.key, *.p12, *.pfx, *.jks
  • SSH keys: id_rsa, id_ecdsa, id_ed25519
  • Cloud credentials: serviceAccountKey.json, firebase-adminsdk*
  • Token files: .npmrc, .pypirc, .netrc
  • Any file matching *credential*, *secret*, *private_key*

Supports .agentignore: Create a .agentignore file (or .aiignore, .cursorignore) in the project root with gitignore-style patterns to add project-specific protections.

Configuration:

# Add extra patterns (pipe-separated regex)
export SENSITIVE_GUARD_EXTRA="\.vault|.*_token\.json"

Self-Review (self-review.sh)

Trigger: When Claude is about to stop (Stop event). Action: Injects a self-review checklist reminding Claude to verify quality before finishing. Blocking: No — just a reminder.

Questions asked:

  1. Did you leave any TODO/FIXME that should be resolved now?
  2. Did you create mock/fake implementations just to pass tests?
  3. Did you replace real code with placeholder comments?
  4. Do all changed files compile and typecheck cleanly?
  5. Did you run the full test suite, not just the new tests?
  6. Are there any files you modified but forgot to include in the summary?

Configuration:

# Disable self-review
export SELF_REVIEW_ENABLED=false

Testing Hooks Manually

You can test hooks by piping mock JSON payloads:

# ── Path Guard ──
# Should exit 2 (blocked)
echo '{"tool_input":{"command":"ls node_modules"}}' | bash .claude/hooks/path-guard.sh
echo $?  # expect: 2

# Should exit 0 (allowed)
echo '{"tool_input":{"command":"ls src"}}' | bash .claude/hooks/path-guard.sh
echo $?  # expect: 0

# ── File Guard ──
seq 1 250 > /tmp/test-large.txt
echo '{"tool_input":{"file_path":"/tmp/test-large.txt"}}' | node .claude/hooks/file-guard.js
# Should output JSON with additionalContext warning

# ── Comment Guard ──
# Should exit 2 (blocked — replacing code with placeholder)
echo '{"tool_input":{"old_string":"function hello() {\n  return world;\n}","new_string":"// ... existing code ..."}}' | node .claude/hooks/comment-guard.js
echo $?  # expect: 2

# Should exit 0 (allowed — replacing code with code)
echo '{"tool_input":{"old_string":"return a;","new_string":"return b;"}}' | node .claude/hooks/comment-guard.js
echo $?  # expect: 0

# ── Sensitive Guard ──
# Should exit 2 (blocked)
echo '{"tool_input":{"file_path":".env"}}' | bash .claude/hooks/sensitive-guard.sh
echo $?  # expect: 2

# Should exit 0 (allowed)
echo '{"tool_input":{"file_path":".env.example"}}' | bash .claude/hooks/sensitive-guard.sh
echo $?  # expect: 0

# Should exit 0 (warn only — bash commands are allowed for approved access)
echo '{"tool_input":{"command":"cat .env.local"}}' | bash .claude/hooks/sensitive-guard.sh
echo $?  # expect: 0 (with warning on stderr)

# ── Glob Guard ──
# Should exit 2 (blocked — broad pattern at root)
echo '{"tool_input":{"pattern":"**/*.ts"}}' | node .claude/hooks/glob-guard.js
echo $?  # expect: 2

# Should exit 0 (allowed — scoped pattern)
echo '{"tool_input":{"pattern":"src/**/*.ts"}}' | node .claude/hooks/glob-guard.js
echo $?  # expect: 0

7. Build Test Script

Usage

bash scripts/build-test.sh                    # run all tests
bash scripts/build-test.sh --filter "Auth"    # filter by pattern
bash scripts/build-test.sh --list             # show detected project type
bash scripts/build-test.sh --ci               # machine-readable output
bash scripts/build-test.sh --help             # show usage

Supported Languages

| Language | Detected By | Test Command | |----------|-------------|-------------| | Swift (SPM) | Package.swift | swift test | | Swift (Xcode) | *.xcworkspace / *.xcodeproj | xcodebuild test | | Node (Vitest) | vitest.config.* or vitest in package.json | npx vitest run | | Node (Jest) | jest.config.* or jest in package.json | npx jest | | Python (pytest) | pyproject.toml, setup.py, pytest.ini | python3 -m pytest | | Rust | Cargo.toml | cargo test | | Go | go.mod | go test -race ./... | | Java (Gradle) | build.gradle / build.gradle.kts | ./gradlew test | | Java (Maven) | pom.xml | mvn test | | C# (.NET) | *.sln / *.csproj | dotnet test | | Ruby (RSpec) | Gemfile with rspec | bundle exec rspec | | Ruby (Minitest) | Gemfile without rspec | bundle exec rake test |

Detection order: first match wins. The script also detects package managers (pnpm, bun) for Node projects.

Exit Codes

| Code | Meaning | |------|---------| | 0 | All tests passed | | 1 | Tests failed | | 2 | No project detected or missing tooling |

CI Integration

# GitHub Actions example
- name: Run tests
  run: bash scripts/build-test.sh --ci

Adding a New Language

Edit scripts/build-test.sh:

  1. Add a detect_<language>() function
  2. Add it to the DETECTORS array
  3. The function should set LANG_NAME and TEST_CMD

8. Spec Format

Spec Template

Create specs at docs/specs/<feature>/<feature>.md:

# Spec: <Feature Name>

**Created:** 2026-04-02
**Last updated:** 2026-04-02
**Status:** Draft | Active | Deprecated

## Overview
What this feature does, why it exists, who uses it. 2-3 sentences.

## Data Model
Entities, attributes, relationships (if applicable).

## Stories

### S-001: <Story name> (P0)

**Description:** [user story]
**Source:** [optional: ticket/issue ref]

**Acceptance Scenarios:**

AS-001: <short description>
- **Given:** [state]
- **When:** [action]
- **Then:** [expected]
- **Data:** [test data]

AS-002: <short description>
- **Given:** [error state]
- **When:** [action]
- **Then:** [error handling]

### S-002: <Story name> (P1)

AS-003: <short description>
- **Given:** [state]
- **When:** [action]
- **Then:** [expected]

### S-003: <Story name> (P2)

AS-004: <short description>
- [flow description + expected behavior]

## Constraints & Invariants
Rules that must always hold.

## Change Log

| Date | Change | Ref |
|------|--------|-----|
| 2026-04-02 | Initial creation | -- |

Skip sections that don't apply. Match depth to feature complexity.

Acceptance Scenario depth by priority:

  • P0: Full Given + When + Then + Data + Setup. At least 1 happy path + 1 error path.
  • P1: Given + When + Then. At least 1 happy path.
  • P2: 1-2 line flow description. At least 1 scenario.

Snapshots (Version History)

When /mf-plan Mode C detects a Major change (new story, removed story, priority change, flow change, behavior change for P0, or constraint change), it automatically creates a snapshot before updating:

docs/specs/<feature>/snapshots/
  2026-04-02.md              ← full copy at that point in time
  2026-04-05-BILL-101.md     ← with ticket reference

Snapshots are immutable, managed by mf-plan (not developers), and capped at 5 most recent.

Naming Conventions

| Item | Convention | Example | |------|-----------|---------| | Spec directory | docs/specs/<feature>/ | docs/specs/user-auth/ | | Spec file | <feature>.md in feature directory | user-auth.md | | Story ID | S-NNN sequential per spec | S-001, S-005 | | Scenario ID | AS-NNN sequential across all stories | AS-001, AS-042 | | Priority | P0 (critical), P1 (important), P2 (nice-to-have) — per story | — | | Snapshot | YYYY-MM-DD.md or YYYY-MM-DD-<REF>.md in snapshots/ | 2026-04-02.md |


9. Customization

Environment Variables

| Variable | Default | Description | |----------|---------|-------------| | FILE_GUARD_THRESHOLD | 200 | Max lines before file guard warns | | FILE_GUARD_EXCLUDE | (empty) | Comma-separated globs to skip (e.g. *.generated.swift) | | PATH_GUARD_EXTRA | (empty) | Additional pipe-separated patterns to block (e.g. \.terraform) | | SENSITIVE_GUARD_EXTRA | (empty) | Additional pipe-separated patterns for sensitive files (e.g. \.vault) | | SELF_REVIEW_ENABLED | true | Set to false to disable the self-review checklist on Stop |

Set these in your shell profile or project .envrc (if using direnv).

Extending CLAUDE.md

Add project-specific rules to .claude/CLAUDE.md:

## Project-Specific Rules

- All API endpoints must have OpenAPI annotations
- Database migrations must be reversible
- UI components must support dark mode
- All strings must be localized via i18n keys

Adding Custom Skills

Create new skills in .claude/skills/<name>/SKILL.md:

# .claude/skills/deploy/SKILL.md

Run the deployment pipeline:
1. /mf-review
2. /mf-commit
3. Run: bash scripts/deploy.sh $ARGUMENTS
4. Verify deployment health: curl -f https://api.example.com/health

Then use: /deploy staging


10. Token Cost Guide

| Activity | Tokens | Frequency | |----------|--------|-----------| | /mf-build (incremental, 1-3 files) | 5–10k | Every code chunk | | /mf-fix (single bug) | 3–5k | As needed | | /mf-commit | 2–4k | Every commit | | /mf-review (diff-based) | 10–20k | Before merge | | /mf-plan (new feature) | 20–40k | Start of feature | | /mf-challenge (adversarial review) | 15–30k | After /mf-plan, complex features | | Full audit (manual prompt) | 100k+ | Before release |

Minimizing Token Usage

  • Test incrementally. /mf-build after each small chunk uses 5-10k. Waiting until everything is done then running /mf-build on a large diff uses 50k+.
  • Use filters. /mf-build src/auth/login.ts is cheaper than /mf-build on the whole project.
  • Skip /mf-plan for tiny changes. Under 5 lines with no behavior change? Just /mf-build and /mf-commit.
  • Use /mf-review only before merge. Not after every commit.

11. Troubleshooting

Hook not firing

Symptom: File guard or path guard doesn't trigger.

Check:

  1. Is settings.json valid? node -e "JSON.parse(require('fs').readFileSync('.claude/settings.json','utf-8'))"
  2. Are hooks executable? ls -la .claude/hooks/
  3. Is Node.js available? node --version
  4. Is $CLAUDE_PROJECT_DIR set? Check in Claude Code with: echo $CLAUDE_PROJECT_DIR

Tests not detected

Symptom: build-test.sh says "No supported project detected."

Check:

  1. Are you in the project root? pwd
  2. Does the project marker file exist? (e.g., package.json, Cargo.toml)
  3. Run bash scripts/build-test.sh --list for diagnostic output.

Wrong base branch

Symptom: /mf-build or /mf-review compares against wrong branch.

Check:

git symbolic-ref refs/remotes/origin/HEAD

If this is wrong or missing:

git remote set-head origin <your-main-branch>

Path guard blocking a legitimate command

Symptom: Claude can't run a command you need.

Fix: The path guard blocks broad patterns. If you need to access build/ for a specific reason, run the command directly in your terminal (not through Claude Code).

File guard warning on generated files

Fix: Set the exclude pattern:

export FILE_GUARD_EXCLUDE="*.generated.swift,*.pb.go,*.min.js,*.snap"

12. FAQ

Q: Do I need specs for every tiny change? A: No. Changes under 5 lines with no behavior change can skip the spec. Just /mf-build and /mf-commit. The spec-first rule is for meaningful behavior changes.

Q: Can I use mocks in tests? A: Only for external services you can't run locally (third-party APIs, email services). Never mock your own code or database just to make tests pass faster.

Q: What if Claude writes a test that tests the wrong thing? A: This usually means the spec is ambiguous. Clarify the spec first, then re-run /mf-build. Good specs produce good tests.

Q: Can I use this with other AI coding tools? A: The commands and hooks are Claude Code-specific. The specs, workflow, and build-test.sh work with any tool or manual workflow.

Q: When should I use /mf-challenge? A: After /mf-plan, for complex features involving authentication, payments, data pipelines, or multi-service integration. It spawns parallel hostile reviewers that find security holes, failure modes, and false assumptions BEFORE you write code. Skip it for simple CRUD or small features — the overhead isn't worth it.

Q: How do I do a full coverage audit? A: This is intentionally not a command (it's expensive and rare). When needed, prompt Claude directly: "Audit test coverage for feature X against docs/specs/X/X.md acceptance scenarios. Identify gaps and write missing tests."

Q: What if my project uses multiple languages? A: build-test.sh detects the first match. For monorepos, you may need to run it from each sub-project directory or customize the script.

Q: Can I add more skills? A: Yes. Create a directory .claude/skills/<name>/SKILL.md and it becomes available as a slash command. See Customization.

Q: How do I update the kit in existing projects? A: Run npx claude-devkit-cli upgrade. It automatically detects which files you've customized and only updates unchanged files. Use --force to overwrite everything.

Q: I installed with the old setup.sh — how do I migrate? A: Run npx claude-devkit-cli init --adopt . to generate a manifest from your existing files without overwriting anything. Future upgrades will then work normally.