@schilling.mark.a/atdd-guardian
v1.0.1
Published
Context-aware ATDD enforcement MCP server. Tracks the full development lifecycle from Jira story through nested unit TDD cycles to PR submission, enforcing test-first discipline at every phase transition. Companion to @schilling.mark.a/software-methodolog
Maintainers
Readme
ATDD Guardian
A context-aware, ATDD-enforcing MCP server for VS Code + GitHub Copilot. ATDD Guardian doesn't just lint code — it tracks your full development lifecycle from Jira story through nested unit TDD cycles to PR submission, enforcing test-first discipline at every phase transition.
Companion to @schilling.mark.a/software-methodology — the methodology defines how to build software; this server enforces that you actually did.
The Problem
Copilot generates code that passes tests → PR rejected by teammates
→ rework, lost trustThree gaps cause this:
- No context — Copilot doesn't know what story you're building, what the team decided architecturally, or which files should exist.
- No discipline enforcement — Nothing stops you from writing production code without a failing test. Nothing checks that acceptance criteria are covered.
- No feedback loop — PR rejections don't feed back into the system to prevent the same issues next time.
ATDD Guardian closes all three gaps.
How It Fits Together
@schilling.mark.a/software-methodology @schilling.mark.a/atdd-guardian
┌──────────────────────────────────┐ ┌──────────────────────────────┐
│ SKILLS (knowledge layer) │ │ MCP SERVER (enforcement) │
│ │ │ │
│ product-strategy │ │ Phase gates block if you │
│ ux-research │ │ skip TDD steps │
│ story-mapping │ │ │
│ bdd-specification │ │ Context-aware review checks │
│ ux-design │ │ AC coverage, file │
│ ui-design-workflow │ │ completeness, architecture │
│ ui-design-system * │ │ │
│ atdd-workflow │ │ Nested unit TDD cycle │
│ green-implementation │◄─────►│ tracking (RED→GREEN→ │
│ clean-code * │ sync │ REFACTOR per AC) │
│ cicd-pipeline │ │ │
│ continuous-improvement │ │ PDCA feedback loop │
│ │ │ (PR feedback → new rules) │
│ Tells AI agents HOW to think │ │ Validates AI agents DID it │
└──────────────────────────────────┘ └──────────────────────────────┘The green-implementation skill and ATDD Guardian's team-standards.json stay in sync. The skill prevents violations from being generated. The server catches what slips through. Both update from the same source: PR feedback through the PDCA Act step.
ATDD Workflow with Phase Gates
The server enforces a strict workflow. Each phase transition passes through a gate that validates preconditions. Gates are configurable: strict (blocks everything), guided (blocks errors, warns on warnings), or lenient (warns only).
requirements ──▶ test-design ──▶ implementation ──▶ refactor ──▶ pre-pr-review ──▶ pr-submitted
│ │ │ │
GATE: Every AC GATE: All ACs GATE: Tests GATE: Passing
must have a GREEN + unit still green review exists
RED test TDD cycles after cleanup + DoD checked
recordedPhase Gate Details
| Transition | What the Gate Checks | Blocked If |
|---|---|---|
| requirements → test-design | Valid path | Trying to skip phases |
| test-design → implementation | Every AC has a RED acceptance test | Any AC still has no-test status |
| implementation → refactor | All ACs GREEN + unit TDD cycles exist | ACs went green without recorded unit cycles (PC-1 violation) |
| refactor → pre-pr-review | Tests still green + unit cycles refactored | Tests regressed during refactoring |
| pre-pr-review → pr-submitted | Passing review exists + DoD checked | No passing review on record |
Nested Unit TDD Enforcement
Inside the implementation phase, each acceptance criterion is driven to green through multiple unit-level RED→GREEN→REFACTOR cycles:
AC-1 (RED acceptance test)
├── Unit Cycle 1: validate email format
│ 🔴 RED → write failing unit test
│ 🟢 GREEN → minimal code to pass
│ ♻️ REFACTORED → clean up, extract constant
│
├── Unit Cycle 2: authenticate against API
│ 🔴 RED → write failing unit test
│ 🟢 GREEN → minimal code to pass
│ ♻️ REFACTORED → add typed errors
│
└── AC-1 acceptance test now GREEN ✅The phase gate blocks the transition from implementation → refactor if any AC went green without recorded unit cycles. This enforces PC-1: no production code without a failing unit test.
Deming's PDCA Cycle
PLAN DO
┌──────────────────┐ ┌──────────────────┐
│ Define rules in │ │ Write code, run │
│ team-standards │──────▶ │ review_code tool │
│ .json │ │ │
└──────────────────┘ └────────┬─────────┘
▲ │
│ ▼
┌──────┴───────────┐ ┌──────────────────┐
│ add_rule from │ │ Compare findings │
│ PR feedback │ ◀────── │ to actual PR │
│ Update SKILL.md │ │ feedback │
└──────────────────┘ └──────────────────┘
ACT CHECK- Plan: Rules and standards defined in
team-standards.json - Do: Code against them, run
review_code - Check:
record_pr_feedbackcaptures what reviewers flagged vs. what the server caught - Act:
add_rulecaptures new patterns; update thegreen-implementationskill in software-methodology
MCP Tools (14 total)
Context Tools — know what you're building
| Tool | Purpose |
|---|---|
| start_feature | Load story context: ACs, architecture decisions, expected files, DoD. Call this first. |
| advance_phase | Move through ATDD phases. Phase-gated — validates TDD discipline before allowing. |
| update_criterion | Track AC acceptance test status: no-test → red → green. |
| get_context | Show current feature state, AC status, review history. |
Enforcement Tools — TDD discipline
| Tool | Purpose |
|---|---|
| start_unit_cycle | Begin a unit-level RED→GREEN→REFACTOR cycle within an AC. Records that you started from a failing test. |
| advance_unit_cycle | Move a unit cycle: red → green → refactored. Strict sequence enforced. |
| run_tests | Execute test commands (Playwright, Jest, Vitest) and capture structured results. Results feed into phase gates. |
| get_tdd_status | Show nested TDD cycle status for all ACs: which cycles exist, what state they're in, violation history. |
Review Tools — check the code
| Tool | Purpose |
|---|---|
| review_code | Full context-aware review. Checks pattern rules (phase-filtered), AC coverage, file completeness, architecture compliance, DoD. |
| review_file | Quick single-file check against team standards. |
Rule Tools — manage team standards
| Tool | Purpose |
|---|---|
| list_rules | Show all team coding standard rules with applicable phases. |
| explain_rule | Deep-dive on a specific rule with examples. |
| add_rule | Add a new rule from PR feedback (PDCA Act phase). |
Feedback Tools — close the loop
| Tool | Purpose |
|---|---|
| record_pr_feedback | Capture PR rejection reasons. PDCA Check phase. |
Setup
Install
npm install @schilling.mark.a/atdd-guardian
npm run buildOr clone and build locally:
git clone https://github.com/schilling-mark-a/atdd-guardian.git
cd atdd-guardian
npm install
npm run buildConfigure VS Code
Add to .vscode/mcp.json:
{
"servers": {
"atdd-guardian": {
"type": "stdio",
"command": "node",
"args": ["node_modules/@schilling.mark.a/atdd-guardian/dist/index.js"]
}
}
}Or if installed globally / from a local path:
{
"servers": {
"atdd-guardian": {
"type": "stdio",
"command": "node",
"args": ["/path/to/atdd-guardian/dist/index.js"]
}
}
}Pair with Software Methodology
For the full system — skills that teach + server that enforces:
npm install @schilling.mark.a/software-methodology
npm install @schilling.mark.a/atdd-guardianThe skills shape how AI agents generate code. ATDD Guardian validates they followed through. The green-implementation skill and team-standards.json encode the same standards in two forms: one for prevention, one for detection.
Configure Test Commands
Create .atdd-guardian/test-config.json in your project root:
{
"allTests": "npm test",
"acceptanceTests": "npx playwright test",
"unitTests": "npx jest --passWithNoTests",
"singleFilePattern": "npx jest {file} --passWithNoTests",
"timeoutMs": 300000
}If this file doesn't exist, the server uses sensible defaults.
Customize Team Rules
Edit src/rules/team-standards.json to match your team's conventions. The bundled rules cover Angular + Playwright patterns. Each rule specifies which ATDD phases it applies to — naming rules only fire during refactor and later, test quality rules fire from test-design onward.
Example Workflow
1. Load the story
start_feature
projectRoot: "/home/mark/my-app"
storyId: "PROJ-1234"
title: "User Login with MFA"
acceptanceCriteria:
- id: "AC-1", text: "Given valid creds, When submit, Then redirect to dashboard"
- id: "AC-2", text: "Given MFA enabled, When login, Then prompt for code"
- id: "AC-3", text: "Given invalid creds, When submit, Then show error"
architectureDecisions:
- id: "AD-1", decision: "Use AuthService for all auth logic"
expectedFiles:
- path: "src/auth/auth.service.ts", purpose: "Auth HTTP calls", fileType: "service"
- path: "src/auth/login.component.ts", purpose: "Login UI", fileType: "component"
- path: "tests/e2e/login.po.ts", purpose: "Login page object", fileType: "page-object"2. Write red tests
advance_phase → test-design
update_criterion criterionId: "AC-1" testStatus: "red" testFile: "tests/e2e/login.spec.ts"
update_criterion criterionId: "AC-2" testStatus: "red" testFile: "tests/e2e/login-mfa.spec.ts"
update_criterion criterionId: "AC-3" testStatus: "red" testFile: "tests/e2e/login-error.spec.ts"3. Implement with unit TDD cycles
advance_phase → implementation
start_unit_cycle criterionId: "AC-1" unitDescription: "validate email format"
testFile: "tests/unit/validate-email.spec.ts"
sourceFile: "src/auth/validate-email.ts"
run_tests testLevel: "unit" testFile: "tests/unit/validate-email.spec.ts"
# → ❌ 1 FAILED (good — it's RED)
# Write minimal code to pass...
advance_unit_cycle criterionId: "AC-1" cycleNumber: 1 targetState: "green"
run_tests testLevel: "unit" testFile: "tests/unit/validate-email.spec.ts"
# → ✅ ALL PASS
advance_unit_cycle criterionId: "AC-1" cycleNumber: 1 targetState: "refactored"
notes: "Extracted regex to EMAIL_PATTERN constant"
update_criterion criterionId: "AC-1" testStatus: "green"4. Phase gate blocks if you skip steps
advance_phase → refactor
# 🚫 BLOCKED: 1 blockers
# - no-unit-tests: 1 ACs went green without any unit TDD cycles: AC-2
# → PC-1: No production code without a failing unit test.5. Refactor and review
advance_phase → refactor
review_code directory: "/home/mark/my-app"
# Fix findings...
review_code directory: "/home/mark/my-app"
# ✅ PASS
advance_phase → pre-pr-review
advance_phase → pr-submitted6. PR feedback → PDCA
record_pr_feedback
feedback:
- "switchMap without catchError in auth service"
- "Missing loading spinner during MFA check"
add_rule
id: "rxjs-001"
name: "switchMap must have error handling"
severity: "error"
pattern: "switchMap\\s*\\([^)]*\\)(?![\\s\\S]*catchError)"
appliesTo: ["**/*.ts"]
applicablePhases: ["implementation", "refactor", "pre-pr-review"]Then update green-implementation/references/rxjs-patterns.md in software-methodology with the same pattern. Prevention + detection stay in sync.
Project Structure
atdd-guardian/
├── src/
│ ├── index.ts # MCP server entry (stdio + HTTP transport)
│ ├── types.ts # Full lifecycle type model
│ ├── constants.ts # Shared constants
│ ├── tools/
│ │ └── review-tools.ts # All 14 MCP tool registrations
│ ├── services/
│ │ ├── phase-gate.ts # Phase transition enforcement
│ │ ├── test-runner.ts # Test execution + result capture
│ │ ├── session-manager.ts # Persistent workflow state + unit cycle tracking
│ │ ├── context-review.ts # AC coverage, file completeness, DoD, architecture
│ │ ├── review-engine.ts # Pattern-based code review engine
│ │ ├── formatter.ts # Markdown/JSON output formatting
│ │ └── rule-loader.ts # Team rules file loading + validation
│ ├── schemas/
│ │ └── tool-schemas.ts # Zod input validation for all tools
│ └── rules/
│ └── team-standards.json # Team coding rules (customize this)
├── package.json
└── tsconfig.jsonRequirements Traceability
This server implements the ATDD requirements for MCP server implementation:
| Requirement | Implementation |
|---|---|
| AC-1: Red Phase test creation | start_feature + update_criterion with red status |
| AC-2: Nested unit TDD loop | start_unit_cycle + advance_unit_cycle (red→green→refactored) |
| AC-3: Refactor phase validation | Phase gate checks all tests stay green; review_code runs full standards check |
| PC-1: Unit test first enforcement | Phase gate blocks implementation→refactor if ACs lack unit cycles |
| PC-2: Minimal implementation | Guided by unit cycle granularity — each cycle drives one small behavior |
| PC-3: Test independence | One AC per Playwright test; unit tests tracked independently |
| CR-1: Workflow state tracking | SessionState in .atdd-guardian/session.json — phase, test level, cycle counts, checkpoints |
| CR-2: Guidance and prompting | getPhaseChecks() + phase gate guidance messages + get_tdd_status |
| CR-3: Documentation generation | AC↔test mapping in session, test run history, violation log, PDCA entries |
| TI-1: Test framework integration | test-runner.ts — Playwright, Jest, Vitest output parsing |
| TI-2: Workflow automation | Phase gates automate transition validation; run_tests automates execution |
| TI-3: AI agent compatibility | Structured tool inputs/outputs; phase gate provides clear next-step guidance |
Configuration Reference
.atdd-guardian/session.json (auto-managed)
Persisted workflow state. Created by start_feature, updated by all tools. Contains: active feature context, review history, test run history, violation log, PDCA entries.
.atdd-guardian/test-config.json (user-created)
Test command configuration. All fields optional — defaults apply if missing.
{
"allTests": "npm test",
"acceptanceTests": "npx playwright test",
"unitTests": "npx jest --passWithNoTests",
"singleFilePattern": "npx jest {file} --passWithNoTests",
"timeoutMs": 300000
}src/rules/team-standards.json (user-customized)
Team coding rules. Each rule specifies: pattern (regex), severity, applicable ATDD phases, file globs, fix suggestion with good/bad examples. Add rules with the add_rule tool or edit directly.
License
MIT
