@schilling.mark.a/atdd-guardian

v1.0.1

Published

4 days ago

Context-aware ATDD enforcement MCP server. Tracks the full development lifecycle from Jira story through nested unit TDD cycles to PR submission, enforcing test-first discipline at every phase transition. Companion to @schilling.mark.a/software-methodolog

0High
0Medium
0Low

schilling.mark.a

atdd tdd mcp model-context-protocol playwright angular code-review phase-gate pdca deming acceptance-test-driven-development test-first ai-coding-agent

ATDD Guardian

A context-aware, ATDD-enforcing MCP server for VS Code + GitHub Copilot. ATDD Guardian doesn't just lint code — it tracks your full development lifecycle from Jira story through nested unit TDD cycles to PR submission, enforcing test-first discipline at every phase transition.

Companion to @schilling.mark.a/software-methodology — the methodology defines how to build software; this server enforces that you actually did.

The Problem

Copilot generates code that passes tests  →  PR rejected by teammates
                                           →  rework, lost trust

Three gaps cause this:

No context — Copilot doesn't know what story you're building, what the team decided architecturally, or which files should exist.
No discipline enforcement — Nothing stops you from writing production code without a failing test. Nothing checks that acceptance criteria are covered.
No feedback loop — PR rejections don't feed back into the system to prevent the same issues next time.

ATDD Guardian closes all three gaps.

How It Fits Together

@schilling.mark.a/software-methodology     @schilling.mark.a/atdd-guardian
┌──────────────────────────────────┐       ┌──────────────────────────────┐
│  SKILLS (knowledge layer)        │       │  MCP SERVER (enforcement)     │
│                                  │       │                              │
│  product-strategy                │       │  Phase gates block if you    │
│  ux-research                     │       │  skip TDD steps              │
│  story-mapping                   │       │                              │
│  bdd-specification               │       │  Context-aware review checks │
│  ux-design                       │       │  AC coverage, file           │
│  ui-design-workflow              │       │  completeness, architecture  │
│  ui-design-system           *    │       │                              │
│  atdd-workflow                   │       │  Nested unit TDD cycle       │
│  green-implementation            │◄─────►│  tracking (RED→GREEN→        │
│  clean-code                 *    │  sync │  REFACTOR per AC)            │
│  cicd-pipeline                   │       │                              │
│  continuous-improvement          │       │  PDCA feedback loop          │
│                                  │       │  (PR feedback → new rules)   │
│  Tells AI agents HOW to think    │       │  Validates AI agents DID it  │
└──────────────────────────────────┘       └──────────────────────────────┘

The green-implementation skill and ATDD Guardian's team-standards.json stay in sync. The skill prevents violations from being generated. The server catches what slips through. Both update from the same source: PR feedback through the PDCA Act step.

ATDD Workflow with Phase Gates

The server enforces a strict workflow. Each phase transition passes through a gate that validates preconditions. Gates are configurable: strict (blocks everything), guided (blocks errors, warns on warnings), or lenient (warns only).

requirements ──▶ test-design ──▶ implementation ──▶ refactor ──▶ pre-pr-review ──▶ pr-submitted
                      │               │                  │              │
                  GATE: Every AC   GATE: All ACs      GATE: Tests   GATE: Passing
                  must have a      GREEN + unit       still green    review exists
                  RED test         TDD cycles         after cleanup  + DoD checked
                                   recorded

Phase Gate Details

| Transition | What the Gate Checks | Blocked If | |---|---|---| | requirements → test-design | Valid path | Trying to skip phases | | test-design → implementation | Every AC has a RED acceptance test | Any AC still has no-test status | | implementation → refactor | All ACs GREEN + unit TDD cycles exist | ACs went green without recorded unit cycles (PC-1 violation) | | refactor → pre-pr-review | Tests still green + unit cycles refactored | Tests regressed during refactoring | | pre-pr-review → pr-submitted | Passing review exists + DoD checked | No passing review on record |

Nested Unit TDD Enforcement

Inside the implementation phase, each acceptance criterion is driven to green through multiple unit-level RED→GREEN→REFACTOR cycles:

AC-1 (RED acceptance test)
  ├── Unit Cycle 1: validate email format
  │     🔴 RED   → write failing unit test
  │     🟢 GREEN → minimal code to pass
  │     ♻️ REFACTORED → clean up, extract constant
  │
  ├── Unit Cycle 2: authenticate against API
  │     🔴 RED   → write failing unit test
  │     🟢 GREEN → minimal code to pass
  │     ♻️ REFACTORED → add typed errors
  │
  └── AC-1 acceptance test now GREEN ✅

The phase gate blocks the transition from implementation → refactor if any AC went green without recorded unit cycles. This enforces PC-1: no production code without a failing unit test.

Deming's PDCA Cycle

    PLAN                          DO
    ┌──────────────────┐          ┌──────────────────┐
    │ Define rules in  │          │ Write code, run   │
    │ team-standards   │──────▶   │ review_code tool  │
    │ .json            │          │                   │
    └──────────────────┘          └────────┬─────────┘
           ▲                               │
           │                               ▼
    ┌──────┴───────────┐          ┌──────────────────┐
    │ add_rule from    │          │ Compare findings  │
    │ PR feedback      │  ◀────── │ to actual PR      │
    │ Update SKILL.md  │          │ feedback          │
    └──────────────────┘          └──────────────────┘
    ACT                           CHECK

Plan: Rules and standards defined in team-standards.json
Do: Code against them, run review_code
Check: record_pr_feedback captures what reviewers flagged vs. what the server caught
Act: add_rule captures new patterns; update the green-implementation skill in software-methodology

MCP Tools (14 total)

Context Tools — know what you're building

| Tool | Purpose | |---|---| | start_feature | Load story context: ACs, architecture decisions, expected files, DoD. Call this first. | | advance_phase | Move through ATDD phases. Phase-gated — validates TDD discipline before allowing. | | update_criterion | Track AC acceptance test status: no-test → red → green. | | get_context | Show current feature state, AC status, review history. |

Enforcement Tools — TDD discipline

| Tool | Purpose | |---|---| | start_unit_cycle | Begin a unit-level RED→GREEN→REFACTOR cycle within an AC. Records that you started from a failing test. | | advance_unit_cycle | Move a unit cycle: red → green → refactored. Strict sequence enforced. | | run_tests | Execute test commands (Playwright, Jest, Vitest) and capture structured results. Results feed into phase gates. | | get_tdd_status | Show nested TDD cycle status for all ACs: which cycles exist, what state they're in, violation history. |

Review Tools — check the code

| Tool | Purpose | |---|---| | review_code | Full context-aware review. Checks pattern rules (phase-filtered), AC coverage, file completeness, architecture compliance, DoD. | | review_file | Quick single-file check against team standards. |

Rule Tools — manage team standards

| Tool | Purpose | |---|---| | list_rules | Show all team coding standard rules with applicable phases. | | explain_rule | Deep-dive on a specific rule with examples. | | add_rule | Add a new rule from PR feedback (PDCA Act phase). |

Feedback Tools — close the loop

| Tool | Purpose | |---|---| | record_pr_feedback | Capture PR rejection reasons. PDCA Check phase. |

Setup

Install

npm install @schilling.mark.a/atdd-guardian
npm run build

Or clone and build locally:

git clone https://github.com/schilling-mark-a/atdd-guardian.git
cd atdd-guardian
npm install
npm run build

Configure VS Code

Add to .vscode/mcp.json:

{
  "servers": {
    "atdd-guardian": {
      "type": "stdio",
      "command": "node",
      "args": ["node_modules/@schilling.mark.a/atdd-guardian/dist/index.js"]
    }
  }
}

Or if installed globally / from a local path:

{
  "servers": {
    "atdd-guardian": {
      "type": "stdio",
      "command": "node",
      "args": ["/path/to/atdd-guardian/dist/index.js"]
    }
  }
}

Pair with Software Methodology

For the full system — skills that teach + server that enforces:

npm install @schilling.mark.a/software-methodology
npm install @schilling.mark.a/atdd-guardian

The skills shape how AI agents generate code. ATDD Guardian validates they followed through. The green-implementation skill and team-standards.json encode the same standards in two forms: one for prevention, one for detection.

Configure Test Commands

Create .atdd-guardian/test-config.json in your project root:

{
  "allTests": "npm test",
  "acceptanceTests": "npx playwright test",
  "unitTests": "npx jest --passWithNoTests",
  "singleFilePattern": "npx jest {file} --passWithNoTests",
  "timeoutMs": 300000
}

If this file doesn't exist, the server uses sensible defaults.

Customize Team Rules

Edit src/rules/team-standards.json to match your team's conventions. The bundled rules cover Angular + Playwright patterns. Each rule specifies which ATDD phases it applies to — naming rules only fire during refactor and later, test quality rules fire from test-design onward.

Example Workflow

1. Load the story

start_feature
  projectRoot: "/home/mark/my-app"
  storyId: "PROJ-1234"
  title: "User Login with MFA"
  acceptanceCriteria:
    - id: "AC-1", text: "Given valid creds, When submit, Then redirect to dashboard"
    - id: "AC-2", text: "Given MFA enabled, When login, Then prompt for code"
    - id: "AC-3", text: "Given invalid creds, When submit, Then show error"
  architectureDecisions:
    - id: "AD-1", decision: "Use AuthService for all auth logic"
  expectedFiles:
    - path: "src/auth/auth.service.ts", purpose: "Auth HTTP calls", fileType: "service"
    - path: "src/auth/login.component.ts", purpose: "Login UI", fileType: "component"
    - path: "tests/e2e/login.po.ts", purpose: "Login page object", fileType: "page-object"

2. Write red tests

advance_phase → test-design

update_criterion  criterionId: "AC-1"  testStatus: "red"  testFile: "tests/e2e/login.spec.ts"
update_criterion  criterionId: "AC-2"  testStatus: "red"  testFile: "tests/e2e/login-mfa.spec.ts"
update_criterion  criterionId: "AC-3"  testStatus: "red"  testFile: "tests/e2e/login-error.spec.ts"

3. Implement with unit TDD cycles

advance_phase → implementation

start_unit_cycle  criterionId: "AC-1"  unitDescription: "validate email format"
                  testFile: "tests/unit/validate-email.spec.ts"
                  sourceFile: "src/auth/validate-email.ts"

run_tests  testLevel: "unit"  testFile: "tests/unit/validate-email.spec.ts"
# → ❌ 1 FAILED (good — it's RED)

# Write minimal code to pass...
advance_unit_cycle  criterionId: "AC-1"  cycleNumber: 1  targetState: "green"

run_tests  testLevel: "unit"  testFile: "tests/unit/validate-email.spec.ts"
# → ✅ ALL PASS

advance_unit_cycle  criterionId: "AC-1"  cycleNumber: 1  targetState: "refactored"
                    notes: "Extracted regex to EMAIL_PATTERN constant"

update_criterion  criterionId: "AC-1"  testStatus: "green"

4. Phase gate blocks if you skip steps

advance_phase → refactor

# 🚫 BLOCKED: 1 blockers
# - no-unit-tests: 1 ACs went green without any unit TDD cycles: AC-2
#   → PC-1: No production code without a failing unit test.

5. Refactor and review

advance_phase → refactor
review_code  directory: "/home/mark/my-app"
# Fix findings...
review_code  directory: "/home/mark/my-app"
# ✅ PASS

advance_phase → pre-pr-review
advance_phase → pr-submitted

6. PR feedback → PDCA

record_pr_feedback
  feedback:
    - "switchMap without catchError in auth service"
    - "Missing loading spinner during MFA check"

add_rule
  id: "rxjs-001"
  name: "switchMap must have error handling"
  severity: "error"
  pattern: "switchMap\\s*\\([^)]*\\)(?![\\s\\S]*catchError)"
  appliesTo: ["**/*.ts"]
  applicablePhases: ["implementation", "refactor", "pre-pr-review"]

Then update green-implementation/references/rxjs-patterns.md in software-methodology with the same pattern. Prevention + detection stay in sync.

Project Structure

atdd-guardian/
├── src/
│   ├── index.ts                  # MCP server entry (stdio + HTTP transport)
│   ├── types.ts                  # Full lifecycle type model
│   ├── constants.ts              # Shared constants
│   ├── tools/
│   │   └── review-tools.ts       # All 14 MCP tool registrations
│   ├── services/
│   │   ├── phase-gate.ts         # Phase transition enforcement
│   │   ├── test-runner.ts        # Test execution + result capture
│   │   ├── session-manager.ts    # Persistent workflow state + unit cycle tracking
│   │   ├── context-review.ts     # AC coverage, file completeness, DoD, architecture
│   │   ├── review-engine.ts      # Pattern-based code review engine
│   │   ├── formatter.ts          # Markdown/JSON output formatting
│   │   └── rule-loader.ts        # Team rules file loading + validation
│   ├── schemas/
│   │   └── tool-schemas.ts       # Zod input validation for all tools
│   └── rules/
│       └── team-standards.json   # Team coding rules (customize this)
├── package.json
└── tsconfig.json

Requirements Traceability

This server implements the ATDD requirements for MCP server implementation:

| Requirement | Implementation | |---|---| | AC-1: Red Phase test creation | start_feature + update_criterion with red status | | AC-2: Nested unit TDD loop | start_unit_cycle + advance_unit_cycle (red→green→refactored) | | AC-3: Refactor phase validation | Phase gate checks all tests stay green; review_code runs full standards check | | PC-1: Unit test first enforcement | Phase gate blocks implementation→refactor if ACs lack unit cycles | | PC-2: Minimal implementation | Guided by unit cycle granularity — each cycle drives one small behavior | | PC-3: Test independence | One AC per Playwright test; unit tests tracked independently | | CR-1: Workflow state tracking | SessionState in .atdd-guardian/session.json — phase, test level, cycle counts, checkpoints | | CR-2: Guidance and prompting | getPhaseChecks() + phase gate guidance messages + get_tdd_status | | CR-3: Documentation generation | AC↔test mapping in session, test run history, violation log, PDCA entries | | TI-1: Test framework integration | test-runner.ts — Playwright, Jest, Vitest output parsing | | TI-2: Workflow automation | Phase gates automate transition validation; run_tests automates execution | | TI-3: AI agent compatibility | Structured tool inputs/outputs; phase gate provides clear next-step guidance |

Configuration Reference

`.atdd-guardian/session.json` (auto-managed)

Persisted workflow state. Created by start_feature, updated by all tools. Contains: active feature context, review history, test run history, violation log, PDCA entries.

`.atdd-guardian/test-config.json` (user-created)

Test command configuration. All fields optional — defaults apply if missing.

{
  "allTests": "npm test",
  "acceptanceTests": "npx playwright test",
  "unitTests": "npx jest --passWithNoTests",
  "singleFilePattern": "npx jest {file} --passWithNoTests",
  "timeoutMs": 300000
}

`src/rules/team-standards.json` (user-customized)

Team coding rules. Each rule specifies: pattern (regex), severity, applicable ATDD phases, file globs, fix suggestion with good/bad examples. Add rules with the add_rule tool or edit directly.

License

MIT