agent-health-monitor

v0.1.0

Published

4 months ago

CLI tool to detect AI coding agent session degradation

0High
0Medium
0Low

Agent Health Monitor

A lightweight CLI tool that detects when an AI coding agent session is going wrong — repeated fixes, regressions, hallucinated imports, or context overload.

Zero LLM calls. Fully local. Sub-second.

The Problem

When using AI coding agents (Claude Code, Cursor, Copilot, etc.) over many iterations, the agent can start to degrade:

Repeating the same fixes in a loop
Introducing regressions that weren't there before
Hallucinating imports or packages that don't exist
Touching unrelated files across the codebase
Making test results worse with every step

There's no built-in signal for this. You only notice when it's too late.

The Solution

Run am health whenever something feels off. The tool analyzes your git history and session log and gives you a diagnostic report in under a second.

Agent Health Report
═══════════════════════════════════════

  Score: [██████░░░░] 58/100
  Risk:  HIGH
  Steps: 6

── Signals ────────────────────────────

  ● Repeated File Edits (weight: 0.25) — 40/100
    → src/auth/middleware.ts edited 5 times

  ● Regression Risk (weight: 0.25) — 60/100
    → Test failures increased in 2 of 5 steps

  ● Large Diff Risk (weight: 0.20) — 80/100
    → No unusually large diffs detected

  ● Hallucinated Imports (weight: 0.15) — 50/100
    → src/utils/parser.ts: 'fast-xml-parser' not in package.json

  ● Context Complexity (weight: 0.15) — 70/100
    → 14 files across 5 directories

═══════════════════════════════════════

Suggested Action:
  Restart the agent with a summarized context.
  Consider breaking the task into smaller steps.

Install

npm install -g agent-health-monitor

No server. No background process. No API keys.

Usage

1. Start a session

Run this once at the beginning of an AI coding session, inside your git project:

cd your-project
am init

2. Log a step

After each meaningful AI change (commit, batch of edits), record a step:

am log

With test results:

am log --test-passed 42 --test-failed 2
am log --test-passed 38 --test-failed 6 -m "auth refactor"

3. Check health

When the AI starts behaving strangely, run:

am health

For machine-readable output:

am health --json

Other commands

am status   # Show session info (ID, step count, last activity)
am reset    # Clear the session and start fresh

All Commands

| Command | Description | |---|---| | am init | Start a new monitoring session | | am init --force | Overwrite an existing session | | am log | Record the current step from git | | am log --test-passed N --test-failed N | Record step with test results | | am log -m "message" | Record step with a note | | am health | Run diagnostics and print report | | am health --json | Output report as JSON | | am status | Show current session info | | am reset | Clear the current session |

How It Works

Session Log

am init creates a .agent-monitor/session.json file in your project. Each am log call reads the git diff since the last step and appends a record:

{
  "meta": {
    "id": "a1b2c3d4-...",
    "startedAt": "2025-03-24T10:00:00.000Z",
    "projectRoot": "/your/project",
    "initialCommit": "abc123"
  },
  "steps": [
    {
      "stepNumber": 1,
      "timestamp": "2025-03-24T10:05:00.000Z",
      "commitSha": "def456",
      "filesChanged": [
        { "path": "src/app.ts", "linesAdded": 12, "linesRemoved": 3, "changeType": "modified" }
      ],
      "linesAdded": 12,
      "linesRemoved": 3,
      "testResults": { "passed": 40, "failed": 2, "skipped": 0, "total": 42 },
      "errorsDetected": []
    }
  ]
}

Health Scoring

The health score (0–100) is a weighted combination of 5 signals:

| Signal | Weight | What it detects | |---|---|---| | Repeated File Edits | 25% | Same file modified many times in a short window | | Regression Risk | 25% | Test failures trending upward across steps | | Large Diff Risk | 20% | Unusually large modifications to many files | | Hallucinated Imports | 15% | Imports that don't exist in package.json, requirements.txt, or go.mod | | Context Complexity | 15% | Too many files or directories touched at once |

Risk levels:

| Score | Risk | |---|---| | 75 – 100 | LOW | | 50 – 74 | MEDIUM | | 25 – 49 | HIGH | | 0 – 24 | CRITICAL |

Import Validation

The hallucinated imports signal supports three languages:

JavaScript / TypeScript — checks against node_modules/ and Node.js builtins
Python — checks against requirements.txt / pyproject.toml and stdlib
Go — checks against go.mod and stdlib

Requirements

Node.js >= 18
Git (project must be a git repository)

Project Install (per-project, not global)

npm install --save-dev agent-health-monitor

Then use with npx:

npx am init
npx am log
npx am health

Or add to your package.json scripts:

"scripts": {
  "am:init": "am init",
  "am:log": "am log",
  "am:health": "am health"
}

Add to .gitignore

The session log contains local paths and is not meant to be committed:

.agent-monitor/

Contributing

Contributions are welcome. This is an early-stage open source project.

Setup:

git clone https://github.com/your-username/agent-health-monitor
cd agent-monitor
npm install
npm run build
npm test

Project structure:

src/
  cli.ts                  # Entry point (Commander)
  commands/               # init, log, health, status, reset
  core/
    session.ts            # Session read/write
    git-analyzer.ts       # Git diff via simple-git
    health-scorer.ts      # Orchestrates signals
    import-validator.ts   # Multi-language import checks
  signals/
    repeated-edits.ts
    regression-risk.ts
    large-diff.ts
    hallucinated-imports.ts
    context-complexity.ts
  types/index.ts          # All TypeScript interfaces
  utils/
    constants.ts          # Weights, thresholds, builtin lists
    formatting.ts         # Chalk-based report rendering

To add a new signal:

Create src/signals/your-signal.ts — export a function (session: Session) => SignalResult
Add it to src/core/health-scorer.ts
Adjust weights in src/utils/constants.ts so they still sum to 1.0

Scripts:

npm run build     # Type-check + bundle
npm test          # Run tests
npm run dev       # Watch mode (TypeScript)

License

MIT