@phoenixaihub/scope-guard

v0.1.0

Published

17 days ago

Delegation Corruption Detector — detects when AI agents silently modify code outside their assigned scope

0High
0Medium
0Low

phoenixaihub

ai agent scope delegation corruption detector ast diff code-review ci github-actions tree-sitter vibe-coding

ScopeGuard 🛡️

Delegation Corruption Detector — detects when AI coding agents silently modify code outside their assigned scope.

Research shows frontier models corrupt ~25% of document content during long delegated workflows. ScopeGuard is the missing enforcement layer: it AST-diffs what the agent changed vs what it was asked to change.

Features

🎯 Scope Classification — Every change classified as ✅ in-scope, ⚠️ adjacent, or 🚨 corruption
📊 Drift Tracking — Monitor scope creep across multi-turn sessions
🔍 Hunk-level Analysis — Per-hunk classification with confidence scores
📋 Multiple Reporters — Console, SARIF 2.1.0, GitHub Actions annotations
🚫 Zero LLM Dependency — Pure algorithmic analysis using token similarity + structural matching
⚡ CI-Ready — Pre-commit hook, GitHub Action, exit codes for automation

Install

npm install @phoenixaihub/scope-guard

CLI Usage

# Pipe from git diff
git diff HEAD~1 | scopeguard check -t "Add user authentication"

# From a diff file
scopeguard check -t "Fix login bug" -d changes.diff

# SARIF output for CI
git diff main | scopeguard check -t "Refactor database layer" -f sarif > report.sarif

# GitHub Actions annotations
git diff main | scopeguard check -t "Add caching" -f annotations

# Strict mode (exit 1 on ANY corruption)
git diff HEAD~1 | scopeguard check -t "Fix auth" --strict

# Custom thresholds
git diff | scopeguard check -t "Update API" --corruption-threshold 0.2 --adjacent-threshold 0.4

# With drift tracking
git diff | scopeguard check -t "Migrate to v2" --track-drift

Programmatic API

import { checkScope } from '@phoenixaihub/scope-guard';

const report = checkScope({
  task: 'Add input validation to the login handler',
  diff: gitDiffString,
});

console.log(report.summary);
// {
//   totalChanges: 5,
//   inScope: 3,
//   adjacent: 1,
//   corruption: 1,
//   scopeScore: 0.8,
//   verdict: 'warn'
// }

// Check individual files
for (const change of report.changes) {
  if (change.classification === 'corruption') {
    console.log(`🚨 ${change.filePath}: ${change.reason}`);
  }
}

Reporters

import { checkScope, formatConsole, generateSarif, formatAnnotations } from '@phoenixaihub/scope-guard';

const report = checkScope({ task: '...', diff: '...' });

// Console output
console.log(formatConsole(report));

// SARIF 2.1.0
const sarif = generateSarif(report);
fs.writeFileSync('report.sarif', JSON.stringify(sarif, null, 2));

// GitHub Actions annotations
console.log(formatAnnotations(report));

Drift Tracking

const report = checkScope({
  task: 'Migrate auth to OAuth2',
  diff: latestDiff,
  trackDrift: true,
  previousCommits: [
    { hash: 'abc', message: 'Start OAuth2', scopeScore: 0.95, corruptionCount: 0, adjacentCount: 1, inScopeCount: 5 },
    { hash: 'def', message: 'Add token refresh', scopeScore: 0.8, corruptionCount: 1, adjacentCount: 2, inScopeCount: 4 },
  ],
});

console.log(report.drift);
// { trend: 'drifting', driftScore: 0.35, commits: [...] }

GitHub Actions

- name: ScopeGuard Check
  run: |
    npm install -g @phoenixaihub/scope-guard
    git diff ${{ github.event.pull_request.base.sha }} | \
      scopeguard check \
        -t "${{ github.event.pull_request.title }}" \
        -f annotations \
        --strict

How It Works

Scope Parser — Extracts task intent from descriptions: tokenization, identifier extraction, file pattern matching
Change Extractor — Parses unified diffs into structured file/hunk objects
Scope Classifier — Maps each hunk to task requirements using TF-IDF-like token similarity, identifier matching, and file path relevance
Drift Tracker — Monitors scope creep across multiple commits with trend detection
Reporter — Outputs results as console text, SARIF 2.1.0, or GitHub Actions annotations

Classification Logic

Each change gets a composite relevance score:

40% Token similarity (task description ↔ code tokens)
30% Identifier matching (named functions, classes, variables)
30% File path relevance (mentioned paths, directory overlap)

Score thresholds (configurable):

≥ 0.2 → ✅ In-scope
≥ 0.1 → ⚠️ Adjacent (imports, formatting, config files)
< 0.1 → 🚨 Corruption (unrelated modification)

License

MIT