@phoenixaihub/compaction-guard

v0.1.0

Published

19 days ago

Testing framework that verifies safety-critical instructions survive context window compaction

Downloads

136

0High
0Medium
0Low

phoenixaihub

ai-safety context-window compaction testing agent llm invariants

CompactionGuard

Testing framework that verifies safety-critical instructions survive context window compaction.

The pytest for context compaction safety. Define instruction invariants. Simulate compaction. Assert preservation.

The Problem

Every major AI agent framework compacts context windows when they get too long. None of them verify that safety-critical instructions survive the compaction.

Real incidents:

Meta's AI Safety Director lost 200+ emails when compaction removed her "confirm before acting" directive
Five Eyes government guidance explicitly calls for instruction preservation in agentic AI
Claude Code, Codex CLI, OpenCode — all handle compaction differently, none verify preservation

CompactionGuard fills this gap. Define what MUST survive. Test it. Catch failures before production.

Quick Start

# Initialize with sample invariants
npx @phoenixaihub/compaction-guard init

# Run tests
npx @phoenixaihub/compaction-guard test

How It Works

1. Define Invariants

Create invariants.yaml with instructions that must survive compaction:

invariants:
  - id: safety-confirm
    instruction: "Always ask for confirmation before deleting files"
    severity: critical
    match: semantic
    threshold: 0.85

  - id: no-external-requests
    instruction: "Never make HTTP requests to external services without approval"
    severity: critical
    match: semantic
    threshold: 0.90

  - id: output-format
    instruction: "Always respond in JSON format when the user requests structured output"
    severity: warning
    match: regex
    pattern: "respond.*JSON|JSON.*format|output.*JSON"

2. Run Tests

compaction-guard test

Output:

╔══════════════════════════════════════════════════════════╗
║            CompactionGuard Test Report                  ║
╚══════════════════════════════════════════════════════════╝

  File:        invariants.yaml
  Invariants:  5
  Strategies:  6

  ✅ Strategy: token-budget (100% pass rate)
    ✓ 🔴 [safety-confirm] score=0.912 threshold=0.85
    ✓ 🔴 [no-external-requests] score=0.945 threshold=0.90

  ❌ Strategy: truncate-front (20% pass rate)
    ✗ 🔴 [safety-confirm] score=0.022 threshold=0.85
        Lost at 90% context size

  Total: 30 | Passed: 22 | Failed: 8 | Critical failures: 4
  Result: ❌ FAILED (critical instructions lost)

3. Integrate in CI

compaction-guard ci --file invariants.yaml
# Exit code 0 = all critical invariants preserved
# Exit code 1 = critical instruction loss detected

CLI Commands

| Command | Description | |---------|-------------| | compaction-guard init | Create sample invariants.yaml | | compaction-guard test | Run all invariant tests | | compaction-guard test --file custom.yaml | Test specific file | | compaction-guard report --format json | Generate report (JSON/JUnit/SARIF) | | compaction-guard simulate | Interactive compaction simulation | | compaction-guard ci | CI mode (exit 0/1, minimal output) |

Compaction Strategies

CompactionGuard tests your invariants against 6 compaction strategies:

| Strategy | Description | |----------|-------------| | truncate-front | Remove tokens from the beginning | | truncate-back | Remove tokens from the end | | truncate-middle | Keep front and back, remove middle | | sliding-window | Keep most recent N tokens | | summary-based | Replace middle with summary marker | | token-budget | Smart selection based on instruction importance |

Each strategy is tested at multiple context sizes: 90%, 75%, 50%, 25%, 10%.

Match Types

| Type | Description | Use When | |------|-------------|----------| | exact | Substring match | Instruction must appear verbatim | | regex | Pattern match | Flexible keyword matching | | semantic | TF-IDF cosine similarity | Meaning preservation (default) |

Report Formats

# JSON (default for report command)
compaction-guard report --format json

# JUnit XML (CI integration)
compaction-guard report --format junit -o report.xml

# SARIF (GitHub Code Scanning)
compaction-guard report --format sarif -o report.sarif

# Human-readable text
compaction-guard test --format text

Architecture

┌─────────────────┐     ┌──────────────────┐     ┌─────────────────┐
│  Invariant YAML  │────▶│  Compaction       │────▶│  Preservation   │
│  Parser          │     │  Simulator        │     │  Scorer         │
└─────────────────┘     │  (6 strategies)   │     │  (TF-IDF/exact/ │
                        │  (5 size levels)  │     │   regex)        │
                        └──────────────────┘     └────────┬────────┘
                                                          │
                                                 ┌────────▼────────┐
                                                 │  Reporter        │
                                                 │  (JSON/JUnit/   │
                                                 │   SARIF/text)   │
                                                 └─────────────────┘

Key design decisions:

Zero LLM dependency — TF-IDF cosine similarity for semantic matching
Zero external API calls — Everything runs locally
Framework-agnostic — Test any compaction strategy
CI-native — JUnit XML, SARIF, exit codes

Programmatic API

import { runTests, formatReport, compact, scorePreservation } from '@phoenixaihub/compaction-guard';

// Run full test suite
const report = runTests({ file: 'invariants.yaml' });
console.log(formatReport(report, 'json'));

// Test individual compaction
const result = compact(myContext, 'token-budget', 50);

// Score preservation
const score = scorePreservation(myInvariant, result.compactedText);

Why Not Just Use an LLM?

Deterministic — Same input always produces same result
Fast — Milliseconds, not seconds
Free — No API costs
Offline — Works without internet
CI-friendly — No API keys in CI environment

Comparison

| Feature | CompactionGuard | Manual Testing | LLM-based Check | |---------|:-:|:-:|:-:| | Automated | ✅ | ❌ | ✅ | | Deterministic | ✅ | ❌ | ❌ | | CI Integration | ✅ | ❌ | ⚠️ | | Zero API Cost | ✅ | ✅ | ❌ | | Multiple Strategies | ✅ | ❌ | ❌ | | SARIF/JUnit Output | ✅ | ❌ | ❌ | | Offline | ✅ | ✅ | ❌ |

Roadmap

[ ] Framework adapters (Claude Code, LangChain, OpenClaw)
[ ] Embedding-based semantic matching (sentence-transformers)
[ ] Custom compaction strategy plugins
[ ] GitHub Action
[ ] VS Code extension

Contributing

See CONTRIBUTING.md for guidelines.

License

MIT — see LICENSE for details.