contrarianai-context-inspector

v0.2.19

Published

a month ago

See AI bugs before they ship. Free bell-curve sensor that watches your AI's working memory in real time. Plugs into Claude Desktop, Cursor, Cline, Windsurf, Claude Code. MIT-licensed.

Context Inspector

The drop-in MCP inspector for Bell Tuning™ — reading the statistical bell curve of AI context windows to catch domain-alignment degradation 3 steps before output failure.

Your AI system is failing 3 steps before you notice. Context Inspector applies Bell Tuning to your AI workflow: continuously reading the statistical bell curve of context-window content so you can spot degradation as it forms, not after it ships.

📄 Research-backed: Every signal in this tool traces to a controlled experiment documented in the white paper.

What is Bell Tuning?

Bell Tuning is the practice of treating an AI context window as a measurable distribution — and tuning your workflow against the shape of that distribution rather than against the output it eventually produces.

Tighter bell, right-shifted → context is on-domain and consistent
Wider bell, drifting left → contamination, summary loss, or topic drift entering
Flat bell near zero → original content gone; system still answering, but on noise

You don't tune by listening to the output. You tune by watching the bell. Context Inspector is the instrument.

🚀 One-Command Install

npx contrarianai-context-inspector --install-mcp

This auto-detects Claude Desktop, Claude Code, Cursor, Windsurf, or Cline and adds the MCP server to your config. Restart your client, and you're done.

For a specific client: --client=claude-desktop (or cursor, windsurf, cline, claude-code).

🔬 Part of the Bell Tuning™ suite

Context Inspector is the base sensor. Four companion tools extend the same discipline to the other places AI production fails:

| Tool | What it watches | Install | |---|---|---| | predictor-corrector | Forecasts the bell-curve trajectory using numerical methods — detects context rot several turns ahead of output collapse | npx contrarianai-predictor-corrector --baseline prescriptive | | retrieval-auditor | RAG-specific sensor — scores each chunk-to-query alignment and flags six silent retrieval bugs. Unsupervised health score tracks ground-truth precision@5 at r=0.999 | npx contrarianai-retrieval-auditor trace.json | | tool-call-grader | Multi-agent / MCP sensor — grades tool calls per-call and per-session; catches silent failures, fixation, response bloat, schema drift | npx contrarianai-tool-call-grader session.json | | audit-report-generator | Consumes the output of the four sensors and emits a unified audit report in markdown, HTML, or JSON | npx contrarianai-audit-report-generator audit.json --format html |

Full manifesto + four whitepapers: contrarianai-landing.onrender.com/bell-tuning

Free audit offer: The first 10 production users that email [email protected] (subject: "Bell Tuning audit") get a 1-hour engineer sit-down on their actual traces — bell curve, silent-bug flags, written report. No slides.

🖥️ Claude Desktop Setup

Quick install (recommended)

npx contrarianai-context-inspector --install-mcp --client=claude-desktop

Then restart Claude Desktop. Look for the hammer icon (🔨) in the bottom-right of the chat input — it should show 4 new tools from context-inspector.

Manual install

Edit claude_desktop_config.json at:

macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
Windows: %APPDATA%\Claude\claude_desktop_config.json
Linux: ~/.config/Claude/claude_desktop_config.json

Add:

{
  "mcpServers": {
    "context-inspector": {
      "command": "npx",
      "args": ["-y", "contrarianai-context-inspector", "--mcp"]
    }
  }
}

Save, restart Claude Desktop, and look for the hammer icon.

What you'll see

Screenshots not visible? They're captured per docs/assets/README.md. Run the installer + verification prompt and drop the PNGs there.

Expected indicators:

🔨 Hammer icon visible at bottom of chat input
Clicking it shows 4 tools: analyze_context, get_bell_curve, get_chunks, compare_alignment
Tool tooltips describe what each one does

Verify it works

Paste this into Claude Desktop:

Use the get_bell_curve tool to analyze this text: "The three little pigs built houses from straw, sticks, and bricks."

Claude should call the tool and respond with a bell curve summary (mean, σ, histogram shape).

Troubleshooting

| Symptom | Fix | |---------|-----| | No hammer icon | Check config JSON is valid. Restart Claude Desktop fully (quit, not just close window). | | Tools greyed out | Check the npx path in your shell (which npx). Claude Desktop uses system PATH. | | "Server disconnected" | The first npx -y install can take 30-60s. Wait and try again. | | Config file missing | Create the directory (mkdir -p ~/Library/Application\ Support/Claude), then add the JSON above. |

For Cursor, Windsurf, Cline, or Claude Code — use the same JSON but in the respective client's config file (the installer does this automatically).

Need Help Diagnosing Context Degradation?

Seeing σ collapse, flattening bell curves, or early mean drift in your MCP workflow?

I offer AI System Diagnosis Consulting this week:

Attach the inspector to your existing MCP setup (Claude Desktop, Cursor, custom agents, etc.)
Run real-time analysis on your context flows
Deliver a prioritized report with root causes and fixes (refresh strategies, eviction rules, prompt hygiene, etc.)
Built from the research in the white paper

Book a 30-minute discovery call (free or low-cost) this week: cal.com/kevin-luddy-0dlzuu

Let's find what's actually wrong with your AI before your users do.

# AI-guided setup — the easiest way to start
npx contrarianai-context-inspector --setup

# Analyze a file from the command line
npx contrarianai-context-inspector conversation.txt --domain --verbose

# Web dashboard
npx contrarianai-context-inspector --serve

MCP integration — add to your .mcp.json (don't run --mcp manually):

{
  "mcpServers": {
    "context-inspector": {
      "command": "npx",
      "args": ["contrarianai-context-inspector", "--mcp"]
    }
  }
}

Context Inspector Dashboard The bell curve at step 10 (healthy, right-shifted) vs step 15 (collapsed, flat). The graph warned 3 steps before the output failed.

The Problem

Most teams evaluate AI by checking the output. The answer looks right? Ship it.

But the context window can be structurally degraded while the output still appears correct. By the time the output fails, the context has already been compromised — and recovery may be impossible.

Context Inspector watches the context, not the output. It computes domain alignment distributions across every chunk and alerts when the bell curve starts to flatten — the statistical signature of context rot.

What It Measures

| Metric | What It Tells You | |--------|------------------| | Domain σ (standard deviation) | How consistently the context aligns with the target domain. Rising σ = contamination entering. Collapsing σ toward zero = original content gone. | | Domain mean | Where the bell curve is centered. Leftward drift = domain alignment weakening. | | Bell curve shape | Tight right-shifted = healthy. Bimodal = contamination present. Flat near zero = total rot. | | Per-chunk scores | Individual measurements shown as a rug plot. See exactly which chunks are on-topic and which aren't. |

Plus: readability scores, sentiment, entropy, cosine similarity, N-grams, POS tagging, NER, LDA topics, BPE token counts, and the full statistical suite (skewness, kurtosis, percentiles, IQR, MAD, z-scores, trend detection, correlation, moving averages).

From White Paper to Tool

"We found that context degradation follows predictable statistical patterns. The bell curve of domain alignment scores is a leading indicator that reveals structural decay before output quality degrades."
— Context Rot: Statistical Early Warning for AI System Degradation

The white paper documents a controlled experiment: feeding fairy tales through an AI system with a constrained context window, then progressively adding unrelated content (Cinderella, Christopher Columbus, the Battle of the Alamo) while monitoring the bell curve.

Key finding: The bell curve σ spiked at step 11 while the output still scored 0.85. Three steps later, the output collapsed to 0.00 — and never recovered. The graph saw it coming. The output didn't.

Every metric and visualization in Context Inspector was designed to surface this specific signal. The tool is the research, productized.

Read the full white paper →

Quick Start

As an MCP Server (drop into any AI workflow)

Add to your .mcp.json or claude_desktop_config.json (then restart your MCP client):

{
  "mcpServers": {
    "context-inspector": {
      "command": "npx",
      "args": ["contrarianai-context-inspector", "--mcp"]
    }
  }
}

Note: Don't run --mcp manually in your terminal — it's a stdio server that MCP clients manage. Use --setup or --serve for interactive use.

Available MCP tools:

| Tool | Description | |------|------------| | analyze_context | Full analysis: domain/user alignment, stats, bell curve data, per-chunk breakdown | | get_bell_curve | Quick bell curve stats (mean, σ, histogram) for domain or user alignment | | get_chunks | Per-chunk scores with top-N highest and lowest scoring chunks | | compare_alignment | Side-by-side domain vs user alignment comparison |

CLI

# Domain alignment (default) — reports σ, mean, histogram
npx contrarianai-context-inspector conversation.txt

# User alignment
npx contrarianai-context-inspector conversation.txt --user

# Custom chunk size
npx contrarianai-context-inspector conversation.txt --chunk-size 300

# Full JSON output (pipe to jq, store, compare)
npx contrarianai-context-inspector conversation.txt --json

# Per-chunk breakdown with visual bars
npx contrarianai-context-inspector conversation.txt --verbose

# Read from stdin
cat system-prompt.txt | npx contrarianai-context-inspector -

Example output:

  context-inspector — domain alignment

  Input:       4,696 chars, 10 chunks @ 500 chars
  Mean:        0.7867
  Std Dev:     0.2455  [moderate]
  Median:      0.8319
  Skewness:    -0.6421
  Kurtosis:    -0.3812
  Range:       0.2884 — 1.0000
  Alignment:   strong
  Narrative:   moderate bell curve (σ=0.2455): strong domain-aligned content.

  Distribution (domain):

  0.00 |
  0.05 |
  0.10 |
  0.15 |
  0.20 |#########
  0.25 |
  0.30 |
  ...
  0.85 |#########################################
  0.90 |##################
  0.95 |#########
  1.00

Web Dashboard

# Analysis tool (port 4000)
npx contrarianai-context-inspector --serve

# Simulation dashboard (port 4001)
node sim/index.js

The web dashboard provides:

Interactive bell curve with mean line (red), ±1σ/±2σ bands (blue), Gaussian fit (green), rug plot (white dots)
Concentrator toggle (domain vs user)
Chunk size slider
Color-coded chunk view sorted by alignment score

Context Inspector vs MCP Inspector

| | Anthropic MCP Inspector | Context Inspector | |---|---|---| | Purpose | Debug MCP tool calls and responses | Monitor context window statistical health | | Approach | Manual testing of individual tool calls | Proactive statistical monitoring across turns | | What it shows | Tool input/output, error messages, latency | Bell curve shape, σ trend, domain drift, chunk scores | | When to use | "Is my MCP tool returning the right data?" | "Is my context degrading before my output fails?" | | Integration | Standalone debugging UI | MCP server that drops into any workflow | | Detects | Tool errors, schema mismatches | Context rot, domain drift, contamination, information loss |

They're complementary. MCP Inspector verifies that tools work correctly. Context Inspector verifies that the context built from tool results remains structurally sound. Use both.

Simulation Framework

Context Inspector includes a simulation framework for testing context management strategies:

# Run 150 simulations (50 per scenario: RAG, multi-agent, support bot)
node sim/runner.js

# Run the context rot experiment (fairy tales + contamination)
node sim/rot-runner.js

# Run a specific story
node sim/rot-runner.js --story three_little_pigs

# Start the simulation dashboard
node sim/index.js

Pre-built scenarios:

| Scenario | What it simulates | Failure patterns injected | |----------|------------------|--------------------------| | RAG Pipeline | Retrieval drift, context accumulation | Irrelevant retrieval, context overflow | | Multi-Agent | Coordinator + researcher + coder + reviewer | Coordination bloat, tool misroute, self-evaluation | | Support Bot | Customer conversation with topic changes | Topic drift, sentiment escalation | | Story Lessons | Context rot demonstration with fairy tales | Drop-oldest + resummarize, progressive contamination |

Each simulation stores per-step analysis data (domain/user stats, bell curve snapshots, lessons alignment scores) in SQLite for comparison across runs.

Architecture

context-inspector/
├── core.js              # Analysis engine (TF-IDF, stats, NLP, scoring)
├── cli.js               # Command-line interface
├── mcp-server.js        # MCP server (stdio transport, 4 tools)
├── web-server.js        # Web UI server (port 4000)
├── web/index.html       # Analysis web UI
├── docs/
│   └── whitepaper.md    # Research paper
└── sim/
    ├── index.js          # Simulation dashboard server (port 4001)
    ├── runner.js          # Batch simulation runner
    ├── rot-runner.js      # Context rot experiment runner
    ├── story-runner.js    # Story lessons runner
    ├── engine.js          # Simulation engine
    ├── db.js              # SQLite storage
    ├── llm.js             # Anthropic API integration
    ├── scoring.js         # Vector alignment scoring
    ├── seed-rng.js        # Deterministic PRNG
    ├── scenarios/          # RAG, multi-agent, support bot, story-rot
    ├── content/            # Templates for simulation content
    ├── stories/            # Fairy tale texts + ground truth
    └── web/index.html      # Simulation dashboard UI

Dependencies

| Package | Purpose | Required? | |---------|---------|:---------:| | express | Web UI server | Yes | | @modelcontextprotocol/sdk | MCP server | Yes | | gpt-tokenizer | Exact BPE token counting | Optional (falls back to word estimate) | | compromise | POS tagging, NER | Optional (skipped if not installed) | | lda | Topic modeling | Optional (skipped if not installed) | | sql.js | Simulation data storage | For simulations only | | ws | WebSocket for live dashboard | For simulations only | | @anthropic-ai/sdk | LLM calls in story experiments | For story simulations only |

License

MIT

Bell looking weird? Get help.

→ Buy the Rapid Audit — $2,500 / 48hr turnaround

We run all five Bell Tuning sensors against your AI's search + agent pipeline and ship you an 8-12 page PDF: bell curves showing what's happening under the hood, the silent bugs we found, a ranked fix list, plus a walkthrough call and 7 days of Q&A. Limited to 3 slots a week.

Want to talk first? → Free 30-min call

Context Inspector spots the bug. Fixing it — redesigning the search step, rewriting prompts, rebuilding how memory works — is where most teams get stuck.

contrarianAI does this work:

Context audits — we hook into your pipeline, find where the bell curve collapses, and show you why.
Architecture reviews — for AI that looks things up (RAG), multiple agents working together, or long-running chat. We pressure-test the parts you can't see failing yet.
Whitepaper-grade diagnostics — the same method documented in docs/whitepaper.md, applied to your system.

We find what's actually broken in your AI before your users do. → contrarianai-landing.onrender.com