@n50/agent-entropy-scanner

v0.1.0-alpha.0

Published

2 months ago

Measure citation entropy in multi-agent codebases. Counts agent files that no other code in the same repo imports or spawns. High % = high technical debt — files exist but nothing reaches them.

Downloads

0High
0Medium
0Low

n50

agentic-ai autonomous-agents code-quality static-analysis citation-entropy technical-debt alef-prime

@alef-prime/agent-entropy-scanner

Measure citation entropy in a multi-agent codebase. One CLI command. Mechanical, reproducible, opinionated about nothing — just counts.

npx @alef-prime/agent-entropy-scanner /path/to/repo

What it measures

For each agent file (any *.mjs|*.ts|*.py inside agents/, tools/, skills/, etc.), counts how many other files in the same repo reference it — by import, require, spawn, exec, or string-stem match.

Citation entropy = % of agent files with zero downstream references. They exist in the repo but nothing reaches them. They are either dead code, undocumented dependencies, or files staged for future use that never landed.

Why this matters

A multi-agent system that grows past N=50 agents will accumulate "ghost agents" — files that compile, files that no other file imports, files that nobody remembers writing. They survive because nothing breaks if they're there, but they expand the surface for bugs, security drift, and code review fatigue.

This scanner measures the rate.

Origin

The metric was first observed in ALEF, an autonomous engine maintaining the public catalog at n50.io/patterns. At time of v0.1.0-alpha:

ALEF had 241 agent files
199 of them (82%) had zero downstream references
The number was found by running the precursor script mutation_pressure.mjs on the engine's own source

The hypothesis worth testing: is 82% an ALEF-specific artifact, or do multi-agent codebases in general accumulate this kind of debt? You can test the hypothesis on any agent codebase you have access to:

git clone https://github.com/<repo>
cd <repo>
npx @alef-prime/agent-entropy-scanner .

Output

agent-entropy-scanner v0.1.0-alpha.0
repo: my-agent-project
scanned: 1247 files · 89 agent files · 312ms

  Citation Entropy: 62%
  ███████████████████████████████░░░░░░░░░░░░░░░░░░░
  55 / 89 agent files have ZERO downstream references.

interpretation:
  0-30%      healthy — agents are well-connected
  30-60%     moderate technical debt
  60-100% ← you  high entropy — many "dead" or undocumented agent files

top 5 no-citation files (candidates for tombstone review):
  agents/legacy_planner.mjs
  agents/tool_dispatcher_v2.mjs
  ...

top 5 well-cited (architectural anchors):
  agents/router.mjs · 47 references
  ...

CI integration

# Fail CI if entropy is above 70%
npx @alef-prime/agent-entropy-scanner . --threshold 70
echo $?   # 0 if pass, 1 if entropy% > threshold

JSON output for tooling

npx @alef-prime/agent-entropy-scanner . --json | jq '.citation_entropy_pct'

Methodology — be skeptical

The scanner uses mechanical string matching. It does not understand:

Dynamic imports computed at runtime
File-system-based plugin loaders
Conditional spawning based on user input

So citation_entropy is an over-estimate of dead code. A file with 0 references in source may still be reachable via runtime mechanisms. Treat the number as a starting point for investigation, not a verdict. The scanner's job is to surface candidates, not to decide.

Reproducibility

Any third party can clone any public repo and run this scanner. The number is deterministic modulo file-system encoding and case-sensitivity. If two scans of the same SHA produce different numbers, it's a bug — file an issue.

Roadmap

v0.2: smarter heuristics for dynamic loaders (registry patterns, plugin systems)
v0.3: language-specific scanners (Python AST, TS compiler API) instead of regex
v0.4: longitudinal mode (git log → entropy% over time)

License

MIT. Fork it, modify it, redistribute it. Citation appreciated but not required.

Citing

If you use this in research or in a public claim:

ALEF Pattern Catalog. Agent Entropy Scanner v0.1. n50.io/patterns. CC-BY-4.0.

Origin essay

The full case for why this metric matters is at: n50.io/patterns