excavate
v1.0.1
Published
Dig up what is buried in your codebase.
Maintainers
Readme
Excavate
Dig up what is buried in your codebase.
Every codebase has files nobody wants to touch. The ones where a small bug fix turns into a week of archaeology. The ones that keep showing up in post-mortems. Excavate finds them — in seconds, with no setup.
npx excavateRequires Node.js ≥ 18 and a git repository.
What you get
████
███ Excavate v0.1.0
██
█ your codebase, laid bare
scanning ./src ⠸
BEDROCK src/auth/legacy-session.ts 91 churn ▲ coverage ▲ authors:1
BEDROCK src/payments/stripe-v1.ts 84 coverage ▲ knowledge ▲
BEDROCK src/utils/formatters.js 78 complexity ▲
DEEP src/api/users.ts 62
DEEP src/components/DataTable.tsx 55
SURFACE src/hooks/useTheme.ts 34
CLEAR src/hooks/useAuth.ts 18
─────────────────────────────────────────────
files scanned 847 duration 4.2s
bedrock 34 deep 212
surface 189 clear 412
avg score 41 est. cleanup ~340h
─────────────────────────────────────────────Each file gets a debt score from 0–100. Excavate uses geological language — because debt accumulates in layers, oldest and most compressed at the bottom.
| Level | Score | What it means | | ----------- | ------ | ---------------------------------- | | bedrock | 70–100 | ancient, untouchable, nobody dares | | deep | 40–70 | accumulated over years | | surface | 20–40 | recent deposits, still manageable | | clear | 0–20 | no significant accumulation |
How the score is calculated
Excavate combines six signals into a single weighted score per file:
| Signal | Weight | What it's measuring |
| ------------ | ------ | ---------------------------------------------------- |
| churn | 25% | How often this file changes — instability indicator |
| coverage | 25% | How much of it is untested — risk with no safety net |
| complexity | 20% | Cyclomatic complexity — how hard it is to follow |
| knowledge | 15% | Bus factor — only one person has ever touched this |
| docs | 10% | Comment density + TODO/FIXME/HACK count |
| deps | 5% | Circular dependencies, stale packages, CVEs |
All signals are normalised to 0–100 before weighting. No magic — just git history, your AST, and your coverage report.
Getting started
# Run instantly with no install
npx excavate
# Or install globally
npm install -g excavateThen point it at any JS/TS repo:
# Scan current directory
npx excavate
# Scan a specific path
npx excavate /path/to/repo
# Generate a shareable HTML report
npx excavate --report
open excavate-report/index.html
# Show only the worst 10 files
npx excavate --top 10
# Use only the last 30 days of git history
npx excavate --since 30The HTML report
Pass --report to get a self-contained HTML file you can share with your team or management — no server, no external deps, just open it in a browser. Excavate bundles everything inline.
It includes:
- A D3 treemap where box size = lines of code, colour = debt level
- Click any file to see its full signal breakdown with plain-English explanations
- A sortable table of every scanned file
- Summary counts and an estimated cleanup hours figure
- A debt trend chart showing avg score over time (appears after 2+ scans)
- A comparison toggle — switch between Latest, vs Base (first-ever scan), and vs Last (previous scan) to see which files improved or worsened
npx excavate --report
open excavate-report/index.htmlTrend tracking (automatic)
Every scan automatically saves a snapshot to excavate-report/history/. The first scan becomes your permanent base. Subsequent scans show a delta vs that base in the terminal:
baseline → today 52 → 46 avg delta −6 ✓The HTML report (--report) shows a trend line chart and a toggle to compare files vs base or vs the previous scan.
History is kept at 5 snapshots by default (base always preserved + last 4). Configure with historyLimit. Disable entirely with history: false (useful in CI where reportDir is ephemeral).
Comparing scans (manual diff)
For ad-hoc comparison between any two saved JSON reports:
excavate . --output json
cp excavate-report/excavate-report.json baseline.json
# ... make code changes ...
excavate . --output json
excavate diff baseline.json excavate-report/excavate-report.jsonOutput shows which files got better or worse, with colour-coded deltas.
GitHub Action
Post a debt summary comment on every pull request automatically:
# .github/workflows/excavate.yml
name: Debt Scan
on: [pull_request]
jobs:
excavate:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- uses: sonic-sabers/excavate-action@v1
with:
github-token: ${{ secrets.GITHUB_TOKEN }}
fail-above: '70'See sonic-sabers/excavate-action for full docs.
CI integration
Use --fail-above to gate your CI pipeline on debt score:
npx excavate --fail-above 70 --jsonExits with code 1 if the average score across all files exceeds the threshold. Pair with --json for machine-readable output.
# .github/workflows/debt.yml
- name: Check technical debt
run: npx excavate --fail-above 70 --jsonConfiguration
Excavate works out of the box with zero config. When you're ready to tune it, drop a .excavaterc in your repo root (or add an excavate key to package.json):
{
"include": ["src/**/*.ts", "src/**/*.tsx"],
"exclude": ["**/*.test.ts", "**/node_modules/**"],
"weights": {
"churn": 0.25,
"coverage": 0.25,
"complexity": 0.2,
"knowledge": 0.15,
"docs": 0.1,
"deps": 0.05
},
"thresholds": {
"bedrock": 70,
"deep": 40,
"surface": 20
},
"output": ["terminal", "html"],
"reportDir": "./excavate-report",
"failAbove": null,
"history": true,
"historyLimit": 5
}Weights must sum to 1.0. See .excavaterc.example for the full list of options.
All options
excavate [path] [options]
Arguments:
path Repo to scan (default: current directory)
Options:
-o, --output <formats> Output formats: terminal, html, json (comma-separated)
-r, --report Shorthand for --output terminal,html
--report-dir <dir> Where to write reports (default: ./excavate-report)
--fail-above <score> Exit 1 if avg score exceeds this — useful in CI
--config <path> Path to config file (default: auto-discover .excavaterc)
--no-color Disable terminal colours
--json Shorthand for --output json
--top <n> Show only the top N worst files
--since <days> Git history window in days (default: 90)
-v, --version Show version
-h, --help Show helpProgrammatic API
import { scan } from "excavate";
const result = await scan("/path/to/repo", {
output: [],
failAbove: null,
});
console.log(result.summary.avgScore);
console.log(result.files.filter((f) => f.level === "bedrock"));Why not just ask an LLM to review your code?
You can. But there are four things an LLM cannot do:
1. It can't read your git history. Debt isn't just in how code looks — it's in how it behaves over time. A file that's been touched 60 times in 90 days by a single author is a risk that no static snapshot reveals. LLMs see one commit. Excavate sees the whole dig site.
2. It can't tell you where to look first. Ask an LLM to "find the worst file in this repo" and it'll pick something that looks messy. Excavate ranks by a weighted signal model — churn, coverage gap, complexity, bus factor, circular deps, and SATD — built on the same empirical research that Code Maat and DebtViz are based on. The worst-looking file is rarely the highest-risk file.
3. It can't fit your repo in context. A 50,000-line codebase won't fit in any context window. Summaries lose the signal. Excavate runs locally against the full repo — every file, every commit in the window, every import graph — and collapses it into a ranked list in seconds.
4. It produces no artefact you can act on.
An LLM gives you a paragraph. Excavate gives you a score per file, a sortable table, a D3 treemap you can share with your manager, and a --fail-above CI gate that enforces debt thresholds on every PR. The insight becomes process.
Use LLMs to fix the files Excavate surfaces. Use Excavate to find them.
Why not SonarQube, Code Climate, or Plato?
| Tool | The gap | | --- | --- | | SonarQube | Requires a Java server, a running instance, and admin setup. Zero-config it is not. | | Code Climate | SaaS. Your code leaves your network. Pricing starts at $10/seat. | | Plato | JS-only, CJS, unmaintained since 2018. No git signals, no coverage integration. | | ESLint | Lints syntax and style. Has no model of time, history, or who knows the code. | | npm audit | CVEs only. One signal out of six. |
Excavate is the only tool that combines git history + AST complexity + coverage gaps + bus factor + circular deps + SATD into a single ranked score, runs in one npx command with no server or account, and produces a report your manager can open in a browser.
FAQ
Does it need any setup?
No. Excavate works on any git repo with JS or TS files — just run npx excavate. Coverage signal is optional — if no coverage report is found, that signal is skipped and weights are redistributed automatically.
Does it send any data anywhere?
No. Everything runs locally. The only network call is npm audit for CVE data, which goes to the npm registry — the same call npm audit makes normally.
What about monorepos?
Pass the path to the sub-package: npx excavate packages/api.
Can I use it with Vitest / Jest / c8?
Yes — as long as your test runner outputs a coverage/coverage-summary.json or lcov.info, Excavate will pick it up automatically.
Using excavate with AI assistants
A machine-readable summary of excavate's API, CLI flags, output schema, and configuration is available at llms.txt — following the llms.txt convention. AI assistants and coding tools can fetch this file to use excavate correctly without hallucinating flags or types.
npx excavate llms.txt # not a real command — fetch from GitHub raw or npm packageDirect URL (raw):
https://raw.githubusercontent.com/sonic-sabers/excavate/main/llms.txtLicense
MIT
