erf-analyzer

v0.2.1

Published

4 months ago

embarrassing relative finder - identify dead code, broken dependencies, and complexity hotspots

0High
0Medium
0Low

danja

code-analysis dead-code refactoring dependency-graph mcp rdf

erf - Embarrassing Relative Finder

A relatively lightweight Javascript code quality and dependency analysis tool that helps identify unused code, broken dependencies, isolated subgraphs, and complexity hotspots in your codebase. Has GUI, CLI and MCP connectivity. Now you can determine who you shouldn't invite to the wedding.

When ERF tools may be useful:

Pre-refactoring planning - Know which files are hubs before making changes
Code cleanup - Find genuinely unused code
Understanding impact - See how many files depend on a module before modifying it
Tracking tech debt - Monitor health score and missing imports over time

Screenshot

Overview

erf analyzes JavaScript/Node.js projects to find code quality issues that might be embarrassing if discovered by others:

🔍 Unused Code Detection - Functions, classes, and modules that are never imported or called
🔗 Broken Dependencies - Missing imports, circular dependencies, unresolved modules
🏝️ Isolated Subgraphs - Code clusters with no connections to the main application
🔥 Complexity Hotspots - Files and functions with high cyclomatic complexity
📊 Dependency Health - Overall codebase health scores and metrics
🔄 Duplicate Detection - Find duplicate method/function names indicating potential code redundancy

Architecture

erf is designed as a standalone Node.js tool with three interfaces:

Core Analysis Engine - JavaScript AST parsing and RDF-based dependency graph
MCP Server - Model Context Protocol integration for AI assistants
Web GUI - Interactive force-directed graph visualization
CLI

Status : 2025-10-02

It all appears to basically work on my setup, ymmv. Gives a very optimistic view of a particular codebase I know is messy, so take reports with a pinch of salt.

Quick Start

CLI

npm install -g erf-analyzer

# Generate comprehensive report (prints to stdout)
erf

# Save report to file
erf --file report.md

# Analyze specific directory
erf ./src

See sample-report.md for an example analysis report generated by running erf on this codebase.

GUI

From global install:

npm install -g erf-analyzer

# Launch GUI (starts API server on :3001 and Vite dev server on :3000)
erf-gui

Then open http://localhost:3000 in your browser.

From local clone:

git clone <repo-url>
cd erf
npm install
npm run dev

MCP

claude mcp add erf node /absolute/path/to/erf/bin/erf-mcp.js

works for me :claude mcp add erf node /home/danny/hyperdata/erf/bin/erf-mcp.js

later : claude mcp add erf npx erf-analyzer

File modified: /home/danny/.claude.json [project: /home/danny/hyperdata/erf]

Or manually edit your MCP configuration file:

Location: ~/.config/claude-code/mcp.json (Linux/Mac) or %APPDATA%\claude-code\mcp.json (Windows)

{
  "mcpServers": {
    "erf": {
      "command": "node",
      "args": ["/absolute/path/to/erf/bin/erf-mcp.js"]
    }
  }
}

Using `npx` for the Latest MCP Release

[mcp_servers.erf]
command = "npx"
args = ["--yes", "--package", "erf-analyzer@latest", "erf-mcp"]

Hyphenated binary: use erf-mcp (npx maps the npm name erf-analyzer to that executable).
First-run delay: npx downloads the package on first use; if your MCP client has a short startup timeout, prime the cache with a shell run of npx --yes --package erf-analyzer@latest erf-mcp or bump the client timeout.
Faster starts: once cached, npx launches almost instantly. Alternatively, install globally (npm install -g erf-analyzer) and point the client at the erf-mcp binary to skip npx entirely.
Pinning: swap @latest for a specific version if you need deterministic behaviour.

the following was what Claude suggested I put in CLAUDE.md after trying the MCP tools:

Using MCP ERF Tools for Codebase Analysis

The ERF (Entity Relationship Framework) MCP tools provide powerful static analysis capabilities for understanding codebase structure and health:

Available Tools:

mcp__erf__erf_analyze - Generate comprehensive codebase statistics (files, modules, functions, imports, exports)
mcp__erf__erf_health - Get overall health score (0-100) with connectivity, structure, and quality metrics
mcp__erf__erf_dead_code - Find unreachable files and unused exports
mcp__erf__erf_isolated - Identify code subgraphs with no connection to entry points
mcp__erf__erf_hubs - Find hub files (core infrastructure that many files depend on)
mcp__erf__erf_functions - Analyze function/method distribution and complexity
mcp__erf__erf_duplicates - Find duplicate or similar method/function names

When to Use ERF Tools:

Before major refactoring - Use erf_health to get baseline metrics, erf_hubs to identify critical files needing extra testing
During code cleanup - Use erf_dead_code and erf_isolated to find candidates for removal, erf_duplicates to find redundant code
After architecture changes - Use erf_analyze to verify import/export structure, check connectivity
Identifying technical debt - Use erf_health to track missing imports, isolated files over time
Understanding unfamiliar codebases - Use erf_hubs to find the most important files to study first

Example Results from Semem:

Health Score: 63/100 (Good) - 567/589 files connected, 22 isolated, 53 missing imports
Top Hubs: Config.js (125 dependents), SPARQLHelper.js (77 dependents), Utils.js (50 dependents)
Dead Code: 0 dead files, 302 unused exports, 100% reachability
Functions: 5833 functions across 589 files (avg 9.9 per file)

These tools are faster and more accurate than text search for architectural questions, and complement Gemini CLI which is better for semantic code understanding.

Installation

npm install -g erf-analyzer

Or use locally:

npm install erf-analyzer
npx erf-analyzer

Usage

CLI

Default Command - Comprehensive Analysis:

# Run comprehensive analysis on current directory (prints to stdout)
erf
# or
erf .

# Analyze specific directory
erf /path/to/project

# Save report to file (default: erf-report.md)
erf --file
erf --file custom-report.md

# Export RDF graph as Turtle (default: erf.ttl)
erf --rdf
erf --rdf custom-graph.ttl

# Trace critical path from entry point
erf --entry bin/index.js

# Combine all flags
erf --file report.md --rdf graph.ttl --entry src/main.js

# Use custom config
erf --config .erfrc.json

The default command generates a comprehensive markdown report including:

Summary statistics (files, functions, imports, exports, dependencies)
Health score (0-100) with emoji indicator 🟢🟡🟠🔴 (now includes redundancy penalty)
Dead code analysis (reachable/dead files, unused exports, reachability %)
Duplicate methods analysis (cross-class, cross-file duplicates with redundancy score)
Top 5 largest files by lines of code
Critical path analysis (with --entry flag)
Actionable recommendations

Specific Analysis Commands:

# Full codebase analysis with JSON/RDF/stats output
erf analyze [directory]
erf analyze --format json
erf analyze --format rdf

# Find dead code (unreachable files and unused exports)
erf dead-code [directory]
erf dead-code --format text
erf dead-code --format json

# Generate health report
erf health [directory]

# Find isolated subgraphs (disconnected code)
erf isolated [directory]

# Find duplicate method/function names
erf duplicates [directory]
erf duplicates --threshold 3
erf duplicates --include-similar
erf duplicates --format json

# Launch interactive GUI visualization
erf show [directory]

MCP Server

# Start MCP server (stdio)
erf-mcp

# Or use via Claude Desktop / other MCP clients

Available MCP tools:

erf_analyze - Full codebase analysis
erf_dead_code - Find unused code
erf_isolated - Find disconnected modules
erf_health - Get health metrics
erf_hubs - Identify hub files (core infrastructure)
erf_functions - Analyze function/method distribution
erf_duplicates - Find duplicate/similar method names

Configuration

Create .erfrc.json in your project root:

{
  "entryPoints": ["src/index.js", "src/server.js"],
  "ignore": ["node_modules/**", "dist/**", "**/*.test.js"],
  "entryPointPatterns": [],
  "thresholds": {
    "complexity": 10,
    "minReferences": 1
  },
  "analyzers": {
    "deadCode": true,
    "complexity": true,
    "isolated": true
  }
}

Entry Point Detection

erf uses smart heuristics to distinguish genuine entry points from dead code:

Explicit entry points - Files listed in entryPoints config
Auto-detected entry points - Files with no incoming imports that match patterns:
- /bin/ - Binary/executable scripts
- /scripts?/ - Script directories
- server.js - Server entry points
- main.js - Main entry files
- .config.(js|mjs|cjs|ts) - Configuration files
- /(tests?|specs?|__tests__)/ - Test directories
- .(test|spec).(js|mjs|cjs|ts) - Test files
Dead code detection - Files with no incoming imports BUT with exports are considered dead code, not entry points

Custom entry point patterns can be added via entryPointPatterns config:

{
  "entryPointPatterns": ["/workers/", "\\.worker\\.js$"]
}

Patterns should be regular expression strings (without delimiters). They will be combined with the default patterns above.

RDF-Based Graph Model

Uses RDF-Ext to model code structure as a semantic graph with custom ontology:

@prefix erf: <http://purl.org/stuff/erf/> .
@prefix code: <http://purl.org/stuff/code/> .

# Nodes
erf:File, erf:Module, erf:Function, erf:Class, erf:Variable

# Edges
erf:imports, erf:exports, erf:calls, erf:extends, erf:references

# Properties
code:loc, code:complexity, code:lastModified, erf:isEntryPoint, erf:isExternal

Web GUI

Launch the interactive web interface:

# Start API server (port 3001)
npm run gui

# In another terminal, start dev server (port 3000)
npm run dev

Then open http://localhost:3000 in your browser.

Features:

Force-directed graph visualization with D3.js
Color-coded nodes: 🟢 Entry points, 🔴 Dead code, 🔵 Files, 🟣 External modules
Node size proportional to file size
Interactive: click nodes for details, drag to reposition
Search and filter controls
Real-time health scoring
Zoom/pan navigation
Click to view source code
Color-coded health indicators

Development

# Clone repository
git clone https://github.com/danja/erf.git
cd erf

# Install dependencies
npm install

# Run tests
npm test

# Run in development
npm run dev

# Build for production
npm run build

Project Structure

erf/
├── src/
│   ├── analyzers/          # Core analysis components
│   │   ├── FileScanner.js
│   │   ├── DependencyParser.js
│   │   ├── GraphBuilder.js
│   │   └── DeadCodeDetector.js
│   ├── config/
│   │   └── ErfConfig.js
│   ├── graph/
│   │   └── RDFModel.js     # RDF-Ext wrapper
│   └── utils/
│       └── ASTWalker.js
├── mcp/
│   ├── index.js            # MCP server entry
│   └── tools.js            # MCP tool definitions
├── ui/
│   ├── src/
│   │   ├── App.vue
│   │   ├── components/
│   │   └── stores/
│   └── vite.config.js
├── bin/
│   ├── erf.js              # CLI entry
│   └── erf-mcp.js          # MCP entry
├── tests/
├── .erfrc.json             # Default config
└── package.json

Technical Approach

Phase 1: Discovery & Parsing

Scan filesystem respecting .gitignore and config patterns
Parse JavaScript files with Babel to generate AST
Extract imports, exports, function calls from AST

Phase 2: Graph Construction

Build RDF graph with files/modules/functions as nodes
Create edges for imports, exports, calls, references
Identify entry points from config

Phase 3: Analysis

Dead code: Traverse from entry points, mark unreachable nodes
Isolated subgraphs: Find connected components with no entry point
Complexity: Calculate metrics per file/function
Health scores: Aggregate metrics into overall scores

Phase 4: Output

CLI: Text reports with colored output
JSON: Structured data for CI/CD integration
RDF: Export graph for further semantic analysis
HTML: Interactive visualization

Roadmap

[x] Project initialization and structure
[x] FileScanner implementation
[x] DependencyParser (ES modules + CommonJS)
[ ] RDF graph model wrapper
[ ] GraphBuilder for dependency graph construction
[ ] DeadCodeDetector algorithm
[ ] CLI interface with Commander.js
[ ] MCP server with stdio protocol
[ ] Web GUI with Vue 3 + D3.js
[ ] TypeScript support
[ ] Multi-language support (Python, Go, etc.)
[ ] CI/CD integration examples
[ ] VSCode extension

Contributing

Contributions welcome! Please read CONTRIBUTING.md for guidelines.

License

MIT License - see LICENSE file for details.

Status 2025-09-30

Completed ✅

Project Initialization
- Created package.json with all dependencies (@babel/parser, acorn, rdf-ext, commander, etc.)
- Set up directory structure (src/, mcp/, ui/, tests/, bin/)
- Created .erfrc.json default configuration
ErfConfig.js - Configuration Management
- Loads user config from .erfrc.json
- Merges with sensible defaults
- Provides validation methods
- Location: /home/danny/hyperdata/erf/src/config/ErfConfig.js
FileScanner.js - Filesystem Analysis
- Walks directory tree recursively
- Respects .gitignore patterns using ignore library
- Supports custom ignore patterns from config
- Returns file info with stats (size, mtime, etc.)
- Location: /home/danny/hyperdata/erf/src/analyzers/FileScanner.js
DependencyParser.js - AST-Based Dependency Extraction
- Parses JavaScript files using @babel/parser
- Extracts ES module imports/exports (import, export)
- Extracts CommonJS dependencies (require, module.exports)
- Handles dynamic imports and require() with variables
- Resolves relative import paths to absolute paths
- Distinguishes external packages from local files
- Implements caching for performance
- Location: /home/danny/hyperdata/erf/src/analyzers/DependencyParser.js
- 350+ lines of production-ready code
RDFModel.js - RDF-Ext Wrapper ✅
- Wraps RDF-Ext library with convenience methods
- Implements custom erf ontology (erf:, code: namespaces)
- Provides methods to add nodes (files, modules, functions, classes)
- Provides methods to add edges (imports, exports, calls, references)
- Query interface for graph analysis (by type, imports, exports, entry points)
- Export to N-Quads format and graph statistics
- Location: /home/danny/hyperdata/erf/src/graph/RDFModel.js
- 450+ lines with full CRUD operations on RDF graph
GraphBuilder.js - Dependency Graph Construction ✅
- Orchestrates FileScanner + DependencyParser
- Builds complete RDF dependency graph
- Identifies and marks entry points from config or package.json
- Handles external package detection
- Exports in multiple formats (json, rdf, stats)
- Location: /home/danny/hyperdata/erf/src/analyzers/GraphBuilder.js
- 4-phase build process with detailed logging
DeadCodeDetector.js - Reachability Analysis ✅
- Graph traversal from entry points using BFS
- Marks all reachable nodes
- Identifies dead files (unreachable from entry points)
- Detects unused exports
- Calculates reachability percentage
- Generates human-readable reports
- Location: /home/danny/hyperdata/erf/src/analyzers/DeadCodeDetector.js
- Path finding capability to trace why code is dead
CLI Interface - Commander.js Commands ✅
- erf analyze - Full codebase analysis with multiple output formats
- erf dead-code - Find unreachable code (text/json output)
- erf health - Generate health report with score (0-100)
- erf isolated - Find isolated subgraphs
- erf init - Create default .erfrc.json config
- Location: /home/danny/hyperdata/erf/bin/erf.js
- 250+ lines with full argument parsing and error handling
Vitest Test Suite ✅
- 40/43 tests passing (93% pass rate)
- Unit tests: ErfConfig (6/6), FileScanner (4/4), DependencyParser (10/10), RDFModel (13/13)
- Integration tests: Full analysis (7/10) using erf's own codebase as test target
- Test configuration: vitest.config.js with coverage reporting
- Test documentation: tests/README.md with comprehensive testing philosophy
- Dogfooding approach: Tests analyze erf itself to validate functionality
- Locations: tests/unit/**/*.test.js, tests/integration/**/*.test.js
- Run with: npm test, npm run test:unit, npm run test:integration, npm run test:coverage
MCP Server Interface ✅
- Full stdio protocol server using @modelcontextprotocol/sdk
- 4 MCP tools: erf_analyze, erf_dead_code, erf_health, erf_isolated
- Rich formatted responses with statistics and recommendations
- Health scoring (0-100) with visual indicators (🟢🟡🟠🔴)
- Error handling with stack traces
- Location: /home/danny/hyperdata/erf/mcp/index.js
- Entry point: /home/danny/hyperdata/erf/bin/erf-mcp.js
- Claude Code integration: claude mcp add erf node /path/to/erf/bin/erf-mcp.js
Development Documentation ✅
- Comprehensive CLAUDE.md with architecture, patterns, troubleshooting
- Code examples for common operations
- MCP tool documentation
- Testing guidelines and known issues
- Contributing guidelines and resources
Web GUI ✅
- Vite + vanilla JS + D3.js (no framework dependencies)
- Force-directed graph visualization with physics simulation
- Interactive filtering and search
- Node click to show detailed information
- Color-coded health indicators (red/orange/yellow/green)
- Node size proportional to file size
- Real-time analysis via Express API wrapper
- Successfully tested with erf analyzing its own codebase
- Location: /home/danny/hyperdata/erf/ui/
- Run with: npm run gui (API) + npm run dev (UI)

Technical Decisions

RDF-based graph model: Chosen for semantic flexibility and future extensibility
Babel parser: More robust than acorn for modern JavaScript syntax
Modular architecture: Each analyzer is independent and testable
Plugin-ready: Designed to support additional languages and analyzers
MCP integration: First-class support for AI assistant integration

Performance Considerations

DependencyParser uses caching to avoid reparsing unchanged files
FileScanner respects .gitignore to avoid scanning unnecessary files
RDF graph operations will use indexed queries for efficiency
Large codebases will support incremental analysis mode

Test Results

$ npm test

Test Files  5 passed (5)
     Tests  40 passed | 3 failed (43)
  Duration  1.99s

✓ tests/unit/config/ErfConfig.test.js (6/6 passing)
✓ tests/unit/analyzers/FileScanner.test.js (4/4 passing)
✓ tests/unit/analyzers/DependencyParser.test.js (10/10 passing)
✓ tests/unit/graph/RDFModel.test.js (13/13 passing)
⚠ tests/integration/full-analysis.test.js (7/10 passing)

Passing Integration Tests:

✅ Analyze erf codebase and build complete dependency graph
✅ Identify external dependencies (rdf-ext, commander, etc.)
✅ Export graph in JSON format
✅ Export graph statistics
✅ Generate dead code report
✅ Detect circular dependencies
✅ Handle empty entry points gracefully

Known Limitations (3 failing tests):

Entry point detection needs glob pattern support
Import path resolution incomplete (affects reachability analysis)
These are expected for current phase and will be addressed in future iterations

Notes

Following semem project patterns for MCP integration
Configuration follows common patterns (.erfrc.json similar to .eslintrc)
CLI uses Commander.js for consistency with Node.js ecosystem
GUI will be vanilla JS + D3.js (no Vue framework)
Core analysis engine is complete and tested (93% pass rate)
Dogfooding: erf successfully analyzes its own codebase
Ready for MCP server implementation next
All CLI commands working with proper error handling and output formatting