codebase-dna
v1.3.0
Published
Codebase intelligence layer for AI coding agents - auto-discovers conventions, architecture, and static contract signals via MCP
Maintainers
Readme
codebase-dna 🧬
The Codebase Intelligence Layer for AI Coding Agents
codebase-dna is a local MCP (Model Context Protocol) server that statically analyzes your JavaScript/TypeScript and PHP/Laravel codebase to extract static evidence about local patterns. It serves this evidence to AI coding agents, helping them generate code that aligns with your project's conventions, architecture, and static contract signals.
Runs 100% locally. Zero cost.
Documentation
Start with docs/getting-started.md if you are new to the project.
Detailed docs:
- Getting Started
- Configuration
- MCP Tools and CLI Commands
- Trust Model and Limits
- Benchmarks
- Security Notes
- Troubleshooting
What Is It All About?
Existing AI coding agents (like Cursor, Claude Code, GitHub Copilot CLI, Windsurf) can read files, but they don't natively infer the implicit rules of your codebase. codebase-dna acts as a static-analysis bridge by extracting three layers of evidence:
- Conventions — Naming patterns, async style, error handling, export styles, and more.
- Architecture — Module boundaries, allowed import directions, and layer classification (e.g., flagging services that import directly from the DB layer when explicit rules say they should not).
- Static Contract Signals — Function signatures, parameter types, return annotations, side effects, throws, guard clauses, confidence, and extraction provenance.
Agents query this information via MCP before writing code, turning them from generic code generators into "team-aware contributors".
The Problem It Solves
When you tell an AI agent to "Add a payment processing service", it might:
- ✗ Use
functiondeclarations instead of your team's preferred arrow functions. - ✗ Use
console.loginstead of your structured logger. - ✗ Name the file
PaymentService.tsinstead ofpayment-service.ts. - ✗ Violate architecture by importing database modules directly into a web route.
Why existing solutions fall short:
AGENTS.md/.cursorrules: Require manual updates and go stale quickly.- ESLint / Prettier: Enforce configured rules, but don't discover what the codebase actually does.
- Codegraph / GitNexus: Map import relationships but miss conventions and static contract signals.
- Full code intelligence platforms (Sourcegraph, SonarQube, CodeScene, etc.): Much broader and deeper, but heavier to adopt.
codebase-dnaaims to be the local, lightweight MCP context layer that agents can query before edits.
What codebase-dna does differently:
- Auto-discovery: Zero manual rule writing. It scans and summarizes evidence from your existing code.
- Pre-generation Signals: Agents query before generating code, reducing avoidable mistakes.
- File-change Updates: Refreshes file-level contract, import, and call signals quickly, with full scans for conventions and inferred architecture.
- MCP-Native: Purpose-built for AI agents via the standard Model Context Protocol.
Setup & Installation
codebase-dna connects directly to your AI coding agent (Cursor, Claude Desktop, Windsurf, Antigravity, etc.) via the Model Context Protocol (MCP).
Prerequisites
- Node.js (v20+): Required to run the server. Check your version by running
node -vin your terminal. If you don't have it, download it from nodejs.org.
Installation Steps
Step 1: Open your AI Agent's MCP Settings
Locate your AI tool's MCP configuration file. This is usually named mcp.json, claude_desktop_config.json, or found inside the tool's settings menu under "MCP Servers".
Step 2: Add the Configuration
[!IMPORTANT]
codebase-dnaneeds to know which project directory to scan. By default it uses the server process's working directory (cwd), but many IDEs set this to their own installation folder — not your project. You must tell the server your project root using one of the methods below.
Option A: Workspace-specific config (Recommended for Cursor / VS Code)
Place this in your workspace-level MCP settings (e.g., .cursor/mcp.json or .vscode/mcp.json). The IDE will typically set cwd to the workspace root automatically:
{
"mcpServers": {
"codebase-dna": {
"command": "npx",
"args": ["-y", "codebase-dna@latest", "serve"]
}
}
}Option B: Global config with explicit --rootDir (Recommended for Antigravity / global setups)
If your IDE only supports a global MCP config, you must pass --rootDir to tell the server which project to scan:
{
"mcpServers": {
"codebase-dna": {
"command": "npx",
"args": [
"-y",
"codebase-dna@latest",
"serve",
"--rootDir", "C:/Projects/your-project"
]
}
}
}[!NOTE] Replace
C:/Projects/your-projectwith the absolute path to the project you want to analyze. When switching projects, update this path accordingly.
Option C: Environment variable
You can also set the CODEBASE_DNA_ROOT environment variable instead of using --rootDir:
{
"mcpServers": {
"codebase-dna": {
"command": "npx",
"args": ["-y", "codebase-dna@latest", "serve"],
"env": {
"CODEBASE_DNA_ROOT": "C:/Projects/your-project"
}
}
}
}Priority order: --rootDir flag > CODEBASE_DNA_ROOT env var > process.cwd().
Step 3: Restart and Scan
- Restart your AI application to load the new configuration.
- Open a chat in your AI tool and say: "Please use the
dna_scantool to refresh analysis for this project." - After you review the initial report/context and trust the current state, run
dna_accept_baselineonce so futuredna_verifychecks have a trusted comparison point.
[!NOTE]
Generated Files: Once the scan runs successfully,codebase-dnawill automatically create a.codebase-dnafolder in your project directory to store the analyzed knowledge. Add.codebase-dna/to your project's.gitignorefile.
Troubleshooting
Symptoms of wrong project root:
- The scan runs but finds files from the IDE's own installation directory
dna_conventionsreturns examples pointing to IDE internal files (e.g.,resources/app/extensions/...)dna_boundariesonly finds one layer that doesn't match your projectdna_contractcan't find your project's functionsdna_can_importrejects project files with "path traversal" errors- No
.codebase-dna/folder appears in your project directory
Fix: Add --rootDir to your MCP config args pointing to your project's absolute path (see Option B above).
Configuration (Optional)
codebase-dna works out of the box with zero configuration, but you can customize its behavior by creating a codebase-dna.config.json file in your project root:
{
"include": ["**/*.{ts,tsx,js,jsx,mts,cts,mjs,cjs,php}"],
"exclude": [
"**/node_modules/**",
"**/vendor/**",
"**/dist/**",
"**/*.test.*",
"**/__tests__/**"
],
"conventionThreshold": 0.75,
"architectureMode": "strict",
"boundaryOverrides": [
{
"layer": "services",
"allowedImports": ["repositories", "utils"]
}
]
}[!NOTE]
includecontrols which files are considered for scanning, but only supported languages are parsed. Adding extensions such aspytoincludewill not enable Python analysis until a Python parser is implemented.
Available MCP Tools
Once connected, your AI agent will have access to 14 native tools:
dna_scan: Refreshes analysis without changing the verification baseline.dna_conventions: Retrieves discovered coding conventions and patterns.dna_check_style: Verifies if a proposed code snippet adheres to local styles.dna_boundaries: Explains the architectural layers and directory structures.dna_can_import: Checks if importing from directory A to B violates boundaries.dna_contract: Retrieves static contract signals for specific functions or symbols, including confidence/provenance.dna_verify: Checks current code against the latest explicit baseline by default.dna_context: Provides comprehensive intelligence for a specific task.dna_accept_baseline: Accepts the current codebase state as the new verification baseline after intentional changes.dna_report: Generates a Markdown report of conventions, boundaries, contracts, side effects, and risks.dna_callees: Lists static calls made by a function or method.dna_callers: Lists static callers of a function or method.dna_impact: Estimates upstream impact by walking static callers of a symbol.dna_suggest_boundaries: Emits advisoryboundaryOverridesJSON with confidence labels and warnings; nothing is enforced unless copied into config.
Trust Model
codebase-dna separates hard static evidence from heuristic inference:
- Observed imports are evidence. Inferred architecture never treats an import that already exists in the scanned codebase as a violation by itself.
- Enforced architecture is explicit. Inferred boundaries are advisory;
boundaryOverridesare the mechanism for rules that should faildna_can_importordna_verify. - Contracts are static signals, not runtime proofs. Parameters and return annotations are extracted from syntax; side effects, throws, and guards include confidence/provenance metadata.
- Call graph results are graded.
dna_callees,dna_callers,dna_impact, anddna_reportdistinguish resolved internal targets from unresolved static call sites. - Quality is testable. This repository currently passes
npm run lint,npm test,npm run build, andnpm run benchmarkwith 13/13 static-signal scenarios passing;dna_reportalso surfaces resolved call-edge counts so teams can judge usefulness on their own codebase.
CLI Commands
codebase-dna scan --rootDir <path>: Refresh analysis without changing the baseline.codebase-dna accept-baseline --rootDir <path>: Accept the current state as the new baseline.codebase-dna verify --rootDir <path>: Verify current analysis against the accepted baseline.codebase-dna report --rootDir <path> --out DNA_REPORT.md: Generate a Markdown intelligence report.codebase-dna doctor --rootDir <path>: Check root detection, store readiness, aliases, and baseline status.codebase-dna mcp-config cursor --rootDir <path>: Print ready-to-paste MCP config.codebase-dna suggest-boundaries --rootDir <path>: Print advisory boundary override suggestions.codebase-dna callees <symbol> --rootDir <path>: List static calls from a symbol.codebase-dna callers <symbol> --rootDir <path>: List static callers of a symbol.
Benchmarks
Run npm run benchmark to execute the local static-signal benchmark suite. The current verified result is 13/13 passing scenarios across naming drift, explicit boundary violations, contract signals, and call-impact resolution.
[!NOTE] Inferred architecture is advisory by default. Add
boundaryOverrideswhen a rule should be enforced bydna_can_importanddna_verify.
Project Structure
src/server.ts- MCP Server & Tool Handlerssrc/scanner.ts- Orchestrates full codebase scanningsrc/analyzers/- Logic for extracting conventions, architecture boundaries, and contractssrc/store.ts- SQLite Database layer for state persistencesrc/watcher.ts- File watcher for incremental live-updates
Alternative: Running from Source
If you prefer to clone the repository and run the server directly from source instead of using npx via the npm registry, follow these steps:
- Clone and build the project in your terminal:
git clone https://github.com/your-username/codebase-dna.git cd codebase-dna npm install npm run build - Update your MCP JSON configuration to use
nodeinstead ofnpx, and point it to the absolute path of the compileddist/bin/codebase-dna.jsfile on your computer:{ "mcpServers": { "codebase-dna": { "command": "node", "args": [ "C:/absolute/path/to/cloned/codebase-dna/dist/bin/codebase-dna.js", "serve", "--rootDir", "C:/Projects/your-project" ] } } }
(Replace both paths with the actual absolute paths on your system.)
License
MIT
