npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@qlucent/code-dna

v0.1.3

Published

Zero-Token Pre-Analysis Layer for codebase analysis

Readme

code-dna

Zero-Token Pre-Analysis Layer — give any LLM instant codebase understanding

npm version Node.js License: MIT

Table of Contents


The Problem

LLMs waste 50,000–200,000 tokens exploring unfamiliar codebases. Typical workflows involve asking the model to read file trees, open individual files, trace imports, and re-derive architecture facts it will forget next session. Context packers ship raw source code. Knowledge graphs need infrastructure.

The result: slow, expensive, and inconsistent onboarding every time a new LLM session touches your codebase.

The Solution

code-dna runs static analysis in under 5 seconds and produces a compact 5–10k token "DNA file" that gives any LLM architectural understanding — without reading source files.

The DNA file captures:

  • The project's module structure and symbol inventory
  • Architectural style, detected framework, and layer organisation
  • Coding conventions derived from the actual codebase
  • Hot files, risk scores, and dependency centrality
  • Git churn data and ownership information

Give any LLM the DNA file as its first context document and it hits the ground running.

Quick Start

# Run once, output to stdout
npx code-dna analyze

# Save to a file (recommended)
npx code-dna analyze --output CODEBASE-DNA.md

# YAML output for programmatic consumption
npx code-dna analyze --format yaml --output CODEBASE-DNA.yaml

# Analyse a specific directory
npx code-dna analyze /path/to/project --output CODEBASE-DNA.md

What It Extracts (4 Layers)

code-dna runs four analysis layers in sequence (Layers 1 and 2 execute in parallel):

Layer 1: Structural Skeleton

Discovers all source files, parses them with Tree-sitter AST grammars, and builds:

  • File tree with language and role annotations (controller, service, model, etc.)
  • Module map — hierarchical directory structure with per-file symbol inventories
  • Dependency graph — import/export edges with fan-in/fan-out metrics and circular dependency detection
  • Symbol index — every exported function, class, interface, type, and variable

Layer 2: Git Archaeology

Queries the local git history to surface temporal patterns:

  • Commit heatmap — files ranked by total commits
  • Ownership map — primary author per file
  • Co-change coupling — files that change together frequently (configurable window)
  • Hot files — churn hotspots with commit counts and last-modified timestamps

Gracefully skipped when no git history is available.

Layer 3: Pattern Inference

Uses Layer 1 results to infer higher-level patterns without configuration:

  • Framework detection — identifies Next.js, Express, FastAPI, Spring Boot, NestJS, and more from dependency manifests and file markers
  • Architecture style — classifies projects as MVC, hexagonal, layered, event-driven, or monolith
  • Naming conventions — detects camelCase, PascalCase, snake_case, kebab-case across files, functions, classes, and variables
  • File organisation — by-feature, by-layer, by-type, or hybrid
  • Import and export style — relative vs. aliased paths, named vs. default exports

Layer 4: Risk Surface

Combines all previous layers to produce a risk-ranked file list:

  • Centrality score — files with the highest in-degree (most imported)
  • Churn score — correlation between frequency of change and dependency weight
  • Coverage proxy — estimated test coverage based on co-located test files
  • Composite risk score — 0–100 rank with per-factor breakdowns

Supported Languages

| Language | Extensions | Support Tier | |----------|-----------|--------------| | TypeScript | .ts, .tsx | Full AST parsing | | JavaScript | .js, .jsx, .mjs, .cjs | Full AST parsing | | Python | .py, .pyi | Full AST parsing | | Go | .go | File discovery + framework detection | | Rust | .rs | File discovery + framework detection | | Java | .java | File discovery + framework detection | | Vue | .vue | File discovery + framework detection | | C# | .cs | File discovery + framework detection | | Ruby | .rb | File discovery + framework detection | | Kotlin | .kt, .kts | File discovery + framework detection | | Swift | .swift | File discovery + framework detection | | PHP | .php | File discovery + framework detection | | C / C++ | .c, .h, .cpp, .cc, .cxx, .hpp | File discovery + framework detection | | Solidity | .sol | Discovery only |

Run code-dna info to verify the languages and tiers detected by your installed version.

CLI Usage

analyze [path]

Run the full analysis pipeline and output DNA.

code-dna analyze [path] [options]

Arguments:

| Argument | Description | Default | |----------|-------------|---------| | path | Directory to analyse | Current working directory |

Options:

| Flag | Description | Default | |------|-------------|---------| | -f, --format <format> | Output format: md or yaml | md | | -o, --output <file> | Write output to file instead of stdout | stdout | | -l, --layers <layers> | Comma-separated layers to run | 1,2,3,4 | | --languages <langs> | Language filter, e.g. ts,py,go | all languages | | --scope <dir> | Scope analysis to a subdirectory | none | | --token-budget <n> | Target token count for Markdown output | 8000 | | --git-depth <n> | Maximum git commits to traverse | 1000 | | --no-git | Skip git archaeology (disables Layer 2) | false | | -q, --quiet | Suppress progress output | false |

Examples:

# Full analysis, Markdown output to stdout
code-dna analyze

# Save to file with YAML format
code-dna analyze . --format yaml --output CODEBASE-DNA.yaml

# Only structural skeleton, no git or risk analysis
code-dna analyze --layers 1,3

# Analyse only TypeScript and Python files
code-dna analyze --languages ts,py

# Scope to a single service in a monorepo
code-dna analyze --scope services/api --output services/api/DNA.md

# Large repo with tight token budget
code-dna analyze --token-budget 5000 --git-depth 500

diff <dna-a> <dna-b>

Compare two DNA YAML snapshots and produce a Markdown diff report.

code-dna diff before.yaml after.yaml
code-dna diff before.yaml after.yaml --output diff-report.md

The diff report covers: files added/removed/modified, symbols added/removed, dependency graph changes, risk score movements, convention and framework shifts.

mcp

Start the code-dna MCP server over stdio for use with MCP-compatible clients.

code-dna mcp
code-dna mcp --path /path/to/project
code-dna mcp --path /path/to/project --watch

See MCP Integration for client configuration details.

info

Show version, Node.js version, platform, and supported languages with their tiers.

code-dna info

MCP Integration

code-dna exposes its analysis pipeline as an MCP server, allowing LLM clients to query codebase DNA directly without running CLI commands.

Starting the Server

# Start against current directory
code-dna mcp

# Start against a specific project
code-dna mcp --path /path/to/project

# Watch mode: auto-refresh cache on file changes
code-dna mcp --path /path/to/project --watch

Claude Code Configuration

Add code-dna to your .mcp.json (project-scoped) or your global Claude Code settings:

{
  "mcpServers": {
    "code-dna": {
      "command": "npx",
      "args": ["code-dna", "mcp", "--path", "/absolute/path/to/project", "--watch"]
    }
  }
}

Cursor Configuration

In Cursor settings, add a new MCP server:

{
  "mcp": {
    "servers": {
      "code-dna": {
        "command": "npx",
        "args": ["code-dna", "mcp", "--path", "${workspaceFolder}", "--watch"]
      }
    }
  }
}

Available MCP Resources

Once connected, clients can read these resources:

| URI | Content | |-----|---------| | codedna://full | Complete DNA Markdown output | | codedna://skeleton | Architecture and Module Map sections | | codedna://dependencies | Dependencies section | | codedna://conventions | Conventions section | | codedna://risks | Risk Surface and Hot Files sections | | codedna://hotfiles | Hot Files section only |

Available MCP Tools

| Tool | Description | |------|-------------| | analyze | Run analysis on a directory, update the cache, return full DNA | | diff | Compute a structural diff between two DNA Markdown strings |

See docs/MCP.md for the full MCP reference including tool parameter schemas.

Configuration

Create a .codedna.yaml file in your project root to customise analysis:

# Additional glob patterns to ignore (built-in ignores always apply)
ignore:
  - "generated/**"
  - "vendor/**"
  - "*.pb.go"

# Toggle individual analysis layers
layers:
  skeleton: true
  git: true
  patterns: true
  risk: true

# Git archaeology settings
git:
  max_commits: 1000
  max_blame_files: 50
  coupling_window: 30   # days

# Per-language overrides
languages:
  python:
    enabled: true
    framework: "fastapi"   # override auto-detection
  solidity:
    enabled: false         # skip entirely

# Output preferences
output:
  format: md
  token_budget: 8000
  filename: CODEBASE-DNA.md
  sections:
    architecture: 15
    module_map: 25
    dependencies: 15
    conventions: 15
    hot_files: 10
    risk_surface: 10
    api_surface: 5

# Monorepo: include/exclude sub-directories
scope:
  include:
    - "services/api"
    - "packages/shared"
  exclude:
    - "packages/legacy"

All fields are optional and fall back to sensible defaults.

Programmatic API

code-dna can be used as a library from TypeScript or JavaScript:

npm install code-dna
import { analyze, formatMarkdown, formatYaml } from 'code-dna/lib';

// Run the full 4-layer analysis
const dna = await analyze('/path/to/project', {
  layers: [1, 2, 3, 4],
  tokenBudget: 8000,
});

// Render as Markdown (token-budget aware)
const markdown = formatMarkdown(dna, budget);

// Render as YAML (full data, no truncation)
const yaml = formatYaml(dna);

See docs/API.md for the complete programmatic API reference.

Example Output

The following is a truncated excerpt from code-dna analysing itself:

# Codebase DNA -- code-dna

> Generated by code-dna v0.1.0 on 2026-03-26.
> Languages: typescript (99%), javascript (1%) | Files: 101 | LOC: 35,864

## Architecture

**Style:** layered (85% confidence)
**Framework:** Node.js / Commander CLI

### Layers
- **cli** (3 files): entry point, MCP command
- **core** (8 files): engine, types, diff engine, token budget
- **analyzers** (6 files): git, framework, architecture, conventions, risk
- **parsers** (19 files): Tree-sitter extractors for 14 languages
- **output** (3 files): Markdown and YAML formatters
- **mcp** (2 files): MCP server

## Conventions

- **Files:** kebab-case
- **Functions:** camelCase
- **Classes:** PascalCase
- **Exports:** named
- **Imports:** external-first, relative paths
- **Tests:** co-located

## Risk Surface

| File | Score | Factors |
|------|-------|---------|
| src/core/engine.ts | 82 | high-centrality, high-churn |
| src/core/types.ts | 74 | high-centrality |
| src/parsers/parser-engine.ts | 65 | high-centrality |

Contributing

  1. Clone the repository and install dependencies: npm install
  2. Build: npm run build
  3. Run all tests: npm test (1199 tests, Node.js 20+ required)
  4. Lint: npm run lint
  5. Typecheck: npm run typecheck

All code changes require tests written first (TDD). Commits follow Conventional Commits (feat(scope):, fix(scope):).

License

MIT