onboard_codebase-mcp

v1.0.8

Published

21 days ago

MCP server that onboards developers and AI agents onto any codebase — framework and language agnostic

0High
0Medium
0Low

b_sydhom

mcp documentation onboarding component-tree vscode cursor agnostic

onboard-codebase-mcp

One command to understand any codebase. An MCP server that gives AI agents a complete map of your project — framework and language agnostic.

What it does

Install it once. Your AI agent (Cursor, VS Code Copilot, Claude Code, Windsurf) gains 11 tools that let it understand your entire codebase without reading every source file — saving tokens and giving better answers.

You:   "Onboard me on this project"
Agent: calls onboard_me() →
         ✅ Detected: Next.js (ts)
         ✅ Analyzed: 47 files, 83 units, 124 edges
         ✅ Generated: ONBOARDING.md, Tree.md, Graph.md, Matrix.md, Api.md, docs/src/
         📂 Obsidian: open docs/ as a vault or symlink into your existing one

Supported stacks

JavaScript / TypeScript — React, Next.js, Remix, Vue, Nuxt, Angular, Svelte, Express, Fastify, NestJS, Koa

Python — Django, FastAPI, Flask

Go — Gin, Echo, Fiber

JVM — Spring Boot, Java, Kotlin

Systems — Rust (Actix, Axum, Tauri)

Other — Ruby on Rails, PHP, Swift, C#, and any project with a standard manifest file

Install

Global (recommended)

npm install -g onboard-codebase-mcp

The postinstall script auto-configures every AI editor it detects on your machine:

| Editor | Config file written | |---|---| | Cursor | ~/.cursor/mcp.json | | VS Code | .vscode/mcp.json in your project | | Claude Desktop | platform config file | | Windsurf | ~/.codeium/windsurf/mcp_config.json |

Then restart your editor. The tools appear automatically.

Without installing (always latest)

# npm
npx -y onboard-codebase-mcp

# yarn
yarn dlx onboard-codebase-mcp

As a project dev dependency (good for teams)

npm install --save-dev onboard-codebase-mcp
# postinstall writes .vscode/mcp.json automatically
# commit it so the whole team gets the config on npm install

Manual editor config

If auto-configure missed your editor, add this to its MCP config file:

Cursor — `~/.cursor/mcp.json`

{
  "mcpServers": {
    "onboard-codebase-mcp": {
      "command": "npx",
      "args": ["-y", "onboard-codebase-mcp"]
    }
  }
}

VS Code — `.vscode/mcp.json`

⚠️ VS Code uses "servers" not "mcpServers". MCP tools only work in Agent Mode.

{
  "servers": {
    "onboard-codebase-mcp": {
      "type": "stdio",
      "command": "npx",
      "args": ["-y", "onboard-codebase-mcp"]
    }
  }
}

Claude Desktop

{
  "mcpServers": {
    "onboard-codebase-mcp": {
      "command": "npx",
      "args": ["-y", "onboard-codebase-mcp"]
    }
  }
}

macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
Windows: %APPDATA%\Claude\claude_desktop_config.json
Linux: ~/.config/Claude/claude_desktop_config.json

Claude Code CLI

claude mcp add onboard-codebase-mcp -- npx -y onboard-codebase-mcp

Usage

The one command you need

Tell your agent:

Onboard me on this project

It calls onboard_me() which runs the full pipeline in sequence and ends with Obsidian instructions:

Detect — identifies your framework, language, source root, and entry point
Analyze — parses all source files and builds the dependency graph
Generate — writes structured documentation into docs/
Brief — returns a compact onboarding summary
Obsidian — tells you exactly how to open the docs as a vault

All 11 tools

| Tool | What it does | When to use | |---|---|---| | onboard_me | Full pipeline in one call | First time on a project | | detect_project | Identify stack without analyzing | Quick sanity check | | analyze_project | Parse + emit all docs | After major refactors | | onboard | Read ONBOARDING.md and return briefing | Start of every AI session | | get_tree | Dependency tree (JSON or ASCII) | Understand structure | | get_graph | All edges as list or Mermaid | Visualize dependencies | | get_unit | Deep info on one unit | Before editing a specific file | | query_docs | Search docs by name or topic | Find something without reading source | | read_unit_source | Return source code + auto-extracted props for a unit | Before writing one description | | document_unit | Write description + props into a unit's doc file | After analyzing one unit | | document_all_units | Bulk-document every undocumented unit in two calls | After running onboard_me |

Documenting components after onboarding

onboard_me and analyze_project generate skeleton doc pages with placeholder descriptions. They do not fill in descriptions or props automatically — that is a separate step you trigger yourself.

Document one unit at a time

Use read_unit_source + document_unit together:

# Step 1 — get the source so the agent can read it
read_unit_source({ unitName: "ProductCard" })

# Agent sees the source code and auto-extracted TypeScript props, then:

# Step 2 — write the description and props into the doc file
document_unit({
  unitName: "ProductCard",
  description: "Displays a product image, title, price, and an add-to-cart button.",
  props: ["product", "onAddToCart", "isLoading"]
})

Document all units at once

document_all_units handles the whole project in just two calls:

# Call 1 — READ mode: returns every undocumented unit's source + extracted props
document_all_units()

# Agent reads all the source blocks and generates descriptions, then:

# Call 2 — WRITE mode: saves all descriptions in one shot
document_all_units({
  units: [
    { unitName: "Button",      description: "Reusable button with loading state.", props: ["label", "onClick", "isLoading"] },
    { unitName: "ProductCard", description: "Product image, title, price, and cart button.", props: ["product", "onAddToCart"] },
    { unitName: "UserService", description: "Handles user authentication and profile fetching." }
  ]
})

Options on the first (READ mode) call:

onlyUndocumented: false — include units that already have descriptions
maxUnits: 30 — batch size (default 20, max 50)

Descriptions written by hand inside  are never overwritten by analyze_project.

Token-efficient AI workflow

Session start → onboard()           ~500 tokens  full map of the codebase
Need a unit  → get_unit("Name")     ~100 tokens  uses, used-by, description
Find topic   → query_docs("auth")   ~300 tokens  search docs, not source
Need graph   → get_graph()          ~200 tokens  all dependency edges
Last resort  → read source file     many tokens  only when docs aren't enough

Load docs first. Read source files last.

Generated documentation

After running onboard_me or analyze_project, your project gets:

docs/
├── ONBOARDING.md     ← Start here. Briefing for humans and AI agents.
├── Tree.md           ← Mermaid mindmap of the full dependency tree
├── Graph.md          ← Directed DAG — each unit appears exactly once
├── Matrix.md         ← Reuse leaderboard + edge list + full matrix
├── Api.md            ← Class diagram of units with documented props
└── src/
    ├── App.md
    ├── components/
    │   ├── Button.md
    │   └── ProductCard.md
    └── services/
        └── UserService.md

Per-unit pages

Each unit gets a wiki page with:

Structured frontmatter (type, source, uses, usedBy) for Dataview queries
What it uses (outgoing dependencies, as [[wiki-links]])
What uses it (incoming dependencies, as [[wiki-links]])
Its doc comment as description
A link back to the source file

---
type: component
source: src/components/ProductCard.tsx
uses:
  - Badge
  - Button
usedBy:
  - FeaturedProducts
  - ProductList
---

# ProductCard

<!-- DOCS:START -->
Displays a product image, title, price, and an add-to-cart button.

> **Props**
> - `product`
> - `onAddToCart`
> - `isLoading`
<!-- DOCS:END -->

## Uses
- [[Button]]
- [[Badge]]

## Used by
- [[ProductList]]
- [[FeaturedProducts]]

Hand-written blocks are preserved

Both  in unit pages and  in ONBOARDING.md survive every regeneration. Add architecture notes, gotchas, and conventions — they'll never be overwritten.

Querying component relationships (Dataview)

Because every doc page includes uses and usedBy arrays in its frontmatter, you can query component relationships in Obsidian using the Dataview plugin — no graph view needed.

Find everything that renders a specific component:

LIST FROM "src" WHERE contains(uses, "Button")

Find everything a component depends on:

LIST uses WHERE file.name = "Button"

Full parent → child relationship table across all components:

TABLE uses, usedBy FROM "src" WHERE type = "component" SORT file.name ASC

Find the most-reused components (used by 3 or more others):

TABLE usedBy FROM "src"
WHERE type = "component" AND length(usedBy) >= 3
SORT length(usedBy) DESC

Traverse all descendants of a component (DataviewJS):

const root = dv.page("Button");
const seen = new Set();
const queue = [root];
const rows = [];
while (queue.length) {
  const p = queue.shift();
  if (!p || seen.has(p.file.name)) continue;
  seen.add(p.file.name);
  const children = p.file.outlinks
    .map(l => dv.page(l.path))
    .filter(x => x?.type === "component");
  rows.push([p.file.link, children.map(c => c.file.link)]);
  queue.push(...children);
}
dv.table(["Component", "Direct children"], rows);

Obsidian integration

The docs are plain Markdown with [[wiki-links]] and Mermaid diagrams — both work in Obsidian with zero plugins.

Quickest setup: Open Obsidian → Open folder as vault → select your docs/ folder.

For teams: Commit docs/ to git. Every developer opens it as their vault after pulling.

Symlink (stays in sync automatically):

ln -s /path/to/project/docs "<your-vault>/my-project-docs"

Smart auto-detection

workDir is the only required parameter — everything else is detected automatically:

Finds the nearest manifest file (package.json, go.mod, pyproject.toml, etc.)
Identifies the framework from dependencies
Locates the source root (src/, app/, lib/, etc.)
Finds the entry file (App.tsx, main.go, manage.py, etc.)
Derives the root unit name from the filename

Override detection when needed:

analyze_project({
  workDir: "/path/to/project",
  rootFile: "/path/to/project/src/CustomEntry.tsx",
  rootName: "CustomEntry"
})

License

MIT