onboard_codebase-mcp
v1.0.8
Published
MCP server that onboards developers and AI agents onto any codebase — framework and language agnostic
Maintainers
Readme
onboard-codebase-mcp
One command to understand any codebase. An MCP server that gives AI agents a complete map of your project — framework and language agnostic.
What it does
Install it once. Your AI agent (Cursor, VS Code Copilot, Claude Code, Windsurf) gains 11 tools that let it understand your entire codebase without reading every source file — saving tokens and giving better answers.
You: "Onboard me on this project"
Agent: calls onboard_me() →
✅ Detected: Next.js (ts)
✅ Analyzed: 47 files, 83 units, 124 edges
✅ Generated: ONBOARDING.md, Tree.md, Graph.md, Matrix.md, Api.md, docs/src/
📂 Obsidian: open docs/ as a vault or symlink into your existing oneSupported stacks
JavaScript / TypeScript — React, Next.js, Remix, Vue, Nuxt, Angular, Svelte, Express, Fastify, NestJS, Koa
Python — Django, FastAPI, Flask
Go — Gin, Echo, Fiber
JVM — Spring Boot, Java, Kotlin
Systems — Rust (Actix, Axum, Tauri)
Other — Ruby on Rails, PHP, Swift, C#, and any project with a standard manifest file
Install
Global (recommended)
npm install -g onboard-codebase-mcpThe postinstall script auto-configures every AI editor it detects on your machine:
| Editor | Config file written |
|---|---|
| Cursor | ~/.cursor/mcp.json |
| VS Code | .vscode/mcp.json in your project |
| Claude Desktop | platform config file |
| Windsurf | ~/.codeium/windsurf/mcp_config.json |
Then restart your editor. The tools appear automatically.
Without installing (always latest)
# npm
npx -y onboard-codebase-mcp
# yarn
yarn dlx onboard-codebase-mcpAs a project dev dependency (good for teams)
npm install --save-dev onboard-codebase-mcp
# postinstall writes .vscode/mcp.json automatically
# commit it so the whole team gets the config on npm installManual editor config
If auto-configure missed your editor, add this to its MCP config file:
Cursor — ~/.cursor/mcp.json
{
"mcpServers": {
"onboard-codebase-mcp": {
"command": "npx",
"args": ["-y", "onboard-codebase-mcp"]
}
}
}VS Code — .vscode/mcp.json
⚠️ VS Code uses
"servers"not"mcpServers". MCP tools only work in Agent Mode.
{
"servers": {
"onboard-codebase-mcp": {
"type": "stdio",
"command": "npx",
"args": ["-y", "onboard-codebase-mcp"]
}
}
}Claude Desktop
{
"mcpServers": {
"onboard-codebase-mcp": {
"command": "npx",
"args": ["-y", "onboard-codebase-mcp"]
}
}
}macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
Windows: %APPDATA%\Claude\claude_desktop_config.json
Linux: ~/.config/Claude/claude_desktop_config.json
Claude Code CLI
claude mcp add onboard-codebase-mcp -- npx -y onboard-codebase-mcpUsage
The one command you need
Tell your agent:
Onboard me on this projectIt calls onboard_me() which runs the full pipeline in sequence and ends with Obsidian instructions:
- Detect — identifies your framework, language, source root, and entry point
- Analyze — parses all source files and builds the dependency graph
- Generate — writes structured documentation into
docs/ - Brief — returns a compact onboarding summary
- Obsidian — tells you exactly how to open the docs as a vault
All 11 tools
| Tool | What it does | When to use |
|---|---|---|
| onboard_me | Full pipeline in one call | First time on a project |
| detect_project | Identify stack without analyzing | Quick sanity check |
| analyze_project | Parse + emit all docs | After major refactors |
| onboard | Read ONBOARDING.md and return briefing | Start of every AI session |
| get_tree | Dependency tree (JSON or ASCII) | Understand structure |
| get_graph | All edges as list or Mermaid | Visualize dependencies |
| get_unit | Deep info on one unit | Before editing a specific file |
| query_docs | Search docs by name or topic | Find something without reading source |
| read_unit_source | Return source code + auto-extracted props for a unit | Before writing one description |
| document_unit | Write description + props into a unit's doc file | After analyzing one unit |
| document_all_units | Bulk-document every undocumented unit in two calls | After running onboard_me |
Documenting components after onboarding
onboard_me and analyze_project generate skeleton doc pages with placeholder descriptions. They do not fill in descriptions or props automatically — that is a separate step you trigger yourself.
Document one unit at a time
Use read_unit_source + document_unit together:
# Step 1 — get the source so the agent can read it
read_unit_source({ unitName: "ProductCard" })
# Agent sees the source code and auto-extracted TypeScript props, then:
# Step 2 — write the description and props into the doc file
document_unit({
unitName: "ProductCard",
description: "Displays a product image, title, price, and an add-to-cart button.",
props: ["product", "onAddToCart", "isLoading"]
})Document all units at once
document_all_units handles the whole project in just two calls:
# Call 1 — READ mode: returns every undocumented unit's source + extracted props
document_all_units()
# Agent reads all the source blocks and generates descriptions, then:
# Call 2 — WRITE mode: saves all descriptions in one shot
document_all_units({
units: [
{ unitName: "Button", description: "Reusable button with loading state.", props: ["label", "onClick", "isLoading"] },
{ unitName: "ProductCard", description: "Product image, title, price, and cart button.", props: ["product", "onAddToCart"] },
{ unitName: "UserService", description: "Handles user authentication and profile fetching." }
]
})Options on the first (READ mode) call:
onlyUndocumented: false— include units that already have descriptionsmaxUnits: 30— batch size (default 20, max 50)
Descriptions written by hand inside <!-- DOCS:START --> are never overwritten by analyze_project.
Token-efficient AI workflow
Session start → onboard() ~500 tokens full map of the codebase
Need a unit → get_unit("Name") ~100 tokens uses, used-by, description
Find topic → query_docs("auth") ~300 tokens search docs, not source
Need graph → get_graph() ~200 tokens all dependency edges
Last resort → read source file many tokens only when docs aren't enoughLoad docs first. Read source files last.
Generated documentation
After running onboard_me or analyze_project, your project gets:
docs/
├── ONBOARDING.md ← Start here. Briefing for humans and AI agents.
├── Tree.md ← Mermaid mindmap of the full dependency tree
├── Graph.md ← Directed DAG — each unit appears exactly once
├── Matrix.md ← Reuse leaderboard + edge list + full matrix
├── Api.md ← Class diagram of units with documented props
└── src/
├── App.md
├── components/
│ ├── Button.md
│ └── ProductCard.md
└── services/
└── UserService.mdPer-unit pages
Each unit gets a wiki page with:
- Structured frontmatter (
type,source,uses,usedBy) for Dataview queries - What it uses (outgoing dependencies, as
[[wiki-links]]) - What uses it (incoming dependencies, as
[[wiki-links]]) - Its doc comment as description
- A link back to the source file
---
type: component
source: src/components/ProductCard.tsx
uses:
- Badge
- Button
usedBy:
- FeaturedProducts
- ProductList
---
# ProductCard
<!-- DOCS:START -->
Displays a product image, title, price, and an add-to-cart button.
> **Props**
> - `product`
> - `onAddToCart`
> - `isLoading`
<!-- DOCS:END -->
## Uses
- [[Button]]
- [[Badge]]
## Used by
- [[ProductList]]
- [[FeaturedProducts]]Hand-written blocks are preserved
Both <!-- DOCS:START --> in unit pages and <!-- ONBOARD:START --> in ONBOARDING.md
survive every regeneration. Add architecture notes, gotchas, and conventions — they'll never be overwritten.
Querying component relationships (Dataview)
Because every doc page includes uses and usedBy arrays in its frontmatter, you can query component relationships in Obsidian using the Dataview plugin — no graph view needed.
Find everything that renders a specific component:
LIST FROM "src" WHERE contains(uses, "Button")Find everything a component depends on:
LIST uses WHERE file.name = "Button"Full parent → child relationship table across all components:
TABLE uses, usedBy FROM "src" WHERE type = "component" SORT file.name ASCFind the most-reused components (used by 3 or more others):
TABLE usedBy FROM "src"
WHERE type = "component" AND length(usedBy) >= 3
SORT length(usedBy) DESCTraverse all descendants of a component (DataviewJS):
const root = dv.page("Button");
const seen = new Set();
const queue = [root];
const rows = [];
while (queue.length) {
const p = queue.shift();
if (!p || seen.has(p.file.name)) continue;
seen.add(p.file.name);
const children = p.file.outlinks
.map(l => dv.page(l.path))
.filter(x => x?.type === "component");
rows.push([p.file.link, children.map(c => c.file.link)]);
queue.push(...children);
}
dv.table(["Component", "Direct children"], rows);Obsidian integration
The docs are plain Markdown with [[wiki-links]] and Mermaid diagrams — both
work in Obsidian with zero plugins.
Quickest setup: Open Obsidian → Open folder as vault → select your docs/ folder.
For teams: Commit docs/ to git. Every developer opens it as their vault after pulling.
Symlink (stays in sync automatically):
ln -s /path/to/project/docs "<your-vault>/my-project-docs"Smart auto-detection
workDir is the only required parameter — everything else is detected automatically:
- Finds the nearest manifest file (
package.json,go.mod,pyproject.toml, etc.) - Identifies the framework from dependencies
- Locates the source root (
src/,app/,lib/, etc.) - Finds the entry file (
App.tsx,main.go,manage.py, etc.) - Derives the root unit name from the filename
Override detection when needed:
analyze_project({
workDir: "/path/to/project",
rootFile: "/path/to/project/src/CustomEntry.tsx",
rootName: "CustomEntry"
})License
MIT
