code-graph-context
v3.0.10
Published
MCP server that builds code graphs to provide rich context to LLMs
Maintainers
Readme
Code Graph Context
Give your AI coding assistant a photographic memory of your codebase.
Code Graph Context is an MCP server that builds a semantic graph of your TypeScript codebase, enabling Claude to understand not just individual files, but how your entire system fits together.
Config-Driven & Extensible: Define custom framework schemas to capture domain-specific patterns beyond the included NestJS support. The parser is fully configurable to recognize your architectural patterns, decorators, and relationships.
┌─────────────────────────────────────────────────────────────┐
│ YOUR CODEBASE │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │ Service │ │Controller│ │ Module │ │ Entity │ │
│ └────┬─────┘ └────┬─────┘ └────┬─────┘ └────┬─────┘ │
└───────┼─────────────┼─────────────┼─────────────┼──────────┘
│ │ │ │
▼ ▼ ▼ ▼
┌─────────────────────────────────────────────────────────────┐
│ CODE GRAPH CONTEXT │
│ │
│ AST Parser ──► Neo4j Graph ──► Vector Embeddings │
│ (ts-morph) (Relationships) (Local or OpenAI) │
│ │
└─────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ CLAUDE CODE │
│ │
│ "What services depend on UserService?" │
│ "What's the blast radius if I change this function?" │
│ "Find all HTTP endpoints that accept a UserDTO" │
│ "Refactor this across all 47 files that use it" │
│ │
└─────────────────────────────────────────────────────────────┘Why Code Graph Context?
| Without Code Graph | With Code Graph | |---|---| | Claude reads files one at a time | Claude understands the entire dependency tree | | "What uses this?" requires manual searching | Instant impact analysis with risk scoring | | Refactoring misses edge cases | Graph traversal finds every reference | | Large codebases overwhelm context | Semantic search finds exactly what's relevant | | Multi-file changes are error-prone | Swarm agents coordinate parallel changes |
Features
- Multi-Project Support: Parse and query multiple projects in a single database with complete isolation
- Semantic Search: Vector-based search using local or OpenAI embeddings to find relevant code
- Natural Language Querying: Convert questions into Cypher queries
- Framework-Aware: Built-in NestJS schema with ability to define custom framework patterns
- Weighted Graph Traversal: Intelligent traversal scoring paths by importance and relevance
- Workspace Support: Auto-detects Nx, Turborepo, pnpm, Yarn, and npm workspaces
- Parallel & Async Parsing: Multi-threaded parsing with Worker threads for large codebases
- Streaming Import: Chunked processing for projects with 100+ files
- Incremental Parsing: Only reparse changed files
- File Watching: Real-time graph updates on file changes
- Impact Analysis: Assess refactoring risk (LOW/MEDIUM/HIGH/CRITICAL)
- Dead Code Detection: Find unreferenced exports with confidence scoring
- Duplicate Detection: Structural (AST hash) and semantic (embedding similarity) duplicates
- Swarm Coordination: Multi-agent stigmergic coordination with pheromone decay
Architecture
TypeScript Source → AST Parser (ts-morph) → Neo4j Graph + Vector Embeddings → MCP ToolsCore Components:
src/core/parsers/typescript-parser.ts- AST parsing with ts-morphsrc/storage/neo4j/neo4j.service.ts- Graph storage and queriessrc/core/embeddings/embeddings.service.ts- Embedding service (local sidecar or OpenAI)src/mcp/mcp.server.ts- MCP server and tool registration
Dual-Schema System:
- Core Schema: AST-level nodes (ClassDeclaration, MethodDeclaration, ImportDeclaration, etc.)
- Framework Schema: Semantic interpretation (NestController, NestService, HttpEndpoint, etc.)
Nodes have both coreType (AST) and semanticType (framework meaning), enabling queries like "find all controllers" while maintaining AST precision.
Quick Start
Prerequisites
- Node.js >= 18
- Python >= 3.10 (for local embeddings)
- Docker (for Neo4j)
No API keys required. Local embeddings work out of the box using a Python sidecar.
1. Install
npm install -g code-graph-context
code-graph-context init # Sets up Neo4j + Python sidecar + downloads embedding modelThe init command handles everything:
- Starts a Neo4j container via Docker
- Creates a Python virtual environment
- Installs embedding dependencies (PyTorch, sentence-transformers)
- Downloads the default embedding model (~3GB)
2. Configure Claude Code
claude mcp add --scope user code-graph-context -- code-graph-contextThat's it. No API keys needed. Restart Claude Code and you're ready to go.
Want to use OpenAI instead? See Embedding Configuration below.
3. Parse Your Project
In Claude Code, say:
"Parse this project and build the code graph"
Claude will run parse_typescript_project and index your codebase.
Configuration Files
Claude Code stores MCP server configs in JSON files. The location depends on scope:
| Scope | File | Use Case |
|-------|------|----------|
| User (global) | ~/.claude.json | Available in all projects |
| Project | .claude.json in project root | Project-specific config |
| Local | .mcp.json in project root | Git-ignored local overrides |
Manual Configuration
If you prefer to edit the config files directly:
~/.claude.json (user scope - recommended):
{
"mcpServers": {
"code-graph-context": {
"command": "code-graph-context"
}
}
}With OpenAI (optional):
{
"mcpServers": {
"code-graph-context": {
"command": "code-graph-context",
"env": {
"OPENAI_EMBEDDINGS_ENABLED": "true",
"OPENAI_API_KEY": "sk-your-key-here"
}
}
}
}From source installation:
{
"mcpServers": {
"code-graph-context": {
"command": "node",
"args": ["/absolute/path/to/code-graph-context/dist/cli/cli.js"]
}
}
}Environment Variables
| Variable | Required | Default | Description |
|----------|----------|---------|-------------|
| NEO4J_URI | No | bolt://localhost:7687 | Neo4j connection URI |
| NEO4J_USER | No | neo4j | Neo4j username |
| NEO4J_PASSWORD | No | PASSWORD | Neo4j password |
| EMBEDDING_MODEL | No | codesage/codesage-base-v2 | Local embedding model (see Embedding Configuration) |
| EMBEDDING_BATCH_SIZE | No | 8 | Texts per embedding batch (lower = less memory, higher = faster) |
| EMBEDDING_SIDECAR_PORT | No | 8787 | Port for local embedding server |
| EMBEDDING_DEVICE | No | auto (mps/cpu) | Device for embeddings. Auto-detects MPS on Apple Silicon |
| EMBEDDING_HALF_PRECISION | No | false | Set true for float16 (uses ~0.5x memory) |
| OPENAI_EMBEDDINGS_ENABLED | No | false | Set true to use OpenAI instead of local embeddings |
| OPENAI_API_KEY | No* | - | Required when OPENAI_EMBEDDINGS_ENABLED=true; also enables natural_language_to_cypher |
Core Capabilities
Semantic Code Search
Find code by describing what you need, not by memorizing file paths:
"Find where user authentication tokens are validated"
"Show me the database connection pooling logic"
"What handles webhook signature verification?"Impact Analysis
Before you refactor, understand the blast radius:
┌─────────────────────────────────────────────────────────────┐
│ Impact Analysis: UserService.findById() │
├─────────────────────────────────────────────────────────────┤
│ Risk Level: HIGH │
│ │
│ Direct Dependents (12): │
│ └── AuthController.login() │
│ └── ProfileController.getProfile() │
│ └── AdminService.getUserDetails() │
│ └── ... 9 more │
│ │
│ Transitive Dependents (34): │
│ └── 8 controllers, 15 services, 11 tests │
│ │
│ Affected Files: 23 │
│ Recommendation: Add deprecation warning before changing │
└─────────────────────────────────────────────────────────────┘Graph Traversal
Explore relationships in any direction:
UserController
│
├── INJECTS ──► UserService
│ │
│ ├── INJECTS ──► UserRepository
│ │ │
│ │ └── MANAGES ──► User (Entity)
│ │
│ └── INJECTS ──► CacheService
│
└── EXPOSES ──► POST /users
│
└── ACCEPTS ──► CreateUserDTODead Code Detection
Find code that can be safely removed:
Dead Code Analysis: 47 items found
├── HIGH confidence (23): Exported but never imported
│ └── formatLegacyDate() in src/utils/date.ts:45
│ └── UserV1DTO in src/dto/legacy/user.dto.ts:12
│ └── ... 21 more
├── MEDIUM confidence (18): Private, never called
└── LOW confidence (6): May be used dynamicallyDuplicate Code Detection
Identify DRY violations across your codebase:
Duplicate Groups Found: 8
Group 1 (Structural - 100% identical):
├── validateEmail() in src/auth/validation.ts:23
└── validateEmail() in src/user/validation.ts:45
Recommendation: Extract to shared utils
Group 2 (Semantic - 94% similar):
├── parseUserInput() in src/api/parser.ts:78
└── sanitizeInput() in src/webhook/parser.ts:34
Recommendation: Review for consolidationSwarm Coordination
Execute complex, multi-file changes with parallel AI agents.
The swarm system enables multiple Claude agents to work on your codebase simultaneously, coordinating through the code graph without stepping on each other.
┌──────────────────┐
│ ORCHESTRATOR │
│ │
│ "Add JSDoc to │
│ all services" │
└────────┬─────────┘
│
┌─────────────┼─────────────┐
│ │ │
▼ ▼ ▼
┌──────────┐ ┌──────────┐ ┌──────────┐
│ Worker 1 │ │ Worker 2 │ │ Worker 3 │
│ │ │ │ │ │
│ Claiming │ │ Working │ │ Claiming │
│ AuthSvc │ │ UserSvc │ │ PaySvc │
└──────────┘ └──────────┘ └──────────┘
│ │ │
└─────────────┼─────────────┘
│
▼
┌─────────────────────────────┐
│ PHEROMONE TRAILS │
│ │
│ AuthService: [claimed] │
│ UserService: [modifying] │
│ PayService: [claimed] │
│ CacheService: [available] │
│ │
└─────────────────────────────┘Two Coordination Mechanisms
1. Pheromone System (Stigmergic)
Agents leave markers on code nodes that decay over time—like ants leaving scent trails:
| Pheromone | Half-Life | Meaning |
|-----------|-----------|---------|
| exploring | 2 min | "I'm looking at this" |
| claiming | 1 hour | "This is my territory" |
| modifying | 10 min | "I'm actively changing this" |
| completed | 24 hours | "I finished work here" |
| warning | Never | "Don't touch this" |
| blocked | 5 min | "I'm stuck" |
Self-healing: If an agent crashes, its pheromones decay and the work becomes available again.
2. Task Queue (Blackboard)
Explicit task management with dependencies:
┌─────────────────────────────────────────────────────────────┐
│ TASK QUEUE │
├─────────────────────────────────────────────────────────────┤
│ [available] Add JSDoc to UserService priority: high │
│ [claimed] Add JSDoc to AuthService agent: worker1 │
│ [blocked] Update API docs ─────────────────► depends on ──┤
│ [in_progress] Add JSDoc to PaymentService agent: worker2 │
│ [completed] Add JSDoc to CacheService ✓ │
└─────────────────────────────────────────────────────────────┘Swarm Tools
| Tool | Purpose |
|------|---------|
| swarm_post_task | Add a task to the queue |
| swarm_get_tasks | Query tasks with filters |
| swarm_claim_task | Claim/start/release a task |
| swarm_complete_task | Complete/fail/request review |
| swarm_pheromone | Leave a marker on a code node |
| swarm_sense | Query what other agents are doing |
| swarm_cleanup | Remove pheromones after completion |
Example: Parallel Refactoring
// Orchestrator decomposes the task and creates individual work items
swarm_post_task({
projectId: "backend",
swarmId: "swarm_rename_user",
title: "Update UserService.findUserById",
description: "Rename getUserById to findUserById in UserService",
type: "refactor",
createdBy: "orchestrator"
})
// Workers claim and execute tasks
swarm_claim_task({ projectId: "backend", swarmId: "swarm_rename_user", agentId: "worker_1" })
// ... do work ...
swarm_complete_task({ taskId: "task_1", agentId: "worker_1", action: "complete", summary: "Renamed method" })Install the Swarm Skill
For optimal swarm execution, install the included Claude Code skill that teaches agents the coordination protocol:
# Copy to your global skills directory
mkdir -p ~/.claude/skills
cp -r skills/swarm ~/.claude/skills/Or for a specific project:
cp -r skills/swarm .claude/skills/The skill provides:
- Worker agent protocol with step-by-step workflow
- Multi-phase orchestration patterns (discovery, contracts, implementation, validation)
- Common failure modes and how to prevent them
- Complete tool reference
Once installed, just say "swarm" or "parallel agents" and Claude will use the skill automatically.
See skills/swarm/SKILL.md for the full documentation.
All Tools
| Tool | Description |
|------|-------------|
| Discovery | |
| list_projects | List parsed projects in the database |
| search_codebase | Semantic search using vector embeddings |
| traverse_from_node | Explore relationships from a node |
| natural_language_to_cypher | Convert questions to Cypher queries |
| Analysis | |
| impact_analysis | Assess refactoring risk (LOW/MEDIUM/HIGH/CRITICAL) |
| detect_dead_code | Find unreferenced exports and methods |
| detect_duplicate_code | Find structural and semantic duplicates |
| Parsing | |
| parse_typescript_project | Build the graph from source |
| check_parse_status | Monitor async parsing jobs |
| start_watch_project | Auto-update graph on file changes |
| stop_watch_project | Stop file watching |
| list_watchers | List active file watchers |
| Swarm | |
| swarm_post_task | Add task to the queue |
| swarm_get_tasks | Query tasks |
| swarm_claim_task | Claim/start/release tasks |
| swarm_complete_task | Complete/fail/review tasks |
| swarm_pheromone | Leave coordination markers |
| swarm_sense | Query what others are doing |
| swarm_cleanup | Clean up after swarm completion |
| Utility | |
| test_neo4j_connection | Verify database connectivity |
Tool Workflow Patterns
Pattern 1: Discovery → Focus → Deep Dive
list_projects → search_codebase → traverse_from_node → traverse (with skip for pagination)Pattern 2: Pre-Refactoring Safety
search_codebase("function to change") → impact_analysis(nodeId) → review risk levelPattern 3: Code Health Audit
detect_dead_code → detect_duplicate_code → prioritize cleanupPattern 4: Multi-Agent Work
swarm_post_task → swarm_claim_task → swarm_complete_task → swarm_get_tasks(includeStats) → swarm_cleanupMulti-Project Support
All query tools require projectId for isolation. You can use:
- Project ID:
proj_a1b2c3d4e5f6(auto-generated) - Project name:
my-backend(from package.json) - Project path:
/path/to/project(resolved automatically)
// These all work:
search_codebase({ projectId: "my-backend", query: "auth" })
search_codebase({ projectId: "proj_a1b2c3d4e5f6", query: "auth" })
search_codebase({ projectId: "/path/to/my-backend", query: "auth" })Framework Support
NestJS (Built-in)
Deep understanding of NestJS patterns:
- Controllers with route analysis (
@Controller,@Get,@Post, etc.) - Services with dependency injection mapping (
@Injectable) - Modules with import/export relationships (
@Module) - Guards, Pipes, Interceptors as middleware chains
- DTOs with validation decorators (
@IsString,@IsEmail, etc.) - Entities with TypeORM relationship mapping
NestJS-Specific Relationships:
INJECTS- Dependency injectionEXPOSES- Controller exposes HTTP endpointMODULE_IMPORTS,MODULE_PROVIDES,MODULE_EXPORTS- Module systemGUARDED_BY,TRANSFORMED_BY,INTERCEPTED_BY- Middleware
Custom Framework Schemas
The parser is config-driven. Define your own framework patterns:
// Example: Custom React schema
const REACT_SCHEMA = {
name: 'react',
decoratorPatterns: [
{ pattern: /^use[A-Z]/, semanticType: 'ReactHook' },
{ pattern: /^with[A-Z]/, semanticType: 'HOC' },
],
nodeTypes: [
{ coreType: 'FunctionDeclaration', condition: (node) => node.name?.endsWith('Provider'), semanticType: 'ContextProvider' },
],
relationships: [
{ type: 'PROVIDES_CONTEXT', from: 'ContextProvider', to: 'ReactHook' },
]
};The dual-schema system means every node has:
coreType: AST-level (ClassDeclaration, FunctionDeclaration)semanticType: Framework meaning (NestController, ReactHook)
This enables queries like "find all hooks that use context" while maintaining AST precision for refactoring.
Embedding Configuration
Local embeddings are the default — no API key needed. The Python sidecar starts automatically on first use and runs a local model for high-quality code embeddings.
The sidecar uses MPS (Apple Silicon GPU) when available, falling back to CPU. It auto-shuts down after 3 minutes of inactivity to free memory, and restarts lazily when needed (~15-20s).
Device override: Set
EMBEDDING_DEVICE=cputo force CPU if MPS causes issues.Half precision: Set
EMBEDDING_HALF_PRECISION=trueto load the model in float16, roughly halving memory usage.
Available Models
Set via the EMBEDDING_MODEL environment variable:
| Model | Dimensions | RAM (fp16) | Quality | Best For |
|-------|-----------|-----|---------|----------|
| codesage/codesage-base-v2 (default) | 1024 | ~700 MB | Best | Default, code-specific encoder, fast |
| Qodo/Qodo-Embed-1-1.5B | 1536 | ~4.5 GB | Great | Machines with 32+ GB RAM |
| BAAI/bge-base-en-v1.5 | 768 | ~250 MB | Good | General purpose, low RAM |
| sentence-transformers/all-MiniLM-L6-v2 | 384 | ~100 MB | OK | Minimal RAM, fast |
| nomic-ai/nomic-embed-text-v1.5 | 768 | ~300 MB | Good | Code + prose mixed |
| sentence-transformers/all-mpnet-base-v2 | 768 | ~250 MB | Good | Balanced quality/speed |
| BAAI/bge-small-en-v1.5 | 384 | ~65 MB | OK | Smallest footprint |
Example: Use a lightweight model on a low-memory machine:
claude mcp add --scope user code-graph-context \
-e EMBEDDING_MODEL=BAAI/bge-base-en-v1.5 \
-- code-graph-contextSwitching Models
Switching models requires re-parsing — vector index dimensions are locked per model. Drop existing indexes first:
DROP INDEX embedded_nodes_idx IF EXISTS;
DROP INDEX session_notes_idx IF EXISTS;Then re-parse your project with the new model configured.
Using OpenAI Instead
If you prefer OpenAI embeddings (higher quality, requires API key):
claude mcp add --scope user code-graph-context \
-e OPENAI_EMBEDDINGS_ENABLED=true \
-e OPENAI_API_KEY=sk-your-key-here \
-- code-graph-contextTroubleshooting
MCP Server Not Connecting
# Check the server is registered
claude mcp list
# Verify Neo4j is running
docker ps | grep neo4j
# Test manually
code-graph-context statusEmbedding Errors
"Failed to generate embedding" — The local sidecar may not have started. Check:
# Verify Python deps are installed
code-graph-context status
# Re-run init to fix sidecar setup
code-graph-context initOut of memory (large model on 16GB machine) — Switch to a lighter model:
claude mcp add --scope user code-graph-context \
-e EMBEDDING_MODEL=BAAI/bge-base-en-v1.5 \
-- code-graph-contextUsing OpenAI and getting auth errors — Ensure your key is configured:
claude mcp remove code-graph-context
claude mcp add --scope user code-graph-context \
-e OPENAI_EMBEDDINGS_ENABLED=true \
-e OPENAI_API_KEY=sk-your-key-here \
-- code-graph-contextNeo4j Memory Issues
For large codebases, increase memory limits:
# Stop and recreate with more memory
code-graph-context stop
code-graph-context init --memory 4GParsing Timeouts
Use async mode for large projects:
parse_typescript_project({
projectPath: "/path/to/project",
tsconfigPath: "/path/to/project/tsconfig.json",
async: true // Returns immediately, poll with check_parse_status
})CLI Commands
code-graph-context init [options] # Set up Neo4j + Python sidecar + embedding model
code-graph-context status # Check Docker/Neo4j/sidecar status
code-graph-context stop # Stop Neo4j containerInit options:
-p, --port <port>- Bolt port (default: 7687)--http-port <port>- Browser port (default: 7474)--password <password>- Neo4j password (default: PASSWORD)-m, --memory <size>- Heap memory (default: 2G)-f, --force- Recreate container
Contributing
git clone https://github.com/andrew-hernandez-paragon/code-graph-context.git
cd code-graph-context
npm install
npm run build
npm run dev # Watch modeConventional Commits: feat|fix|docs|refactor(scope): description
License
MIT - see LICENSE
