large-repo-mcp-server
v1.1.1
Published
High-performance MCP server for large repositories using ripgrep
Readme
Large Repo MCP Server
Stdio-based Model Context Protocol server designed as a standard toolkit for large and complex repositories. Uses ripgrep for fast, bounded search across codebases of any size.
Architecture
┌─────────────────────────────────────────────────────────────────┐
│ MCP Client (Codex CLI / Claude Desktop / Claude Code) │
│ │
│ tools/call ──► JSON-RPC 2.0 request │
│ Content-Length: N\r\n\r\n{...} │
└─────────────┬───────────────────────────────────▲───────────────┘
│ stdin │ stdout
▼ │
┌─────────────────────────────────────────────────────────────────┐
│ large-repo-mcp server │
│ │
│ ┌──────────┐ ┌────────────┐ ┌──────────────────────┐ │
│ │ Frame │───►│ JSON-RPC │───►│ Tool Dispatch │ │
│ │ Parser │ │ Router │ │ │ │
│ │ │ │ │ │ project_search_rg │ │
│ │ Content- │ │ initialize │ │ symbol_search │ │
│ │ Length │ │ tools/list │ │ read_range │ │
│ │ framing │ │ tools/call │ │ list_files │ │
│ └──────────┘ │ ping │ └──────────┬───────────┘ │
│ └────────────┘ │ │
│ │ │
│ ┌────────────────────────────────────────────▼──────────┐ │
│ │ Security Layer │ │
│ │ │ │
│ │ • Path confinement (resolve + realpath + root check) │ │
│ │ • Command allowlist (rg only, shell: false) │ │
│ │ • Request size limit (1 MB) │ │
│ │ • Response size limit (200 KB) │ │
│ │ • Null-byte rejection │ │
│ └───────────────────────────┬───────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────┐ │
│ │ rg (ripgrep) │ │
│ │ subprocess │ │
│ │ │ │
│ │ • 15s timeout │ │
│ │ • JSON output │ │
│ │ • Streaming │ │
│ └─────────────────┘ │
│ stderr ──► │
│ structured JSON logs │
└─────────────────────────────────────────────────────────────────┘Request Flow
Client Server ripgrep
│ │ │
│ Content-Length: N\r\n\r\n │ │
│ {"jsonrpc":"2.0", │ │
│ "method":"tools/call",...} │ │
│──────────────────────────────►│ │
│ │ validate request size (≤1MB) │
│ │ parse JSON-RPC frame │
│ │ validate tool + arguments │
│ │ resolve & confine paths │
│ │ │
│ │ spawn rg --json ... │
│ │──────────────────────────────►│
│ │ │
│ │ stream results (line-by-line)│
│ │◄──────────────────────────────│
│ │ │
│ │ enforce match/byte limits │
│ │ kill rg if limit reached │
│ │ │
│ Content-Length: M\r\n\r\n │ │
│ {"jsonrpc":"2.0", │ │
│ "result":{...}} │ │
│◄──────────────────────────────│ │
│ │ │Safety Guarantees
| Guarantee | Mechanism | Limit |
|-----------|-----------|-------|
| Fast, bounded search | ripgrep with match caps | 500 max matches |
| Response size cap | Byte-level output budget | 200 KB |
| Subprocess timeout | setTimeout + child.kill() | 15 seconds |
| Path confinement | path.resolve + realpath + root prefix check | repo root only |
| Symlink escape prevention | fs.realpath() resolves then re-validates | double-checked |
| Command execution | Allowlist (rg only) + shell: false | no shell injection |
| Request size limit | Frame-level rejection before parse | 1 MB |
| Input sanitization | Null-byte rejection, type validation | all tool inputs |
| Child process cleanup | Tracked set, killed on shutdown signals | SIGTERM/SIGINT/exit |
Supported Clients
- Codex CLI
- Claude Desktop
- Claude Code
- Any MCP client supporting stdio transport (protocol version
2024-11-05)
Requirements
- Node.js 18+
- ripgrep (
rg) onPATHsymbol_searchrequires PCRE2 support (most official packages include it)
Verify ripgrep:
rg --version # should show version
rg --pcre2-version # should show PCRE2 versionInstall ripgrep:
| Platform | Command |
|----------|---------|
| macOS | brew install ripgrep |
| Ubuntu/Debian | sudo apt-get install ripgrep |
| Fedora | sudo dnf install ripgrep |
| Windows | winget install BurntSushi.ripgrep.MSVC |
| Cargo | cargo install ripgrep --features pcre2 |
Install
npm install
npm run buildRun
npm startDefault repo root behavior:
- If
REPO_ROOTis unset, repo root is current working directory (process.cwd()). - Set
REPO_ROOTto override explicitly.
PowerShell:
$env:REPO_ROOT = "C:\absolute\path\to\repo"
npm startMCP Configuration
Codex (Global)
Add to ~/.codex/config.toml:
[mcp_servers.large_repo_mcp]
command = "node"
args = ["C:/absolute/path/to/mcp/dist/server.js"]
startup_timeout_sec = 30
[mcp_servers.large_repo_mcp.env]
REPO_ROOT = "."Codex (Project-Local)
Add to .codex/config.toml in a project:
[mcp_servers.large_repo_mcp]
command = "node"
args = ["C:/absolute/path/to/mcp/dist/server.js"]
startup_timeout_sec = 30
[mcp_servers.large_repo_mcp.env]
REPO_ROOT = "."Claude Desktop
Add to Claude Desktop MCP config:
{
"mcpServers": {
"large_repo_mcp": {
"command": "node",
"args": ["C:/absolute/path/to/mcp/dist/server.js"],
"env": {
"REPO_ROOT": "C:/absolute/path/to/target-repo"
}
}
}
}Claude Code
Add to .mcp.json in a project or ~/.claude/mcp.json globally:
{
"mcpServers": {
"large_repo_mcp": {
"command": "node",
"args": ["C:/absolute/path/to/mcp/dist/server.js"],
"env": {
"REPO_ROOT": "."
}
}
}
}Environment Variables
| Variable | Default | Description |
|----------|---------|-------------|
| REPO_ROOT | process.cwd() | Repository root directory |
| LARGE_REPO_MCP_LOG_LEVEL | error | Log level: error, warn, info, debug |
| LARGE_REPO_MCP_DEBUG | — | Set to 1 to force debug logging |
Tools
project_search_rg
Search the repository with ripgrep. Supports regex patterns and optional glob filters.
┌──────────────────────────────────────────────────────┐
│ project_search_rg │
│ │
│ Input Output │
│ ───── ────── │
│ pattern (string, required) matches[] │
│ globs (string[], max 50) .path │
│ maxMatches (1-500, def 100) .line │
│ .column │
│ .text │
│ .submatches[] │
│ truncated │
│ truncateReason │
│ timedOut │
│ serverVersion │
│ durationMs │
└──────────────────────────────────────────────────────┘Example call:
{
"name": "project_search_rg",
"arguments": {
"pattern": "TODO|FIXME",
"globs": ["*.ts", "*.tsx"],
"maxMatches": 50
}
}symbol_search
Search for an exact symbol using word-boundary matching (\b...\b). Auto-detects project type (TypeScript or Python) and scopes file types accordingly.
┌──────────────────────────────────────────────────────┐
│ symbol_search │
│ │
│ Input Output │
│ ───── ────── │
│ symbol (string, required) projectType │
│ maxMatches (1-500, def 100) matchMode │
│ matches[] │
│ .path │
│ Detection logic: .line │
│ ┌────────────────────┐ .column │
│ │ tsconfig.json? │ .text │
│ │ yes → typescript │ .submatches[] │
│ │ no → check files │ │
│ │ .py only → python│ │
│ │ .ts only → ts │ │
│ │ fallback → ts │ │
│ └────────────────────┘ │
└──────────────────────────────────────────────────────┘Example call:
{
"name": "symbol_search",
"arguments": {
"symbol": "handleRequest",
"maxMatches": 20
}
}read_range
Read a specific line range from a file. Path must be relative and resolve inside the repo root.
┌──────────────────────────────────────────────────────┐
│ read_range │
│ │
│ Input Output │
│ ───── ────── │
│ path (relative, required) path (normalized) │
│ startLine (int, required) requestedLines │
│ endLine (int, required) returnedLines │
│ lines[] │
│ Constraints: .line │
│ • max 500 lines per call .text │
│ • path confined to repo root truncated │
│ • symlinks resolved + checked truncateReason │
└──────────────────────────────────────────────────────┘Example call:
{
"name": "read_range",
"arguments": {
"path": "src/server.ts",
"startLine": 1,
"endLine": 50
}
}list_files
List repository files using ripgrep --files with optional glob filters. Excludes .git, node_modules, dist, build, and coverage by default.
┌──────────────────────────────────────────────────────┐
│ list_files │
│ │
│ Input Output │
│ ───── ────── │
│ globs (string[], max 50) files[] (paths) │
│ maxResults (1-500, def 500) returned │
│ truncated │
│ Auto-excluded dirs: truncateReason │
│ .git, node_modules, dist, timedOut │
│ build, coverage │
└──────────────────────────────────────────────────────┘Example call:
{
"name": "list_files",
"arguments": {
"globs": ["src/**/*.ts"],
"maxResults": 100
}
}Project Type Detection
The server auto-detects whether a repository is TypeScript or Python to scope symbol_search file types. Detection runs once per process and is cached.
┌─────────────────────┐
│ tsconfig.json │
│ exists? │
└──────┬──────────────┘
yes │ no
┌──────────┘ │
▼ ▼
┌──────────┐ ┌────────────────────┐
│TYPESCRIPT│ │ Python project │
└──────────┘ │ file exists? │
│ (pyproject.toml, │
│ requirements.txt, │
│ setup.py, etc.) │
└──────┬─────────────┘
yes │ no
┌──────────┤ │
▼ │ ▼
┌────────────┐ │ ┌──────────────────┐
│package.json│ │ │ package.json │
│ exists? │ │ │ exists? │
└─────┬──────┘ │ └──────┬───────────┘
yes │ no │ yes │ no
│ │ │ │ │ │ │
▼ │ ▼ │ ▼ │ ▼
┌────────┐│ ┌──────┐ │ ┌──────────┐ ┌──────────────┐
│ Both ││ │PYTHON│ │ │TYPESCRIPT│ │ Count .ts vs │
│ scan ││ └──────┘ │ └──────────┘ │ .py files │
│ files │▼ ▼ └──────┬───────┘
└───┬────┘ ┌──────────┐ more │ more
│ │ No pkg, │ .py │ .ts
▼ │ has py │ │ │
┌────────────┐ │ file → │ ▼ ▼
│ .py only → │ │ PYTHON │ ┌──────┐ ┌──┐
│ PYTHON │ └──────────┘ │PYTHON│ │TS│
│ .ts only → │ └──────┘ └──┘
│ TS │
│ fallback → │
│ TS │
└────────────┘Dev Workflow
npm run typecheck # type-check without emitting
npm test # build + run unit & integration tests
npm run lint # run ESLint
npm run format:check # check Prettier formattingTroubleshooting
| Problem | Solution |
|---------|----------|
| rg not found | Install ripgrep and ensure it is on PATH |
| symbol_search PCRE2 error | Install ripgrep with PCRE2 support (most official packages include it) |
| Request exceeds 1048576 bytes | Split large requests; max inbound JSON-RPC frame body is 1 MB |
| truncated: true | Increase maxMatches/maxResults (up to bounds), narrow your query, or add globs |
| No results from symbol_search | Check projectType in response — detection may have picked the wrong language; use project_search_rg as a fallback |
License
MIT
