code-context-gate

v0.1.1

Published

4 days ago

Context-aware code retrieval broker MCP server — minimal, ranked, backpressure-enforced results for coding agents

0High
0Medium
0Low

wisdomrock

mcp model-context-protocol code-retrieval ai-agent context-gate coding-assistant

code-context-gate

An MCP server that acts as a context-aware code retrieval broker for AI coding agents. It sits between the agent and your codebase, enforcing backpressure so the agent never receives more context than it can use.

Why use it?

Without a gate, AI agents call raw grep/find/cat on source files. On a large codebase that means:

A single broad search returns 800 lines from 60 files — flooding the context window
Irrelevant comments, tests, and vendored code push the signal out of the model's attention
There is no token budget, no relevance ranking, and no guidance on how to narrow a bad query

code-context-gate solves this with four disciplined tools that rank, gate, and budget every result before it reaches the agent. Over-broad queries get a structured refine refusal with a per-directory hit distribution and actionable suggestions — so the next call is deterministic.

| Scenario | Without gate | With gate | |---|---|---| | Broad search ("auth") | 800 hits across 60 files, full context dump | Refusal with distribution; agent narrows to src/auth/ in one round-trip | | Large codebase (500+ files) | Agent reads several wrong files before finding the right one | locate returns the 3 most relevant paths | | Cross-file relationships | Must read every candidate file to infer callers | explore returns the call chain directly | | Path traversal | No protection | resolveSafe() hard-blocks any path outside project root |

Relationship to codebase-memory-mcp

code-context-gate can connect to a graph backend (codebase-memory-mcp or codegraph) for semantic symbol lookup and architecture overviews. But the gate's core value — ranking, backpressure, token budgeting, block expansion — is its own, independent of any graph backend. Even with --graph-backend none, it outperforms raw file access on any codebase where result volume is a concern.

codebase-memory-mcp     ← semantic understanding (what code MEANS)
        ↑ consumed by
code-context-gate       ← budget discipline (HOW MUCH reaches the agent)
        ↑ consumed by
AI agent

How it works

Every tool call flows through the same pipeline:

Tool call (locate / read / explore / search)
  └─ Dispatcher         — routes by query shape (symbol → graph, phrase → merge, pattern → fileio)
       ├─ Graph adapter  — codebase-memory-mcp or codegraph (optional)
       └─ FileIO adapter — async line-by-line file scan with gitignore support
            └─ Shaper   — BM25 lexical scoring + kind weighting → ranked ResultItem[]
                 └─ Gatekeeper — enforces maxFiles / maxHits / maxTokens / minRelevance
                      └─ OkEnvelope  (results fit budget)
                      └─ RefineEnvelope  (too broad — includes distribution + suggestions)

Query shape routing in the dispatcher:

| Query shape | Example | Strategy | |---|---|---| | Symbol (single word) | OrderHandler | Graph first → FileIO fallback on miss | | Phrase (has spaces) | payment retry logic | FileIO + graph merged, deduped by file:line | | Pattern (regex chars) | retry.*count | FileIO regex scan only |

Installation

npx code-context-gate --project-root /path/to/your/project

Or install globally:

npm install -g code-context-gate
code-context-gate --project-root .

MCP Configuration

Add to your MCP client config (e.g. Claude Desktop claude_desktop_config.json or Claude Code .mcp.json):

{
  "mcpServers": {
    "code-context-gate": {
      "command": "npx",
      "args": ["code-context-gate", "--project-root", "/path/to/project"]
    }
  }
}

With a graph backend

{
  "mcpServers": {
    "code-context-gate": {
      "command": "npx",
      "args": [
        "code-context-gate",
        "--project-root", "/path/to/project",
        "--graph-backend", "codebase-memory-mcp"
      ]
    }
  }
}

CLI flags

| Flag | Default | Description | |---|---|---| | --project-root | (see below) | Root directory to search within | | --config | (none) | Path to a JSON config file | | --graph-backend | codebase-memory-mcp | codebase-memory-mcp, codegraph, or none | | --debug | false | Write structured JSON logs to stderr |

Project root resolution order

--project-root CLI flag
CODE_CONTEXT_GATE_ROOT environment variable
process.cwd() — the working directory of the process that launched the server

For stdio-mode MCP clients (Claude Code CLI, Cursor, etc.) option 3 already points at the correct project root. For GUI clients (Claude Desktop) set the env var in your MCP config:

{
  "mcpServers": {
    "code-context-gate": {
      "command": "npx",
      "args": ["code-context-gate"],
      "env": { "CODE_CONTEXT_GATE_ROOT": "/path/to/your/project" }
    }
  }
}

Tools

`locate` — find where something lives

Returns file paths and line ranges only — no source bodies. Use first, before read.

{ "query": "OrderHandler", "kind": "definition", "scope": "src/" }

| Parameter | Type | Description | |---|---|---| | query | string | Symbol name or concept | | kind | definition | reference | any | Filter by result kind (default: any) | | scope | string | Subdirectory to restrict the search | | max_results | number | Override the default hit cap |

`read` — get the actual source

Reads a symbol by name or an explicit file:start-end range. Block-expands to enclosing function/class boundaries. Size-bounded so a single call never dumps an entire file.

{ "target": "OrderHandler.handle" }
{ "target": "src/payments/OrderHandler.ts:142-179", "expand": false }
{ "target": "src/config.ts" }

| Parameter | Type | Description | |---|---|---| | target | string | Symbol name or file:start[-end] | | expand | boolean | Expand to enclosing block (default: true) | | max_tokens | number | Token budget for the response |

`search` — keyword or regex search

The sanctioned replacement for grep. Always ranked, gated, and token-budgeted.

{ "pattern": "PathDeniedError", "scope": "src/" }
{ "pattern": "export.*function", "regex": true, "max_hits": 20 }

| Parameter | Type | Description | |---|---|---| | pattern | string | Keyword or regex pattern | | regex | boolean | Treat pattern as regex (default: false) | | scope | string | Subdirectory to restrict the search | | exclude | string[] | Subdirectories to skip | | max_hits | number | Override the default hit cap | | max_tokens | number | Token budget for snippets | | force | boolean | Bypass gating (use sparingly) |

`explore` — understand structure and relationships

Answers structural questions: callers, callees, dependencies, blast radius, architecture overview. Requires a graph backend for full results; falls back to a file scan summary when no graph is available.

{ "question": "what calls resolveSafe", "anchor": "resolveSafe", "relation": "callers" }
{ "question": "overview of the payments module", "relation": "overview" }
{ "question": "what would break if I change UserSession", "relation": "impact" }

| Parameter | Type | Description | |---|---|---| | question | string | Natural-language structural question | | anchor | string | Symbol to anchor the query | | relation | callers | callees | dependencies | impact | overview | auto | Relationship type (default: auto) |

Backpressure

When a query is too broad, tools return a refine envelope instead of overwhelming the agent:

{
  "status": "refine",
  "reason": "breadth",
  "summary": { "totalFound": 340, "files": 48, "estTokens": 28000, "topScore": 0.21 },
  "distribution": [
    { "dir": "src/auth", "hits": 120 },
    { "dir": "src/core", "hits": 80 }
  ],
  "suggestions": [
    { "action": "narrow_scope", "arg": "src/auth", "wouldYield": 120 },
    { "action": "use_force", "hint": "Set force:true to bypass gating" }
  ],
  "refusalCount": 1
}

Refusal reasons:

| Reason | Condition | |---|---| | breadth | Unique files > maxFiles (default 25) | | volume | Hit count > maxHits (120) or estimated tokens > absoluteMaxTokens (24 000) | | relevance | Top BM25 score below minRelevance (0.35) | | compound | Two or more of the above |

Deadlock cap: after maxRefusals (default 3) consecutive refusals, the gate opens automatically.
Compound margin: single-axis near-misses within 15% of a threshold pass through without refusal.

Config file

Place a JSON file anywhere and pass it with --config:

{
  "gates": {
    "maxFiles": 25,
    "maxHits": 120,
    "maxTokens": 6000,
    "minRelevance": 0.35
  },
  "scan": {
    "excludeDirs": ["node_modules", "dist", ".git"],
    "respectGitignore": true
  }
}

Full config schema (all fields optional — shown with defaults):

{
  "gates": {
    "maxFiles": 25,
    "maxHits": 120,
    "maxTokens": 6000,
    "minRelevance": 0.35,
    "compoundMargin": 0.15
  },
  "limits": {
    "maxRefusals": 3,
    "absoluteMaxTokens": 24000
  },
  "estimation": {
    "charsPerToken": 3.5
  },
  "scoring": {
    "weights": {
      "kind": 0.30,
      "lexical": 0.25,
      "proximity": 0.20,
      "shape": 0.15,
      "recency": 0.10
    }
  },
  "backend": {
    "graph": "codebase-memory-mcp"
  },
  "scan": {
    "excludeDirs": ["node_modules", "dist", "build", "target", ".venv", "vendor", ".git"],
    "respectGitignore": true
  }
}

Logging

All tool activity is written as newline-delimited JSON to code-context-gate.log in the project root. Pass --debug to also stream logs to stderr.

Log events

| Event | Source | Key fields | What it tells you | |---|---|---|---| | tool_call | server.ts | tool, args | Every inbound MCP tool call with its arguments | | dispatch | dispatcher.ts | query, shape, graphAvailable | Query shape classification and whether graph is reachable | | fileio_scan | fileio.ts | scope, patternType | When a file walk starts, what directory and pattern type (string|regex) | | graph_search | graph.ts | backend, count | Graph search completed; how many results the backend returned | | graph_search_error | graph.ts | backend, error | Graph search failed silently (previously swallowed) | | graph_architecture | graph.ts | backend, hasResult | Architecture call completed and whether a non-empty summary was returned | | graph_architecture_error | graph.ts | backend, error | Architecture call failed | | dispatch_result | dispatcher.ts | strategy, count | Final strategy used (graph_symbol, fileio_only, fileio+graph_merged, etc.) and total hits | | dispatch_graph_miss | dispatcher.ts | query, shape | Graph returned 0 results for a symbol query; falling back to FileIO | | gate | server.ts | tool, pass, reason | Gate decision — passed or refused, and why | | tool_result | server.ts | tool, returned, estTokens | How many results were returned and their estimated token cost | | tool_error | server.ts | tool, error | Unhandled tool exception |

Reading the log

A typical symbol lookup with graph hit:

{"ts":"...","event":"tool_call","tool":"locate","args":{"query":"resolveSafe","kind":"definition"}}
{"ts":"...","event":"dispatch","query":"resolveSafe","shape":"symbol","graphAvailable":true}
{"ts":"...","event":"graph_search","backend":"codebase-memory-mcp","count":2}
{"ts":"...","event":"dispatch_result","query":"resolveSafe","strategy":"graph_symbol","count":2}
{"ts":"...","event":"gate","tool":"locate","pass":true,"reason":null}
{"ts":"...","event":"tool_result","tool":"locate","returned":2}

A broad search that triggers a refusal:

{"ts":"...","event":"tool_call","tool":"search","args":{"pattern":"export"}}
{"ts":"...","event":"dispatch","query":"export","shape":"symbol","graphAvailable":true}
{"ts":"...","event":"graph_search","backend":"codebase-memory-mcp","count":148}
{"ts":"...","event":"dispatch_result","query":"export","strategy":"graph_symbol","count":148}
{"ts":"...","event":"gate","tool":"search","pass":false,"reason":"breadth"}
{"ts":"...","event":"tool_result","tool":"search","returned":0}

Source structure (`src/`)

| File | Role | |---|---| | cli.ts | Entry point — parses CLI flags, loads config, starts the server | | server.ts | MCP server — registers the four tools, wires dispatch → shape → gate → respond | | dispatcher.ts | Routes queries by shape to graph and/or FileIO; merges and deduplicates results | | shaper.ts | BM25 lexical scoring + kind weighting → ranked ResultItem[]; trim() for token budgeting | | gatekeeper.ts | Enforces backpressure limits; returns RefineEnvelope with distribution and suggestions on failure | | config.ts | Zod-validated config loader; resolves project root from flag → env var → CWD | | logger.ts | Append-only JSON line logger to code-context-gate.log; never crashes the server on write failure | | types.ts | Shared types and Zod input schemas for all four tools | | adapters/fileio.ts | Filesystem adapter — scan(), readLines(), expandBlock(), resolveSafe() | | adapters/graph.ts | Graph adapters — CodebaseMemoryAdapter, CodegraphAdapter, NullAdapter; createGraphAdapter() factory |

Result scoring

Each result gets a composite score before ranking:

score = kindScore × 0.30 + BM25lexical × 0.25 + graphScore × 0.45

Kind weights:

| Kind | Weight | |---|---| | definition | 1.0 | | call, reference | 0.7 | | unknown | 0.5 | | test | 0.3 | | comment, string | 0.2 |

Graph backends

| Backend | Launch command | Adapter | |---|---|---| | codebase-memory-mcp | npx -y codebase-memory-mcp --project-root <root> | CodebaseMemoryAdapter — calls search_code and get_architecture | | codegraph | npx -y codegraph <root> | CodegraphAdapter — calls search and architecture | | none | (not launched) | NullAdapter — FileIO only, all tools still work |

Both graph backends are optional. Without one, all four tools work using FileIO only.

Agent Instructions (CLAUDE.md)

This repo ships a CLAUDE.md that Claude Code and compatible agents load automatically. It routes all code discovery through code-context-gate and prohibits direct calls to lower-level tools like codebase-memory-mcp.

When an agent session starts it first probes code-context-gate with a lightweight locate call:

Server responds → the four tools are used for all code access
Server unavailable → the agent stops and prompts the user to enable code-context-gate or re-enable the cbm-session-reminder fallback hook

Re-enabling the fallback hook

If you want agents to fall back to raw codebase-memory-mcp calls when code-context-gate is not running, add to ~/.claude/settings.json:

{
  "hooks": {
    "SessionStart": [
      { "matcher": "startup", "hooks": [{ "type": "command", "command": "~/.claude/hooks/cbm-session-reminder" }] },
      { "matcher": "resume",  "hooks": [{ "type": "command", "command": "~/.claude/hooks/cbm-session-reminder" }] },
      { "matcher": "clear",   "hooks": [{ "type": "command", "command": "~/.claude/hooks/cbm-session-reminder" }] },
      { "matcher": "compact", "hooks": [{ "type": "command", "command": "~/.claude/hooks/cbm-session-reminder" }] }
    ]
  }
}

Requirements

Node.js 20+
No native build tools required

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

code-context-gate

Why use it?

Relationship to codebase-memory-mcp

How it works

Installation

MCP Configuration

With a graph backend

CLI flags

Project root resolution order

Tools

locate — find where something lives

read — get the actual source

search — keyword or regex search

explore — understand structure and relationships

Backpressure

Config file

Logging

Log events

Reading the log

Source structure (src/)

Result scoring

Graph backends

Agent Instructions (CLAUDE.md)

Re-enabling the fallback hook

Requirements

License

`locate` — find where something lives

`read` — get the actual source

`search` — keyword or regex search

`explore` — understand structure and relationships

Source structure (`src/`)