@lebatuananh/context-engine
v0.1.34
Published
Semantic code search engine with MCP integration — vector search, call-graph expansion, built-in chat agent, and project memory for AI coding assistants
Readme
@lebatuananh/context-engine
A semantic code search engine that indexes your repositories and exposes results via MCP (Model Context Protocol). Designed for AI coding assistants — add one MCP endpoint and your assistant gains deep codebase understanding with call-graph-aware retrieval.
Install
npx @lebatuananh/context-engine@latestOr install globally:
npm install -g @lebatuananh/context-engine@latest
context-engineThis boots the HTTP server on port 6699:
- Web UI: http://127.0.0.1:6699
- MCP endpoint: http://127.0.0.1:6699/mcp
Supported Platforms
| Platform | Package |
|----------|---------|
| Windows x64 | @lebatuananh/context-engine-win32-x64 |
The correct binary is installed automatically via optionalDependencies.
CLI Options
context-engine --port 8080 --bind 0.0.0.0 --data-dir /path/to/data| Flag | Env Variable | Default | Description |
|------|-------------|---------|-------------|
| --port | CONTEXT_ENGINE_PORT | 6699 | Port to listen on |
| --bind | CONTEXT_ENGINE_BIND | 127.0.0.1 | Bind address |
| --data-dir | CONTEXT_ENGINE_DATA_DIR | ~/.my-context-engine | Data directory |
| --embeddings-dir | CONTEXT_ENGINE_EMBEDDINGS_DIR | ~/.my-context-engine/embeddings | Embedding cache |
Features
- Semantic code search — embedding-based vector search with cosine similarity
- Call-graph expansion — BFS-expands matched symbols through caller/callee edges (2 levels deep)
- Multi-language parsing — Tree-sitter extraction for 22 languages (Python, JS, TS, Rust, Go, Java, C, C++, C#, PHP, Ruby, Objective-C, Swift, Kotlin, Dart, Lua, Luau, Svelte, Pascal, Liquid)
- Agentic RAG — optional tool-calling agent loop that iteratively searches and accumulates context
- LLM reranking — reorders and line-prunes candidate chunks via Google or OpenAI-compatible LLMs
- Dual embedding providers — Voyage AI or any OpenAI-compatible endpoint
- Content-addressed embedding cache — on-disk, shareable across instances
- Real-time file watching — incremental re-index on changes
- Knowledge Bases — global reference folders always included in every search query (0.8 score multiplier)
- Linked Workspaces — per-repo related repositories queried in parallel during search (0.9 score multiplier)
- Project Memory — per-workspace observations (decisions, bugfixes, patterns, learnings) with semantic recall
- Import resolution — resolves real import targets for candidate ranking (barrel chasing, tsconfig aliases, depth cap 8)
- Framework detection — React/Express/Django/Spring/Gin route and DI edges extracted at index time
- Generated-file downranking — auto-detected generated/minified files deprioritised in results
- Field-qualified search —
filter_lang,filter_path,filter_nameMCP params narrow results post-retrieval - Built-in Chat Agent — per-repo AI chat with tool-calling (grep/read), per-turn model picker, streaming markdown
- Memory-mapped vector shards — persisted shard files with mmap loading for faster cold starts
- Windows Auto-start — one-click Task Scheduler registration via Web UI
- MCP session persistence — BoundedSessionStore (LRU 8192) restores sessions after idle timeout
- Web UI — multi-language (EN/VI/ZH), dark mode, index explorer, call-graph visualization, built-in chat agent, auto-start toggle
MCP Tools
codebase-retrieval
Semantic search across indexed codebases. Takes a natural-language query and returns relevant code snippets with file paths and line numbers. Automatically merges results from Knowledge Bases, Linked Workspaces, and Project Memory.
file-retrieval
Targeted search within a specific file. Takes a file path and a description of what you're looking for.
save-observation
Persist a project observation (decision, bugfix, pattern, or learning) to the workspace's memory store. Observations are embedded and automatically included in future codebase-retrieval queries with a 0.7 score multiplier.
recall-observations
Semantically search past observations for a workspace. Returns the most relevant observations matching a natural-language query.
Knowledge Bases
Global reference directories (coding rules, templates, standards) automatically included in every codebase-retrieval query. Results receive a 0.8 score multiplier so project code always ranks higher.
Configure via Web UI (Repo tab → Knowledge Bases) or REST API:
curl -X PUT http://127.0.0.1:6699/api/config \
-H "Content-Type: application/json" \
-d '{"knowledge_bases": ["D:\\Rules\\coding-standards"]}'- Max 10 entries, absolute paths only
- Indexed using the same pipeline as repos
- Searched in parallel with the primary repo on every query
Linked Workspaces
Per-repo related repositories automatically queried alongside the primary workspace during codebase-retrieval. Unlike Knowledge Bases (global), linked workspaces are configured per-repo — useful for frontend/backend pairs, monorepo splits, or shared library relationships.
Configure via Web UI (open repo → MCP tab → Linked Workspaces) or REST API:
curl -X PUT http://127.0.0.1:6699/api/config \
-H "Content-Type: application/json" \
-d '{"linked_workspaces": {"D:\\Projects\\frontend": ["D:\\Projects\\backend-api"]}}'- Max 5 linked workspaces per repo
- Results receive a 0.9 score multiplier (primary: 1.0 > linked: 0.9 > KB: 0.8 > observations: 0.7)
- Linking is one-directional: A→B does not automatically create B→A
- Unindexed linked workspaces are skipped silently
Project Memory
Persistent per-workspace memory for AI assistants. Observations (decisions, bugfixes, patterns, learnings) are stored with vector embeddings and automatically surfaced during codebase-retrieval queries.
- Stored in a dedicated SurrealDB database (separate from code indexes)
- Max 200 observations per workspace, auto-prunes oldest
- Results receive a 0.7 score multiplier
- Types:
decision,bugfix,pattern,learning - AI assistants call
save-observationto persist andrecall-observationsto search
Chat Agent
Built-in AI chat accessible per-repo from the Web UI. Uses the same LLM provider configured in Settings, or select a custom endpoint per-turn via the model picker.
- Tool-calling agent — grep files and read source code to answer codebase questions
- Streaming responses — SSE-based with full markdown rendering (tables, code blocks)
- Multi-conversation — separate threads per repo, create and delete freely
- Custom model picker — switch model/endpoint per turn without changing global settings
Connect to AI Assistants
Claude Code
claude mcp add context-engine --transport http http://127.0.0.1:6699/mcpPer-repo endpoint
claude mcp add my-project --transport http http://127.0.0.1:6699/mcp-repo/BASE64URL_ENCODED_PATHAny MCP client
Point any Streamable-HTTP MCP client to http://127.0.0.1:6699/mcp.
Alternative Install Methods
Docker
docker compose up -dDocumentation
Full documentation, configuration guide, and architecture details: GitHub
License
MIT
