tea-rags
v1.31.1
Published
MCP server for semantic search using local Qdrant and Ollama (default) with support for OpenAI, Cohere, and Voyage AI
Maintainers
Readme
Your coding agent copies the first code it finds — not the right one.
TeaRAGs is an MCP server for code search that enriches every retrieved chunk with git history: authorship, churn, bug-fix rate, ownership. Your agent stops learning from hotspots and starts learning from stable, owned, battle-tested code.
📖 Full documentation · 🏁 15-minute quickstart · 🧠 Core concepts
The Problem
1. Understanding a monorepo is expensive — for humans AND agents
Every new developer pays in hours. Every fresh agent session pays in tokens. Naming conventions, domain logic, local idioms — all of it has to be rebuilt from scratch, every time.
2. Bad code hygiene is a tax on your agent
Confusing names mean the agent reads more files. More files mean more tokens, slower responses, and a higher chance of picking the wrong example. Your codebase's technical debt is now your AI bill.
3. Agents can't tell stable code from a hotspot
Standard code search ranks by embedding similarity alone. It doesn't know which function gets bug-fixed every sprint, which module hasn't been touched in two years, or whose name is on the commits. So the agent copies whatever looks similar — including the broken examples.
The Solution
TeaRAGs gives your agent two things it can't get from vanilla code search.
1. Every chunk carries its own history
Retrieved code comes with signals about who wrote it, how stable it is, how often it gets bug-fixed, and how impactful a change would be. Semantic similarity stops being the whole answer — it becomes the floor.
2. Pre-built skills, not just raw tools
TeaRAGs ships agent skills — ready-made playbooks that tell your agent when and how to use the signals. No prompt engineering required:
explore— orient in an unfamiliar codebasedata-driven-generation— write code backed by stable, owned templatesrisk-assessment— know what you'd break before you break itrefactoring-scan·bug-hunt·pattern-search— and more
Install the plugin, your agent learns the workflow. See all skills →
Bonus: dinopowers — a companion plugin with 10 wrappers over
superpowers:* skills (Jesse Vincent's
skills library for Claude Code) that inject tea-rags signals into brainstorming,
planning, debugging, TDD, review, and completion flows. Mean eval delta +71pp
across 136 cases.
Learn more →
Use Cases
🛡️ Safe code generation
Your agent writes new code backed by stable, canonical templates — modules
with a low bug-fix rate, long stability, and a clear owner. No more copying from
last sprint's hotspot. Skill: data-driven-generation ·
Why stable code is safer →
🔧 Refactoring planning & problem-pattern discovery
Find the 5% of code responsible for 80% of incidents. High churn + high
bug-fix rate + concentrated ownership = your next production issue — and your
next refactoring candidate. Skills: refactoring-scan, bug-hunt
🎯 Risk assessment before changes
Before modifying a function, the agent checks who depends on it, how often it
breaks, and what its ticket history says. Know the blast radius before you
blast. Skill: risk-assessment ·
Coupling & blast radius theory →
🗺️ Learning an unfamiliar codebase
Ask questions instead of reading directory trees. "How does auth work?"
returns the stable, canonical implementation with its history attached — not
a random similar-looking snippet. Skill: explore
How It Works
flowchart LR
User([👤 You])
subgraph mcp["TeaRAGs MCP Server"]
Agent[🤖 Agent<br/>runs skills]
TeaRAGs[🍵 TeaRAGs<br/>search · enrich · rerank]
Agent <--> TeaRAGs
end
Qdrant[(🗄️ Qdrant<br/>vector DB)]
Embeddings[✨ Embeddings<br/>Ollama/OpenAI]
Codebase[📁 Your Codebase<br/>+ Git History]
User <--> Agent
TeaRAGs <--> Qdrant
TeaRAGs <--> Embeddings
TeaRAGs <--> CodebaseYou talk to your agent. The agent runs a TeaRAGs skill. TeaRAGs searches your code, enriches each result with git history, and ranks by what the skill needs — stability, ownership, risk, or pure relevance.
What You Get
- 🧬 Trajectory-aware retrieval — the only open-source code RAG that scores results by git history, not just embedding similarity
- 📚 Ships with agent skills — 6 ready-made playbooks for exploration, generation, risk assessment, and index management (plus 2 internal strategies)
- 🔒 Local-first, privacy-first — works fully offline with Ollama; your code never leaves your machine (cloud providers optional)
- 🚀 Built for monorepos — AST-aware chunking across 10+ languages, incremental reindexing, parallel pipelines, millions of LOC tested
🕸️ Codegraph Enrichments (beta)
⚠️ Beta. Structural graph signals are still being calibrated across languages and may change between releases.
Beyond git history, tea-rags can enrich chunks with structural graph signals
— call graph and import graph (fan-in, fan-out, instability, PageRank,
transitive impact) — and expose graph-query MCP tools (get_callers,
get_callees, find_cycles, trace_path). This powers blast-radius and
architectural-hub ranking.
Codegraph is disabled by default (beta). Opt in with the CODEGRAPH_ENABLED
environment variable, then re-index:
CODEGRAPH_ENABLED=true # enable graph signals + tools
CODEGRAPH_ENABLED=false # default — graph extraction offSee the Codegraph Enrichments docs for signals, presets, supported languages, and configuration.
Who It's For
- Developers in large monorepos — where "find similar code" returns a dozen near-duplicates and you need the canonical one
- Solo devs doing agentic development — agent-driven workflows produce
bursts of micro-commits that wreck churn metrics. TeaRAGs ships a
GIT SESSIONS
mode (
TRAJECTORY_GIT_SQUASH_AWARE_SESSIONS=true) that groups commits by(author, time gap)so a 20-commit refactor session counts as one. Churn, bug-fix rate, and ownership stay meaningful even with a single human + an agent as the only contributors. - Tech leads worried about AI code quality — who want their team's agents to learn from stable modules, not from last sprint's hotspot
- Privacy-sensitive teams — finance, healthcare, defense, or anyone who can't send source code to a cloud API
Not for: repos without git history (no signal to enrich) or teams that only need autocomplete (use Copilot).
🚀 Quick Start
Inside Claude Code, install the TeaRAGs plugins and run the setup wizard:
/plugin marketplace add artk0de/TeaRAGs-MCP
/plugin install tea-rags-setup@tea-rags
/tea-rags-setup:installThen install the skills plugin (Claude-only, final step):
/plugin install tea-rags@tea-ragsOptionally install dinopowers for wrappers over superpowers:* skills:
/plugin install dinopowers@tea-ragsIndex your codebase:
/tea-rags:indexAsk your agent anything: "How does auth work in this project?", "Find stable examples of retry logic", "What should I know before touching the payment module?".
For other MCP clients, CI, or air-gapped setups, see the
manual install
(Node + npm install -g tea-rags + Ollama/ONNX/OpenAI/Cohere/Voyage).
🗂️ Project Registry
TeaRAGs maintains a per-machine registry at ~/.tea-rags/registry.json (or
$TEA_RAGS_DATA_DIR/registry.json) that records collection metadata and lets
you address indexed projects by a short name instead of an absolute path or
opaque collection id.
CLI:
tea-rags register-project --path ./my-repo --name myrepo
tea-rags list-projects
tea-rags list-projects --json
tea-rags tune --project myrepo
tea-rags unregister-project --name myrepoMCP tools: register_project, list_projects, unregister_project.
Every project-aware tool and command also accepts an optional project
parameter alongside path and collection. Resolution priority is
collection > project > path. The registry is auto-populated at the end of each
indexing / reindexing run with the embedding model, embedding dimensions, Qdrant
URL, indexedAt timestamp, tea-rags version, and chunk count — no manual
register-project call is required for collections you index through tea-rags.
📚 Documentation
| I want to… | Start here | | ---------------------------- | ----------------------------------------------------------------------------------------------------------------------------------- | | Get it running | Quickstart (15 min) — install, index, first query | | Understand the concept | Core Concepts — vectorization, trajectory enrichment, reranking | | See what my agent can do | Skills — 6 ready-made agent playbooks for exploration, generation, risk | | Look under the hood | Architecture — pipelines, data model, reranker internals | | Learn the theory | Knowledge Base — RAG, code search, software evolution |
🤝 Contributing
See CONTRIBUTING.md for workflow and conventions.
🙏 Acknowledgments
Built on a fork of mhalder/qdrant-mcp-server — clean architecture, solid tests, open-source spirit. And its ancestor qdrant/mcp-server-qdrant. Code vectorization inspired by claude-context (Zilliz).
Feel free to fork this fork. It's forks all the way down. 🐢
