smartgrep
v1.4.1
Published
Local-first semantic code search + Q&A for humans and AI agents. CLI + MCP server. No API keys, nothing leaves your machine.
Maintainers
Readme
If smartgrep saves you time, give it a ⭐ — it's the cheapest way to help.
What's new in v1.4.1
- Clearer first-run guidance —
smartgrep doctorand the error messages now tell apart Ollama isn't installed (points you to ollama.com/download) from Ollama is installed but not running (ollama serve), instead of always suggestingserve. - Quickstart now starts at step 0 — an explicit "install Ollama" step so brand-new users aren't dropped straight into
ollama pull.
What's new in v1.4.0
- Tree-sitter for Python, Go, Rust, and Java — these now chunk at function/class boundaries (previously sliding-window), so semantic hits land on whole functions instead of arbitrary line ranges.
smartgrep doctor— one command checks Ollama, both models, Node, native modules, and your index, then offers to fix what it can (ollama pull …,smartgrep index). Run it first if anything seems off.- Friendlier errors everywhere — missing model, Ollama not running, no/locked/unreadable index now produce a clear message + the exact fix instead of a stack trace.
Already installed? Run
npm install -g smartgrep@latestto update.
30-second quickstart
# 0. Install Ollama (one time) — powers the local AI, ~2 min
# macOS/Windows: download from https://ollama.com/download
# Linux: curl -fsSL https://ollama.com/install.sh | sh
# 1. Install smartgrep + the embedding model (one time)
npm install -g smartgrep
ollama pull nomic-embed-text
# 2. Index your project (once)
cd your-project
smartgrep doctor # (optional) verify everything is set up
smartgrep index
# 3. Search
smartgrep # interactive REPL
smartgrep ask "where do we send emails" # one-shot
# 4. (optional) Ask questions with citations
ollama pull qwen2.5-coder:7b # one-time, ~4.7 GB
smartgrep explain "how does authentication work"You're searching your codebase with natural language in under a minute. No accounts, no internet, no API keys.
Ask anything about your code
smartgrep explain "how does watch mode debounce file changes"Returns a paragraph with [N] file:line citations, generated 100% on-device via
Ollama. No API keys, nothing leaves your machine. For follow-ups, run
smartgrep chat — multi-turn Q&A that re-searches the index each turn.
For tooling and CI, smartgrep explain "..." --json returns the same answer
- citations as a single machine-readable object — the same shape exposed via
the MCP
explaintool.
One-time setup
ollama pull qwen2.5-coder:7b # ~4.7 GB chat modelWhy vibe coders need smartgrep
You're vibe coding — you tell Claude Code or Cursor what to build and it does it. That flow breaks the moment your AI agent doesn't know your codebase. It starts hallucinating file paths, reading the wrong functions, duplicating logic that already exists somewhere.
The problem isn't the AI. It's that the AI is flying blind.
Without smartgrep With smartgrep
───────────────────────────────── ─────────────────────────────────
Agent: "I'll look for auth logic" Agent calls → semantic_search
Agent reads: index.ts ❌ Returns: src/auth/middleware.ts:34
Agent reads: app.ts ❌ src/auth/jwt.ts:12
Agent reads: server.ts ❌ src/routes/login.ts:67
Agent reads: routes.ts... ❌
Agent: "I'll guess the structure" Agent reads exactly the right files
First try. Every time.smartgrep gives your AI agent semantic eyes — it finds the right code by understanding intent, not just matching words.
Supercharge your AI agent (Claude Code / Cursor / Windsurf)
This is the headline feature. smartgrep ships an MCP (Model Context Protocol) server so any AI coding agent searches your codebase as a tool — semantically, instantly, locally.
What your agent gains
┌─────────────────────────────────────────────────────────────┐
│ Your AI Agent │
│ │
│ "Add rate limiting to the auth routes" │
│ │ │
│ ▼ │
│ semantic_search("auth route middleware") │
│ │ │
│ ▼ │
│ ✓ src/auth/middleware.ts:34 ← exact file, exact line │
│ ✓ src/routes/auth.ts:12 ← no hallucination │
│ ✓ src/config/rateLimiter.ts:1 ← finds existing logic │
│ │ │
│ ▼ │
│ Agent reads the right code → writes the right fix │
└─────────────────────────────────────────────────────────────┘The 3 tools your agent gets
| Tool | What it does | When the agent uses it |
|------|-------------|----------------------|
| semantic_search(query) | Finds code by intent, returns ranked file:line hits | Any time it needs to locate logic |
| get_chunk(file, start, end) | Reads a specific line range from any file | Expanding context around a search hit |
| explain(question) | Returns a cited paragraph answer about your code | "How does X work?", "Why is Y structured this way?" |
What your agent actually sees
Without smartgrep: the agent guesses, reads 10 wrong files, may invent a solution that conflicts with existing code.
With smartgrep: the agent gets exact file:line results with real code snippets — first call, every time.
Set up the MCP integration (5 minutes)
Step 1 — Index your repo:
cd your-project
smartgrep indexStep 2 — Add smartgrep to your agent's MCP config:
Add to your project's .mcp.json (or run claude mcp add):
{
"mcpServers": {
"smartgrep": {
"command": "smartgrep",
"args": ["mcp", "--dir", "/absolute/path/to/your-project"]
}
}
}Add to ~/.cursor/mcp.json:
{
"mcpServers": {
"smartgrep": {
"command": "smartgrep",
"args": ["mcp", "--dir", "/absolute/path/to/your-project"]
}
}
}Add to ~/.codeium/windsurf/mcp_config.json:
{
"mcpServers": {
"smartgrep": {
"command": "smartgrep",
"args": ["mcp", "--dir", "/absolute/path/to/your-project"]
}
}
}Windows users: use forward slashes or double backslashes in the path — e.g.
C:/Users/you/my-projectorC:\\Users\\you\\my-project.
Step 3 — Restart your agent. It now has three new tools:
| Tool | What the agent can do |
|------|--------------|
| semantic_search(query, top_k?) | Find code by intent — returns ranked file/line results with up to 80 lines of context per hit |
| get_chunk(file, line_start, line_end) | Pull a specific line range to expand context around any hit |
| explain(question, top_k?) | Get a paragraph-length answer with [N] file:line citations, synthesized by a local chat model |
Step 4 (optional) — Keep the index live while you code:
smartgrep watchRun this in a separate terminal. The agent's searches will always reflect what's currently on disk.
Don't have an MCP client? smartgrep also works as a standalone CLI — keep scrolling.
How smartgrep makes vibe coding stronger
Traditional vibe coding Vibe coding + smartgrep
────────────────────────────────── ──────────────────────────────────
You describe a feature You describe a feature
Agent reads random files Agent calls semantic_search
Agent hallucinates structure Agent finds exact file:line
You correct mistakes Agent writes targeted code
Agent tries again It works first time
You correct more mistakes
...eventually it worksThe difference: your AI agent goes from exploring blindly to knowing exactly where to look.
Framework-aware from the start
$ smartgrep
◆ smartgrep semantic code search
─────────────────────────────────────────────────────
↳ Detected React — try: component render · hook state · context provider
Type a query, /help for commands, q to quit.smartgrep detects your stack automatically — Next.js, React, Vue, Python, Rust, Go, Express, Angular, Fastify — and shows you what to search for right away.
Your agent never reads the wrong file again
Agent prompt: "add input validation to the user registration endpoint"
│
semantic_search("user registration endpoint validation")
│
┌─────────┴──────────┐
▼ ▼
src/routes/auth.ts:34 src/validators/user.ts:12
(registration handler) (existing validation utils)
│
get_chunk to read both files
│
writes correct code that uses existing validators
← no duplication, no wrong file, no hallucinationStandalone CLI usage
Commands at a glance
| Command | What it does |
|---------|--------------|
| smartgrep | Interactive REPL (default) — search, then press 1–9 to open a result |
| smartgrep ask "<q>" | One-shot semantic search, pipe-safe |
| smartgrep explain "<q>" | Cited paragraph answer (--json for tooling / CI) |
| smartgrep chat | Multi-turn Q&A — re-searches the index each turn |
| smartgrep index [dir] | Build or refresh the index (--full forces a clean rebuild) |
| smartgrep watch [dir] | Keep the index live as you edit |
| smartgrep doctor [dir] | Check your setup (Ollama, models, index) and offer to fix issues |
| smartgrep mcp | Run as an MCP server for AI agents |
Run smartgrep <command> --help for flags and examples on any command.
Interactive REPL (default)
Just run smartgrep with no arguments:
smartgrepA REPL opens. Type a question in plain English. Press 1–9 to open any result in your editor. Press / to search again. q to quit.
| Key | Action |
|-----|--------|
| 1–9 | Open that result in your editor at the exact line |
| Enter or / | Start a new search (keeps session open) |
| q or Ctrl+C | Quit |
On startup smartgrep detects your project's framework (Next.js, React, Python, Rust, Go, etc.) and shows relevant example queries. The REPL remembers your last query — pressing Enter repeats it.
One-shot search (pipe-safe)
# Pipe to other tools
smartgrep ask "error handling" | head -20
# Search a specific directory
smartgrep ask "database connection" --dir /path/to/project
# More results
smartgrep ask "caching logic" --top 10
# Different embedding model
smartgrep ask "auth flow" --model mxbai-embed-largeKeep the index live
smartgrep watchForeground watcher. Re-indexes files as you save them. Ctrl+C to stop. Uses chokidar + 500ms debounce + batched re-embed. Shows a per-drain ticker so you can see when changes flow in.
Ask questions with citations
# One-shot streaming answer with [N] citations
smartgrep explain "how does the watch lock work"
# Multi-turn — each turn re-searches the index for fresh context
smartgrep chat
# Machine-readable output (same shape as the MCP explain tool)
smartgrep explain "auth flow" --json | jqPowered by a local chat model (default qwen2.5-coder:7b). Citations are validated post-stream — any [N] the model invents that doesn't map to a retrieved chunk is stripped before display.
Check your setup
smartgrep doctor # current folder
smartgrep doctor ./my-app # a specific folderdoctor verifies Node, native modules, Ollama, both models, and your index in one pass.
For anything fixable it offers to run the fix on the spot (ollama pull …, smartgrep index);
print-only issues show the exact command. It exits non-zero when something's still wrong, so
it doubles as a CI readiness check. Run it first whenever a command behaves unexpectedly.
Install
Prerequisites
| Tool | Install | |------|---------| | Node.js 18+ | nodejs.org | | Ollama | ollama.ai |
Fresh install
npm install -g smartgrepAlready installed? Update to the latest version
npm install -g smartgrep@latestCheck your current version anytime:
smartgrep --versionAfter updating, re-run
smartgrep indexin any project you want to search — new versions may improve the index format.
Pull the models (once)
# Required — for indexing and semantic search
ollama pull nomic-embed-text
# Optional — only needed for `smartgrep explain` / `smartgrep chat` / MCP `explain`
ollama pull qwen2.5-coder:7b
nomic-embed-textis ~274 MB.qwen2.5-coder:7bis ~4.7 GB. Each is downloaded once per machine. If you only want search, skip the chat model.
git clone https://github.com/Benedictpatrick/Smartgrep.git
cd Smartgrep
npm install
npm run build
npm linkFeatures
- Local-first — runs entirely on your machine via Ollama. No cloud, no API keys, your code never leaves disk.
- MCP server built in — plug into Claude Code, Cursor, Windsurf, Aider, or any MCP-aware agent in two minutes. Three tools:
semantic_search,get_chunk,explain. - Repo Q&A with citations —
smartgrep explain "..."returns a paragraph-length answer with inline[N]file:line citations, streamed from a local chat model.smartgrep chatadds multi-turn.--jsonfor tooling / CI. - Hybrid semantic + lexical search — vector similarity (LanceDB) and keyword matching fused via Reciprocal Rank Fusion for best-of-both-worlds relevance.
- Smart chunking — tree-sitter splits TS / TSX / JS / JSX, Python, Go, Rust, and Java by function and class boundary; sliding-window fallback for other languages.
- Incremental re-indexing — only re-embeds files that actually changed. Re-running
smartgrep indexon an up-to-date repo is near-free. - Live watch mode —
smartgrep watchkeeps the index live as you code, with a per-drain ticker so you can see when changes flow in. - Interactive REPL — animated search phases, syntax-highlighted results, press
1–9to open in your editor. Detects your framework on startup (Next.js, React, Python, Rust, Go…) and shows tailored example queries. Slash commands/help,/explain <q>,/chat. - Press-to-open editor integration — auto-detects VS Code, vim, nano, or
$EDITOR. - Pipe-safe —
smartgrep ask "query" | grep fooandsmartgrep explain "..." --json | jqboth work cleanly. - One-command setup check —
smartgrep doctorverifies Ollama, both models, Node, native modules, and your index, then offers to fix what it can. CI-friendly (non-zero exit on failure). - Friendly errors, never stack traces — Ollama down, a missing model, or a missing/locked/unreadable index each produce a clear message plus the exact command to fix it.
- Respects
.gitignore— won't indexnode_modules, build artifacts, lock files. - Any language — TypeScript, JavaScript, Python, Go, Rust, Java, C/C++, C#, Ruby, PHP, Swift, Kotlin, HTML, CSS, SQL, Markdown, and more.
Indexing
cd your-project
smartgrep index ◆ smartgrep semantic code search
──────────────────────────────────────────────────────────
◇ Indexing workspace
workspace /your-project
✓ Ollama ready
✓ Files 47 found
✓ Chunks 312 created
● Embedding with nomic-embed-text
━━━━━━━━━━━━━━━━━━━━ 62% (194/312) ETA 14s
src/payments/retryService.ts
✓ Index saved
✓ 312 chunks · 47 files · 18.3s
↳ Run smartgrep to searchThe index is saved in .smartgrep/ inside your project. Add it to .gitignore:
.smartgrep/Re-running smartgrep index is incremental — only files whose mtime, size, or content hash changed get re-embedded. --full forces a clean rebuild.
Editor integration
When you press 1–9 in the REPL, smartgrep detects your editor and opens the file at the exact line.
| Editor | Detection | Open-at-line syntax |
|--------|-----------|---------------------|
| VS Code | $VISUAL, $EDITOR, or code on PATH | code --goto path:line:1 |
| vim / neovim | $VISUAL, $EDITOR, or vim on PATH | vim +line path |
| nano | $EDITOR | nano +line path |
| Any other | $VISUAL or $EDITOR | $EDITOR +line path |
Set $EDITOR or $VISUAL in your shell profile to override detection.
How it works
Your codebase
│
▼
File Walker respects .gitignore, skips noise files
│
▼
Tree-sitter splits code by function / class boundary
Chunker (falls back to sliding window for unsupported languages)
│
▼
Ollama embeds each chunk into a 768-dimensional vector
Embedder using nomic-embed-text running locally
│
▼
LanceDB stores vectors + metadata + manifest.json in .smartgrep/
Vector Store
│
▼
Your query ──► embed ──► vector search ─┐
──► keyword search ─┤──► Reciprocal Rank Fusion ──► top results
MCP server (smartgrep mcp) exposes the same ranking pipeline as
semantic_search + get_chunk tools over stdio JSON-RPC.Why this works better than keyword search:
grep "email" finds 200 matches. smartgrep ask "where do we send transactional emails" finds the 3 functions that actually send them — because it understands intent, not just words.
Hybrid search: smartgrep combines semantic (vector similarity) and lexical (keyword scoring) results using Reciprocal Rank Fusion. Results that appear in both lists rank higher; neither list discards the other.
Supported languages
| Language | Chunking | |----------|----------| | TypeScript / TSX | Tree-sitter (function/class level) | | JavaScript / JSX | Tree-sitter (function/class level) | | Python | Tree-sitter (function/class level) | | Go | Tree-sitter (function/class level) | | Rust | Tree-sitter (function/class level) | | Java | Tree-sitter (function/class level) | | C/C++, C#, Ruby, PHP, Swift, Kotlin | Sliding window | | HTML, CSS, SCSS, Vue, Svelte, SQL, YAML, TOML, Markdown | Sliding window |
Tree-sitter support for more languages is on the roadmap. PRs welcome.
Configuration
| Flag | Default | Description |
|------|---------|-------------|
| --model | nomic-embed-text | Ollama embedding model |
| --top | 5 | Number of results to return |
| --dir | cwd | Project root |
| --full | (off) | Force a clean re-index (skip incremental cache) |
| --force | (off) | Override the watch.lock check |
Alternative embedding models (better accuracy, slower):
ollama pull mxbai-embed-large
smartgrep index --model mxbai-embed-large
smartgrep ask "your query" --model mxbai-embed-largeIf you change the embedding model, smartgrep automatically triggers a full re-index — the old vectors aren't compatible.
Troubleshooting
Run smartgrep doctor — it checks Ollama, both models, Node, native modules, and the
index, and offers to fix anything it can:
smartgrep doctor # current folder
smartgrep doctor ./my-app # a specific folderIt exits non-zero if anything is still wrong, so it works in CI too.
FAQ
Does this send my code anywhere? No. Ollama runs on your machine. LanceDB is an embedded file-based vector store. Nothing leaves disk.
Do I need a GPU? Recommended for fast indexing, not required. Ollama runs on CPU too — just slower.
How big can the repo be? Tested on repos up to ~10k files. The bottleneck is Ollama embedding throughput (a few hundred chunks/sec on CPU, much faster on GPU).
What's MCP? Model Context Protocol — Anthropic's open standard for giving AI agents tool access. Claude Code, Cursor, Windsurf, Aider, and others speak it.
Does it work offline? Yes. After npm install and ollama pull, you can use it on a plane.
Will it index private code safely? Yes — that's the whole point of local-first. Nothing crosses a network.
Which chat model should I use for explain / chat? qwen2.5-coder:7b (the default) is a strong code-tuned ~5 GB model. On laptops with less RAM, try qwen2.5-coder:3b. For more capable answers, try qwen2.5-coder:14b or llama3.1:8b. Override with smartgrep explain "..." --model <name> or smartgrep chat -m <name>.
Contributing
Pull requests are welcome — especially:
- Tree-sitter chunking for more languages (C/C++, Ruby, PHP, Swift, Kotlin — function/class boundaries)
- New MCP tools (e.g.
list_symbols,find_references) - VS Code / JetBrains extensions
To contribute:
- Fork the repo
git checkout -b feature/your-featurenpm install && npm run buildnpm test(all tests must pass)- Open a PR
Roadmap
- [x] Interactive REPL with animated search phases
- [x] Syntax-highlighted results with press-to-open editor integration
- [x] Hybrid semantic + lexical search with Reciprocal Rank Fusion
- [x] Per-file ETA progress during indexing
- [x] Incremental re-indexing (only changed files)
- [x] Live watch mode (
smartgrep watch) - [x] MCP server for Claude Code / Cursor / agents
- [x] Published to npm
- [x] Repo Q&A with citations —
smartgrep explain,smartgrep chat, MCPexplaintool - [x] Workspace-adaptive tips — REPL detects Next.js / React / Python / Rust / Go and shows framework-specific example queries on startup
- [x] Reliable cross-platform action bar — readline-based input works on Windows PowerShell (was broken with raw-mode)
- [x] 10-line result snippets — shows more of each function body (was 6 lines)
- [x] Tree-sitter for more languages — Python, Go, Rust, Java now chunk at function/class boundaries (was sliding-window for non-TS/JS)
- [x]
smartgrep doctor+ friendly errors — one-command setup check with offer-to-fix; every failure shows the exact fix instead of a stack trace
What's next
- [ ] Code-graph queries —
smartgrep refs <symbol>,smartgrep callers <fn>, "who imports X" - [ ] VS Code extension — wrap the MCP server as a sidebar panel
- [ ]
smartgrep explaintool-use loop — let the chat model re-query the index mid-answer (follow imports, expand scope) - [ ] Multi-repo / monorepo federation — search across many indexed roots
- [ ] GitHub Action —
smartgrep indexon push, cache the artifact for fast CI re-use
Have an idea? Open an issue — local-first, no-cloud, no-telemetry ideas welcome; cloud-LLM and analytics features are intentionally out of scope.
Built with Ollama · LanceDB · tree-sitter · TypeScript
If smartgrep saved you time, give it a ⭐ it's the cheapest way to help.
