smartgrep

v1.4.1

Published

21 days ago

Local-first semantic code search + Q&A for humans and AI agents. CLI + MCP server. No API keys, nothing leaves your machine.

0High
0Medium
0Low

bencodess

code-search semantic-search ai cli ollama mcp model-context-protocol claude-code cursor developer-tools rag

If smartgrep saves you time, give it a ⭐ — it's the cheapest way to help.

What's new in v1.4.1

Clearer first-run guidance — smartgrep doctor and the error messages now tell apart Ollama isn't installed (points you to ollama.com/download) from Ollama is installed but not running (ollama serve), instead of always suggesting serve.
Quickstart now starts at step 0 — an explicit "install Ollama" step so brand-new users aren't dropped straight into ollama pull.

What's new in v1.4.0

Tree-sitter for Python, Go, Rust, and Java — these now chunk at function/class boundaries (previously sliding-window), so semantic hits land on whole functions instead of arbitrary line ranges.
smartgrep doctor — one command checks Ollama, both models, Node, native modules, and your index, then offers to fix what it can (ollama pull …, smartgrep index). Run it first if anything seems off.
Friendlier errors everywhere — missing model, Ollama not running, no/locked/unreadable index now produce a clear message + the exact fix instead of a stack trace.

Already installed? Run npm install -g smartgrep@latest to update.

30-second quickstart

# 0. Install Ollama (one time) — powers the local AI, ~2 min
#    macOS/Windows: download from https://ollama.com/download
#    Linux:         curl -fsSL https://ollama.com/install.sh | sh

# 1. Install smartgrep + the embedding model (one time)
npm install -g smartgrep
ollama pull nomic-embed-text

# 2. Index your project (once)
cd your-project
smartgrep doctor                           # (optional) verify everything is set up
smartgrep index

# 3. Search
smartgrep                                  # interactive REPL
smartgrep ask "where do we send emails"    # one-shot

# 4. (optional) Ask questions with citations
ollama pull qwen2.5-coder:7b               # one-time, ~4.7 GB
smartgrep explain "how does authentication work"

You're searching your codebase with natural language in under a minute. No accounts, no internet, no API keys.

Ask anything about your code

smartgrep explain "how does watch mode debounce file changes"

Returns a paragraph with [N] file:line citations, generated 100% on-device via Ollama. No API keys, nothing leaves your machine. For follow-ups, run smartgrep chat — multi-turn Q&A that re-searches the index each turn.

For tooling and CI, smartgrep explain "..." --json returns the same answer

citations as a single machine-readable object — the same shape exposed via the MCP explain tool.

One-time setup

ollama pull qwen2.5-coder:7b   # ~4.7 GB chat model

Why vibe coders need smartgrep

You're vibe coding — you tell Claude Code or Cursor what to build and it does it. That flow breaks the moment your AI agent doesn't know your codebase. It starts hallucinating file paths, reading the wrong functions, duplicating logic that already exists somewhere.

The problem isn't the AI. It's that the AI is flying blind.

Without smartgrep                    With smartgrep
─────────────────────────────────    ─────────────────────────────────
Agent: "I'll look for auth logic"    Agent calls → semantic_search
Agent reads: index.ts ❌             Returns:  src/auth/middleware.ts:34
Agent reads: app.ts ❌                         src/auth/jwt.ts:12
Agent reads: server.ts ❌                      src/routes/login.ts:67
Agent reads: routes.ts... ❌
Agent: "I'll guess the structure"    Agent reads exactly the right files
                                     First try. Every time.

smartgrep gives your AI agent semantic eyes — it finds the right code by understanding intent, not just matching words.

Supercharge your AI agent (Claude Code / Cursor / Windsurf)

This is the headline feature. smartgrep ships an MCP (Model Context Protocol) server so any AI coding agent searches your codebase as a tool — semantically, instantly, locally.

What your agent gains

┌─────────────────────────────────────────────────────────────┐
│                    Your AI Agent                            │
│                                                             │
│  "Add rate limiting to the auth routes"                     │
│         │                                                   │
│         ▼                                                   │
│  semantic_search("auth route middleware")                   │
│         │                                                   │
│         ▼                                                   │
│  ✓ src/auth/middleware.ts:34   ← exact file, exact line     │
│  ✓ src/routes/auth.ts:12       ← no hallucination           │
│  ✓ src/config/rateLimiter.ts:1 ← finds existing logic      │
│         │                                                   │
│         ▼                                                   │
│  Agent reads the right code → writes the right fix         │
└─────────────────────────────────────────────────────────────┘

The 3 tools your agent gets

| Tool | What it does | When the agent uses it | |------|-------------|----------------------| | semantic_search(query) | Finds code by intent, returns ranked file:line hits | Any time it needs to locate logic | | get_chunk(file, start, end) | Reads a specific line range from any file | Expanding context around a search hit | | explain(question) | Returns a cited paragraph answer about your code | "How does X work?", "Why is Y structured this way?" |

What your agent actually sees

Without smartgrep: the agent guesses, reads 10 wrong files, may invent a solution that conflicts with existing code.

With smartgrep: the agent gets exact file:line results with real code snippets — first call, every time.

Set up the MCP integration (5 minutes)

Step 1 — Index your repo:

cd your-project
smartgrep index

Step 2 — Add smartgrep to your agent's MCP config:

Add to your project's .mcp.json (or run claude mcp add):

{
  "mcpServers": {
    "smartgrep": {
      "command": "smartgrep",
      "args": ["mcp", "--dir", "/absolute/path/to/your-project"]
    }
  }
}

Add to ~/.cursor/mcp.json:

{
  "mcpServers": {
    "smartgrep": {
      "command": "smartgrep",
      "args": ["mcp", "--dir", "/absolute/path/to/your-project"]
    }
  }
}

Add to ~/.codeium/windsurf/mcp_config.json:

{
  "mcpServers": {
    "smartgrep": {
      "command": "smartgrep",
      "args": ["mcp", "--dir", "/absolute/path/to/your-project"]
    }
  }
}

Windows users: use forward slashes or double backslashes in the path — e.g. C:/Users/you/my-project or C:\\Users\\you\\my-project.

Step 3 — Restart your agent. It now has three new tools:

| Tool | What the agent can do | |------|--------------| | semantic_search(query, top_k?) | Find code by intent — returns ranked file/line results with up to 80 lines of context per hit | | get_chunk(file, line_start, line_end) | Pull a specific line range to expand context around any hit | | explain(question, top_k?) | Get a paragraph-length answer with [N] file:line citations, synthesized by a local chat model |

Step 4 (optional) — Keep the index live while you code:

smartgrep watch

Run this in a separate terminal. The agent's searches will always reflect what's currently on disk.

Don't have an MCP client? smartgrep also works as a standalone CLI — keep scrolling.

How smartgrep makes vibe coding stronger

Traditional vibe coding                 Vibe coding + smartgrep
──────────────────────────────────      ──────────────────────────────────
You describe a feature                  You describe a feature
Agent reads random files                Agent calls semantic_search
Agent hallucinates structure            Agent finds exact file:line
You correct mistakes                    Agent writes targeted code
Agent tries again                       It works first time
You correct more mistakes
...eventually it works

The difference: your AI agent goes from exploring blindly to knowing exactly where to look.

Framework-aware from the start

$ smartgrep

  ◆ smartgrep  semantic code search
  ─────────────────────────────────────────────────────
  ↳  Detected React — try: component render · hook state · context provider
     Type a query, /help for commands, q to quit.

smartgrep detects your stack automatically — Next.js, React, Vue, Python, Rust, Go, Express, Angular, Fastify — and shows you what to search for right away.

Your agent never reads the wrong file again

Agent prompt:  "add input validation to the user registration endpoint"
                              │
               semantic_search("user registration endpoint validation")
                              │
                    ┌─────────┴──────────┐
                    ▼                    ▼
         src/routes/auth.ts:34    src/validators/user.ts:12
         (registration handler)   (existing validation utils)
                    │
              get_chunk to read both files
                    │
              writes correct code that uses existing validators
              ← no duplication, no wrong file, no hallucination

Standalone CLI usage

Commands at a glance

| Command | What it does | |---------|--------------| | smartgrep | Interactive REPL (default) — search, then press 1–9 to open a result | | smartgrep ask "<q>" | One-shot semantic search, pipe-safe | | smartgrep explain "<q>" | Cited paragraph answer (--json for tooling / CI) | | smartgrep chat | Multi-turn Q&A — re-searches the index each turn | | smartgrep index [dir] | Build or refresh the index (--full forces a clean rebuild) | | smartgrep watch [dir] | Keep the index live as you edit | | smartgrep doctor [dir] | Check your setup (Ollama, models, index) and offer to fix issues | | smartgrep mcp | Run as an MCP server for AI agents |

Run smartgrep <command> --help for flags and examples on any command.

Interactive REPL (default)

Just run smartgrep with no arguments:

smartgrep

A REPL opens. Type a question in plain English. Press 1–9 to open any result in your editor. Press / to search again. q to quit.

| Key | Action | |-----|--------| | 1–9 | Open that result in your editor at the exact line | | Enter or / | Start a new search (keeps session open) | | q or Ctrl+C | Quit |

On startup smartgrep detects your project's framework (Next.js, React, Python, Rust, Go, etc.) and shows relevant example queries. The REPL remembers your last query — pressing Enter repeats it.

One-shot search (pipe-safe)

# Pipe to other tools
smartgrep ask "error handling" | head -20

# Search a specific directory
smartgrep ask "database connection" --dir /path/to/project

# More results
smartgrep ask "caching logic" --top 10

# Different embedding model
smartgrep ask "auth flow" --model mxbai-embed-large

Keep the index live

smartgrep watch

Foreground watcher. Re-indexes files as you save them. Ctrl+C to stop. Uses chokidar + 500ms debounce + batched re-embed. Shows a per-drain ticker so you can see when changes flow in.

Ask questions with citations

# One-shot streaming answer with [N] citations
smartgrep explain "how does the watch lock work"

# Multi-turn — each turn re-searches the index for fresh context
smartgrep chat

# Machine-readable output (same shape as the MCP explain tool)
smartgrep explain "auth flow" --json | jq

Powered by a local chat model (default qwen2.5-coder:7b). Citations are validated post-stream — any [N] the model invents that doesn't map to a retrieved chunk is stripped before display.

Check your setup

smartgrep doctor            # current folder
smartgrep doctor ./my-app   # a specific folder

doctor verifies Node, native modules, Ollama, both models, and your index in one pass. For anything fixable it offers to run the fix on the spot (ollama pull …, smartgrep index); print-only issues show the exact command. It exits non-zero when something's still wrong, so it doubles as a CI readiness check. Run it first whenever a command behaves unexpectedly.

Install

Prerequisites

| Tool | Install | |------|---------| | Node.js 18+ | nodejs.org | | Ollama | ollama.ai |

Fresh install

npm install -g smartgrep

Already installed? Update to the latest version

npm install -g smartgrep@latest

Check your current version anytime:

smartgrep --version

After updating, re-run smartgrep index in any project you want to search — new versions may improve the index format.

Pull the models (once)

# Required — for indexing and semantic search
ollama pull nomic-embed-text

# Optional — only needed for `smartgrep explain` / `smartgrep chat` / MCP `explain`
ollama pull qwen2.5-coder:7b

nomic-embed-text is ~274 MB. qwen2.5-coder:7b is ~4.7 GB. Each is downloaded once per machine. If you only want search, skip the chat model.

git clone https://github.com/Benedictpatrick/Smartgrep.git
cd Smartgrep
npm install
npm run build
npm link

Features

Local-first — runs entirely on your machine via Ollama. No cloud, no API keys, your code never leaves disk.
MCP server built in — plug into Claude Code, Cursor, Windsurf, Aider, or any MCP-aware agent in two minutes. Three tools: semantic_search, get_chunk, explain.
Repo Q&A with citations — smartgrep explain "..." returns a paragraph-length answer with inline [N] file:line citations, streamed from a local chat model. smartgrep chat adds multi-turn. --json for tooling / CI.
Hybrid semantic + lexical search — vector similarity (LanceDB) and keyword matching fused via Reciprocal Rank Fusion for best-of-both-worlds relevance.
Smart chunking — tree-sitter splits TS / TSX / JS / JSX, Python, Go, Rust, and Java by function and class boundary; sliding-window fallback for other languages.
Incremental re-indexing — only re-embeds files that actually changed. Re-running smartgrep index on an up-to-date repo is near-free.
Live watch mode — smartgrep watch keeps the index live as you code, with a per-drain ticker so you can see when changes flow in.
Interactive REPL — animated search phases, syntax-highlighted results, press 1–9 to open in your editor. Detects your framework on startup (Next.js, React, Python, Rust, Go…) and shows tailored example queries. Slash commands /help, /explain <q>, /chat.
Press-to-open editor integration — auto-detects VS Code, vim, nano, or $EDITOR.
Pipe-safe — smartgrep ask "query" | grep foo and smartgrep explain "..." --json | jq both work cleanly.
One-command setup check — smartgrep doctor verifies Ollama, both models, Node, native modules, and your index, then offers to fix what it can. CI-friendly (non-zero exit on failure).
Friendly errors, never stack traces — Ollama down, a missing model, or a missing/locked/unreadable index each produce a clear message plus the exact command to fix it.
Respects .gitignore — won't index node_modules, build artifacts, lock files.
Any language — TypeScript, JavaScript, Python, Go, Rust, Java, C/C++, C#, Ruby, PHP, Swift, Kotlin, HTML, CSS, SQL, Markdown, and more.

Indexing

cd your-project
smartgrep index

  ◆ smartgrep  semantic code search
  ──────────────────────────────────────────────────────────

  ◇ Indexing workspace
    workspace   /your-project

  ✓ Ollama  ready
  ✓ Files   47 found
  ✓ Chunks  312 created

  ●  Embedding with nomic-embed-text

  ━━━━━━━━━━━━━━━━━━━━  62%  (194/312)  ETA 14s
     src/payments/retryService.ts

  ✓ Index  saved

  ✓  312 chunks  ·  47 files  ·  18.3s
  ↳  Run smartgrep to search

The index is saved in .smartgrep/ inside your project. Add it to .gitignore:

.smartgrep/

Re-running smartgrep index is incremental — only files whose mtime, size, or content hash changed get re-embedded. --full forces a clean rebuild.

Editor integration

When you press 1–9 in the REPL, smartgrep detects your editor and opens the file at the exact line.

| Editor | Detection | Open-at-line syntax | |--------|-----------|---------------------| | VS Code | $VISUAL, $EDITOR, or code on PATH | code --goto path:line:1 | | vim / neovim | $VISUAL, $EDITOR, or vim on PATH | vim +line path | | nano | $EDITOR | nano +line path | | Any other | $VISUAL or $EDITOR | $EDITOR +line path |

Set $EDITOR or $VISUAL in your shell profile to override detection.

How it works

Your codebase
     │
     ▼
  File Walker        respects .gitignore, skips noise files
     │
     ▼
  Tree-sitter        splits code by function / class boundary
  Chunker            (falls back to sliding window for unsupported languages)
     │
     ▼
  Ollama             embeds each chunk into a 768-dimensional vector
  Embedder           using nomic-embed-text running locally
     │
     ▼
  LanceDB            stores vectors + metadata + manifest.json in .smartgrep/
  Vector Store
     │
     ▼
  Your query ──► embed ──► vector search   ─┐
                       ──► keyword search  ─┤──► Reciprocal Rank Fusion ──► top results

  MCP server (smartgrep mcp) exposes the same ranking pipeline as
  semantic_search + get_chunk tools over stdio JSON-RPC.

Why this works better than keyword search:

grep "email" finds 200 matches. smartgrep ask "where do we send transactional emails" finds the 3 functions that actually send them — because it understands intent, not just words.

Hybrid search: smartgrep combines semantic (vector similarity) and lexical (keyword scoring) results using Reciprocal Rank Fusion. Results that appear in both lists rank higher; neither list discards the other.

Supported languages

| Language | Chunking | |----------|----------| | TypeScript / TSX | Tree-sitter (function/class level) | | JavaScript / JSX | Tree-sitter (function/class level) | | Python | Tree-sitter (function/class level) | | Go | Tree-sitter (function/class level) | | Rust | Tree-sitter (function/class level) | | Java | Tree-sitter (function/class level) | | C/C++, C#, Ruby, PHP, Swift, Kotlin | Sliding window | | HTML, CSS, SCSS, Vue, Svelte, SQL, YAML, TOML, Markdown | Sliding window |

Tree-sitter support for more languages is on the roadmap. PRs welcome.

Configuration

| Flag | Default | Description | |------|---------|-------------| | --model | nomic-embed-text | Ollama embedding model | | --top | 5 | Number of results to return | | --dir | cwd | Project root | | --full | (off) | Force a clean re-index (skip incremental cache) | | --force | (off) | Override the watch.lock check |

Alternative embedding models (better accuracy, slower):

ollama pull mxbai-embed-large
smartgrep index --model mxbai-embed-large
smartgrep ask "your query" --model mxbai-embed-large

If you change the embedding model, smartgrep automatically triggers a full re-index — the old vectors aren't compatible.

Troubleshooting

Run smartgrep doctor — it checks Ollama, both models, Node, native modules, and the index, and offers to fix anything it can:

smartgrep doctor            # current folder
smartgrep doctor ./my-app   # a specific folder

It exits non-zero if anything is still wrong, so it works in CI too.

FAQ

Does this send my code anywhere? No. Ollama runs on your machine. LanceDB is an embedded file-based vector store. Nothing leaves disk.

Do I need a GPU? Recommended for fast indexing, not required. Ollama runs on CPU too — just slower.

How big can the repo be? Tested on repos up to ~10k files. The bottleneck is Ollama embedding throughput (a few hundred chunks/sec on CPU, much faster on GPU).

What's MCP? Model Context Protocol — Anthropic's open standard for giving AI agents tool access. Claude Code, Cursor, Windsurf, Aider, and others speak it.

Does it work offline? Yes. After npm install and ollama pull, you can use it on a plane.

Will it index private code safely? Yes — that's the whole point of local-first. Nothing crosses a network.

Which chat model should I use for explain / chat? qwen2.5-coder:7b (the default) is a strong code-tuned ~5 GB model. On laptops with less RAM, try qwen2.5-coder:3b. For more capable answers, try qwen2.5-coder:14b or llama3.1:8b. Override with smartgrep explain "..." --model <name> or smartgrep chat -m <name>.

Contributing

Pull requests are welcome — especially:

Tree-sitter chunking for more languages (C/C++, Ruby, PHP, Swift, Kotlin — function/class boundaries)
New MCP tools (e.g. list_symbols, find_references)
VS Code / JetBrains extensions

To contribute:

Fork the repo
git checkout -b feature/your-feature
npm install && npm run build
npm test (all tests must pass)
Open a PR

Roadmap

[x] Interactive REPL with animated search phases
[x] Syntax-highlighted results with press-to-open editor integration
[x] Hybrid semantic + lexical search with Reciprocal Rank Fusion
[x] Per-file ETA progress during indexing
[x] Incremental re-indexing (only changed files)
[x] Live watch mode (smartgrep watch)
[x] MCP server for Claude Code / Cursor / agents
[x] Published to npm
[x] Repo Q&A with citations — smartgrep explain, smartgrep chat, MCP explain tool
[x] Workspace-adaptive tips — REPL detects Next.js / React / Python / Rust / Go and shows framework-specific example queries on startup
[x] Reliable cross-platform action bar — readline-based input works on Windows PowerShell (was broken with raw-mode)
[x] 10-line result snippets — shows more of each function body (was 6 lines)
[x] Tree-sitter for more languages — Python, Go, Rust, Java now chunk at function/class boundaries (was sliding-window for non-TS/JS)
[x] smartgrep doctor + friendly errors — one-command setup check with offer-to-fix; every failure shows the exact fix instead of a stack trace

What's next

[ ] Code-graph queries — smartgrep refs <symbol>, smartgrep callers <fn>, "who imports X"
[ ] VS Code extension — wrap the MCP server as a sidebar panel
[ ] smartgrep explain tool-use loop — let the chat model re-query the index mid-answer (follow imports, expand scope)
[ ] Multi-repo / monorepo federation — search across many indexed roots
[ ] GitHub Action — smartgrep index on push, cache the artifact for fast CI re-use

Have an idea? Open an issue — local-first, no-cloud, no-telemetry ideas welcome; cloud-LLM and analytics features are intentionally out of scope.

Built with Ollama · LanceDB · tree-sitter · TypeScript

If smartgrep saved you time, give it a ⭐ it's the cheapest way to help.