wb-grep

v0.1.1

Published

an hour ago

A local semantic grep tool using Qwen3 embeddings via Ollama and LanceDB

0High
0Medium
0Low

wb200

semantic-search grep embeddings ollama lancedb qwen3 code-search

wb-grep

A fully local semantic grep tool for code search using Qwen3 embeddings via Ollama and LanceDB.

wb-grep brings the power of semantic code search to your local machine without requiring any cloud services or API keys. Search your codebase using natural language queries, find related code by meaning rather than exact text matches, and keep your code indexed automatically as you work.

Install now: npm install -g wb-grep

Note: wb-grep is a derivative work based on mgrep by Mixedbread AI, licensed under Apache-2.0. This project replaces the cloud-based embedding and vector storage with local alternatives (Ollama + LanceDB) while preserving the core CLI design and architecture.

What is wb-grep?

Traditional grep is an invaluable tool, but it requires you to know exactly what you're looking for. When exploring unfamiliar codebases, debugging complex issues, or trying to understand how features are implemented, you often need to search by intent rather than exact patterns.

wb-grep solves this by:

Understanding meaning: Search for "authentication logic" and find the actual auth implementation, even if it's called verifyCredentials or checkUserSession
Running 100% locally: All embeddings and vector storage happen on your machine using Ollama and LanceDB—no cloud services, no API costs, no data leaving your system
Staying up-to-date: Watch mode automatically re-indexes files as you edit them
Being agent-friendly: Designed to work seamlessly with coding agents, providing quiet output and thoughtful defaults

How It Works

wb-grep uses a three-stage pipeline:

Chunking: Source files are intelligently split into semantic chunks (functions, classes, logical blocks) that preserve context
Embedding: Each chunk is converted into a 1024-dimensional vector using the Qwen3-Embedding-0.6B model running locally via Ollama
Vector Search: Queries are embedded the same way, then LanceDB finds the most similar code chunks using approximate nearest neighbor search

The result is a search experience that understands what you mean, not just what you type.

Why wb-grep?

| Feature | grep/ripgrep | Cloud Semantic Search | wb-grep | |---------|--------------|----------------------|-------------| | Exact pattern matching | ✅ | ✅ | ✅ | | Natural language queries | ❌ | ✅ | ✅ | | Works offline | ✅ | ❌ | ✅ | | No API costs | ✅ | ❌ | ✅ | | Data stays local | ✅ | ❌ | ✅ | | Automatic re-indexing | ❌ | ✅ | ✅ | | AST-aware chunking | ❌ | ✅ | ✅ |

Use grep for: exact symbol tracing, regex patterns, refactoring known identifiers

Use wb-grep for: code exploration, feature discovery, understanding unfamiliar codebases, natural language queries

Quick Start

Prerequisites

Node.js 18+ (for running wb-grep)
Ollama (for local embeddings)

Installation

Recommended: Install from NPM (one-liner)

npm install -g wb-grep

Alternative: Install from source (for development)

git clone https://github.com/wb200/wb-grep.git
cd wb-grep
npm install
npm run build
npm link  # Makes 'wb-grep' available globally

Verify installation:

wb-grep --version

Setup Ollama

# Install Ollama (macOS)
brew install ollama

# Or on Linux
curl -fsSL https://ollama.com/install.sh | sh

# Start the Ollama server (keep running in background)
ollama serve

# Pull the embedding model (in another terminal, one-time setup)
ollama pull qwen3-embedding:0.6b

Verify Ollama is running:

curl http://localhost:11434/api/tags

Your First Search

# Navigate to any codebase
cd /path/to/your/project

# Index the repository (runs once, then watches for changes)
wb-grep watch

# Search using natural language
wb-grep "where is authentication handled"
wb-grep "database connection setup"
wb-grep "error handling patterns"

Factory Droid Integration

wb-grep integrates seamlessly with Factory Droid to provide semantic code search capabilities within your AI coding workflows.

Setup with Droid

# Run the install command to set up wb-grep for Droid
wb-grep install-droid

This command:

Verifies Ollama connectivity and embedding model availability
Checks if your repository is indexed
Installs wb-grep as a Droid plugin with hooks and skills
Enables automatic watch mode when Droid sessions start

How It Works with Droid

Once installed, wb-grep integrates with Droid via:

Hooks - Automatically starts/stops wb-grep watch at session boundaries:
- SessionStart: Initializes wb-grep watch in background
- SessionEnd: Cleanly terminates the watch process
Skills - Two complementary capabilities:
- wb-grep skill: Quick reference for using semantic search
- advanced-grep skill: Comprehensive decision framework for choosing the right search tool (wb-grep, Grep, or ast-grep)

Usage in Droid

Within a Droid session, you can leverage semantic search in several ways:

# Droid can invoke wb-grep directly
droid> Search for the authentication middleware

# Or use the advanced-grep skill for optimal search strategy
droid> Where should I look for rate limiting?
# → advanced-grep skill recommends: wb-grep "rate limiting implementation"

# Or run semantic queries via Execute tool
droid> /exec wb-grep "session management"

Plugin Structure

~/.factory/plugins/wb-grep/
├── hooks/
│   ├── hook.json                    # Hook configuration
│   ├── wb_grep_watch.py             # Start watch process
│   └── wb_grep_watch_kill.py        # Clean shutdown
├── skills/
│   └── wb-grep/
│       └── SKILL.md                 # Quick reference skill
└── plugin.json                      # Plugin metadata

Requirements for Droid Integration

wb-grep binary installed globally (via npm link)
Ollama running and accessible at http://localhost:11434
Embedding model available: qwen3-embedding:0.6b
Factory Droid CLI installed

Troubleshooting Droid Integration

"Plugin not found"

# Verify installation
wb-grep install-droid --verify

# Reinstall if needed
wb-grep install-droid --force

"Hooks failing"

# Check hook logs
cat ~/.factory/hooks/debug.log | grep wb-grep

# Verify Ollama connectivity
curl http://localhost:11434/api/tags

"Watch mode not starting"

# Test hook manually
python3 ~/.factory/plugins/wb-grep/hooks/wb_grep_watch.py

# Check if already running
pgrep -f "wb-grep watch"

Commands

`wb-grep search <pattern> [path]` (default)

Search for code using natural language queries. This is the default command—you can omit search.

# Basic search
wb-grep "function that validates user input"

# Search with path filter
wb-grep "API endpoints" src/routes

# Show more results
wb-grep -m 20 "logging configuration"

# Include code snippets in output
wb-grep -c "authentication middleware"

| Option | Description | Default | |--------|-------------|---------| | -m, --max-count <n> | Maximum number of results | 10 | | -c, --content | Show code snippets in results | false |

Output Format:

./src/lib/auth.ts:45-67 (85.2%)
./src/middleware/session.ts:12-28 (73.8%)
./src/utils/jwt.ts:5-22 (68.4%)

The percentage indicates semantic similarity—higher means more relevant.

`wb-grep watch`

Index the repository and keep it up-to-date as files change.

# Start watching (indexes first, then monitors changes)
wb-grep watch

# Dry run—show what would be indexed without actually indexing
wb-grep watch --dry-run

| Option | Description | |--------|-------------| | -d, --dry-run | Preview files without indexing |

What gets indexed:

Source code files (.ts, .js, .py, .go, .rs, .java, etc.)
Documentation (.md, .mdx, .txt)
Configuration files (.json, .yaml, .toml, .xml)
Shell scripts (.sh, .bash, .zsh)
And 50+ other file types

What gets ignored:

.gitignore patterns are respected
.wbgrepignore for additional exclusions
Binary files, lock files, build outputs
node_modules, .git, dist, build directories

`wb-grep index`

One-shot indexing without file watching. Useful for CI/CD or when you don't need continuous updates.

# Index current directory
wb-grep index

# Index a specific path
wb-grep index --path /path/to/project

# Clear existing index and rebuild from scratch
wb-grep index --clear

| Option | Description | |--------|-------------| | -c, --clear | Clear existing index before indexing | | -p, --path <path> | Path to index (defaults to cwd) |

`wb-grep status`

Show index statistics and system status.

# Basic status
wb-grep status

# Detailed status with file list
wb-grep status --verbose

| Option | Description | |--------|-------------| | -v, --verbose | Show detailed information including indexed files |

Example Output:

📊 wb-grep Status

Index Statistics:
  Files indexed:    142
  Total chunks:     1,847
  Last sync:        2024-01-15T10:32:45.000Z

Vector Store:
  Unique files:     142
  Total vectors:    1,847

Ollama Status:
  Connected:        yes
  Model available:  yes
  Model:            qwen3-embedding:0.6b
  URL:              http://localhost:11434

`wb-grep clear`

Remove all indexed data and start fresh.

# Show warning (requires --force to actually clear)
wb-grep clear

# Actually clear the index
wb-grep clear --force

| Option | Description | |--------|-------------| | -f, --force | Required to confirm deletion |

Configuration

wb-grep can be configured via configuration files or environment variables.

Configuration Files

Create one of these files in your project root:

.wbgreprc
.wbgreprc.json
wbgrep.config.json

Example .wbgreprc.json:

{
  "ollama": {
    "baseURL": "http://localhost:11434",
    "model": "qwen3-embedding:0.6b",
    "timeout": 30000,
    "retries": 3
  },
  "indexing": {
    "batchSize": 10,
    "maxFileSize": 1048576,
    "concurrency": 8
  },
  "search": {
    "maxResults": 10,
    "showContent": false
  },
  "ignore": [
    "*.generated.ts",
    "vendor/**"
  ]
}

Environment Variables

| Variable | Description | Default | |----------|-------------|---------| | WBGREP_OLLAMA_URL | Ollama server URL | http://localhost:11434 | | WBGREP_OLLAMA_MODEL | Embedding model name | qwen3-embedding:0.6b | | WBGREP_OLLAMA_TIMEOUT | Request timeout (ms) | 30000 | | WBGREP_OLLAMA_RETRIES | Number of retries | 3 | | WBGREP_MAX_COUNT | Default max results | 10 | | WBGREP_CONTENT | Show content by default | false | | WBGREP_BATCH_SIZE | Indexing batch size | 10 | | WBGREP_CONCURRENCY | Embedding concurrency | 8 | | WBGREP_LOG_LEVEL | Log level (debug/info/warn/error) | info |

Example:

export WBGREP_MAX_COUNT=25
export WBGREP_CONTENT=true
wb-grep "authentication"

Ignore Patterns

Create a .wbgrepignore file in your project root to exclude additional files:

# Exclude generated files
*.generated.ts
*.g.dart

# Exclude specific directories
legacy/**
experiments/**

# Exclude large data files
*.csv
*.parquet

The syntax follows .gitignore conventions.

Examples

Exploring a New Codebase

# Get an overview of the architecture
wb-grep "main entry point"
wb-grep "application initialization"
wb-grep "routing configuration"

# Find specific functionality
wb-grep "user authentication flow"
wb-grep "database migrations"
wb-grep "API rate limiting"

# Understand patterns
wb-grep "error handling patterns"
wb-grep "logging implementation"
wb-grep "dependency injection"

Debugging

# Find error-related code
wb-grep "where errors are thrown"
wb-grep "exception handling for network requests"

# Trace data flow
wb-grep "where user data is saved"
wb-grep "session storage implementation"

Code Review

# Find security-sensitive code
wb-grep "password hashing"
wb-grep "SQL query construction"
wb-grep "file upload handling"

# Check for patterns
wb-grep "deprecated API usage"
wb-grep "TODO comments about security"

With Path Filters

# Search only in specific directories
wb-grep "validation logic" src/validators
wb-grep "React hooks" src/components
wb-grep "test utilities" tests/

# Search across multiple areas
wb-grep "configuration parsing" src/config

Detailed Output

# Get code snippets with results
wb-grep -c "middleware chain"

# Output:
# ./src/middleware/index.ts:15-32 (89.3%)
#   export function createMiddlewareChain(middlewares: Middleware[]) {
#     return async (ctx: Context, next: NextFunction) => {
#       let index = 0;
#       const dispatch = async (i: number): Promise<void> => {
#         if (i <= index) throw new Error('next() called multiple times');
#         index = i;
#         const fn = middlewares[i];
#         if (!fn) return next();
#         await fn(ctx, () => dispatch(i + 1));
#       };
#       return dispatch(0);
#     };
#   }

Technical Details

Embedding Model

wb-grep uses Qwen3-Embedding-0.6B, a compact but powerful embedding model:

Dimensions: 1024
Context Length: 32K tokens
Size: ~600MB
Languages: Multilingual support

The model runs locally via Ollama, ensuring your code never leaves your machine.

Vector Storage

LanceDB provides the vector database:

Embedded database (no server required)
Fast approximate nearest neighbor search
Efficient storage with columnar format
Supports millions of vectors

Index data is stored in .wb-grep/ in your project root.

Code Chunking

Files are intelligently split into chunks that:

Preserve function/class boundaries where possible
Keep related code together
Respect a maximum chunk size (~2000 characters)
Include context (imports, surrounding code)

File Structure

your-project/
├── .wb-grep/
│   ├── vectors/          # LanceDB vector store
│   └── state.json        # Index metadata
├── .wbgrepignore         # Custom ignore patterns
└── .wbgreprc.json        # Configuration (optional)

Troubleshooting

"Cannot connect to Ollama"

# Make sure Ollama is running
ollama serve

# Check if it's accessible
curl http://localhost:11434/api/tags

"Model not found"

# Pull the embedding model
ollama pull qwen3-embedding:0.6b

# Verify it's installed
ollama list

Search returns no results

# Check if files are indexed
wb-grep status

# If no files indexed, run watch or index
wb-grep watch

Index seems stale

# Rebuild the index from scratch
wb-grep index --clear

Performance issues

# Reduce concurrency if Ollama is overwhelmed
export WBGREP_CONCURRENCY=4

# Or increase timeout for slow systems
export WBGREP_OLLAMA_TIMEOUT=60000

Comparison with mgrep

wb-grep is inspired by mgrep, a cloud-based semantic grep tool by Mixedbread. The key differences:

| Feature | mgrep | wb-grep | |---------|-------|---------| | Embedding Provider | Mixedbread Cloud | Local Ollama | | Vector Storage | Mixedbread Cloud | Local LanceDB | | Authentication | Required | None | | API Costs | Pay per use | Free | | Data Privacy | Cloud-based | 100% local | | Model | Mixedbread proprietary | Qwen3-Embedding-0.6B | | Multimodal | Images, PDFs | Code/text only |

Choose mgrep if you want cloud convenience, multimodal search, and don't mind API costs.

Choose wb-grep if you need fully local operation, data privacy, or want to avoid recurring costs.

Contributing & Development

For Users

Install from npm:

npm install -g wb-grep

Or clone from source:

git clone https://github.com/wb200/wb-grep.git
cd wb-grep
npm install
npm link

For Developers

# Install dependencies
npm install

# Build
npm run build

# Development (watch mode)
npm run dev

# Lint and format
npm run lint
npm run format

# Type check
npm run typecheck

# Run tests (if configured)
npm test

Publishing Updates

# Update version
npm version patch  # or minor/major

# Publish to npm
npm publish

# Push tags to GitHub
git push origin --tags

Project Structure

wb-grep/
├── src/
│   ├── index.ts              # CLI entry point
│   ├── commands/
│   │   ├── search.ts         # Search command
│   │   ├── watch.ts          # Watch command
│   │   ├── index-cmd.ts      # Index command
│   │   ├── status.ts         # Status command
│   │   └── clear.ts          # Clear command
│   └── lib/
│       ├── embeddings.ts     # Ollama embedding client
│       ├── vector-store.ts   # LanceDB wrapper
│       ├── chunker.ts        # Code chunking logic
│       ├── indexer.ts        # Indexing orchestration
│       ├── index-state.ts    # State management
│       ├── file.ts           # File system utilities
│       ├── config.ts         # Configuration loading
│       ├── constants.ts      # Shared constants
│       └── logger.ts         # Logging utilities
├── dist/                     # Compiled output
├── package.json
├── tsconfig.json
└── biome.json

License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.

wb-grep is a derivative work of mgrep by Mixedbread AI, which is also licensed under Apache-2.0.

Acknowledgments

Original Work

wb-grep is based on mgrep by Mixedbread AI. The original mgrep provides cloud-based semantic code search using Mixedbread's embedding API. This derivative work adapts the core architecture for fully local operation.

Original Project: mixedbread-ai/mgrep
Original License: Apache-2.0
Original Authors: Mixedbread AI team

Other Dependencies

Ollama - Local LLM inference
LanceDB - Embedded vector database
Qwen - Embedding model

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

wb-grep

What is wb-grep?

How It Works

Why wb-grep?

Quick Start

Prerequisites

Installation

Setup Ollama

Your First Search

Factory Droid Integration

Setup with Droid

How It Works with Droid

Usage in Droid

Plugin Structure

Requirements for Droid Integration

Troubleshooting Droid Integration

Commands

wb-grep search <pattern> [path] (default)

wb-grep watch

wb-grep index

wb-grep status

wb-grep clear

Configuration

Configuration Files

Environment Variables

Ignore Patterns

Examples

Exploring a New Codebase

Debugging

Code Review

With Path Filters

Detailed Output

Technical Details

Embedding Model

Vector Storage

Code Chunking

File Structure

Troubleshooting

"Cannot connect to Ollama"

"Model not found"

Search returns no results

Index seems stale

Performance issues

Comparison with mgrep

Contributing & Development

For Users

For Developers

Publishing Updates

Project Structure

License

Acknowledgments

Original Work

Other Dependencies

`wb-grep search <pattern> [path]` (default)

`wb-grep watch`

`wb-grep index`

`wb-grep status`

`wb-grep clear`