@sibyllinesoft/lens
v1.0.0-rc.2
Published
Local sharded code search system with three-layer processing pipeline
Downloads
3
Maintainers
Readme
Lens 🔍
Lightning-Fast Code Search That Actually Understands Your Code
Search your entire codebase in under 20ms - from fuzzy function names to complex semantic queries. Lens combines the speed of traditional text search with the intelligence of semantic understanding.
⚡ Why Lens?
Traditional search tools miss the mark:
- Text search is fast but doesn't understand code structure
- IDE search is limited to one project at a time
- Semantic search is smart but too slow for real-time use
- AST tools are precise but don't handle typos or natural language
Lens gives you the best of everything:
- 🏎️ Sub-20ms responses for any query size
- 🧠 Code-aware search that understands functions, classes, and symbols
- 🔤 Fuzzy matching that handles typos and variations
- 🌐 Semantic understanding for natural language queries
- 📊 Multi-repo support - search across your entire codebase
- 🛡️ Production-ready with comprehensive monitoring and benchmarking
🚀 Quick Start - Get Running in 2 Minutes
1. Install & Start
# Install Lens
npm install -g [email protected]
# Start the server
lens server
# ✅ Server running on http://localhost:30002. Index Your Code
# Index current directory
curl -X POST http://localhost:3000/index \
-H "Content-Type: application/json" \
-d '{"repo_path": ".", "repo_sha": "main"}'
# ✅ Indexed 1,247 files in 3.2 seconds3. Search Like Magic
# Find functions by name (fuzzy matching)
curl -X POST http://localhost:3000/search \
-d '{"repo_sha": "main", "q": "calcTotal", "mode": "hybrid", "k": 10}'
# Find classes and interfaces
curl -X POST http://localhost:3000/search \
-d '{"repo_sha": "main", "q": "class User", "mode": "struct", "k": 5}'
# Natural language search
curl -X POST http://localhost:3000/search \
-d '{"repo_sha": "main", "q": "authentication logic", "mode": "hybrid", "k": 10}'Response in < 20ms:
{
"hits": [
{
"file": "src/auth/User.ts",
"line": 15,
"snippet": "class User implements UserInterface {",
"score": 0.95,
"why": ["exact", "symbol"]
}
],
"total": 1,
"latency_ms": {
"stage_a": 4, // Lightning-fast text search
"stage_b": 6, // Code structure analysis
"stage_c": 8, // Semantic reranking
"total": 18
}
}💡 What Makes Lens Different?
The 3-Stage Intelligence Pipeline
Lens doesn't just do text search - it understands your code through three complementary layers:
🔤 Stage A: Smart Text Search (2-8ms)
- Fuzzy matching - finds
calcTotaleven if you typecalctotalorcalc_total - Subtoken awareness - understands camelCase and snake_case conventions
- N-gram indexing - blazing fast even on massive codebases
🧠 Stage B: Code Structure (3-10ms)
- Symbol resolution - knows the difference between function definitions and calls
- AST parsing - understands language syntax and structure
- Cross-references - finds where functions are defined vs used
🌐 Stage C: Semantic Understanding (5-15ms)
- Natural language - search for "user authentication logic"
- Code similarity - find similar functions even with different names
- Context-aware ranking - prioritizes results based on semantic relevance
🏆 Use Cases - Real Developers, Real Problems
👨💻 For Developers
# "Where did I define that helper function?"
curl -X POST localhost:3000/search -d '{"q": "formatDate", "mode": "struct"}'
# "Show me all the places where users are authenticated"
curl -X POST localhost:3000/search -d '{"q": "user auth", "mode": "hybrid"}'
# "Find similar error handling patterns"
curl -X POST localhost:3000/search -d '{"q": "try catch finally", "mode": "struct"}'🏢 For Teams & Organizations
- Code reviews: Find similar implementations across repos
- Refactoring: Identify all usages of deprecated functions
- Onboarding: Help new team members discover existing code
- Security audits: Search for potential security patterns
🤖 For AI & Tools
- Code completion: Better context for AI assistants
- Documentation: Auto-generate docs from code discovery
- Dependency analysis: Understand code relationships
- Migration tools: Find patterns to update automatically
🛠️ Language Support
Lens understands the structure and semantics of multiple programming languages:
| Language | Fuzzy Search | Symbol Recognition | AST Parsing | Semantic Search | |----------|:------------:|:------------------:|:-----------:|:---------------:| | TypeScript/JavaScript | ✅ | ✅ | ✅ | ✅ | | Python | ✅ | ✅ | ✅ | ✅ | | Rust | ✅ | ✅ | ✅ | ✅ | | Go | ✅ | ✅ | ✅ | ✅ | | Java | ✅ | ✅ | ✅ | ✅ | | Bash | ✅ | ✅ | ✅ | ⚠️ |
New languages added regularly - contribute support for your favorite language!
⚙️ Installation Options
NPM (Recommended)
# Install globally
npm install -g [email protected]
# Or install locally
npm install lens
npx lens serverDocker
# Quick start with Docker
docker run -p 3000:3000 -v $(pwd):/code lens:1.0.0
# Or with Docker Compose
curl -O https://raw.githubusercontent.com/lens-search/lens/main/docker-compose.yml
docker-compose upFrom Source
git clone https://github.com/lens-search/lens.git
cd lens
npm install
npm run build
npm start🎛️ Configuration
Basic Configuration
# Environment variables
export LENS_PORT=3000
export LENS_HOST=0.0.0.0
# Start with custom settings
LENS_PORT=8080 lens serverAdvanced Configuration
Create lens.config.json:
{
"server": {
"port": 3000,
"host": "0.0.0.0"
},
"performance": {
"max_concurrent_queries": 100,
"stage_timeouts": {
"stage_a": 200, // Lexical search timeout (ms)
"stage_b": 300, // Symbol search timeout (ms)
"stage_c": 300 // Semantic rerank timeout (ms)
}
},
"features": {
"fuzzy_search": true,
"semantic_rerank": true,
"learned_rerank": true
}
}Performance Tuning
# For large codebases (>1M files)
export LENS_MEMORY_LIMIT_GB=16
export LENS_MAX_CANDIDATES=500
# For low-latency requirements
export LENS_SEMANTIC_RERANK=false
export LENS_STAGE_A_ONLY=true🔌 API Reference
Core Endpoints
POST /search - Main Search
Search across your codebase with multiple modes:
curl -X POST http://localhost:3000/search \
-H "Content-Type: application/json" \
-d '{
"repo_sha": "main",
"q": "user authentication",
"mode": "hybrid", // "lex", "struct", or "hybrid"
"fuzzy": 1, // Edit distance (0-2)
"k": 10 // Number of results (1-200)
}'POST /index - Index Repository
Add a codebase to the search index:
curl -X POST http://localhost:3000/index \
-H "Content-Type: application/json" \
-d '{
"repo_path": "/path/to/code",
"repo_sha": "main" // Unique identifier
}'GET /health - System Health
Check system status and performance:
curl http://localhost:3000/health
# Returns: {"status": "ok", "shards_healthy": 5, "latency_p95": 18}Search Modes Explained
| Mode | Best For | Speed | Accuracy |
|------|----------|:-----:|:--------:|
| lex | Quick text search, exact matches | ⚡⚡⚡ | ⭐⭐ |
| struct | Code patterns, AST queries | ⚡⚡ | ⭐⭐⭐ |
| hybrid | Natural language, semantic similarity | ⚡ | ⭐⭐⭐⭐ |
Response Format
{
"hits": [
{
"file": "src/auth/login.ts",
"line": 42,
"col": 8,
"snippet": "function authenticateUser(credentials) {",
"score": 0.95,
"why": ["exact", "symbol"],
"symbol_kind": "function"
}
],
"total": 1,
"latency_ms": {"stage_a": 4, "stage_b": 6, "stage_c": 8, "total": 18}
}🚨 Troubleshooting
Common Issues
❌ "Repository not found in index"
# Solution: Index the repository first
curl -X POST http://localhost:3000/index \
-H "Content-Type: application/json" \
-d '{"repo_path": ".", "repo_sha": "main"}'❌ "Search returns no results"
# Check if repo_sha matches indexed value
curl http://localhost:3000/health
# Try broader search with fuzzy matching
curl -X POST http://localhost:3000/search \
-d '{"repo_sha": "main", "q": "function", "mode": "lex", "fuzzy": 2}'❌ "Slow search performance"
# Check latency breakdown
curl -X POST http://localhost:3000/search \
-H "X-Trace-Id: debug-123" \
-d '{"repo_sha": "main", "q": "test", "mode": "lex", "k": 5}'
# Disable semantic rerank for speed
curl -X POST http://localhost:3000/search \
-d '{"repo_sha": "main", "q": "test", "mode": "struct", "k": 5}'Performance Tips
- Use
mode: "lex"for fastest results (2-8ms) - Limit
kparameter to what you actually need (≤50 typically) - Use
fuzzy: 0when you need exact matches only - Monitor
/healthendpoint for system bottlenecks
🏗️ Architecture - The Technical Details
For architects and advanced users who want to understand how Lens achieves sub-20ms performance
Three-Layer Processing Pipeline
Query: "user authentication logic"
↓
[Stage A: Lexical+Fuzzy Search] // 2-8ms
├─ N-gram indexing
├─ Fuzzy matching (≤2 edits)
└─ Subtoken analysis
↓ (~100-200 candidates)
[Stage B: Symbol/AST Analysis] // 3-10ms
├─ Universal-ctags symbol resolution
├─ Tree-sitter AST parsing
└─ Structural pattern matching
↓ (~50-100 candidates)
[Stage C: Semantic Reranking] // 5-15ms (optional)
├─ ColBERT-v2 vector similarity
├─ Context-aware scoring
└─ Final ranking
↓ (Top 10-50 results)
Final Results: 18ms totalTechnology Stack
- Core Engine: TypeScript + Fastify
- Indexing: Memory-mapped segments, NATS/JetStream messaging
- AST Parsing: Tree-sitter, universal-ctags
- Vector Search: ColBERT-v2, HNSW indexing
- Observability: OpenTelemetry tracing & metrics
- Storage: Append-only segments with periodic compaction
📈 Production & Monitoring
Built-in Observability
- Real-time metrics at
/metricsendpoint - Distributed tracing with OpenTelemetry
- Performance breakdown by search stage
- Health monitoring with automatic alerts
Benchmarking & Validation
# Run performance benchmarks
npm run benchmark:smoke # Quick 5-minute benchmark
npm run benchmark:full # Comprehensive evaluation
# Validate system health
npm run gates:validate # Check all quality gatesProduction Deployment
# docker-compose.yml
version: '3.8'
services:
lens:
image: lens:1.0.0
ports: ["3000:3000"]
volumes: ["./code:/code:ro"]
environment:
LENS_MEMORY_LIMIT_GB: 8
LENS_MAX_CONCURRENT_QUERIES: 200
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:3000/health"]🤝 Contributing & Community
Get Involved
- 📖 Documentation: Full docs
- 🐛 Issues: Report bugs
- 💬 Discussions: Join the community
- 🛠️ Development: See CONTRIBUTING.md
Roadmap
- ✅ v1.0: Core search with 3-stage pipeline
- 🚧 v1.1: IDE extensions (VS Code, IntelliJ)
- 📋 v1.2: GraphQL/REST API integrations
- 🔮 v2.0: Multi-language neural reranking
📄 License & Credits
MIT License - See LICENSE for details.
Built with love for developers who deserve better code search.
Special thanks to the open source community and the researchers behind ColBERT, Tree-sitter, and universal-ctags.
Ready to search your code like never before?
npm install -g [email protected]
lens server🔍 Happy searching!
