indexer-ai
v2.1.0
Published
High-performance universal codebase indexer for AI assistants and development tools
Maintainers
Readme
indexer-ai
High-performance universal code indexer optimized for AI assistants and modern development tools.
Production Ready: Enterprise-grade performance with Worker Threads and async I/O Performance: 3.6x faster indexing with parallel processing (10,000 files in ~25s) Clean Codebase: Major cleanup completed - removed 260+ instances of dead code
🚀 Latest Updates (September 2025)
Code Quality Improvements
- Dead Code Removal: Eliminated 67 unused functions, 145 unused exports, 48 unused imports
- Parser Consolidation: Merged 3 Python parsers into 1 unified Tree-sitter parser
- VS Code Extension: Removed to focus on core indexing functionality
- Logger Migration: Replaced 398 console.log statements with proper Logger class
- Syntax Fixes: Fixed all cleanup-related syntax errors across 9 files
Performance Enhancements (v2.0.2)
- Worker Threads: Parallel parsing using all CPU cores
- Async I/O: Non-blocking file operations throughout
- Node.js 20 LTS: Latest runtime optimizations
- ES2023 Target: Modern JavaScript features for better performance
Performance Benchmarks
| Files | Before | After | Improvement | |-------|--------|-------|-------------| | 100 | ~2s | ~0.7s | 2.8x faster | | 1,000 | ~15s | ~5s | 3x faster | | 10,000 | ~90s | ~25s | 3.6x faster | | Memory | 300MB | 180MB | 40% reduction |
Features
Core Capabilities
- Lightning Fast: Index 10,000+ files in ~25 seconds with Worker Threads
- Multi-Language: 9 languages - JavaScript, TypeScript, Python, Go, SQL, GraphQL, YAML, Astro
- AI-Optimized: Built for Claude, GPT-4, and other LLM assistants
- Real-time Updates: File watching with automatic index refresh
- Organization-Wide: Index entire organizations and monorepos automatically
- Cross-Repository: Track API calls and dependencies between services
- Modern Architecture: ES2023, async/await, Worker Threads
🆕 Advanced Features
- Call Graph Analysis: Bidirectional function call tracking and dead code detection
- AI Compression: 50-70% size reduction with token-aware optimization
- Worker Thread Pool: Automatic parallel processing for large codebases
- Streaming Support: Handle massive files without memory issues
- Impact Analysis: Track cascading effects of code changes
Export Formats
- JSON: Complete index with compression options (standard, compressed, minified)
- Markdown: Human-readable documentation
- Mermaid: Interactive diagrams for VS Code/Cursor
- GraphViz: Professional dependency graphs
- ASCII: Terminal-friendly visualizations
Quick Start
Requirements
- Node.js >= 20.0.0 (LTS recommended)
- npm or yarn
- 4GB RAM recommended for large codebases
Installation Prerequisites
Linux/WSL Requirements
For tree-sitter and native dependencies to compile:
# Ubuntu/Debian/WSL
sudo apt-get update
sudo apt-get install -y build-essential python3
# macOS (if needed)
xcode-select --installInstallation
# Install globally from npm (recommended)
npm install -g indexer-ai
# Quick usage with short command
idxr # Shortest command (4 chars!)
indexer # Alternative command
indexer-ai # Full package name
# Or install from source
git clone https://github.com/tacit-code/indexer.git
cd indexer
yarn install && yarn build
npm install -g .Basic Usage
# Smart mode - analyzes everything automatically
idxr # Quick 4-character command!
# or
indexer
# Index entire organization (all repos in subdirectories)
cd /your/organization
idxr
# Index specific project with Worker Threads (automatic for >50 files)
idxr scan /path/to/project
# Index with specific options
indexer scan --parallel 8 --output custom-index.json
# Watch mode with real-time updates
idxr watch
# Interactive chat with Claude about your codebase
idxr chat
# Query the index
idxr query "function.*Auth" --fuzzyMulti-Repository & Organization Indexing
The indexer automatically detects and analyzes entire organization structures, monorepos, and multi-repository setups without configuration.
Organization-Wide Indexing
# Index your entire organization
cd /path/to/organization # Parent directory containing all repos
indexer # Automatically indexes ALL repositories
# Example: Clone Global organization structure
/clone-global/
├── indexer/ # This tool
├── backend/ # API services
├── frontend/ # Web applications
├── mobile/ # Mobile apps
├── skills/ # Microservices
└── data-ops/ # Data pipelines
# Run from parent directory:
cd /clone-global
indexer # Creates comprehensive cross-repository knowledge graphAutomatic Detection
The SmartIndexer automatically detects:
- Monorepo structures: lerna.json, yarn workspaces, pnpm workspaces
- Multi-repository setups: Multiple .git directories
- Service architectures: Microservices, APIs, frontends
- Shared dependencies: Cross-repository imports and libraries
Cross-Repository Analysis
Tracks relationships across your entire codebase:
- Frontend → Backend: API calls, GraphQL queries, REST endpoints
- Service → Service: Inter-service communication, event streams
- Shared Libraries: Import/export dependencies, version tracking
- Database Schemas: Cross-service data flows and dependencies
Generated Outputs
.indexer-output/current/
├── PROJECT_INDEX.json # Combined index of ALL repositories
├── service-graph.json # Complete dependency graph
├── multi-repo-overview.md # Visual architecture diagram
├── multi-repo-interactive.html # Interactive dependency explorer
└── [repo-name]/ # Individual repository indexes
└── PROJECT_INDEX.json # Repo-specific indexUse Cases
- Architecture Documentation: Auto-generate system architecture diagrams
- Dependency Analysis: Find all consumers of an API endpoint
- Impact Assessment: See affected services before making changes
- Code Navigation: Jump between repos following API calls
- AI Context: Give LLMs complete understanding of your entire system
Benefits for AI Assistants
When you provide the generated PROJECT_INDEX.json to Claude, GPT-4, or other AI assistants:
- Complete Context: AI understands your entire organization from a single file
- Cross-Repo Intelligence: AI can trace API calls across service boundaries
- Accurate Suggestions: AI knows exact function signatures and dependencies
- Reduced Token Usage: Compressed index uses 50-70% fewer tokens than raw code
- System-Wide Refactoring: AI can suggest changes considering all affected services
Performance Configuration
Optimize for Your System
# Maximum performance (uses all CPU cores)
indexer scan --parallel $(nproc)
# Memory-constrained environment
indexer scan --parallel 2 --max-memory 256
# Disable Worker Threads for debugging
indexer scan --no-workers
# Incremental mode for large codebases
indexer scan --incrementalEnvironment Variables
# Performance
INDEXER_PARALLEL=8 # Number of parallel workers
INDEXER_MAX_MEMORY=1000 # Max memory in MB
INDEXER_USE_WORKERS=true # Enable Worker Threads
# Node.js 20+ optimizations
NODE_OPTIONS="--max-old-space-size=4096" # 4GB heap
UV_THREADPOOL_SIZE=16 # Larger thread pool
# AI Features
ANTHROPIC_API_KEY=sk-ant-... # Claude integrationArchitecture
Modern Tech Stack
- Runtime: Node.js 20 LTS with native ES modules support
- Language: TypeScript 5.3+ with ES2023 target
- Parallelization: Worker Threads for CPU-intensive parsing
- Async I/O: Promises-based file system operations
- Parsing: Tree-sitter (Python), Babel (JS/TS), native AST parsers
Performance Architecture
┌─────────────────────────────────────┐
│ Main Thread │
│ ┌─────────────────────────────┐ │
│ │ Orchestration Layer │ │
│ └──────────┬──────────────────┘ │
│ │ │
│ ┌──────────▼──────────────────┐ │
│ │ Worker Thread Pool │ │
│ │ ┌────┐ ┌────┐ ... ┌────┐ │ │
│ │ │ W1 │ │ W2 │ │ Wn │ │ │
│ │ └────┘ └────┘ └────┘ │ │
│ └─────────────────────────────┘ │
│ │
│ ┌─────────────────────────────┐ │
│ │ Async I/O Layer │ │
│ └─────────────────────────────┘ │
└─────────────────────────────────────┘API Server
REST API
# Start API server
indexer api --port 4000
# Endpoints
POST /api/index # Build index
GET /api/index/status # Get status
POST /api/query # Query index
GET /api/stats # Statistics
POST /api/ai/analyze # AI analysisGraphQL API
query {
index {
files {
path
functions {
name
complexity
}
}
}
}Advanced Features
Multi-Repository Analysis
# Analyze entire organization
indexer multi-repo /path/to/org --cross-dependencies
# Generate knowledge graph
indexer export mermaid --multi-repoAI-Powered Analysis
# Security scanning
indexer ai security-scan
# Bug prediction
indexer ai predict-bugs --confidence 0.8
# Code smell detection
indexer ai detect-smellsCall Graph Analysis
# Find unused code
indexer analyze dead-code
# Trace execution paths
indexer analyze call-paths main
# Detect circular dependencies
indexer analyze circularConfiguration
.indexer.yml
version: 2
performance:
parallel: 8
useWorkers: true
maxMemory: 1000
cache: true
include:
- "**/*.{js,jsx,ts,tsx,py,go,sql}"
ignore:
- "**/node_modules/**"
- "**/dist/**"
export:
formats:
json:
compression: true
maxSize: 10MBIntegrations
IDE Support
- Cursor: Native integration with AI features
- WebStorm: Via REST API
- Vim/Neovim: LSP integration
CI/CD
- GitHub Actions: Pre-built workflows
- GitLab CI: Docker images available
- Jenkins: Plugin support
- CircleCI: Orb available
Monitoring
- Datadog: APM and metrics integration
- New Relic: Performance monitoring
- Sentry: Error tracking
- Grafana: Custom dashboards
Performance Tips
- Use Worker Threads for codebases >50 files (automatic)
- Enable incremental mode for large projects
- Configure parallel workers based on CPU cores
- Use compression for large indexes
- Enable caching for repeated operations
- Set appropriate memory limits for your system
Troubleshooting
Common Issues
Out of Memory
# Increase Node.js heap size
NODE_OPTIONS="--max-old-space-size=8192" indexer scanSlow Performance
# Check Worker Thread status
indexer debug --workers
# Profile performance
indexer scan --profileParser Errors
# Use fallback parser
indexer scan --parser-fallback
# Skip problematic files
indexer scan --skip-errorsDevelopment
Building
npm install
npm run build # Compiles TypeScript and Worker scripts
npm run dev # Watch mode
npm test # Run testsTesting
npm test # Run all tests (~25% coverage)
npm run test:unit # Unit tests
npm run test:e2e # End-to-end tests (Cypress)
npm run test:coverage # Generate coverage reportCurrent Test Status: 6 test suites passing with ~25% coverage. Target: 80%.
Contributing
See CONTRIBUTING.md for development guidelines.
License
MIT © Clone Global
Support
- Documentation: docs/
- Issues: GitHub Issues
- Discord: Join our community
Built for the future of AI-assisted development 🚀
