indexer-ai

v2.1.0

Published

4 months ago

High-performance universal codebase indexer for AI assistants and development tools

0High
0Medium
0Low

jamesrosing

indexer indexer-ai idxr code-analysis ast parser codebase ai-tools claude copilot cursor ai-assistant development-tools

indexer-ai

High-performance universal code indexer optimized for AI assistants and modern development tools.

Production Ready: Enterprise-grade performance with Worker Threads and async I/O Performance: 3.6x faster indexing with parallel processing (10,000 files in ~25s) Clean Codebase: Major cleanup completed - removed 260+ instances of dead code

🚀 Latest Updates (September 2025)

Code Quality Improvements

Dead Code Removal: Eliminated 67 unused functions, 145 unused exports, 48 unused imports
Parser Consolidation: Merged 3 Python parsers into 1 unified Tree-sitter parser
VS Code Extension: Removed to focus on core indexing functionality
Logger Migration: Replaced 398 console.log statements with proper Logger class
Syntax Fixes: Fixed all cleanup-related syntax errors across 9 files

Performance Enhancements (v2.0.2)

Worker Threads: Parallel parsing using all CPU cores
Async I/O: Non-blocking file operations throughout
Node.js 20 LTS: Latest runtime optimizations
ES2023 Target: Modern JavaScript features for better performance

Performance Benchmarks

| Files | Before | After | Improvement | |-------|--------|-------|-------------| | 100 | ~2s | ~0.7s | 2.8x faster | | 1,000 | ~15s | ~5s | 3x faster | | 10,000 | ~90s | ~25s | 3.6x faster | | Memory | 300MB | 180MB | 40% reduction |

Features

Core Capabilities

Lightning Fast: Index 10,000+ files in ~25 seconds with Worker Threads
Multi-Language: 9 languages - JavaScript, TypeScript, Python, Go, SQL, GraphQL, YAML, Astro
AI-Optimized: Built for Claude, GPT-4, and other LLM assistants
Real-time Updates: File watching with automatic index refresh
Organization-Wide: Index entire organizations and monorepos automatically
Cross-Repository: Track API calls and dependencies between services
Modern Architecture: ES2023, async/await, Worker Threads

🆕 Advanced Features

Call Graph Analysis: Bidirectional function call tracking and dead code detection
AI Compression: 50-70% size reduction with token-aware optimization
Worker Thread Pool: Automatic parallel processing for large codebases
Streaming Support: Handle massive files without memory issues
Impact Analysis: Track cascading effects of code changes

Export Formats

JSON: Complete index with compression options (standard, compressed, minified)
Markdown: Human-readable documentation
Mermaid: Interactive diagrams for VS Code/Cursor
GraphViz: Professional dependency graphs
ASCII: Terminal-friendly visualizations

Quick Start

Requirements

Node.js >= 20.0.0 (LTS recommended)
npm or yarn
4GB RAM recommended for large codebases

Installation Prerequisites

Linux/WSL Requirements

For tree-sitter and native dependencies to compile:

# Ubuntu/Debian/WSL
sudo apt-get update
sudo apt-get install -y build-essential python3

# macOS (if needed)
xcode-select --install

Installation

# Install globally from npm (recommended)
npm install -g indexer-ai

# Quick usage with short command
idxr                    # Shortest command (4 chars!)
indexer                 # Alternative command
indexer-ai              # Full package name

# Or install from source
git clone https://github.com/tacit-code/indexer.git
cd indexer
yarn install && yarn build
npm install -g .

Basic Usage

# Smart mode - analyzes everything automatically
idxr                     # Quick 4-character command!
# or
indexer

# Index entire organization (all repos in subdirectories)
cd /your/organization
idxr

# Index specific project with Worker Threads (automatic for >50 files)
idxr scan /path/to/project

# Index with specific options
indexer scan --parallel 8 --output custom-index.json

# Watch mode with real-time updates
idxr watch

# Interactive chat with Claude about your codebase
idxr chat

# Query the index
idxr query "function.*Auth" --fuzzy

Multi-Repository & Organization Indexing

The indexer automatically detects and analyzes entire organization structures, monorepos, and multi-repository setups without configuration.

Organization-Wide Indexing

# Index your entire organization
cd /path/to/organization  # Parent directory containing all repos
indexer                    # Automatically indexes ALL repositories

# Example: Clone Global organization structure
/clone-global/
├── indexer/         # This tool
├── backend/         # API services
├── frontend/        # Web applications
├── mobile/          # Mobile apps
├── skills/          # Microservices
└── data-ops/        # Data pipelines

# Run from parent directory:
cd /clone-global
indexer  # Creates comprehensive cross-repository knowledge graph

Automatic Detection

The SmartIndexer automatically detects:

Monorepo structures: lerna.json, yarn workspaces, pnpm workspaces
Multi-repository setups: Multiple .git directories
Service architectures: Microservices, APIs, frontends
Shared dependencies: Cross-repository imports and libraries

Cross-Repository Analysis

Tracks relationships across your entire codebase:

Frontend → Backend: API calls, GraphQL queries, REST endpoints
Service → Service: Inter-service communication, event streams
Shared Libraries: Import/export dependencies, version tracking
Database Schemas: Cross-service data flows and dependencies

Generated Outputs

.indexer-output/current/
├── PROJECT_INDEX.json          # Combined index of ALL repositories
├── service-graph.json          # Complete dependency graph
├── multi-repo-overview.md      # Visual architecture diagram
├── multi-repo-interactive.html # Interactive dependency explorer
└── [repo-name]/               # Individual repository indexes
    └── PROJECT_INDEX.json     # Repo-specific index

Use Cases

Architecture Documentation: Auto-generate system architecture diagrams
Dependency Analysis: Find all consumers of an API endpoint
Impact Assessment: See affected services before making changes
Code Navigation: Jump between repos following API calls
AI Context: Give LLMs complete understanding of your entire system

Benefits for AI Assistants

When you provide the generated PROJECT_INDEX.json to Claude, GPT-4, or other AI assistants:

Complete Context: AI understands your entire organization from a single file
Cross-Repo Intelligence: AI can trace API calls across service boundaries
Accurate Suggestions: AI knows exact function signatures and dependencies
Reduced Token Usage: Compressed index uses 50-70% fewer tokens than raw code
System-Wide Refactoring: AI can suggest changes considering all affected services

Performance Configuration

Optimize for Your System

# Maximum performance (uses all CPU cores)
indexer scan --parallel $(nproc)

# Memory-constrained environment
indexer scan --parallel 2 --max-memory 256

# Disable Worker Threads for debugging
indexer scan --no-workers

# Incremental mode for large codebases
indexer scan --incremental

Environment Variables

# Performance
INDEXER_PARALLEL=8           # Number of parallel workers
INDEXER_MAX_MEMORY=1000      # Max memory in MB
INDEXER_USE_WORKERS=true     # Enable Worker Threads

# Node.js 20+ optimizations
NODE_OPTIONS="--max-old-space-size=4096"  # 4GB heap
UV_THREADPOOL_SIZE=16        # Larger thread pool

# AI Features
ANTHROPIC_API_KEY=sk-ant-... # Claude integration

Architecture

Modern Tech Stack

Runtime: Node.js 20 LTS with native ES modules support
Language: TypeScript 5.3+ with ES2023 target
Parallelization: Worker Threads for CPU-intensive parsing
Async I/O: Promises-based file system operations
Parsing: Tree-sitter (Python), Babel (JS/TS), native AST parsers

Performance Architecture

┌─────────────────────────────────────┐
│         Main Thread                  │
│  ┌─────────────────────────────┐    │
│  │   Orchestration Layer       │    │
│  └──────────┬──────────────────┘    │
│             │                        │
│  ┌──────────▼──────────────────┐    │
│  │   Worker Thread Pool        │    │
│  │  ┌────┐ ┌────┐ ... ┌────┐  │    │
│  │  │ W1 │ │ W2 │     │ Wn │  │    │
│  │  └────┘ └────┘     └────┘  │    │
│  └─────────────────────────────┘    │
│                                      │
│  ┌─────────────────────────────┐    │
│  │   Async I/O Layer           │    │
│  └─────────────────────────────┘    │
└─────────────────────────────────────┘

API Server

REST API

# Start API server
indexer api --port 4000

# Endpoints
POST /api/index          # Build index
GET  /api/index/status   # Get status
POST /api/query          # Query index
GET  /api/stats          # Statistics
POST /api/ai/analyze     # AI analysis

GraphQL API

query {
  index {
    files {
      path
      functions {
        name
        complexity
      }
    }
  }
}

Advanced Features

Multi-Repository Analysis

# Analyze entire organization
indexer multi-repo /path/to/org --cross-dependencies

# Generate knowledge graph
indexer export mermaid --multi-repo

AI-Powered Analysis

# Security scanning
indexer ai security-scan

# Bug prediction
indexer ai predict-bugs --confidence 0.8

# Code smell detection
indexer ai detect-smells

Call Graph Analysis

# Find unused code
indexer analyze dead-code

# Trace execution paths
indexer analyze call-paths main

# Detect circular dependencies
indexer analyze circular

Configuration

.indexer.yml

version: 2
performance:
  parallel: 8
  useWorkers: true
  maxMemory: 1000
  cache: true

include:
  - "**/*.{js,jsx,ts,tsx,py,go,sql}"

ignore:
  - "**/node_modules/**"
  - "**/dist/**"

export:
  formats:
    json:
      compression: true
      maxSize: 10MB

Integrations

IDE Support

Cursor: Native integration with AI features
WebStorm: Via REST API
Vim/Neovim: LSP integration

CI/CD

GitHub Actions: Pre-built workflows
GitLab CI: Docker images available
Jenkins: Plugin support
CircleCI: Orb available

Monitoring

Datadog: APM and metrics integration
New Relic: Performance monitoring
Sentry: Error tracking
Grafana: Custom dashboards

Performance Tips

Use Worker Threads for codebases >50 files (automatic)
Enable incremental mode for large projects
Configure parallel workers based on CPU cores
Use compression for large indexes
Enable caching for repeated operations
Set appropriate memory limits for your system

Troubleshooting

Common Issues

Out of Memory

# Increase Node.js heap size
NODE_OPTIONS="--max-old-space-size=8192" indexer scan

Slow Performance

# Check Worker Thread status
indexer debug --workers

# Profile performance
indexer scan --profile

Parser Errors

# Use fallback parser
indexer scan --parser-fallback

# Skip problematic files
indexer scan --skip-errors

Development

Building

npm install
npm run build     # Compiles TypeScript and Worker scripts
npm run dev       # Watch mode
npm test          # Run tests

Testing

npm test              # Run all tests (~25% coverage)
npm run test:unit     # Unit tests
npm run test:e2e      # End-to-end tests (Cypress)
npm run test:coverage # Generate coverage report

Current Test Status: 6 test suites passing with ~25% coverage. Target: 80%.

Contributing

See CONTRIBUTING.md for development guidelines.

License

Support

Documentation: docs/
Issues: GitHub Issues
Discord: Join our community

Built for the future of AI-assisted development 🚀

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

indexer-ai

🚀 Latest Updates (September 2025)

Code Quality Improvements

Performance Enhancements (v2.0.2)

Performance Benchmarks

Features

Core Capabilities

🆕 Advanced Features

Export Formats

Quick Start

Requirements

Installation Prerequisites

Linux/WSL Requirements

Installation

Basic Usage

Multi-Repository & Organization Indexing

Organization-Wide Indexing

Automatic Detection

Cross-Repository Analysis

Generated Outputs

Use Cases

Benefits for AI Assistants

Performance Configuration

Optimize for Your System

Environment Variables

Architecture

Modern Tech Stack

Performance Architecture

API Server

REST API

GraphQL API

Advanced Features

Multi-Repository Analysis

AI-Powered Analysis

Call Graph Analysis

Configuration

.indexer.yml

Integrations

IDE Support

CI/CD

Monitoring

Performance Tips

Troubleshooting

Common Issues

Development

Building

Testing

Contributing

License

Support