@sylphx/coderag

v0.1.23

Published

a day ago

Semantic code search with vector embeddings - RAG-ready for AI assistants

Downloads

2,525

0High
0Medium
0Low

ansonsylphx

shtse8

rag code-search retrieval-augmented-generation vector-search embeddings tfidf semantic-search code-indexing mcp

@sylphx/coderag

Core library for semantic code search using vector embeddings with TF-IDF fallback.

Installation

bun add @sylphx/coderag

Features

🔍 Semantic Search - Vector search with embeddings, TF-IDF fallback
🌳 AST-based Chunking - Smart code splitting using Synth parsers
💾 Persistent Storage - SQLite-backed index for instant startup
⚡ Incremental Updates - Only reindex changed files
👁️ File Watching - Auto-update index on file changes
🧠 Embeddings - OpenAI embeddings for semantic search

Quick Start

import { CodebaseIndexer, PersistentStorage } from '@sylphx/coderag'

// Create indexer with persistent storage
const storage = new PersistentStorage({ codebaseRoot: './my-project' })
const indexer = new CodebaseIndexer({
  codebaseRoot: './my-project',
  storage,
})

// Index (instant on subsequent runs)
await indexer.index({ watch: true })

// Search
const results = await indexer.search('authentication', { limit: 10 })

API

`CodebaseIndexer`

Main class for indexing and searching.

const indexer = new CodebaseIndexer({
  codebaseRoot: string,          // Project root path
  storage?: Storage,             // Storage backend (default: in-memory)
  maxFileSize?: number,          // Max file size in bytes (default: 1MB)
  onFileChange?: (event) => void // File change callback
})

// Methods
await indexer.index(options)     // Index codebase
await indexer.search(query, options) // Search
await indexer.startWatch()       // Start file watcher
await indexer.stopWatch()        // Stop file watcher

`PersistentStorage`

SQLite-backed persistent storage.

const storage = new PersistentStorage({
  codebaseRoot: string,          // Project root (for .coderag/ folder)
  dbPath?: string                // Custom database path
})

`buildSearchIndex` / `searchDocuments`

Low-level TF-IDF functions.

import { buildSearchIndex, searchDocuments } from '@sylphx/coderag'

const documents = [
  { uri: 'file://auth.ts', content: '...' },
  { uri: 'file://user.ts', content: '...' },
]

const index = buildSearchIndex(documents)
const results = searchDocuments('auth', index, { limit: 5 })

AST Chunking

Smart code chunking using Synth parsers.

import { chunkCodeByAST } from '@sylphx/coderag'

const chunks = await chunkCodeByAST(code, 'typescript', {
  maxChunkSize: 1500,
  minChunkSize: 100,
})
// Returns: [{ content, type, startLine, endLine }, ...]

Supported languages (15+):

JavaScript: JS, TS, JSX, TSX
Systems: Python, Go, Java, C, Rust
Markup: Markdown, HTML, XML
Data/Config: JSON, YAML, TOML, INI, Protobuf

Vector Storage

For semantic search with embeddings.

import { VectorStorage, createEmbeddingProvider } from '@sylphx/coderag'

const provider = await createEmbeddingProvider({
  provider: 'openai',
  model: 'text-embedding-3-small',
})

const vectorStorage = new VectorStorage()
await vectorStorage.addDocument('doc1', embedding, { path: 'auth.ts' })

const results = await vectorStorage.search(queryEmbedding, { limit: 5 })

Search Options

interface SearchOptions {
  limit?: number           // Max results (default: 10)
  includeContent?: boolean // Include snippets (default: true)
  fileExtensions?: string[] // Filter by extension
  pathFilter?: string      // Filter by path pattern
  excludePaths?: string[]  // Exclude paths
}

Performance

| Metric | Value | |--------|-------| | Indexing speed | ~1000-2000 files/sec | | Startup with cache | <100ms | | Search latency | <50ms | | Memory per 1000 files | ~1-2 MB |

License

MIT

Powered by Sylphx

Built with @sylphx/synth parsers (15+ languages)

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

@sylphx/coderag

Installation

Features

Quick Start

API

CodebaseIndexer

PersistentStorage

buildSearchIndex / searchDocuments

AST Chunking

Vector Storage

Search Options

Performance

License

`CodebaseIndexer`

`PersistentStorage`

`buildSearchIndex` / `searchDocuments`