@claude-vector/core
v2.5.11
Published
Core vector search engine for code intelligence
Maintainers
Readme
@claude-vector/core
Core vector search engine for semantic code search. This package provides the fundamental building blocks for creating embeddings-based search systems.
Features
- 🚀 High-performance vector similarity search
- 💾 Built-in caching system
- 🔧 Configurable chunk processing
- 📁 Smart project analysis
- 🎯 Multiple embedding model support
- 🔄 Extensible architecture
Installation
npm install @claude-vector/coreQuick Start
import { VectorSearchEngine, createDefaultConfig } from '@claude-vector/core';
// Create search engine with default config
const config = createDefaultConfig();
const searchEngine = new VectorSearchEngine(config);
// Initialize and search
await searchEngine.initialize('./your-project');
const results = await searchEngine.search('function definition', { limit: 5 });
console.log(results);Environment Setup
Set your OpenAI API key:
export OPENAI_API_KEY="sk-your-api-key-here"Or create a .env file:
OPENAI_API_KEY=sk-your-api-key-hereProject Analysis
The ProjectAdapter helps analyze your project structure and generate appropriate configurations:
import { ProjectAdapter } from '@claude-vector/core';
const adapter = new ProjectAdapter('/path/to/project');
// Analyze project type and structure
const projectInfo = await adapter.analyzeProject();
// { type: 'nextjs', language: 'typescript', framework: 'next', ... }
// Get optimized configuration for your project
const config = await adapter.getConfig();
// Get all files matching the configuration
const files = await adapter.getFiles();Configuration
Default Configuration
{
search: {
threshold: 0.7, // Minimum similarity score (0-1)
maxResults: 10, // Maximum results to return
includeMetadata: true
},
embeddings: {
model: 'text-embedding-3-small',
batchSize: 100,
dimensions: 1536
},
chunks: {
maxSize: 1000, // Maximum tokens per chunk
minSize: 100, // Minimum tokens per chunk
overlap: 200, // Token overlap between chunks
splitByParagraph: true,
preserveCodeBlocks: true
},
cache: {
enabled: true,
ttl: 3600, // Cache TTL in seconds
compression: true
}
}Custom Configuration
Create a .claude-search.config.js in your project root:
export default {
patterns: {
include: ['src/**/*.{js,ts}', 'docs/**/*.md'],
exclude: ['**/*.test.js', '**/__tests__/**']
},
chunks: {
maxSize: 1500,
overlap: 300
},
search: {
threshold: 0.8
}
};API Reference
VectorSearchEngine
Constructor Options
openaiApiKey(string): OpenAI API keyembeddingModel(string): Model to use for embeddingssearchThreshold(number): Minimum similarity score (0-1)maxResults(number): Maximum results to returncacheEnabled(boolean): Enable/disable cachingcacheTTL(number): Cache time-to-live in seconds
Methods
loadIndex(embeddingsPath, chunksPath)
Load pre-computed embeddings and chunks from JSON files.
search(query, options)
Search for similar chunks using semantic similarity.
findRelated(chunkIndex, options)
Find chunks similar to a given chunk.
generateQueryEmbedding(query)
Generate embedding vector for a query string.
getStats()
Get index statistics including chunk count, token count, and size estimates.
ProjectAdapter
Methods
analyzeProject()
Analyze project structure and detect type, framework, and features.
getDefaultConfig()
Get default configuration based on project type.
loadCustomConfig()
Load custom configuration from project config files.
getConfig()
Get merged configuration (default + custom).
getFiles(config)
Get all files matching the include/exclude patterns.
Caching
The built-in cache system helps improve performance by storing search results:
import { SimpleCache } from '@claude-vector/core';
const cache = new SimpleCache('./cache', 3600); // 1 hour TTL
// Basic operations
await cache.set('key', { data: 'value' });
const value = await cache.get('key');
await cache.delete('key');
// Maintenance
await cache.cleanup(); // Remove expired entries
const stats = await cache.getStats(); // Get cache statisticsAdvanced Usage
Custom Embedding Models
const engine = new VectorSearchEngine({
embeddingModel: 'text-embedding-3-large',
// Dimensions change based on model
config: { embeddings: { dimensions: 3072 } }
});Batch Processing
For large codebases, process embeddings in batches:
const config = {
embeddings: {
batchSize: 50, // Process 50 chunks at a time
maxRetries: 3,
retryDelay: 2000
}
};Type Definitions
TypeScript users can benefit from JSDoc type definitions:
import type {
SearchOptions,
SearchResult,
ProjectConfig
} from '@claude-vector/core';Performance Tips
- Pre-compute embeddings: Generate embeddings once and reuse them
- Enable caching: Cache search results for repeated queries
- Optimize chunk size: Balance between context and performance
- Use appropriate models: Smaller models for speed, larger for accuracy
License
MIT
