@claude-vector/core

v2.5.11

Published

7 months ago

Core vector search engine for code intelligence

0High
0Medium
0Low

liet_see

vector-search embeddings semantic-search code-search

@claude-vector/core

Core vector search engine for semantic code search. This package provides the fundamental building blocks for creating embeddings-based search systems.

Features

🚀 High-performance vector similarity search
💾 Built-in caching system
🔧 Configurable chunk processing
📁 Smart project analysis
🎯 Multiple embedding model support
🔄 Extensible architecture

Installation

npm install @claude-vector/core

Quick Start

import { VectorSearchEngine, createDefaultConfig } from '@claude-vector/core';

// Create search engine with default config
const config = createDefaultConfig();
const searchEngine = new VectorSearchEngine(config);

// Initialize and search
await searchEngine.initialize('./your-project');
const results = await searchEngine.search('function definition', { limit: 5 });

console.log(results);

Environment Setup

Set your OpenAI API key:

export OPENAI_API_KEY="sk-your-api-key-here"

Or create a .env file:

OPENAI_API_KEY=sk-your-api-key-here

Project Analysis

The ProjectAdapter helps analyze your project structure and generate appropriate configurations:

import { ProjectAdapter } from '@claude-vector/core';

const adapter = new ProjectAdapter('/path/to/project');

// Analyze project type and structure
const projectInfo = await adapter.analyzeProject();
// { type: 'nextjs', language: 'typescript', framework: 'next', ... }

// Get optimized configuration for your project
const config = await adapter.getConfig();

// Get all files matching the configuration
const files = await adapter.getFiles();

Configuration

Default Configuration

{
  search: {
    threshold: 0.7,      // Minimum similarity score (0-1)
    maxResults: 10,      // Maximum results to return
    includeMetadata: true
  },
  embeddings: {
    model: 'text-embedding-3-small',
    batchSize: 100,
    dimensions: 1536
  },
  chunks: {
    maxSize: 1000,       // Maximum tokens per chunk
    minSize: 100,        // Minimum tokens per chunk
    overlap: 200,        // Token overlap between chunks
    splitByParagraph: true,
    preserveCodeBlocks: true
  },
  cache: {
    enabled: true,
    ttl: 3600,          // Cache TTL in seconds
    compression: true
  }
}

Custom Configuration

Create a .claude-search.config.js in your project root:

export default {
  patterns: {
    include: ['src/**/*.{js,ts}', 'docs/**/*.md'],
    exclude: ['**/*.test.js', '**/__tests__/**']
  },
  chunks: {
    maxSize: 1500,
    overlap: 300
  },
  search: {
    threshold: 0.8
  }
};

API Reference

VectorSearchEngine

Constructor Options

openaiApiKey (string): OpenAI API key
embeddingModel (string): Model to use for embeddings
searchThreshold (number): Minimum similarity score (0-1)
maxResults (number): Maximum results to return
cacheEnabled (boolean): Enable/disable caching
cacheTTL (number): Cache time-to-live in seconds

Methods

`loadIndex(embeddingsPath, chunksPath)`

Load pre-computed embeddings and chunks from JSON files.

`search(query, options)`

Search for similar chunks using semantic similarity.

`findRelated(chunkIndex, options)`

Find chunks similar to a given chunk.

`generateQueryEmbedding(query)`

Generate embedding vector for a query string.

`getStats()`

Get index statistics including chunk count, token count, and size estimates.

ProjectAdapter

Methods

`analyzeProject()`

Analyze project structure and detect type, framework, and features.

`getDefaultConfig()`

Get default configuration based on project type.

`loadCustomConfig()`

Load custom configuration from project config files.

`getConfig()`

Get merged configuration (default + custom).

`getFiles(config)`

Get all files matching the include/exclude patterns.

Caching

The built-in cache system helps improve performance by storing search results:

import { SimpleCache } from '@claude-vector/core';

const cache = new SimpleCache('./cache', 3600); // 1 hour TTL

// Basic operations
await cache.set('key', { data: 'value' });
const value = await cache.get('key');
await cache.delete('key');

// Maintenance
await cache.cleanup(); // Remove expired entries
const stats = await cache.getStats(); // Get cache statistics

Advanced Usage

Custom Embedding Models

const engine = new VectorSearchEngine({
  embeddingModel: 'text-embedding-3-large',
  // Dimensions change based on model
  config: { embeddings: { dimensions: 3072 } }
});

Batch Processing

For large codebases, process embeddings in batches:

const config = {
  embeddings: {
    batchSize: 50, // Process 50 chunks at a time
    maxRetries: 3,
    retryDelay: 2000
  }
};

Type Definitions

TypeScript users can benefit from JSDoc type definitions:

import type { 
  SearchOptions, 
  SearchResult, 
  ProjectConfig 
} from '@claude-vector/core';

Performance Tips

Pre-compute embeddings: Generate embeddings once and reuse them
Enable caching: Cache search results for repeated queries
Optimize chunk size: Balance between context and performance
Use appropriate models: Smaller models for speed, larger for accuracy

License

MIT

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

@claude-vector/core

Features

Installation

Quick Start

Environment Setup

Project Analysis

Configuration

Default Configuration

Custom Configuration

API Reference

VectorSearchEngine

Constructor Options

Methods

loadIndex(embeddingsPath, chunksPath)

search(query, options)

findRelated(chunkIndex, options)

generateQueryEmbedding(query)

getStats()

ProjectAdapter

Methods

analyzeProject()

getDefaultConfig()

loadCustomConfig()

getConfig()

getFiles(config)

Caching

Advanced Usage

Custom Embedding Models

Batch Processing

Type Definitions

Performance Tips

License

`loadIndex(embeddingsPath, chunksPath)`

`search(query, options)`

`findRelated(chunkIndex, options)`

`generateQueryEmbedding(query)`

`getStats()`

`analyzeProject()`

`getDefaultConfig()`

`loadCustomConfig()`

`getConfig()`

`getFiles(config)`