seekmix

v1.2.0

Published

10 months ago

🔍 A local semantic caching library for Node.js.

Downloads

SeekMix

SeekMix is a powerful semantic caching library for Node.js that leverages vector embeddings to cache and retrieve semantically similar queries, significantly reducing API calls to expensive LLM services.

Features

Semantic Caching: Cache results based on the semantic meaning of queries, not just exact matches
Configurable Similarity Threshold: Fine-tune how semantically similar queries need to be for a cache hit
Local Embedding Models: By default, SeekMix uses Hugging Face embedding models locally, reducing external API dependencies
Multiple Embedding Providers: Support for OpenAI and Hugging Face embedding models
Redis Vector Database: Leverages Redis as a vector database for efficient similarity search
Time-based Invalidation: Easily invalidate old cache entries based on time criteria
TTL Support: Configure time-to-live for all cache entries

Benefits

Cost Reduction: Minimize expensive API calls to Large Language Models
Improved Response Times: Retrieve cached results for semantically similar queries instantly
Perfect for RAG Applications: Ideal for Retrieval-Augmented Generation systems
Flexible Configuration: Adapt to your specific use case with multiple configuration options
Multi-model Support: Use with OpenAI or open-source Hugging Face models

Requirements

Node.js (>= 14.x)
Redis with RedisSearch and RedisJSON modules enabled (Redis Stack recommended)
Disk space for locally downloaded Hugging Face embedding models

Installation with Redis Stack (Docker)

docker run -d --name redis-stack-server -p 6379:6379 redis/
redis-stack-server:latest

npm install seekmix

Basic Usage

const { SeekMix, OpenAIEmbeddingProvider } = require('seekmix');

// Function that simulates an expensive API call (e.g., to an LLM)
async function expensiveApiCall(query) {
    console.log(`Making expensive API call for: "${query}"`);
    // Simulate processing time
    await new Promise(resolve => setTimeout(resolve, 1000));

    // In a real-world scenario, this would be a call to an API like GPT-X
    return `Response for: ${query} - ${new Date().toISOString()}`;
}

// Create and initialize the semantic cache
const cache = new SeekMix({
    similarityThreshold: 0.9, // Semantic similarity threshold
    ttl: 60 * 60, // 1 hour TTL
    // embeddingProvider: new OpenAIEmbeddingProvider()
});

await cache.connect();
console.log('Semantic cache connected successfully');

// Examples of semantically similar queries
const queries = [
    'What are the best restaurants in New York',
    'Recommend places to eat in New York',
    'I need information about restaurants in Chicago',
    'Looking for good dining spots in New York',
    'Tell me about hiking trails'
];

// Process queries, using the cache when possible
for (const query of queries) {
    console.log(`\nProcessing query: "${query}"`);

    // Try to get from cache
    const cachedResult = await cache.get(query);

    if (cachedResult) {
        console.log(`✅ CACHE HIT - Similarity: ${(1 - cachedResult.score).toFixed(4)}`);
        console.log(`Original query: "${cachedResult.query}"`);
        console.log(`Result: ${cachedResult.result}`);
        console.log(`Stored: ${Math.round((Date.now() - cachedResult.timestamp) / 1000)} seconds ago`);
    } else {
        console.log('❌ CACHE MISS - Making API call');

        // Make the expensive call
        const result = await expensiveApiCall(query);

        // Save to cache for future similar queries
        await cache.set(query, result);
        console.log(`Result: ${result}`);
        console.log('Saved to cache for future similar queries');
    }
}

await cache.disconnect();

Advanced Configuration

const { SeekMix, OpenAIEmbeddingProvider } = require('seekmix');

// Create a semantic cache with OpenAI embeddings and custom settings
const cache = new SeekMix({
  redisUrl: 'redis://username:password@your-redis-host:6379',
  indexName: 'my-app:semantic-cache',
  keyPrefix: 'my-app:cache:',
  ttl: 60 * 60 * 24 * 7, // 1 week
  similarityThreshold: 0.85,
  dropIndex: false, // Set to true to recreate the index on connect
  dropKeys: false, // Set to true to clear all cache entries on connect
  embeddingProvider: new OpenAIEmbeddingProvider({
    model: 'text-embedding-ada-002',
    apiKey: process.env.OPENAI_API_KEY
  })
});

Using with RAG Applications

SeekMix is perfect for Retrieval-Augmented Generation applications, as it can cache both the retrieval and generation steps:

// Caching the retrieval step
const retrievalCache = new SeekMix({ keyPrefix: 'rag:retrieval:' });
await retrievalCache.connect();

// Caching the generation step
const generationCache = new SeekMix({ keyPrefix: 'rag:generation:' });
await generationCache.connect();

async function queryRAG(userQuestion) {
  // 1. Try to get the final answer from generation cache
  const cachedAnswer = await generationCache.get(userQuestion);
  if (cachedAnswer) return cachedAnswer.result;

  // 2. Try to get retrieved context from retrieval cache
  let context;
  const cachedRetrieval = await retrievalCache.get(userQuestion);
  
  if (cachedRetrieval) {
    context = cachedRetrieval.result;
  } else {
    // Perform actual retrieval from vector DB
    context = await retrieveDocuments(userQuestion);
    // Cache the retrieval results
    await retrievalCache.set(userQuestion, context);
  }

  // 3. Generate answer using LLM
  const answer = await generateAnswer(context, userQuestion);
  
  // 4. Cache the final answer
  await generationCache.set(userQuestion, answer);
  
  return answer;
}

Invalidating Old Cache Entries

You can manually invalidate old cache entries:

// Invalidate entries older than 1 hour
const invalidated = await cache.invalidateOld(60 * 60);
console.log(`Invalidated ${invalidated} old cache entries`);

License

MIT

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme