vectorsync
v1.0.1
Published
Real-time RAG synchronization engine. Automatically syncs MongoDB changes (insert/update/delete) with Pinecone vector embeddings. Features built-in retrieval, multi-provider support (OpenAI, Gemini, Ollama), and smart field-level updates.
Downloads
5
Maintainers
Readme
VectorSync
Real-time RAG Synchronization & Retrieval for MongoDB + Pinecone
VectorSync solves the problem of keeping your separate Vector Database (Pinecone) in sync with your primary application database (MongoDB). Instead of writing manual hooks or cron jobs, VectorSync attaches to MongoDB Change Streams to automatically reflect insert, update, and delete operations in real-time.
It also provides a built-in Retrieval / RAG Engine to chat with your data immediately.
Features
- Real-time Sync: Reacts instantly to data changes.
- Multi-Provider: Support for OpenAI, Google Gemini, and Ollama (Local).
- Smart Updates: Only regenerates embeddings if specific fields change (saves money).
- RAG Engine: Built-in
VectorRetrievalclass for context-aware chat, history, and citations. - Local-First: Full offline support using Ollama for both embeddings and LLM.
Installation
npm install vectorsyncPrerequisites
- MongoDB: Must be a Replica Set (required for Change Streams). Atlas has this by default.
- Pinecone: An index created (dimension must match your model, e.g., 1536 for OpenAI, 768 for Gemini).
Quick Start
1. The Sync Engine (Background Process)
This code acts as the listener. Run it when your server starts.
import { VectorSync } from 'vectorsync';
// 1. Initialize
const syncer = new VectorSync({
mongoUri: process.env.MONGO_URI!,
// Optional: dbName: 'my_db',
vectorDb: {
type: 'pinecone',
options: { apiKey: process.env.PINECONE_KEY!, index: 'my-index' }
},
embeddingProvider: {
type: 'openai', // or 'gemini', 'ollama'
options: { apiKey: process.env.OPENAI_KEY!, model: 'text-embedding-3-small' }
}
});
// 2. Start Watching Collections
// Only syncs when 'name' or 'description' changes
await syncer.createContext('products', {
fields: ['name', 'description']
});
console.log('VectorSync is processing changes...');2. The RAG Engine (Chat API)
Use this within your API routes to query the synced data.
import { VectorRetrieval } from 'vectorsync';
// Reuse the 'syncer' config, or create new instances of adapters
const retrieval = new VectorRetrieval(syncer.vectorDb, syncer.embeddingProvider);
// A. Stateful Chat (Maintains History)
const sessionId = retrieval.createSession({
systemPrompt: "You are a shopping assistant.",
contextSources: [{ collectionName: 'products', fields: ['name', 'description'] }],
model: 'gpt-4o',
provider: 'openai'
});
const response = await retrieval.query(sessionId, "Do you have ergonomic chairs?");
console.log(response.response); // "Yes, we have..."
console.log(response.retrievedDocuments); // [{ id: '...', score: 0.89, ... }]
// B. Stateless Query (One-off)
const result = await retrieval.queryOnce({
systemPrompt: "Answer based on context",
contextSources: [{ collectionName: 'products', fields: ['name'] }],
model: 'llama3.2',
provider: 'ollama'
}, "Tell me about product X", { debug: true });Supported Providers
| Provider | Type | Config Type | Default Model |
| :--- | :--- | :--- | :--- |
| OpenAI | Embedding / LLM | 'openai' | text-embedding-3-small |
| Google Gemini | Embedding / LLM | 'gemini' | text-embedding-004 |
| Ollama (Local)| Embedding / LLM | 'ollama' | nomic-embed-text |
| Pinecone | Vector DB | 'pinecone' | - |
Configuration Reference
VectorSyncConfig
{
mongoUri: string;
dbName?: string;
vectorDb: {
type: 'pinecone' | 'custom';
options: { apiKey: string; index: string; };
};
embeddingProvider: {
type: 'openai' | 'gemini' | 'ollama' | 'custom';
options: {
apiKey?: string; // Not needed for Ollama
baseUrl?: string; // Special for Ollama
model?: string; // Optional override
};
};
}Environment Variables
Typical setup in .env:
MONGO_URI=mongodb://localhost:27017/mydb?replSet=rs0
OPENAI_API_KEY=sk-...
GOOGLE_API_KEY=AIza...
PINECONE_API_KEY=pc-...License
ISC
