@functional-systems/langchain-lambdadb
v0.2.1
Published
LangChain integration for LambdaDB vector database
Downloads
348
Readme
LangChain LambdaDB Integration
A production-ready TypeScript library that integrates LambdaDB vector database with LangChain.js, providing seamless vector storage and retrieval capabilities for AI applications.
Features
- 🚀 Easy Integration: Drop-in replacement for other LangChain vector stores
- 🎯 Vector Similarity Search: Support for cosine, euclidean, and dot product similarity metrics
- 🧠 Max Marginal Relevance (MMR): Diverse search results balancing relevance and diversity
- 📊 Batch Operations: Efficient bulk document insertion and processing
- 🔍 Flexible Configuration: Custom field names, similarity metrics, and collection settings
- 🛡️ Type Safety: Full TypeScript support with comprehensive type definitions
- ⚡ High Performance: Leverages LambdaDB's optimized vector search engine with consistent reads
- 🧪 Production Ready: Comprehensive test suite with 43 passing tests (16 unit + 27 integration)
- 🔄 Retry Logic: Built-in exponential backoff for robust error handling
- 📈 Collection Management: Full lifecycle management with state monitoring
- 🗑️ Document Deletion: LangChain
delete()support with server-side LambdaDB filter (by ids, filter, or deleteAll)
Installation
npm install langchain-lambdadb @langchain/coreQuick Start
import { LambdaDBVectorStore } from 'langchain-lambdadb';
import { OpenAIEmbeddings } from '@langchain/openai';
import { Document } from '@langchain/core/documents';
// Initialize embeddings
const embeddings = new OpenAIEmbeddings({
apiKey: process.env.OPENAI_API_KEY
});
// Configure LambdaDB connection
const config = {
projectApiKey: process.env.LAMBDADB_API_KEY!,
serverURL: process.env.LAMBDADB_SERVER_URL, // Optional: custom server
collectionName: 'my-documents',
vectorDimensions: 1536, // OpenAI embedding dimensions
similarityMetric: 'cosine',
// Optional: Configure retry behavior
retryOptions: {
maxAttempts: 3,
initialDelay: 500,
maxDelay: 5000
}
};
// Create vector store
const vectorStore = new LambdaDBVectorStore(embeddings, config);
// Create collection if it doesn't exist
await vectorStore.createCollection();
// Add documents
const documents = [
new Document({
pageContent: 'LangChain is a framework for developing applications powered by language models.',
metadata: { source: 'documentation', category: 'framework' }
}),
new Document({
pageContent: 'LambdaDB is a vector database optimized for AI applications.',
metadata: { source: 'documentation', category: 'database' }
})
];
await vectorStore.addDocuments(documents);
// Perform similarity search
const results = await vectorStore.similaritySearch('What is LangChain?', 5);
console.log(results);Configuration Options
LambdaDBConfig
| Option | Type | Required | Description |
|--------|------|----------|-------------|
| projectApiKey | string | ✅ | Your LambdaDB project API key |
| collectionName | string | ✅ | Name of the collection to use |
| vectorDimensions | number | ✅ | Vector dimensions for embeddings |
| similarityMetric | SimilarityMetric | ❌ | Similarity metric (default: 'cosine') |
| baseUrl | string | ❌ | API base URL (e.g. https://api.lambdadb.ai). Preferred with projectName. |
| projectName | string | ❌ | Project name (path under /projects/). Preferred with baseUrl. |
| serverURL | string | ❌ | Deprecated. Full server URL override. Prefer baseUrl + projectName. |
| textField | string | ❌ | Field name for document content (default: 'content') |
| vectorField | string | ❌ | Field name for vectors (default: 'vector') |
| validateCollection | boolean | ❌ | Validate collection before operations (default: false) |
| defaultConsistentRead | boolean | ❌ | Use consistent reads by default (default: true) |
| retryOptions | RetryOptions | ❌ | Configure retry behavior with exponential backoff |
| partitionConfig | PartitionConfigOption | ❌ | Optional partition config for collection creation |
Similarity Metrics
'cosine'- Cosine similarity (default, recommended for most use cases)'euclidean'- Euclidean distance'dot_product'- Dot product similarity
Usage Examples
Basic Vector Search
import { LambdaDBVectorStore } from 'langchain-lambdadb';
import { OpenAIEmbeddings } from '@langchain/openai';
const vectorStore = new LambdaDBVectorStore(
new OpenAIEmbeddings(),
{
projectApiKey: process.env.LAMBDADB_API_KEY!,
collectionName: 'documents',
vectorDimensions: 1536,
}
);
// Search with custom parameters
const results = await vectorStore.similaritySearchWithScore('query text', 10);
results.forEach(([doc, score]) => {
console.log(`Score: ${score}, Content: ${doc.pageContent}`);
});Using with Different Embedding Models
import { HuggingFaceTransformersEmbeddings } from '@langchain/community/embeddings/hf_transformers';
// Using Hugging Face embeddings
const embeddings = new HuggingFaceTransformersEmbeddings({
modelName: 'Xenova/all-MiniLM-L6-v2',
});
const vectorStore = new LambdaDBVectorStore(embeddings, {
projectApiKey: process.env.LAMBDADB_API_KEY!,
collectionName: 'hf-documents',
vectorDimensions: 384, // all-MiniLM-L6-v2 dimensions
similarityMetric: 'cosine'
});Creating from Texts and Metadata
// Create vector store from texts
const texts = [
'The quick brown fox jumps over the lazy dog.',
'Machine learning is a subset of artificial intelligence.',
'Vector databases enable efficient similarity search.'
];
const metadatas = [
{ category: 'literature' },
{ category: 'technology' },
{ category: 'database' }
];
const vectorStore = await LambdaDBVectorStore.fromTexts(
texts,
metadatas,
embeddings,
config
);Max Marginal Relevance (MMR) Search
// MMR search for diverse results
const mmrResults = await vectorStore.maxMarginalRelevanceSearch(
'machine learning frameworks',
{
k: 5, // Number of results to return
fetchK: 20, // Number of initial candidates to fetch
lambda: 0.7 // Balance between relevance (1.0) and diversity (0.0)
}
);Advanced Filtering
Search supports server-side filters (LambdaDB syntax) or a client-side function. Prefer server-side for efficiency.
// Server-side: LambdaDB query string (recommended)
const results = await vectorStore.similaritySearchVectorWithScore(
queryVector,
5,
'category:technology'
);
// Server-side: full LambdaDB filter object
const results2 = await vectorStore.similaritySearchVectorWithScore(queryVector, 5, {
queryString: { query: 'category:technology AND year:2024' },
});
// Client-side: filter function (applied after fetch)
const filterFn = (doc: Document) => doc.metadata?.category === 'technology';
const results3 = await vectorStore.similaritySearchVectorWithScore(queryVector, 5, filterFn);See LambdaDB Query string for filter syntax.
Deleting Documents
The store implements the LangChain VectorStore delete() interface. You must pass explicit parameters (no default to delete all, to avoid accidental wipe).
By IDs (most efficient when you know the ids):
await vectorStore.delete({ ids: ['id1', 'id2'] });By LambdaDB filter (recommended when filtering by metadata; server-side, one API call):
// Query string – converted to LambdaDB queryString filter
await vectorStore.delete({ filter: 'genre:documentary AND year:2019' });
// Or full LambdaDB filter object
await vectorStore.delete({
filter: { queryString: { query: 'genre:documentary AND year:2019' } },
});See LambdaDB Delete data and Query string for filter syntax.
Delete all documents in the collection (explicit):
await vectorStore.delete({ deleteAll: true });By client-side filter function (fetches all docs then deletes by ids; use only when LambdaDB filter is not enough):
await vectorStore.delete({
filter: (doc) => doc.metadata.source === 'legacy',
});RAG (Retrieval-Augmented Generation) Integration
import { ChatOpenAI } from '@langchain/openai';
import { ConversationalRetrievalQAChain } from 'langchain/chains';
const llm = new ChatOpenAI();
const retriever = vectorStore.asRetriever({
searchType: 'similarity',
searchKwargs: { k: 6 }
});
const chain = ConversationalRetrievalQAChain.fromLLM(llm, retriever);
const response = await chain.call({
question: 'What is the main topic of the documents?',
chat_history: []
});API Reference
LambdaDBVectorStore Class
Constructor
new LambdaDBVectorStore(embeddings: EmbeddingsInterface, config: LambdaDBConfig)Methods
addDocuments(documents: Document[]): Promise<string[] \| void>
Adds documents to the vector store with automatic embedding generation. Returns assigned document IDs.
addVectors(vectors: number[][], documents: Document[]): Promise<string[] \| void>
Adds pre-computed vectors with associated documents. Returns assigned document IDs.
similaritySearch(query: string, k?: number, filter?: DocumentFilter): Promise<Document[]>
Performs similarity search with a text query.
similaritySearchVectorWithScore(query: number[], k: number, filter?: DocumentFilter | LambdaDBFilterObject | string): Promise<[Document, number][]>
Performs similarity search with a vector query, returns documents with similarity scores. Filter: string or LambdaDB object → server-side knn.filter; function → client-side filter after fetch.
maxMarginalRelevanceSearch(query: string, options?: MMRSearchOptions): Promise<Document[]>
Performs MMR search using vector similarity: fetches candidates with includeVectors: true and balances relevance to the query with diversity among selected documents (cosine similarity).
createCollection(options?: Partial<CreateCollectionOptions>): Promise<void>
Creates a new collection in LambdaDB with proper state monitoring.
deleteCollection(): Promise<void>
Deletes the collection from LambdaDB.
getCollectionInfo(): Promise<CollectionInfo>
Returns information about the collection including status and document count.
delete(_params?: Record<string, any>): Promise<void> (LangChain VectorStore interface)
Deletes documents. Requires explicit params (no default). Use one of:
{ ids: string[] }– delete by document IDs{ filter: string | LambdaDBFilterObject }– server-side delete (recommended); string is used asqueryString.query{ filter: (doc: Document) => boolean }– client-side filter (fetches all, then deletes by ids){ deleteAll: true }– delete all documents in the collection
deleteDocuments(options: DeleteOptions): Promise<void>
Lower-level delete with the same options as delete(): ids, filter (string, LambdaDB object, or function), or deleteAll: true.
Static Factory Methods
fromTexts(texts: string[], metadatas: object[] | object, embeddings: EmbeddingsInterface, config: LambdaDBConfig): Promise<LambdaDBVectorStore>
Creates a vector store from an array of texts.
fromDocuments(docs: Document[], embeddings: EmbeddingsInterface, config: LambdaDBConfig): Promise<LambdaDBVectorStore>
Creates a vector store from an array of documents.
Environment Variables
You can set your LambdaDB credentials using environment variables:
export LAMBDADB_API_KEY="your-api-key-here"
export LAMBDADB_SERVER_URL="https://your-instance.lambdadb.ai" # OptionalError Handling
The library provides comprehensive error handling:
try {
await vectorStore.addDocuments(documents);
} catch (error) {
if (error.message.includes('LambdaDB Error')) {
console.error('LambdaDB service error:', error.message);
} else if (error.message.includes('Vector dimension mismatch')) {
console.error('Embedding dimension error:', error.message);
} else {
console.error('Unexpected error:', error.message);
}
}Development
Running Tests
# Run all tests
npm test
# Run only unit tests
npm run test:unit
# Run only integration tests (requires LAMBDADB_API_KEY)
npm run test:integrationIntegration Tests: Set LAMBDADB_API_KEY and optionally LAMBDADB_SERVER_URL to run integration tests against real LambdaDB service.
Building
npm run buildLinting
npm run lintContributing
- Fork the repository
- Create your feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add some amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
License
This project is licensed under the MIT License - see the LICENSE file for details.
Implementation Details
Key Features Implemented
- Eventual Consistency Handling: Uses
consistentRead: trueby default for immediate consistency - Collection State Management: Proper waiting for collection to become ACTIVE before operations
- Error Handling: Comprehensive error handling with retry logic and exponential backoff
- Field Name Configuration: Supports custom field names for text and vector data
- Batch Processing: Efficient bulk operations with proper error handling
- MMR: Vector-based MMR with
includeVectors: trueand cosine similarity for relevance/diversity balance - Client options: Prefer
baseUrl+projectName;serverURLsupported but deprecated - Test Coverage: Unit and integration tests covering core functionality and edge cases
LambdaDB Integration Notes
- Uses KNN query format:
{ knn: { field, queryVector, k } } - Prefer
baseUrl+projectName; useserverURL(exact name, notserverUrl) only if overriding full URL - Supports immediate consistency with
consistentRead: true - Collection creation includes state polling until ACTIVE; optional
partitionConfigsupported - Delete: Prefer server-side filter (
filteras string or LambdaDB object) for efficiency;deleteAll: trueuses LambdaDB filter{ queryString: { query: "*:*" } }. Delete data
Links
- LambdaDB Documentation
- LangChain.js Documentation
- TypeScript Client GitHub
- Python Integration Reference
Support
If you encounter any issues or have questions:
- Check the GitHub Issues
- Review the LambdaDB Documentation
- Join the LangChain Discord
