hnswsqlite
v0.2.1
Published
Vector search with HNSWlib and SQLite in TypeScript.
Maintainers
Readme
HNSWSQLite
A TypeScript library that combines approximate nearest neighbor vector search (via HNSWlib) with SQLite for persistent, lightweight, and efficient semantic search. Perfect for building semantic search applications, recommendation systems, and more.
Features
- 🚀 Fast Vector Search: Approximate nearest neighbor search using HNSW algorithm
- 💾 Persistence: All data stored in SQLite for durability and easy backup
- 🔌 Plugin System: Support for multiple embedding providers:
- OpenAI
- HuggingFace
- Dummy (for testing)
- WebLLM (browser-based LLMs)
- MediaPipe (image/video feature extraction)
- TensorFlow.js (text/image/audio feature extraction)
- 🛠️ CLI Tool: Full-featured command-line interface for easy interaction
- 📦 Lightweight: No external dependencies other than SQLite and HNSWlib
- 🧩 Extensible: Easy to integrate with existing applications
- 🔄 Batch Operations: Support for adding and deleting multiple documents at once
Installation
As a Library
npm install hnswsqliteAs a CLI Tool
# Install globally
npm install -g hnswsqlite
# Or use with npx
npx hnswsqlite --helpView on npm: https://www.npmjs.com/package/hnswsqlite
View on GitHub: https://github.com/praveencs87/hnswsqlite
Usage
JavaScript/TypeScript API
import { VectorStore } from 'hnswsqlite';
// Initialize with SQLite database path and embedding dimension
const store = new VectorStore('my_vectors.db', 1536);
try {
// Add documents with embeddings
const docId = store.addDocument('hello world', [0.1, 0.2, 0.3, ...]);
// Search for similar documents
const results = store.search([0.1, 0.2, 0.3, ...], 5);
// Delete a document
const deleted = store.deleteDocument(docId);
// Batch operations
const docIds = store.addDocuments([
{ text: 'first document', embedding: [0.1, 0.2, ...] },
{ text: 'second document', embedding: [0.3, 0.4, ...] }
]);
} finally {
// Always close the store when done
store.close();
}Command Line Interface (CLI)
Initialize a new database
hnswsqlite initAdd a document
# With automatic dummy embedding
hnswsqlite add "Your document text here"
# With custom embedding
hnswsqlite add "Another document" 0.1 0.2 0.3 ...
# (Planned) With specific provider (e.g., WebLLM, MediaPipe, TensorFlow.js)
hnswsqlite add "Text or image path" --provider webllmSearch for similar documents
hnswsqlite search "search query"List all documents
hnswsqlite listDelete a document
hnswsqlite delete 1CLI Options
-d, --database <path> Path to the SQLite database (default: vectors.db)
--dim <dimension> Dimension of the vectors (default: 1536)
--provider <name> Embedding provider to use (openai, huggingface, webllm, mediapipe, tensorflowjs, dummy)
--verbose Enable verbose outputAdvanced Usage
Using Different Embedding Providers
All embedding providers implement a common interface:
type EmbeddingPlugin = {
name: string;
generateEmbedding(input: string | Buffer): Promise<number[]>;
};Example: OpenAI
import { VectorStore } from 'hnswsqlite';
import { OpenAIEmbedder } from 'hnswsqlite/plugins/openai';
const store = new VectorStore('my_vectors.db', 1536);
const embedder = new OpenAIEmbedder('your-api-key');
const embedding = await embedder.generateEmbedding('Your text here');
store.addDocument('Your text here', embedding);Example: WebLLM (browser-based LLMs)
import { WebLLMPlugin } from 'hnswsqlite/plugins/webllm';
const plugin = new WebLLMPlugin();
const embedding = await plugin.generateEmbedding('Your text here');Example: MediaPipe (image/video feature extraction)
import { MediaPipePlugin } from 'hnswsqlite/plugins/mediapipe';
const plugin = new MediaPipePlugin();
const embedding = await plugin.generateEmbedding(imageBuffer);Example: TensorFlow.js (text/image/audio)
import { TensorFlowPlugin } from 'hnswsqlite/plugins/tensorflow';
const plugin = new TensorFlowPlugin();
const embedding = await plugin.generateEmbedding('Your text or image buffer');Note: Each plugin may require additional dependencies or setup. See the plugin source for details.
Performance Tuning
const store = new VectorStore('my_vectors.db', 1536, {
maxElements: 100000, // Maximum number of elements in the index
M: 16, // Maximum number of outgoing connections in the graph
efConstruction: 200, // Controls index search speed/build speed tradeoff
randomSeed: 100, // Random seed for reproducibility
});Development
# Clone the repository
git clone https://github.com/praveencs87/hnswsqlite.git
cd hnswsqlite
# Install dependencies
npm install
# Build the project
npm run build
# Run tests
npm test
# Run the CLI in development mode
npm run cli -- --helpContributing
Contributions are welcome! Please feel free to submit a Pull Request.
License
MIT © Praveen CS
Author
Maintained by Praveen CS
- GitHub: praveencs87
