@rbalchii/tag-walker
v1.0.0
Published
Graph-based semantic retrieval without embeddings - replaces vector DBs
Maintainers
Readme
@anchor/tag-walker
Semantic search without embeddings. No GPU. No cloud.
Overview
Tag-Walker is a graph-based FTS (Full-Text Search) retrieval system that replaces traditional vector databases. It uses a deterministic tag-based retrieval approach instead of embedding-based search, making it completely CPU-based with zero GPU/VRAM requirements.
The system implements a 3-phase search algorithm:
- Anchor: Direct Full-Text Search for initial results
- Pivot: Tag extraction and expansion using synonym rings
- Walk: Associative expansion through graph-based exploration
Installation
npm install @anchor/tag-walkerQuick Start
import { TagWalker } from '@anchor/tag-walker';
import { PGlite } from '@electric-sql/pglite';
// Initialize PGlite database
const db = await PGlite.create();
// Initialize TagWalker
const tagWalker = new TagWalker({
db,
maxResults: 50,
tokenBudget: 4096
});
// Ingest content with tags
await tagWalker.ingest(
"Artificial intelligence is transforming industries worldwide.",
["AI", "technology", "innovation"],
"source_document_1"
);
// Search for relevant content
const results = await tagWalker.search("machine learning technology");
console.log(results);How It Works
Tag-Walker replaces embedding-based vector search with deterministic tag-based retrieval. Instead of relying on neural networks to create semantic vectors, it uses:
- Synonym Rings: Static lookup tables for semantic expansion
- Graph Navigation: Associative expansion through shared tags
- Full-Text Search: Fast indexing and retrieval using PostgreSQL FTS
This approach provides semantic search capabilities without the computational overhead of embedding models, making it ideal for local-first applications.
API Reference
new TagWalker(config)
Initialize the Tag-Walker with configuration options:
db: PGlite database instancesynonymsPath?: Path to custom synonym dictionarymaxResults?: Maximum number of results to return (default: 50)tokenBudget?: Maximum token budget for results (default: 4096)
tagWalker.search(query, options?)
Perform a search with the following parameters:
query: Search query stringoptions?: Optional search parametersbuckets?: Filter by bucketstags?: Filter by tagsprovenance?: Filter by provenance ('internal', 'external', 'quarantine', 'all')
Returns: Promise<SearchResult[]>
tagWalker.ingest(content, tags, source)
Ingest content into the database:
content: Content string to storetags: Array of tags associated with contentsource: Source identifier for the content
Returns: Promise<string> - ID of the ingested content
Comparison: Tag-Walker vs Vector DB
| Feature | Tag-Walker | Vector DB | |---------|------------|-----------| | Hardware Requirements | CPU only | GPU recommended | | Embedding Models | Not required | Required | | Storage Efficiency | High | Lower (vectors) | | Search Speed | Fast (FTS + Graph) | Variable | | Semantic Understanding | Rule-based | Learned | | Local Operation | Yes | Depends on model | | Cost | Low | Potentially high |
When to Use Tag-Walker
- ✅ Local-first applications without cloud dependencies
- ✅ CPU-only environments without GPU access
- ✅ Applications requiring deterministic search results
- ✅ Systems with limited computational resources
- ✅ Privacy-focused applications (no external APIs)
- ✅ Fast prototyping without ML model setup
License
Elastic License v2.0 (ELv2)
