syrupdotts

v0.1.0

Published

14 days ago

A variable n-gram powered search engine for intelligent completion and semantic search

0High
0Medium
0Low

justinwhite

search ngram completion autocomplete semantic-search indexing text-search

🍯 Syrup.ts

A high-performance, variable n-gram search engine library for TypeScript and JavaScript. Syrup enables developers to build context-aware autocomplete, semantic completions, and multi-document search with minimal overhead.

✨ Key Features

Variable N-gram Indexing – Deep-index text to provide multi-token, context-sensitive suggestions.
Blazing Fast – Optimized in-memory architecture for sub-millisecond retrieval.
Document Management – Organize data into discrete documents for scoped querying and easy cleanup.
Reference Mapping – Attach metadata or IDs to indexed content for rich search results.
Fully Type-Safe – Written in TypeScript with comprehensive interface definitions.

Demo + Docs

Coming Soon: A more rudamentary version of syrup is already used on my personal portfolio justinwhite.work for search suggestions

🚀 Installation

npm install syrup-search

Quick Start

import { SyrupCore } from 'syrup-search';

// Initialize with custom depth
const engine = new SyrupCore({
    caseSensitive: false,
    maxDepth: 3
});

// "Infuse" the engine with content
engine.infuse("the quick brown fox jumps");

// Predict the next tokens
const { completions } = engine.predict({ query: "the quick" });

console.log(completions); 
// Output: ["brown", "brown fox", "brown fox jumps"]

🛠 Usage Patterns

Document-Scoped Search Manage specific sets of data using SyrupDocumentHandle.

// Create a managed document
const doc = engine.document("user_manual_01", "To restart the device, hold the power button.");

// Add more content to the same document later
doc.infuse("Ensure the power cable is plugged in.");

// Query results will now include document IDs
const results = engine.predict({ query: "power" });
console.log(results.documents); // ["user_manual_01"]

// Remove document and its associated indices
doc.delete();

Metadata & References

Link external IDs or objects to your chains

engine.infuse("API documentation", ["link_01", "tag_docs"]);

const results = engine.predict({ query: "API" });
console.log(results.references); // ["link_01", "tag_docs"]

🧠 How it works

Tokenization: Splits input into units (e.g., ["hello", "world"]).
Permutation: Generates n-grams from length 1 up to maxDepth.
Mapping: Each n-gram acts as a key to an array of future token sequences.

Example: "hello world how are you" (maxDepth: 3)

Key: "hello" -> Val: ["world"], ["world how"], ["world how are"]
Key: "hello world" -> Val: ["how"], ["how are"], ["how are you"]

📊 Performance & Complexity

n = total tokens, d = maxDepth

Indexing Time: O(n * d^2) (Creating grams and updating internal pointers)
Search Time: O(1) (Average case via Hash Map lookup)
Space: O(n * d) (Total index entries stored in memory)

🧑‍💻 API Summary

SyrupCoreHandle The main API you interact with unless your querying a specific document

.infuse(content: string, refs?: string[])    // Index text globally
.predict(options: SyrupQueryOptions)         // Query completions/docs
.document(id: string, content: string)       // Create/manage a document
.use(id:string)                              // Access an existing document
.deleteDocument(id: string)                  // Wipe document from index
.getDocumentIds()                            // List all active doc IDs

SyrupDocumentHandle The scoped API used by documents (you can get this with .document() or .use())

.infuse(content: string, refs?: string[])    // Add content to this doc
.predict(options: SyrupQueryOptions)         // Query only this doc's scope
.delete()                                    // Self-destruct document data

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme