@lumosearch/search

v1.0.2

Published

8 days ago

Browser-first hybrid search engine with candidate pruning, fuzzy ranking, and worker support.

0High
0Medium
0Low

search fuzzy-search full-text-search bm25 inverted-index autocomplete browser-search client-side-search trigram highlighting worker fuse

LumoSearch

Fast, browser-first search engine for local datasets. Fuzzy matching, BM25F ranking, and candidate pruning — no server required.

LumoSearch replaces Fuse-style client-side search with a proper retrieval pipeline: inverted indexes narrow candidates fast, trigrams recover typo-tolerant matches, and only bounded candidates are rescored. The result is accurate, weighted multi-field search that stays fast as your dataset grows.

Features

BM25F ranking — field-weighted scoring with token rarity and length normalization
Fuzzy matching — trigram-based typo tolerance (handles javscrippt -> javascript)
Candidate pruning — token, trigram, and prefix indexes narrow candidates before scoring
Multi-field search — configure per-field weights and search across nested keys
Highlighting — character-level match ranges for each result field
Autocomplete — prefix-aware suggestions with the same scoring pipeline
Synonyms — token-level alias expansion (js -> javascript)
Filters & predicates — exact-match filters and custom predicate functions
Incremental mutations — add, remove, and replace documents without rebuilding
Persistence — snapshot export/import with InMemoryStorage and IndexedDbStorage adapters
Worker support — off-main-thread search via LumoSearchWorker
Hybrid reranking — async searchAsync() with pluggable reranker for semantic/custom scoring
Zero dependencies — ships as pure ES modules

Install

npm install @lumosearch/search

Quick Start

import { LumoSearch } from '@lumosearch/search'

const docs = [
  { title: 'JavaScript Patterns', body: 'Reusable design patterns for JavaScript.', category: 'books' },
  { title: 'TypeScript Handbook', body: 'Core TypeScript syntax and type system.', category: 'docs' },
  { title: 'Node.js in Action', body: 'Server-side JavaScript with Express.', category: 'books' }
]

const search = new LumoSearch(docs, {
  keys: [{ name: 'title', weight: 3 }, { name: 'body', weight: 1 }],
  candidateLimit: 250,
  limit: 10
})

const results = search.search('javscrippt paterns')
// => [{ item: { title: 'JavaScript Patterns', ... }, score: 0.94, highlights: [...] }]

API

Constructor

new LumoSearch(docs, {
  keys: ['title', { name: 'body', weight: 0.8 }],
  limit: 10,
  candidateLimit: 250,
  synonyms: { js: ['javascript'], auth: ['authentication'] }
})

Search

// Basic search
const results = search.search('patterns')

// With options
const results = search.search('patterns', {
  limit: 5,
  filters: { category: 'books' },
  predicate: (doc) => doc.title.length > 10
})

Autocomplete

const suggestions = search.autocomplete('jav', { limit: 5 })

Async Hybrid Reranking

const results = await search.searchAsync('ui architecture', {
  limit: 5,
  rerankLimit: 10,
  reranker: {
    async rerank({ query, candidates }) {
      // plug in your own semantic/ML reranker here
      return candidates.map((c) => ({ refIndex: c.refIndex, score: c.lexicalScore }))
    }
  }
})

Mutations

search.add({ title: 'New Doc', body: '...', category: 'docs' })
search.removeAt(0)
search.remove((doc) => doc.category === 'archived')
search.setCollection(newDocs)

Persistence

import { InMemoryStorage } from '@lumosearch/search'

// Save and restore
const storage = new InMemoryStorage()
await search.save(storage)
const restored = await LumoSearch.load(storage)

// Snapshot export/import (synchronous)
const snapshot = search.exportSnapshot()
const fromSnap = LumoSearch.fromSnapshot(snapshot)

For browsers, use IndexedDbStorage:

import { IndexedDbStorage } from '@lumosearch/search'

const storage = new IndexedDbStorage({ dbName: 'my-app-search', key: 'docs-index' })
await search.save(storage)

Worker Mode

import { LumoSearchWorker } from '@lumosearch/search/worker'

const worker = new LumoSearchWorker(docs, {
  keys: [{ name: 'title', weight: 3 }, { name: 'body', weight: 1 }]
})

const results = await worker.search('javscrippt paterns')
worker.terminate()

Result Shape

interface SearchResult<T> {
  item: T
  refIndex: number
  score: number
  matchedFields: string[]
  highlights: SearchHighlight[]
  lexicalScore?: number   // present when reranked
  rerankScore?: number    // present when reranked
}

interface SearchHighlight {
  field: string
  value: string
  indices: [number, number][]  // character ranges
}

How It Works

Index — Normalize and tokenize configured fields. Build token, trigram, and prefix inverted indexes.
Retrieve — For each query, gather candidates from postings (token 4x, prefix 3x, trigram 1x weight).
Prune — Keep only the top candidateLimit candidates.
Score — Rank with BM25F + exact/prefix/phrase/proximity boosts.
Return — Top-k results with highlights.

Package Exports

| Export | Description | |--------|-------------| | @lumosearch/search | Main entry — LumoSearch, types, persistence adapters | | @lumosearch/search/worker | LumoSearchWorker for off-main-thread search | | @lumosearch/search/worker-script | Raw worker script entry for custom bundler setups |

Browser Demo

A static demo is included in examples/browser-demo.

npm run build
npm run demo:serve
# Open http://localhost:4173/examples/browser-demo/

Contributing

Contributions are welcome. Please open an issue first to discuss what you'd like to change.

git clone https://github.com/lumosearch/search.git
cd search
npm install
npm test

License

MIT