npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@yoch/minisearch

v8.1.0

Published

Node.js full-text search with FrozenMiniSearch and binary index snapshots

Readme

@yoch/minisearch

In-memory full-text search for Node.js — a fork of MiniSearch by Luca Ongaro, extended for production serving: smaller indexes, faster loads, and a read-only fast path.

Current release: 8.1.0 · install with npm install @yoch/minisearch


Why this fork?

MiniSearch is excellent for building and querying an index in JavaScript. This fork keeps that API for mutable indexing, and adds FrozenMiniSearch for when the index is built once and queried many times:

| | Mutable MiniSearch | FrozenMiniSearch | |---|---------------------|-------------------| | Use when | Documents change (add, remove, discard) | Corpus is fixed, or you reload from disk | | Memory | Maps and nested objects per posting | Flat Uint32Array / Uint8Array postings | | On disk | toJSON / loadJSON | saveBinary / loadBinary (MSv4 / MSv3) | | Typical search | Baseline | Often ~20–35% faster p50 on the same corpus (see benchmarks) |

Same BM25 scoring, prefix/fuzzy search, autoSuggest, and query combinators — frozen indexes aim for search ranking parity with addAll + freeze() when built with the same options. Term frequencies are stored as Uint8 (max 255 per document/field); extreme repetition can cause a small score drift versus the mutable index.


Quick start

npm install @yoch/minisearch
# pre-releases: npm install @yoch/minisearch@beta

One-shot frozen index (no mutable step):

import { FrozenMiniSearch } from '@yoch/minisearch'

const options = { fields: ['title', 'text'], storeFields: ['title'] }

const index = FrozenMiniSearch.fromDocuments(documents, options)
index.search('ishmael', { prefix: true })
index.autoSuggest('zen')

// Persist and reload
const buf = index.saveBinary()
const loaded = FrozenMiniSearch.loadBinary(buf, options)

Mutable index, then freeze (incremental build):

import MiniSearch, { FrozenMiniSearch } from '@yoch/minisearch'

const ms = new MiniSearch({ fields: ['title', 'text'] })
ms.addAll(documents)

const frozen = ms.freeze()   // immutable snapshot
const buf = frozen.saveBinary()
// ESM
import MiniSearch, { FrozenMiniSearch, buildFrozenFromDocuments } from '@yoch/minisearch'

// CommonJS
const MiniSearch = require('@yoch/minisearch')
const { FrozenMiniSearch } = require('@yoch/minisearch')

Pick the right API

| Goal | API | |------|-----| | Live index that changes over time | MiniSearchfreeze() when you need read-only serving | | Fixed corpus, build frozen directly | FrozenMiniSearch.fromDocuments(documents, options) | | Build doc-by-doc (no documents[] buffer) | createFrozenIndexBuilder(options).add(doc)freezeFrozenIndexBuilder(builder) | | Async stream of documents | FrozenMiniSearch.fromAsyncIterable(iterable, options) | | Load a snapshot from disk | FrozenMiniSearch.loadBinary(buffer, options) | | Custom assembly pipeline | buildFrozenFromDocuments, assembleFrozen, freezeFromMiniSearch |

fromDocuments matches new MiniSearch(opts).addAll(docs).freeze() for search ranking on the same corpus and options (fields, tokenize, processTerm, …). Frozen indexes do not support add / remove.

External corpus (e.g. lookup by id after search): keep full rows in your own store (dataCache, DB, etc.) and use minimal storeFields (often ['id'] only) so the frozen index does not duplicate payload text:

import { createFrozenIndexBuilder, freezeFrozenIndexBuilder } from '@yoch/minisearch'

function buildFrozenIndexFromRows (rows, options) {
  const builder = createFrozenIndexBuilder(options, {
    estimatedDocumentCount: rows.length
  })
  for (let i = 0; i < rows.length; i++) {
    builder.add(buildIndexDocument(rows[i], i))
  }
  return freezeFrozenIndexBuilder(builder)
}

// After search: enrich from your store — frozen.getStoredFields(res.id) or dataCache[type][res.id]

Async stream (no intermediate array; documents are indexed as they arrive):

import { createReadStream } from 'node:fs'
import { parse } from 'csv-parse'
import { FrozenMiniSearch } from '@yoch/minisearch'

async function buildFromCsv (path, options) {
  async function * documents () {
    const parser = createReadStream(path).pipe(parse({ columns: true }))
    for await (const row of parser) {
      yield { id: row.cis, denomination: row.denomination, /* … */ }
    }
  }
  return FrozenMiniSearch.fromAsyncIterable(documents(), options)
}

For a sync iterable (for...of on an array or generator), use the builder directly:

import { createFrozenIndexBuilder, freezeFrozenIndexBuilder } from '@yoch/minisearch'

const builder = createFrozenIndexBuilder(options)
for (const doc of documentGenerator()) {
  builder.add(doc)
}
const frozen = freezeFrozenIndexBuilder(builder)

estimatedDocumentCount in the second argument to createFrozenIndexBuilder pre-allocates per-document arrays when the final size is known; internal buffers are trimmed to the actual count on freeze if the hint was too large.


FrozenMiniSearch in a bit more detail

  • freeze() — snapshot a mutable index into compact typed postings + a radix tree keyed by term index.
  • fromDocuments() — build that structure in one pass (skips nested Map postings and radix cloning at freeze time).
  • createFrozenIndexBuilder() — same output without a temporary documents[] array; finalize with freezeFrozenIndexBuilder(builder) (or assembleFrozen(builder.freezeParams()) for custom assembly).
  • fromAsyncIterable() — async document stream (e.g. CSV parser) into a frozen index; equivalent to builder + for await + freezeFrozenIndexBuilder.
  • saveBinary() / loadBinary()MSv4 (sparse multi-field, Uint16 doc ids when possible) or MSv3 (single-field dense, Uint32 doc ids). MSv1/MSv2 are not supported — re-save older snapshots. Field names are stored in the snapshot; fields in loadBinary options is optional (if provided, it must match exactly). Custom tokenize / processTerm are not stored — pass the same functions at load time if you customized them. storeFields data is embedded in the snapshot.
  • Term frequencies — stored as Uint8 (max 255 per doc/term); only affects scores for extreme term repetition.
  • frozenMemoryBreakdown() — introspect postings, radix tree, and stored-field footprint (estimates only; not exact heap accounting).

Mutable index → frozen: prefer a fixed corpus. If you used discard() on a MiniSearch index, run vacuum() before freeze() to shrink the snapshot; search parity is still expected without vacuum, but the binary may retain sparse slots.

Advanced API (assembleFrozen, freezeFromMiniSearch, FrozenIndexBuilder) is for custom pipelines — most apps should use fromDocuments, freeze(), or the builder helpers above.

Advanced exports:

import {
  FrozenMiniSearch,
  createFrozenIndexBuilder,
  freezeFrozenIndexBuilder,
  FrozenIndexBuilder,
  type FrozenIndexBuilderHints,
  buildFrozenFromDocuments,
  assembleFrozen,
  freezeFromMiniSearch,
  frozenMemoryBreakdown
} from '@yoch/minisearch'

MiniSearch (mutable)

Full upstream-style API: field boosts, fuzzy/prefix, nested queries, AND / OR / AND_NOT, filters, autoSuggest, vacuum after discard, etc.

import MiniSearch from '@yoch/minisearch'

const miniSearch = new MiniSearch({ fields: ['title', 'text'] })
miniSearch.addAll(documents)
miniSearch.search('zen art motorcycle')

TypeScript definitions: dist/es/index.d.ts.


FrozenMiniSearch — optimizations

Already in MSv3 / MSv4 (8.0.0+)

| Area | Change | Effect | |------|--------|--------| | Format | MSv3 replaces MSv1/MSv2 (breaking) | CRC32 payload check; binary field names, ids, stored fields, term tree | | Binary load | Structural validation in decodeFrozenSnapshot / validateFrozenSnapshot | Corrupt snapshots fail fast with Invalid frozen index: … | | loadBinary | fields optional (embedded in snapshot); if provided, must match exactly | Simpler reload; no silent field subset | | saveBinary | Single pre-allocated buffer | Lower peak memory while serializing | | Search | Per-query cache for fieldTermDataFor(termIndex) | Fewer allocations on prefix/fuzzy queries |

Measure regressions with benchmarks/ (freezeMs, saveBinary, loadBinary, search p50, heap frozen).

Suggested follow-ups (not implemented yet)

| Priority | Topic | Idea | Trade-off | |----------|-------|------|-----------| | Format | Term dictionary | Drop runtime _terms[] duplicate at rest | Saves heap; more complex save path | | API | loadBinaryAsync | Chunked/async load like loadJSONAsync | Better cold start on huge indexes | | API | Input types | Accept Uint8Array as well as Buffer on loadBinary | Broader runtime support | | Build | freeze / builder | One-pass posting flatten with size estimate | Faster freeze on very large corpora | | Search | Wildcard | Iterate only active document slots after dense remap | Faster wildcard after many discards | | Search | Hot path | Direct subarray posting access in aggregateTerm | Lower GC; invasive |

Intentionally deferred: embedding tokenize / processTerm in the snapshot. Raising the Uint8 term-frequency cap needs a new postings encoding.

For contributor-oriented notes, see DESIGN_DOCUMENT.md — FrozenMiniSearch.


Benchmarks

Reproducible comparisons (heap, load time, search latency) live under benchmarks/:

npm run benchmark:compare    # terminal report
npm run benchmark:diff       # vs versioned baseline

Development

npm install
npm test
npm run build

Use npm run for scripts (Yarn 1.x on Node 22 prints url.parse deprecation noise when invoking yarn test / yarn build).

Publish stable (updates npm latest):

npm run release:stable

Publish a pre-release (dist-tag beta only):

npm run release:beta

Requirements: Node.js ES2018+. No browser UMD/CDN build in this fork (Node-only ESM + CJS).


Changelog & credits

See CHANGELOG.md.

  • MiniSearchLuca Ongaro (MIT)
  • This forkyoch/minisearch: FrozenMiniSearch, MSv4/MSv3 binary snapshots, shared scoring refactor

Upstream docs: MiniSearch site · intro article