npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2025 – Pkg Stats / Ryan Hefner

smgen-search

v0.0.1

Published

Zero-deps Node.js Bloom-filter–based full-text search and indexing for Markdown files.

Readme

smgen-search

Zero-deps Node.js Bloom-filter–based full-text search and indexing for Markdown files.

Features

  • Approximate full-text search using Bloom filters
  • Phrase matching, n-grams, prefixes, and fuzzy-word features
  • Custom binary index for fast loading
  • Zero runtime dependencies in JavaScript
  • Prebuilt index files a fraction of the size of the corpus.

Requirements

  • Node.js (v14 or later)
  • yq (for building indexes at development-time)

Installation

npm i -g smgen-search

Usage

1. Build the index

Scan a directory of Markdown files and write the binary index search.bin:

smgen-search build-index PAGES_DIR INDEX_FILE

2. Search the index

Run a query against the generated index:

node search.mjs <search terms>

3. Example usage

smgen build-index path/to/markdown/files
smgen search "Hello, world!"
5.357 michael-yin.wagtail-whoosh
3.299 springload.wagtailembedder
3.250 treasure-data.pandas-td
2.904 luckydonald.pytgbot
2.904 nathancatania.ryurest
2.857 harrislapiroff.wagtail-foliage
2.799 astronouth7303.gpotato
2.786 xtream1101.cutil

4. Browser (web) usage

You can also perform searches directly in a web page by serving the generated search.bin as a static asset. Include SearchReader.mjs as an ES module (directly or via your bundler), then load and query the index:

<script type="module">
  import { SearchReader } from './SearchReader.mjs';

  async function initSearch() {
    // Fetch the binary index from your static asset server
    const resp = await fetch('/search.bin');
    const buffer = await resp.arrayBuffer();
    const reader = new SearchReader(buffer);

    // Perform a search (threshold parameter is optional)
    const results = reader.search('your search terms here', 0.00);
    // Output top 10 results
    console.log(results.slice(0, 10));
  }

  initSearch().catch(console.error);
</script>

Commands

smgen-search build-index

alias smgen bi

Build the search index and store it in a file.

smgen-search build-index PAGES_DIR INDEX_FILE
  • PAGES_DIR - Directory to scan for pages.
  • INDEX_FILE - File to store the index.

PAGES_DIR & INDEX_FILE can be ALSO be provided as environment variables:

smgen-search search

alias smgen s

Search the index and return the results.

smgen-search search "search terms..."

INDEX_FILE can be ALSO be provided as an environment variable here:

Environment variables

The following environment variables can be used to configure the program in CLI mode:

  • INDEX_FILE: path to index file (default search.bin)
  • PAGES_DIR: directory to scan for pages
  • BLOOM_ERROR_RATE: desired false-positive rate for Bloom filters (default: 0.08)
  • MIN_NGRAMS: minimum n-gram word length to index (default: 2)
  • MAX_NGRAMS: maximum n-gram word length to index (default: 3)
  • MIN_PREFIX: minimum prefix char length to index (default: 3)
  • MAX_PREFIX: maximum prefix char length to index (default: 5)

Binary format

The custom binary index format consists of:

  • File header (4 bytes): ASCII SRCH
  • Four 4-byte little-endian integers, in order:
    • MIN_NGRAMS
    • MAX_NGRAMS
    • MIN_PREFIX
    • MAX_PREFIX
  • One or more document chunks, each:
    • 4 bytes: chunk header ASCII SCHK
    • 4 bytes (LE): title length, followed by the title UTF-8 bytes
    • 4 bytes (LE): path length, followed by the path UTF-8 bytes (no .md)
    • 4 bytes (LE): filter length, followed by the raw Bloom filter bits
  • File trailer (4 bytes): ASCII HCRS

Development

  • Contributions and improvements are welcome