npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2025 – Pkg Stats / Ryan Hefner

jaccard-suggest

v1.0.0

Published

Fast Jaccard-similarity based auto-suggestion for JS/TS

Readme

📦 jaccard-suggest

Fast Jaccard-similarity based auto-suggestion for JavaScript & TypeScript.
Useful for search, autocomplete, spell-check, or fuzzy matching tasks.


✨ Features

  • 🔍 Suggest similar items using Jaccard similarity ([0,1] score).
  • ⚡ Maintains an inverted index for efficient lookup.
  • 📝 Supports add, remove, update operations.
  • 🔧 Configurable tokenizer, minScore, topK results.
  • 🛑 Built-in English stopword list (customizable).
  • ✅ Fully typed (TypeScript).

📥 Installation

npm install jaccard-suggest

or

yarn add jaccard-suggest

🚀 Usage

import JaccardSuggester from "jaccard-suggest";

// Initialize with some items
const suggester = new JaccardSuggester([
  "apple pie",
  "banana smoothie",
  "chocolate cake",
  "apple juice",
]);

console.log("Total items:", suggester.size()); 
// -> 4

// Find suggestions
console.log(suggester.suggest("apple"));
/*
[
  { item: { id: '0', text: 'apple pie' }, score: 0.5 },
  { item: { id: '3', text: 'apple juice' }, score: 0.5 }
]
*/

// Update an item
suggester.update("1", "banana milkshake");

// Remove an item
suggester.remove("2");

// Suggest again
console.log(suggester.suggest("banana"));

⚙️ API

Constructor

new JaccardSuggester(data?: (string | T)[], options?: Options<T>)
  • data: initial items (strings or custom objects).
  • options:
    • tokenizer?: (s: string) => string[] → split text into tokens (default: lowercased words & numbers).
    • stopWords?: Set → stopwords to ignore (default: built-in English stopwords).
    • minScore?: number → minimum Jaccard similarity score (default: 0).
    • topK?: number → maximum number of results to return (default: 5).

Methods

Method | Description

  • .add(d: string | T): T => Add a new item (string or object with { id, text, meta? }).
  • .remove(id: string): boolean => Remove an item by its id.
  • .update(id: string, text: string): boolean => Update the text of an existing item.
  • .suggest(query: string, options?: Options): SuggestResult[] => .suggest(query: string, options?: Options): SuggestResult[]
  • .size(): number => Get total number of items in the index.

Types

export interface Item {
  id: string;
  text: string;
  meta?: unknown;
}

export interface SuggestResult<T = Item> {
  item: T;
  score: number; // Jaccard score in [0,1]
}

export interface Options<T = Item> {
  tokenizer?: (s: string) => string[];
  minScore?: number;
  topK?: number;
}

🛑 Stopwords

Stopwords are common words (like the, is, in, at) that often don’t add meaning for search/suggestions. This library ships with a default English stopword list (defaultStopWords), which is used automatically.

Example with Custom Stopwords

import JaccardSuggester, { defaultStopWords } from "jaccard-suggest";

const customStopWords = new Set([...defaultStopWords, "pie", "juice"]);

const suggester = new JaccardSuggester(
  ["apple pie", "apple juice", "apple tree"],
  { stopWords: customStopWords }
);

console.log(suggester.suggest("apple"));
// -> only matches "apple tree", because "pie" and "juice" are ignored

Why Stopwords?

  1. Improves relevance by ignoring filler words.
  2. Reduces noise in similarity scoring.
  3. Keeps inverted index smaller.

🔬 How it Works

  1. Each item’s text is tokenized into words (e.g., "apple pie" → { "apple", "pie" }).

  2. Stopwords are removed ("the", "is", etc.).

  3. An inverted index maps each token → items that contain it.

    Example: "apple" → items [0, 3].

  4. When you search:

    • The query is tokenized.

    • Candidate items are retrieved from the inverted index.

    • Each candidate is scored with Jaccard similarity:

      score = ∣A∪B∣/∣A∩B∣​

  5. Results are sorted by score and returned.

Example with Custom Objects

const suggester = new JaccardSuggester([
  { id: "u1", text: "iron man", meta: { year: 2008 } },
  { id: "u2", text: "superman returns", meta: { year: 2006 } },
  { id: "u3", text: "batman begins", meta: { year: 2005 } },
]);

console.log(suggester.suggest("man"));
/*
[
  { item: { id: 'u1', text: 'iron man', meta: { year: 2008 } }, score: 0.333... },
  { item: { id: 'u2', text: 'superman returns', meta: { year: 2006 } }, score: 0.25 },
  { item: { id: 'u3', text: 'batman begins', meta: { year: 2005 } }, score: 0.25 }
]
*/