npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

slimsearch

v2.1.1

Published

Tiny but powerful full-text search engine for browser and Node

Downloads

6,619

Readme

SlimSearch

npm npm downloads types

Test codecov

slimsearch is a tiny but powerful in-memory full-text search engine written in JavaScript. It is respectful of resources, and it can comfortably run both in Node and in the browser.

SlimSearch is based on MiniSearch, sharing the SAME index structure.

Use case

slimsearch addresses use cases where full-text search features are needed (e.g. prefix search, fuzzy search, ranking, boosting of fields…), but the data to be indexed can fit locally in the process memory. While you won't index the whole Internet with it, there are surprisingly many use cases that are served well by slimsearch. By storing the index in local memory, slimsearch can work offline, and can process queries quickly, without network latency.

A prominent use-case is real time search "as you type" in web and mobile applications, where keeping the index on the client enables fast and reactive UIs, removing the need to make requests to a search server.

Features

  • Memory-efficient index, designed to support memory-constrained use cases like mobile browsers.

  • Exact match, prefix search, fuzzy match, field boosting.

  • Auto-suggestion engine, for auto-completion of search queries.

  • Modern search result ranking algorithm.

  • Documents can be added and removed from the index at any time.

  • Zero external dependencies.

slimsearch strives to expose a simple API that provides the building blocks to build custom solutions, while keeping a small and well tested codebase.

Installation

With npm:

npm install slimsearch

With yarn:

yarn add slimsearch

With pnpm:

pnpm add slimsearch

Then require or import it in your project:

// If you are using import:
import {
  createIndex,
  // apis...
} from "slimsearch";

// If you are using require:
const {
  createIndex,
  // apis...
} = require("slimsearch");

Usage

Basic usage

import { addAll, createIndex, search } from "slimsearch";

// A collection of documents for our examples
const documents = [
  {
    id: 1,
    title: "Moby Dick",
    text: "Call me Ishmael. Some years ago...",
    category: "fiction",
  },
  {
    id: 2,
    title: "Zen and the Art of Motorcycle Maintenance",
    text: "I can see by my watch...",
    category: "fiction",
  },
  {
    id: 3,
    title: "Neuromancer",
    text: "The sky above the port was...",
    category: "fiction",
  },
  {
    id: 4,
    title: "Zen and the Art of Archery",
    text: "At first sight it must seem...",
    category: "non-fiction",
  },
  // ...and more
];

const index = createIndex({
  fields: ["title", "text"], // fields to index for full-text search
  storeFields: ["title", "category"], // fields to return with search results
});

// Index all documents
addAll(index, documents);

// Search with default options
const results = search(index, "zen art motorcycle");
// => [
//   { id: 2, title: 'Zen and the Art of Motorcycle Maintenance', category: 'fiction', score: 2.77258, match: { ... } },
//   { id: 4, title: 'Zen and the Art of Archery', category: 'non-fiction', score: 1.38629, match: { ... } }
// ]

Search options

slimsearch supports several options for more advanced search behavior:

import { addAll, createIndex, search } from "slimsearch";

// Search only specific fields
search(index, "zen", { fields: ["title"] });

// Boost some fields (here "title")
search(index, "zen", { boost: { title: 2 } });

// Prefix search (so that 'moto' will match 'motorcycle')
search(index, "moto", { prefix: true });

// Search within a specific category
search(index, "zen", {
  filter: (result) => result.category === "fiction",
});

// Fuzzy search, in this example, with a max edit distance of 0.2 * term length,
// rounded to nearest integer. The mispelled 'ismael' will match 'ishmael'.
search(index, "ismael", { fuzzy: 0.2 });

// You can set the default search options upon initialization
index = createIndex({
  fields: ["title", "text"],
  searchOptions: {
    boost: { title: 2 },
    fuzzy: 0.2,
  },
});
addAll(index, documents);

// It will now by default perform fuzzy search and boost "title":
search(index, "zen and motorcycles");

Auto suggestions

slimsearch can suggest search queries given an incomplete query:

import { autoSuggest } from "slimsearch";

autoSuggest(index, "zen ar");
// => [ { suggestion: 'zen archery art', terms: [ 'zen', 'archery', 'art' ], score: 1.73332 },
//      { suggestion: 'zen art', terms: [ 'zen', 'art' ], score: 1.21313 } ]

The autoSuggest method takes the same options as the search method, so you can get suggestions for misspelled words using fuzzy search:

autoSuggest(index, "neromancer", { fuzzy: 0.2 });
// => [ { suggestion: 'neuromancer', terms: [ 'neuromancer' ], score: 1.03998 } ]

Suggestions are ranked by the relevance of the documents that would be returned by that search.

Sometimes, you might need to filter auto suggestions to, say, only a specific category. You can do so by providing a filter option:

autoSuggest(index, "zen ar", {
  filter: (result) => result.category === "fiction",
});
// => [ { suggestion: 'zen art', terms: [ 'zen', 'art' ], score: 1.21313 } ]

Field extraction

By default, documents are assumed to be plain key-value objects with field names as keys and field values as simple values. In order to support custom field extraction logic (for example for nested fields, or non-string field values that need processing before tokenization), a custom field extractor function can be passed as the extractField option:

import { createIndex } from "slimsearch";

// Assuming that our documents look like:
const documents = [
  {
    id: 1,
    title: "Moby Dick",
    author: { name: "Herman Melville" },
    pubDate: new Date(1851, 9, 18),
  },
  {
    id: 2,
    title: "Zen and the Art of Motorcycle Maintenance",
    author: { name: "Robert Pirsig" },
    pubDate: new Date(1974, 3, 1),
  },
  {
    id: 3,
    title: "Neuromancer",
    author: { name: "William Gibson" },
    pubDate: new Date(1984, 6, 1),
  },
  {
    id: 4,
    title: "Zen in the Art of Archery",
    author: { name: "Eugen Herrigel" },
    pubDate: new Date(1948, 0, 1),
  },
  // ...and more
];

// We can support nested fields (author.name) and date fields (pubDate) with a
// custom `extractField` function:

const index = createIndex({
  fields: ["title", "author.name", "pubYear"],
  extractField: (document, fieldName) => {
    // If field name is 'pubYear', extract just the year from 'pubDate'
    if (fieldName === "pubYear") {
      const pubDate = document["pubDate"];
      return pubDate && pubDate.getFullYear().toString();
    }

    // Access nested fields
    return fieldName.split(".").reduce((doc, key) => doc && doc[key], document);
  },
});

The default field extractor can be obtained by calling getDefaultValue('extractField').

Tokenization

By default, documents are tokenized by splitting on Unicode space or punctuation characters. The tokenization logic can be easily changed by passing a custom tokenizer function as the tokenize option:

import { createIndex } from "slimsearch";

// Tokenize splitting by hyphen
const index = createIndex({
  fields: ["title", "text"],
  tokenize: (string, _fieldName) => string.split("-"),
});

Upon search, the same tokenization is used by default, but it is possible to pass a tokenize search option in case a different search-time tokenization is necessary:

import { createIndex } from "slimsearch";

// Tokenize splitting by hyphen
const index = createIndex({
  fields: ["title", "text"],
  tokenize: (string) => string.split("-"), // indexing tokenizer
  searchOptions: {
    tokenize: (string) => string.split(/[\s-]+/), // search query tokenizer
  },
});

The default tokenizer can be obtained by calling getDefaultValue('tokenize').

Term processing

Terms are downcased by default. No stemming is performed, and no stop-word list is applied. To customize how the terms are processed upon indexing, for example to normalize them, filter them, or to apply stemming, the processTerm option can be used. The processTerm function should return the processed term as a string, or a falsy value if the term should be discarded:

import { createIndex } from "slimsearch";

const stopWords = new Set([
  "and",
  "or",
  "to",
  "in",
  "a",
  "the" /* ...and more */,
]);

// Perform custom term processing (here discarding stop words and downcasing)
const index = createIndex({
  fields: ["title", "text"],
  processTerm: (term, _fieldName) =>
    stopWords.has(term) ? null : term.toLowerCase(),
});

By default, the same processing is applied to search queries. In order to apply a different processing to search queries, supply a processTerm search option:

import { createIndex } from "slimsearch";

const index = createIndex({
  fields: ["title", "text"],
  processTerm: (term) => (stopWords.has(term) ? null : term.toLowerCase()), // index term processing
  searchOptions: {
    processTerm: (term) => term.toLowerCase(), // search query processing
  },
});

The default term processor can be obtained by calling getDefaultValue('processTerm').

API Documentation

Refer to the API documentation for details about configuration options and methods.