npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

metalsmith-search

v1.0.1

Published

HTML-first Metalsmith search plugin that emits a Fuse.js-compatible index using Cheerio

Downloads

190

Readme

metalsmith-search

metalsmith:plugin npm: version license: MIT test coverage

An HTML-first Metalsmith search plugin that uses Cheerio to extract content from final rendered HTML and emits a Fuse.js-compatible JSON index for client-side search. For a live example see the Metalsmith Component Library website.

Version 1.0.0 is ESM-only and requires Node.js 22+. Fuse.js is no longer bundled — install it separately on the client. See the migration guide below.

Features

  • Processes final rendered HTML after layouts/templates
  • Uses Cheerio for HTML parsing
  • Configurable content exclusion via CSS selectors
  • Page-level indexing with automatic heading extraction
  • Slugified anchor ids generated in the index for headings without an id attribute
  • Emits a Fuse.js-compatible index with configurable search keys (Fuse runs client-side)
  • ESM-only (Node.js 22+)

Installation

npm install metalsmith-search

The plugin generates a Fuse.js-compatible JSON index at build time but does not depend on Fuse.js itself. To consume the index in the browser, install Fuse.js in your site separately:

npm install fuse.js

Usage

IMPORTANT: Place the search plugin AFTER layouts/templates in your pipeline:

import Metalsmith from 'metalsmith';
import layouts from '@metalsmith/layouts';
import search from 'metalsmith-search';

Metalsmith(import.meta.dirname)
  .source('./src')
  .destination('./build')
  .use(layouts()) // HTML generation FIRST
  .use(
    search({
      // Search indexing AFTER layouts
      pattern: '**/*.html',
      excludeSelectors: ['nav', 'header', 'footer'], // Optional chrome removal
    })
  )
  .build((err) => {
    if (err) throw err;
    console.log('Build complete!');
  });

Index Everything (Including Navigation)

metalsmith.use(
  search({
    pattern: '**/*.html',
    excludeSelectors: [], // Index ALL content including nav/header/footer
    fuseOptions: {
      keys: [
        { name: 'title', weight: 10 },
        { name: 'content', weight: 5 }
      ],
      minMatchCharLength: 3, // Filter stop words
    },
  })
);

Complete Documentation & Examples →

How It Works

The plugin processes HTML files after layouts/templates for accurate search indexing:

  1. HTML Parsing: Uses Cheerio to parse the final rendered HTML.
  2. Content Exclusion: Optionally removes elements matching excludeSelectors (defaults to nav, header, footer).
  3. Content Extraction: Pulls all remaining text content for the index entry.
  4. Heading Collection: Walks every h1h6 and records {level, id, title} in the entry's headings array. If a heading carries an id attribute that id is reused; otherwise a slugified id is generated for the index entry. The rendered HTML is not modified — see Heading IDs.
  5. Index Generation: Emits a Fuse.js-compatible search-index.json containing the page text, excerpt, word count, headings metadata, and the Fuse config the client should use to query it.

Options

| Option | Type | Default | Description | | ------------------ | -------------------- | ------------------------------------------------ | --------------------------------------- | | pattern | string \| string[] | '**/*.html' | HTML files to process | | ignore | string \| string[] | ['**/search-index.json'] | Files to ignore | | indexPath | string | 'search-index.json' | Output path for search index | | excludeSelectors | string[] | ['nav', 'header', 'footer'] | CSS selectors to exclude from indexing | | fuseOptions | object | {keys: [...], threshold: 0.3, ...} | Fuse.js configuration options |

Fuse.js Options

The fuseOptions object is passed directly to Fuse.js. The plugin includes optimized defaults:

fuseOptions: {
  // Search sensitivity (0.0 = exact match, 1.0 = match anything)
  threshold: 0.3,

  // Search keys with weights (must match fields produced by the extractor)
  keys: [
    { name: 'title', weight: 10 },   // Page title from <title> or <h1>
    { name: 'content', weight: 5 },  // All page text content
    { name: 'excerpt', weight: 3 }   // Auto-generated excerpt
  ],

  // Include match details and scores in results
  includeScore: true,
  includeMatches: true,

  // Stop word filtering - excludes words shorter than 3 characters
  minMatchCharLength: 3,  // Filters "to", "be", "or", "in", etc.
}

Search Index Structure

Each page generates a single search entry with this structure:

{
  "id": "page:/blog/post",
  "type": "page",
  "url": "/blog/post",
  "title": "Blog Post Title",
  "excerpt": "Brief excerpt...",
  "content": "All page text content...",
  "headings": [
    { "level": "h2", "id": "introduction", "title": "Introduction" },
    { "level": "h3", "id": "overview", "title": "Overview" },
    { "level": "h2", "id": "conclusion", "title": "Conclusion" }
  ],
  "wordCount": 1523
}

Heading IDs

The plugin does not modify your HTML files. It reads the rendered HTML and emits the headings array on each index entry:

  • Headings with an id attribute: the existing id is recorded verbatim.
  • Headings without an id: a URL-safe slug is generated from the heading text and recorded in the index entry only. -1, -2 suffixes are appended to keep ids unique within a page.

For a deep link like /page#some-id to actually scroll the browser to that heading, the rendered HTML must have an element with that id. Two ways to make that work:

  1. Render with auto-anchored headings. Run a markdown plugin (e.g. markdown-it-anchor) or layout helper that adds id attributes during build, before this plugin sees the HTML. The plugin will reuse those ids verbatim.
  2. Resolve client-side. Read the index's headings array in your search component and scroll / highlight by matching the slug against heading text, without relying on DOM id attributes. The reference search component takes this approach.

Examples

For comprehensive examples including client-side implementation, component-based sites, traditional sites, and features, see GETTING-STARTED.md.

Test Coverage

Run the test suite with coverage via npm run coverage. The plugin uses Node's native test runner and built-in coverage reporter; the coverage badge above is updated by CI on each merge to main.

Debug

To enable debug logs, set the DEBUG environment variable to metalsmith-search*:

metalsmith.env('DEBUG', 'metalsmith-search*');

Migration from v0.x to v1.0

Version 1.0.0 modernizes the toolchain and trims the public surface.

Breaking Changes

  1. ESM only. The CommonJS build is gone. Use import search from 'metalsmith-search' from an ESM project.
  2. Node.js 22+ required. Earlier versions are unsupported.
  3. Fuse.js is no longer a runtime dependency. The plugin still produces a Fuse-compatible index, but consumers must npm install fuse.js separately on the client.
  4. contentFields option removed. Frontmatter fields are no longer indexed; the plugin processes only final rendered HTML. Move searchable content into your templates.
  5. Search index schema bumped to 2.0.0. The entry shape no longer includes description, tags, date, or author — only fields the HTML extractor produces (id, type, url, title, content, excerpt, headings, wordCount). Update any client code that reads those legacy fields.

Migration from v0.1.x (HTML-first architecture)

If you are still on v0.1.x, you also need the v0.2.0 changes:

  • Pattern default changed: **/*.md**/*.html
  • Pipeline position: place search() after layouts()
  • Options removed: async, batchSize, lazyLoad, processMarkdownFields, sectionsField, sectionTypeField, autoDetectSectionTypes, componentFields, maxSectionLength, chunkSize, minSectionLength, frontmatterFields
  • Option added: excludeSelectors

Migration from metalsmith-lunr

This plugin is a modern replacement for the deprecated metalsmith-lunr:

Advantages:

  • Better search: Fuse.js provides fuzzy matching
  • Accurate indexing: Cheerio HTML parsing
  • Smaller bundle: More efficient than Lunr.js
  • Active maintenance: Regular updates and bug fixes
  • Modern JavaScript: ESM-only, Node.js 22+

Migration:

// Old metalsmith-lunr
import lunr from 'metalsmith-lunr';

Metalsmith(import.meta.dirname)
  .use(markdown())
  .use(
    lunr({
      ref: 'title',
      fields: { contents: 1, title: 10 },
    })
  );

// New metalsmith-search
import search from 'metalsmith-search';

Metalsmith(import.meta.dirname)
  .use(markdown())
  .use(layouts()) // Add layouts
  .use(
    search({
      pattern: '**/*.html', // HTML not markdown
      fuseOptions: {
        keys: [
          { name: 'content', weight: 1 },
          { name: 'title', weight: 10 },
        ],
      },
    })
  );

License

MIT © Werner Glinka