npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

metalsmith-search

v0.2.0

Published

HTML-first Metalsmith search plugin with Fuse.js and Cheerio

Downloads

7

Readme

metalsmith-search

metalsmith:plugin npm: version license: MIT test coverage ESM/CommonJS Known Vulnerabilities

An HTML-first Metalsmith search plugin with Fuse.js and Cheerio for accurate content indexing

Version 0.2.0 introduces breaking changes with HTML-first architecture. See migration guide below.

Features

  • Processes final rendered HTML after layouts/templates
  • Uses Cheerio for HTML parsing
  • Configurable content exclusion via CSS selectors
  • Page-level indexing with automatic heading extraction
  • Automatic anchor ID generation for headings without IDs
  • Integrates frontmatter fields into search index
  • Powered by Fuse.js with configurable search keys
  • Supports both ESM and CommonJS

Installation

npm install metalsmith-search

Usage

IMPORTANT: Place the search plugin AFTER layouts/templates in your pipeline:

import Metalsmith from 'metalsmith';
import layouts from '@metalsmith/layouts';
import search from 'metalsmith-search';

Metalsmith(__dirname)
  .source('./src')
  .destination('./build')
  .use(layouts()) // HTML generation FIRST
  .use(
    search({
      // Search indexing AFTER layouts
      pattern: '**/*.html',
      excludeSelectors: ['nav', 'header', 'footer'], // Optional chrome removal
    })
  )
  .build((err) => {
    if (err) throw err;
    console.log('Build complete!');
  });

Index Everything (Including Navigation)

metalsmith.use(
  search({
    pattern: '**/*.html',
    excludeSelectors: [], // Index ALL content including nav/header/footer
    contentFields: ['title', 'description', 'summary'],
    fuseOptions: {
      keys: [
        { name: 'title', weight: 10 },
        { name: 'content', weight: 5 },
        { name: 'description', weight: 3 },
      ],
      minMatchCharLength: 3, // Filter stop words
    },
  })
);

Complete Documentation & Examples →

How It Works

The plugin processes HTML files after layouts/templates for accurate search indexing:

  1. HTML Parsing: Uses Cheerio to parse final rendered HTML
  2. Content Exclusion: Optionally removes nav, header, footer elements
  3. Content Extraction: Extracts all text content from the page
  4. Heading Processing: Finds all headings (h1-h6) and ensures they have IDs
  5. Index Generation: Creates Fuse.js-compatible search index with:
    • Full page text content
    • Metadata from frontmatter
    • Headings array for scroll-to functionality
  6. Anchor Generation: Automatically generates IDs for headings without them

Plugin Position

IMPORTANT: Place the search plugin AFTER layouts/templates:

metalsmith
  .use(layouts()) // HTML generation FIRST
  .use(search()) // Process final HTML AFTER layouts
  .build();

Benefits:

  • Indexes what users actually see
  • Works with any content architecture
  • No assumptions about frontmatter structure
  • Faster and more accurate than RegExp-based approaches

Options

| Option | Type | Default | Description | | ------------------ | -------------------- | ------------------------------------------------ | --------------------------------------- | | pattern | string \| string[] | '**/*.html' | HTML files to process | | ignore | string \| string[] | ['**/search-index.json'] | Files to ignore | | indexPath | string | 'search-index.json' | Output path for search index | | excludeSelectors | string[] | ['nav', 'header', 'footer'] | CSS selectors to exclude from indexing | | contentFields | string[] | ['title', 'description', 'summary', 'excerpt'] | Frontmatter fields to include in search | | fuseOptions | object | {keys: [...], threshold: 0.3, ...} | Fuse.js configuration options |

Fuse.js Options

The fuseOptions object is passed directly to Fuse.js. The plugin includes optimized defaults:

fuseOptions: {
  // Search sensitivity (0.0 = exact match, 1.0 = match anything)
  threshold: 0.3,

  // Search keys with weights
  keys: [
    { name: 'title', weight: 10 },      // Page titles (highest priority)
    { name: 'content', weight: 5 },     // Main text content
    { name: 'description', weight: 3 }, // Page descriptions
    { name: 'tags', weight: 7 },        // Content tags
  ],

  // Include match details and scores in results
  includeScore: true,
  includeMatches: true,

  // Stop word filtering - excludes words shorter than 3 characters
  minMatchCharLength: 3,  // Filters "to", "be", "or", "in", etc.
}

Search Index Structure

Each page generates a single search entry with this structure:

{
  "id": "page:/blog/post",
  "type": "page",
  "url": "/blog/post",
  "title": "Blog Post Title",
  "description": "Page description",
  "excerpt": "Brief excerpt...",
  "content": "All page text content...",
  "tags": ["javascript", "tutorial"],
  "headings": [
    { "level": "h2", "id": "introduction", "title": "Introduction" },
    { "level": "h3", "id": "overview", "title": "Overview" },
    { "level": "h2", "id": "conclusion", "title": "Conclusion" }
  ],
  "wordCount": 1523
}

The headings array enables scroll-to functionality:

  • Fuse.js finds all matches within the page content
  • Client-side JavaScript uses headings to determine which section each match is in
  • Users can be scrolled to the nearest heading anchor

Automatic ID generation:

  • Headings with existing IDs: preserved as-is
  • Headings without IDs: automatically generated from heading text
  • Duplicate prevention: adds -2, -3 suffixes for uniqueness

Examples

For comprehensive examples including client-side implementation, component-based sites, traditional sites, and advanced features, see GETTING-STARTED.md.

Debug

To enable debug logs, set the DEBUG environment variable to metalsmith-search*:

metalsmith.env('DEBUG', 'metalsmith-search*');

Or via command line:

DEBUG=metalsmith-search* npm run build

Migration from v0.1.x

Version 0.2.0 introduces breaking changes with HTML-first architecture:

Breaking Changes

  1. Pattern default changed: **/*.md**/*.html
  2. Pipeline position: Before layouts → After layouts
  3. Removed options: async, batchSize, lazyLoad, processMarkdownFields, sectionsField, sectionTypeField, autoDetectSectionTypes, componentFields, maxSectionLength, chunkSize, minSectionLength, frontmatterFields
  4. New options: excludeSelectors, contentFields
  5. New dependency: Cheerio added for HTML parsing

Migration Steps

Before (v0.1.x):

metalsmith
  .use(
    search({
      pattern: '**/*.md',
      indexLevels: ['page', 'section'],
      sectionsField: 'sections',
    })
  )
  .use(layouts());

After (v0.2.0):

metalsmith
  .use(layouts()) // Move layouts BEFORE search
  .use(
    search({
      pattern: '**/*.html', // Change to HTML
      indexLevels: ['page', 'section'],
      excludeSelectors: ['nav', 'header', 'footer'], // Optional
      contentFields: ['title', 'description'], // Configurable
    })
  );

Migration from metalsmith-lunr

This plugin is a modern replacement for the deprecated metalsmith-lunr:

Advantages:

  • Better search: Fuse.js provides fuzzy matching
  • Accurate indexing: Cheerio HTML parsing
  • Smaller bundle: More efficient than Lunr.js
  • Active maintenance: Regular updates and bug fixes
  • Modern JavaScript: ESM/CJS support

Migration:

// Old metalsmith-lunr
import lunr from 'metalsmith-lunr';

Metalsmith(__dirname)
  .use(markdown())
  .use(
    lunr({
      ref: 'title',
      fields: { contents: 1, title: 10 },
    })
  );

// New metalsmith-search
import search from 'metalsmith-search';

Metalsmith(__dirname)
  .use(markdown())
  .use(layouts()) // Add layouts
  .use(
    search({
      pattern: '**/*.html', // HTML not markdown
      fuseOptions: {
        keys: [
          { name: 'content', weight: 1 },
          { name: 'title', weight: 10 },
        ],
      },
    })
  );

License

MIT © Werner Glinka