npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

metalsmith-seo

v1.1.3

Published

A metalsmith plugin for SEO optimization including sitemap generation, meta tags, and more.

Readme

metalsmith-seo

Inspired by metalsmith-sitemap, the plugin provides SEO optimization for Metalsmith with metadata generation, social media tags, and structured data including Open Graph tags, Twitter Cards, JSON-LD structured data, and sitemap generation.

npm version metalsmith: plugin license: MIT Test Coverage

Version 1.x is ESM-only and requires Node.js 22+. The plugin API and output are unchanged from v0.8.x — only the packaging was modernized in 1.0.0. See the migration guide below.

Features

Core SEO Optimization:

  • HTML Head Optimization - Meta tags, canonical URLs, robots directives
  • Open Graph Tags - Social media sharing with Facebook, LinkedIn, etc.
  • Twitter Cards - Rich Twitter previews with automatic card type detection
  • JSON-LD Structured Data - Article, Product, Organization, WebPage schemas
  • Sitemap Generation - Complete sitemap.xml with auto-calculation of priority, changefreq, and lastmod
  • Robots.txt Management - robots.txt generation and sitemap coordination
  • llms.txt Generation - Opt-in markdown index (and optional plaintext dump) for large language model consumers, per the llmstxt.org proposal

Automation:

  • Content Analysis - Auto-detects content type (article, product, page)
  • Metadata Derivation - Single source feeds all formats (title → og:title, twitter:title, JSON-LD headline)
  • Fallback Chains - Defaults from site.json, frontmatter, or content analysis
  • Site.json Integration - Integration with existing Metalsmith site configuration

Performance:

  • Head-only Cheerio parsing - Only the <head> section is fed to the HTML parser; the body (often the bulk of the document) is never parsed or serialized
  • Single-pass processing - All meta tags, Open Graph, Twitter Cards, JSON-LD, and link tags are injected in one parse/serialize cycle per file
  • Batch processing - Files are processed in configurable parallel batches

Developer Experience:

  • ESM-only - Modern Node.js (>= 22) with native ESM
  • Minimal Configuration - Works great with just a hostname
  • Comprehensive Testing - High test coverage with real-world scenarios (see badge above)

What this plugin won't do

Scope boundaries that come up often enough to be worth stating up front. Each is a deliberate design decision — see docs/THEORY.md §5 for the reasoning.

  • Score pages by content length. Sitemap priority is derived from URL hierarchy, not word count. Long privacy policies don't outrank the homepage.
  • Ship any client-side JavaScript. Output is static HTML. SEO that depends on runtime execution is invisible to crawlers that don't run JS.
  • Parse markdown. HTML in, HTML out. Run @metalsmith/markdown (or similar) before this plugin.
  • Modify anything outside <head>. No body-content rewriting, no mid-page schema injection, no image-tag manipulation.
  • Make network requests. No og:image dimension fetching, no live sitemap validation. Builds are hermetic.

Requirements

  • Node.js >= 22
  • Metalsmith >= 2.5.0
  • ESM-only (no CommonJS support)

Installation

npm install metalsmith-seo

Usage

Quick Start

Minimal Setup

import Metalsmith from 'metalsmith';
import seo from 'metalsmith-seo';

Metalsmith(import.meta.dirname)
  .use(
    seo({
      hostname: 'https://example.com',
    })
  )
  .build();

This simple configuration automatically generates:

  • Complete HTML meta tags
  • Open Graph tags for social sharing
  • Twitter Card tags
  • JSON-LD structured data
  • sitemap.xml with calculated priority/changefreq/lastmod values
  • robots.txt (with sitemap reference)

With site.json Integration (Recommended)

Create data/site.json:

{
  "name": "My Awesome Site",
  "title": "My Site - Welcome",
  "description": "The best site on the internet",
  "url": "https://example.com",
  "locale": "en_US",
  "twitter": "@mysite",
  "organization": {
    "name": "My Company",
    "logo": "https://example.com/logo.png"
  }
}

Then use the plugin:

import Metalsmith from 'metalsmith';
import metadata from '@metalsmith/metadata';
import seo from 'metalsmith-seo';

Metalsmith(import.meta.dirname)
  .use(metadata({ site: 'data/site.json' }))
  .use(seo()) // Automatically uses site.json values!
  .build();

Or if your site metadata is nested differently:

import Metalsmith from 'metalsmith';
import metadata from '@metalsmith/metadata';
import seo from 'metalsmith-seo';

// If metadata is at metadata().data.site instead of metadata().site
Metalsmith(import.meta.dirname)
  .use(
    metadata({
      data: {
        site: 'data/site.json',
      },
    })
  )
  .use(
    seo({
      metadataPath: 'data.site', // Tell plugin where to find site metadata
    })
  )
  .build();

Frontmatter Integration

Add SEO data to any page. The plugin extracts metadata from multiple locations:

---
title: 'My Blog Post'
date: 2024-01-15
seo:
  title: 'Advanced SEO Techniques - My Blog'
  description: 'Learn how to optimize your site for search engines'
  image: '/images/seo-guide.jpg'
  type: 'article'
---

Card Object Support

The plugin also extracts metadata from card objects (commonly used for blog post listings):

---
layout: pages/sections.njk
draft: false

seo:
  title: 'Override Title for SEO' # Highest priority
  description: 'SEO-specific description'

card:
  title: 'Architecture Philosophy' # Used if not in seo object
  date: '2025-06-02'
  author:
    - Albert Einstein
    - Isaac Newton
  image: '/assets/images/sample9.jpg'
  excerpt: 'This starter embodies several key principles...'
---

Metadata Extraction Priority:

  1. seo object (highest priority - explicit SEO overrides)
  2. card object (for blog posts and content cards)
  3. Root level properties
  4. Configured defaults
  5. Site-wide defaults (from site.json)
  6. Auto-generated content

Author Fallback Chain: When no author is specified in frontmatter, the plugin uses siteOwner from your site.json as a fallback, ensuring all content has proper attribution for SEO and social media.

Result: Comprehensive SEO markup automatically generated:

<!-- Basic Meta -->
<title>Advanced SEO Techniques - My Blog</title>
<meta name="description" content="Learn how to optimize your site for search engines" />
<link rel="canonical" href="https://example.com/blog/advanced-seo" />

<!-- Open Graph -->
<meta property="og:title" content="Advanced SEO Techniques - My Blog" />
<meta property="og:type" content="article" />
<meta property="og:image" content="https://example.com/images/seo-guide.jpg" />

<!-- Twitter Cards -->
<meta name="twitter:card" content="summary_large_image" />
<meta name="twitter:title" content="Advanced SEO Techniques - My Blog" />

<!-- JSON-LD Structured Data -->
<script type="application/ld+json">
  {
    "@context": "https://schema.org",
    "@type": "Article",
    "headline": "Advanced SEO Techniques - My Blog",
    "image": "https://example.com/images/seo-guide.jpg",
    "datePublished": "2024-01-15",
    "author": { "@type": "Person", "name": "Site Author" }
  }
</script>

Site.json Configuration

The plugin integrates seamlessly with your existing site.json configuration:

Complete site.json Example

{
  "name": "My Awesome Site",
  "title": "My Site - Home Page",
  "description": "The default description for all pages",
  "url": "https://example.com",
  "locale": "en_US",

  "defaultImage": "/images/default-og.jpg",
  "twitter": "@mysite",
  "facebookAppId": "123456789",
  "siteOwner": "Your Name",

  "organization": {
    "name": "My Company",
    "logo": "https://example.com/logo.png",
    "sameAs": [
      "https://twitter.com/mycompany",
      "https://facebook.com/mycompany",
      "https://linkedin.com/company/mycompany"
    ],
    "contactPoint": {
      "telephone": "+1-555-123-4567",
      "contactType": "customer service"
    }
  },

  "social": {
    "twitterCreator": "@author",
    "locale": "en_US"
  },

  "sitemap": {
    "changefreq": "weekly",
    "priority": 0.8
  }
}

Site.json Property Mapping

| site.json Property | SEO Usage | Example | | ------------------ | ------------------------ | --------------------------- | | url | Hostname for all URLs | https://example.com | | name / title | Site name in Open Graph | og:site_name | | description | Default meta description | <meta name="description"> | | defaultImage | Default social image | og:image, twitter:image | | locale | Content language | og:locale | | twitter | Twitter site handle | twitter:site | | facebookAppId | Facebook integration | fb:app_id | | siteOwner | Default author fallback | <meta name="author"> | | organization | Company info | JSON-LD Organization schema |

Configuration Precedence

The plugin uses this priority order:

  1. Page frontmatter (seo property) - Highest priority
  2. Plugin options - Override site defaults
  3. site.json values - Site-wide defaults
  4. Automatic fallbacks - Auto-generated from content

Plugin Options

Basic Configuration

.use(seo({
  hostname: 'https://example.com',  // Required if not in site.json

  // Global defaults for all pages
  defaults: {
    title: 'My Site',
    description: 'Default page description',
    socialImage: '/images/default-og.jpg'
  },

  // Social media configuration
  social: {
    siteName: 'My Site',
    twitterSite: '@mysite',
    twitterCreator: '@author',
    facebookAppId: '123456789',
    locale: 'en_US'
  },

  // JSON-LD structured data
  jsonLd: {
    organization: {
      name: 'My Company',
      logo: 'https://example.com/logo.png'
    }
  }
}))

Advanced Options

.use(seo({
  hostname: 'https://example.com',

  // Customize where to find site metadata
  metadataPath: 'site',     // Default: 'site' (can be 'data.site' or any path)

  // Customize frontmatter property name
  seoProperty: 'seo',        // Default: 'seo'

  // Fallback property mappings (dotted paths like 'author.name' are supported)
  fallbacks: {
    title: 'title',            // Default: 'title'
    description: 'excerpt',    // Default: 'excerpt'
    image: 'featured_image',   // Default: 'featured_image'
    author: 'author'           // Default: 'author'
  },

  // Sitemap configuration
  sitemap: {
    output: 'sitemap.xml',
    auto: true,              // Default: true (automatic calculation)
    changefreq: 'weekly',    // Override auto-calculation
    priority: 0.8,           // Override auto-calculation
    omitIndex: false
  },

  // Robots.txt configuration
  robots: {
    generateRobots: true,      // Generate robots.txt if none exists
    addSitemapReference: true, // Add sitemap reference to existing robots.txt
    disallowPaths: ['/admin/', '/private/'], // Paths to disallow
    userAgent: '*'             // User agent directive
  },

  // llms.txt configuration (opt-in)
  llms: {
    enabled: true,             // Opt in to llms.txt generation
    output: 'llms.txt',        // Index filename
    fullText: false,           // Also emit llms-full.txt (concatenated plaintext)
    fullTextOutput: 'llms-full.txt',
    title: 'My Site',          // Defaults to social.siteName
    description: 'Tagline.',   // Defaults to defaults.description
    details: undefined,        // Extra markdown block under the description
    pattern: '**/*.html',      // Which files to include
    privateProperty: 'private',// Frontmatter flag that excludes a file
    groups: {                  // Optional: named groups by glob pattern
      Writing: 'writing/**/*.html',
      Notes:   'studio-notes/**/*.html'
    },
    perLocale: false,          // Emit one file pair per locale
    locales: ['en', 'de'],     // Known locale path prefixes (fallback when
                               // frontmatter.locale isn't set by a plugin)
    sort: 'date-desc'          // 'date-desc' | 'date-asc' | 'alpha'
  },

  // Reading time calculation
  wordsPerMinute: 200,    // Default: 200 (reading speed for reading-time metadata)

  // Performance options
  batchSize: 10,          // Process files in batches
  enableSitemap: true,    // Generate sitemap.xml
  enableRobots: true,     // Generate/update robots.txt
  enableLlms: true        // Generate llms.txt (or set llms.enabled: true)
}))

SEO Property Reference

Core SEO Properties (Frontmatter)

seo:
  # Essential properties (covers 90% of SEO needs)
  title: 'Page-specific title'
  description: 'Page-specific description'
  image: '/images/page-image.jpg'

  # Content type (auto-detected if not specified)
  type: 'article' # article, product, page, local-business

  # URL and indexing
  canonicalURL: 'https://example.com/custom-url'
  robots: 'index,follow' # Default: "index,follow"
  noIndex: false # Exclude from search engines

  # Dates (auto-detected from frontmatter if available)
  publishDate: '2024-01-15'
  modifiedDate: '2024-01-20'

  # Author and content metadata
  author: 'John Doe'
  keywords: ['seo', 'metalsmith', 'optimization']

Content Type Detection

The plugin automatically detects content type:

  • Article: Has date and author or tags
  • Product: Has price or sku properties
  • Local Business: Has address or phone
  • Page: Default fallback

Output Examples

Blog Article

Input:

---
title: 'Ultimate SEO Guide'
date: 2024-01-15
author: 'Jane Smith'
tags: ['seo', 'marketing']
seo:
  description: 'Complete guide to SEO optimization'
  image: '/images/seo-guide.jpg'
---

Generated SEO:

<title>Ultimate SEO Guide</title>
<meta name="description" content="Complete guide to SEO optimization" />
<meta property="og:type" content="article" />
<meta property="og:article:author" content="Jane Smith" />
<meta property="og:article:published_time" content="2024-01-15" />
<meta property="og:article:tag" content="seo" />
<meta property="og:article:tag" content="marketing" />

<script type="application/ld+json">
  {
    "@context": "https://schema.org",
    "@type": "Article",
    "headline": "Ultimate SEO Guide",
    "author": { "@type": "Person", "name": "Jane Smith" },
    "datePublished": "2024-01-15",
    "keywords": ["seo", "marketing"]
  }
</script>

Product Page

Input:

---
title: 'Amazing Widget'
price: '$99.99'
seo:
  description: 'The best widget money can buy'
  image: '/images/widget.jpg'
  type: 'product'
---

Generated SEO:

<title>Amazing Widget</title>
<meta property="og:type" content="product" />
<meta property="og:price:amount" content="99.99" />
<meta property="og:price:currency" content="USD" />

<script type="application/ld+json">
  {
    "@context": "https://schema.org",
    "@type": "Product",
    "name": "Amazing Widget",
    "offers": {
      "@type": "Offer",
      "price": "99.99",
      "priceCurrency": "USD"
    }
  }
</script>

Robots.txt Management

The plugin handles robots.txt files:

Automatic Generation

If no robots.txt exists, the plugin generates a basic one:

User-agent: *
Disallow:

Sitemap: https://example.com/sitemap.xml

Coordination with Existing Files

If robots.txt already exists, the plugin:

  1. Preserves existing content - Never overwrites your custom directives
  2. Adds sitemap reference - Automatically adds sitemap URL if missing
  3. Avoids duplicates - Won't add multiple sitemap references

Example - Before:

User-agent: *
Disallow: /admin/
Disallow: /private/

Example - After plugin processing:

User-agent: *
Disallow: /admin/
Disallow: /private/

Sitemap: https://example.com/sitemap.xml

Custom Robots.txt Configuration

.use(seo({
  hostname: 'https://example.com',
  robots: {
    generateRobots: true,      // Generate if missing (default: true)
    addSitemapReference: true, // Add sitemap to existing (default: true)
    disallowPaths: ['/admin/', '/api/'], // Paths to disallow
    userAgent: 'Googlebot'     // Specific user agent (default: '*')
  }
}))

Generated output:

User-agent: Googlebot
Disallow: /admin/
Disallow: /api/

Sitemap: https://example.com/sitemap.xml

Disabling Robots.txt Processing

.use(seo({
  hostname: 'https://example.com',
  enableRobots: false  // Skip robots.txt processing entirely
}))

llms.txt Generation

The plugin can emit llms.txt — a machine-readable index that helps large language models discover and summarize your site — and an optional llms-full.txt containing the concatenated plaintext of every included page. Both files are opt-in: set llms.enabled: true (or enableLlms: true) to turn them on.

Minimal usage

.use(seo({
  hostname: 'https://example.com',
  llms: {
    enabled: true,
    title: 'My Site',
    description: 'Short tagline for LLM consumers.'
  }
}))

The output is spec-compliant markdown:

# My Site

> Short tagline for LLM consumers.

## writing

- [Post Title](https://example.com/writing/post.html): Page description.
- ...

When title or description are omitted, they default to the plugin's resolved social.siteName and defaults.description (pulled from site metadata).

Grouping

Pages are grouped in the output in this priority order:

  1. llms.groups — an explicit map of { 'Group Name': 'glob/pattern' }. The first pattern a file matches wins.
  2. The first entry of frontmatter.collection (as set by metalsmith-collections).
  3. The top-level directory of the file path.
llms: {
  enabled: true,
  groups: {
    Writing: 'writing/**/*.html',
    Art:     'art/**/*.html'
  }
}

Full-text dump

llms: { enabled: true, fullText: true }

Emits /llms-full.txt alongside /llms.txt. Each page is rendered as a ### heading, its URL, and the plaintext body (HTML stripped). Good when you want to hand an LLM the entire site in one file.

Multilingual sites

For sites built with metalsmith-multilingual (or any plugin that sets frontmatter.locale), set perLocale: true to emit one file pair per locale, rooted under the locale path:

llms: { enabled: true, perLocale: true }
// → /en_US/llms.txt, /de_DE/llms.txt, ...

If no plugin sets locale, provide a locales array and the processor will detect locale from the leading path segment:

llms: { enabled: true, perLocale: true, locales: ['en', 'de'] }
// 'de/texte/post.html' → goes into the 'de' bucket

Excluding pages

Pages with a truthy private frontmatter value are excluded (the default; override with llms.privateProperty). Use llms.pattern to restrict which files are considered at all (default: **/*.html).

Disabling

The feature is off by default. To force-disable even when config is present, set enableLlms: false.

Sitemap Generation

Sitemap Configuration Options

All sitemap options are configured under the sitemap property:

| Option | Type | Default | Description | |--------|------|---------|-------------| | output | string | 'sitemap.xml' | Filename for the generated sitemap | | pattern | string | '**/*.html' | Glob pattern to match files for inclusion | | auto | boolean | true | Enable automatic priority and changefreq calculation | | changefreq | string | - | Default change frequency (always, hourly, daily, weekly, monthly, yearly, never) | | priority | number | - | Default priority (0.0 to 1.0) | | lastmod | Date|string | - | Default last modified date for all files | | omitIndex | boolean | false | Remove /index.html from URLs (e.g., about/index.htmlabout/) | | urlProperty | string | 'canonical' | Frontmatter property name to read canonical URL overrides | | modifiedProperty | string | 'lastmod' | Frontmatter property name to read last modified dates | | privateProperty | string | 'private' | Frontmatter property to exclude files (if true, file is excluded) | | priorityProperty | string | 'priority' | Frontmatter property name to read priority values | | links | string | - | Property name for alternate language links (hreflang) |

URL Transformation Examples:

// Example 1: Default behavior (no transformation)
// File: about/index.html → URL: https://example.com/about/index.html
.use(seo({ hostname: 'https://example.com' }))

// Example 2: Clean URLs with omitIndex (recommended for permalink-style URLs)
// File: about/index.html → URL: https://example.com/about/
.use(seo({
  hostname: 'https://example.com',
  sitemap: { omitIndex: true }
}))

// Example 3: Permalink-style URLs (for use with metalsmith-permalinks)
// File: blog/my-post/index.html → URL: https://example.com/blog/my-post/
.use(seo({
  hostname: 'https://example.com',
  sitemap: { omitIndex: true }
}))

Excluding Files from Sitemap:

---
title: 'Draft Page'
private: true  # This page won't appear in sitemap
---

Or use a custom property name:

.use(seo({
  hostname: 'https://example.com',
  sitemap: {
    privateProperty: 'draft'  // Exclude files with draft: true
  }
}))

Automatic Calculation (Default)

By default, the plugin automatically calculates optimal values for sitemap entries:

.use(seo('https://example.com'))  // Auto-calculation enabled by default

What gets auto-calculated:

  • Priority (0.1-1.0) is derived from URL hierarchy only — content length and content type are deliberately not used:

    • index.html (the homepage) → 1.0
    • Root-level pages → 0.8
    • One level deep → 0.6
    • Two levels deep → 0.4
    • Deeper pages → 0.3
    • Any non-root index.html (a section landing page) gets +0.2
  • Change Frequency is derived from URL pattern plus the page's lastmod:

    • index.html (homepage) → weekly
    • Other index.html pages (section landings) → monthly
    • Pages modified in the last 30 days → monthly
    • Pages modified in the last 365 days → yearly
    • Default → yearly
  • Last Modified uses the frontmatter lastmod property (or whichever property modifiedProperty is set to), then the global lastmod option as a fallback. Invalid date strings are skipped rather than emitted as the epoch.

Example auto-generated sitemap:

<url>
  <loc>https://example.com/index.html</loc>
  <lastmod>2024-01-15</lastmod>
  <changefreq>weekly</changefreq>
  <priority>1.0</priority>
</url>
<url>
  <loc>https://example.com/blog/seo-guide/index.html</loc>
  <lastmod>2024-01-10</lastmod>
  <changefreq>monthly</changefreq>
  <priority>0.6</priority>
</url>

Manual Override Options

Disable auto-calculation for minimal sitemaps:

.use(seo({
  hostname: 'https://example.com',
  sitemap: {
    auto: false  // Disable auto-calculation (minimal sitemap)
  }
}))

Set global defaults (auto-calculation disabled):

.use(seo({
  hostname: 'https://example.com',
  sitemap: {
    auto: false,
    changefreq: 'weekly',
    priority: 0.8
  }
}))

Per-page overrides in frontmatter:

---
title: 'Important Page'
seo:
  priority: 1.0 # Override auto-calculated priority
  changefreq: 'daily' # Override auto-calculated frequency
  lastmod: '2024-01-15' # Override file modification date
---

Benefits of Auto-Calculation

Better SEO Performance:

  • Accurate lastmod dates that Google trusts and uses
  • Realistic priorities based on actual content importance
  • Calculated change frequencies based on content type patterns

Developer Experience:

  • Zero configuration - works perfectly out of the box
  • No manual maintenance - adapts as your site grows
  • Override capability for special cases

Migration from v0.x to v1.0

Version 1.0.0 modernizes the toolchain. No plugin API or behavior changed — only the packaging.

Breaking Changes

  1. ESM only. The CommonJS build is gone. Use import seo from 'metalsmith-seo' from an ESM project. On Node.js 22.12+, require('metalsmith-seo').default also works thanks to require(esm) — no separate CJS build is needed.
  2. Node.js 22+ required. Earlier versions are unsupported.

No API changes. Plugin options, extraction logic, and generated output are identical to v0.8.x.

Migration from metalsmith-sitemap

This plugin includes all metalsmith-sitemap functionality:

Before:

.use(sitemap({
  hostname: 'https://example.com',
  changefreq: 'weekly',
  priority: 0.8
}))

After:

.use(seo({
  hostname: 'https://example.com',
  sitemap: {
    changefreq: 'weekly',
    priority: 0.8
  }
  // Now you also get SEO optimization!
}))

License

MIT License - see LICENSE file for details.

Contributing

Contributions welcome! Please read our contributing guidelines first.

Attribution

The sitemap functionality in this plugin was inspired by and adapted from:

Related