metalsmith-seo
v1.1.3
Published
A metalsmith plugin for SEO optimization including sitemap generation, meta tags, and more.
Maintainers
Readme
metalsmith-seo
Inspired by metalsmith-sitemap, the plugin provides SEO optimization for Metalsmith with metadata generation, social media tags, and structured data including Open Graph tags, Twitter Cards, JSON-LD structured data, and sitemap generation.
Version 1.x is ESM-only and requires Node.js 22+. The plugin API and output are unchanged from v0.8.x — only the packaging was modernized in 1.0.0. See the migration guide below.
Features
Core SEO Optimization:
- HTML Head Optimization - Meta tags, canonical URLs, robots directives
- Open Graph Tags - Social media sharing with Facebook, LinkedIn, etc.
- Twitter Cards - Rich Twitter previews with automatic card type detection
- JSON-LD Structured Data - Article, Product, Organization, WebPage schemas
- Sitemap Generation - Complete sitemap.xml with auto-calculation of priority, changefreq, and lastmod
- Robots.txt Management - robots.txt generation and sitemap coordination
- llms.txt Generation - Opt-in markdown index (and optional plaintext dump) for large language model consumers, per the llmstxt.org proposal
Automation:
- Content Analysis - Auto-detects content type (article, product, page)
- Metadata Derivation - Single source feeds all formats (title → og:title, twitter:title, JSON-LD headline)
- Fallback Chains - Defaults from site.json, frontmatter, or content analysis
- Site.json Integration - Integration with existing Metalsmith site configuration
Performance:
- Head-only Cheerio parsing - Only the
<head>section is fed to the HTML parser; the body (often the bulk of the document) is never parsed or serialized - Single-pass processing - All meta tags, Open Graph, Twitter Cards, JSON-LD, and link tags are injected in one parse/serialize cycle per file
- Batch processing - Files are processed in configurable parallel batches
Developer Experience:
- ESM-only - Modern Node.js (>= 22) with native ESM
- Minimal Configuration - Works great with just a hostname
- Comprehensive Testing - High test coverage with real-world scenarios (see badge above)
What this plugin won't do
Scope boundaries that come up often enough to be worth stating up front. Each is a deliberate design decision — see docs/THEORY.md §5 for the reasoning.
- Score pages by content length. Sitemap priority is derived from URL hierarchy, not word count. Long privacy policies don't outrank the homepage.
- Ship any client-side JavaScript. Output is static HTML. SEO that depends on runtime execution is invisible to crawlers that don't run JS.
- Parse markdown. HTML in, HTML out. Run
@metalsmith/markdown(or similar) before this plugin. - Modify anything outside
<head>. No body-content rewriting, no mid-page schema injection, no image-tag manipulation. - Make network requests. No og:image dimension fetching, no live sitemap validation. Builds are hermetic.
Requirements
- Node.js >= 22
- Metalsmith >= 2.5.0
- ESM-only (no CommonJS support)
Installation
npm install metalsmith-seoUsage
Quick Start
Minimal Setup
import Metalsmith from 'metalsmith';
import seo from 'metalsmith-seo';
Metalsmith(import.meta.dirname)
.use(
seo({
hostname: 'https://example.com',
})
)
.build();This simple configuration automatically generates:
- Complete HTML meta tags
- Open Graph tags for social sharing
- Twitter Card tags
- JSON-LD structured data
- sitemap.xml with calculated priority/changefreq/lastmod values
- robots.txt (with sitemap reference)
With site.json Integration (Recommended)
Create data/site.json:
{
"name": "My Awesome Site",
"title": "My Site - Welcome",
"description": "The best site on the internet",
"url": "https://example.com",
"locale": "en_US",
"twitter": "@mysite",
"organization": {
"name": "My Company",
"logo": "https://example.com/logo.png"
}
}Then use the plugin:
import Metalsmith from 'metalsmith';
import metadata from '@metalsmith/metadata';
import seo from 'metalsmith-seo';
Metalsmith(import.meta.dirname)
.use(metadata({ site: 'data/site.json' }))
.use(seo()) // Automatically uses site.json values!
.build();Or if your site metadata is nested differently:
import Metalsmith from 'metalsmith';
import metadata from '@metalsmith/metadata';
import seo from 'metalsmith-seo';
// If metadata is at metadata().data.site instead of metadata().site
Metalsmith(import.meta.dirname)
.use(
metadata({
data: {
site: 'data/site.json',
},
})
)
.use(
seo({
metadataPath: 'data.site', // Tell plugin where to find site metadata
})
)
.build();Frontmatter Integration
Add SEO data to any page. The plugin extracts metadata from multiple locations:
---
title: 'My Blog Post'
date: 2024-01-15
seo:
title: 'Advanced SEO Techniques - My Blog'
description: 'Learn how to optimize your site for search engines'
image: '/images/seo-guide.jpg'
type: 'article'
---Card Object Support
The plugin also extracts metadata from card objects (commonly used for blog post listings):
---
layout: pages/sections.njk
draft: false
seo:
title: 'Override Title for SEO' # Highest priority
description: 'SEO-specific description'
card:
title: 'Architecture Philosophy' # Used if not in seo object
date: '2025-06-02'
author:
- Albert Einstein
- Isaac Newton
image: '/assets/images/sample9.jpg'
excerpt: 'This starter embodies several key principles...'
---Metadata Extraction Priority:
seoobject (highest priority - explicit SEO overrides)cardobject (for blog posts and content cards)- Root level properties
- Configured defaults
- Site-wide defaults (from site.json)
- Auto-generated content
Author Fallback Chain:
When no author is specified in frontmatter, the plugin uses siteOwner from your site.json as a fallback, ensuring all content has proper attribution for SEO and social media.
Result: Comprehensive SEO markup automatically generated:
<!-- Basic Meta -->
<title>Advanced SEO Techniques - My Blog</title>
<meta name="description" content="Learn how to optimize your site for search engines" />
<link rel="canonical" href="https://example.com/blog/advanced-seo" />
<!-- Open Graph -->
<meta property="og:title" content="Advanced SEO Techniques - My Blog" />
<meta property="og:type" content="article" />
<meta property="og:image" content="https://example.com/images/seo-guide.jpg" />
<!-- Twitter Cards -->
<meta name="twitter:card" content="summary_large_image" />
<meta name="twitter:title" content="Advanced SEO Techniques - My Blog" />
<!-- JSON-LD Structured Data -->
<script type="application/ld+json">
{
"@context": "https://schema.org",
"@type": "Article",
"headline": "Advanced SEO Techniques - My Blog",
"image": "https://example.com/images/seo-guide.jpg",
"datePublished": "2024-01-15",
"author": { "@type": "Person", "name": "Site Author" }
}
</script>Site.json Configuration
The plugin integrates seamlessly with your existing site.json configuration:
Complete site.json Example
{
"name": "My Awesome Site",
"title": "My Site - Home Page",
"description": "The default description for all pages",
"url": "https://example.com",
"locale": "en_US",
"defaultImage": "/images/default-og.jpg",
"twitter": "@mysite",
"facebookAppId": "123456789",
"siteOwner": "Your Name",
"organization": {
"name": "My Company",
"logo": "https://example.com/logo.png",
"sameAs": [
"https://twitter.com/mycompany",
"https://facebook.com/mycompany",
"https://linkedin.com/company/mycompany"
],
"contactPoint": {
"telephone": "+1-555-123-4567",
"contactType": "customer service"
}
},
"social": {
"twitterCreator": "@author",
"locale": "en_US"
},
"sitemap": {
"changefreq": "weekly",
"priority": 0.8
}
}Site.json Property Mapping
| site.json Property | SEO Usage | Example |
| ------------------ | ------------------------ | --------------------------- |
| url | Hostname for all URLs | https://example.com |
| name / title | Site name in Open Graph | og:site_name |
| description | Default meta description | <meta name="description"> |
| defaultImage | Default social image | og:image, twitter:image |
| locale | Content language | og:locale |
| twitter | Twitter site handle | twitter:site |
| facebookAppId | Facebook integration | fb:app_id |
| siteOwner | Default author fallback | <meta name="author"> |
| organization | Company info | JSON-LD Organization schema |
Configuration Precedence
The plugin uses this priority order:
- Page frontmatter (
seoproperty) - Highest priority - Plugin options - Override site defaults
- site.json values - Site-wide defaults
- Automatic fallbacks - Auto-generated from content
Plugin Options
Basic Configuration
.use(seo({
hostname: 'https://example.com', // Required if not in site.json
// Global defaults for all pages
defaults: {
title: 'My Site',
description: 'Default page description',
socialImage: '/images/default-og.jpg'
},
// Social media configuration
social: {
siteName: 'My Site',
twitterSite: '@mysite',
twitterCreator: '@author',
facebookAppId: '123456789',
locale: 'en_US'
},
// JSON-LD structured data
jsonLd: {
organization: {
name: 'My Company',
logo: 'https://example.com/logo.png'
}
}
}))Advanced Options
.use(seo({
hostname: 'https://example.com',
// Customize where to find site metadata
metadataPath: 'site', // Default: 'site' (can be 'data.site' or any path)
// Customize frontmatter property name
seoProperty: 'seo', // Default: 'seo'
// Fallback property mappings (dotted paths like 'author.name' are supported)
fallbacks: {
title: 'title', // Default: 'title'
description: 'excerpt', // Default: 'excerpt'
image: 'featured_image', // Default: 'featured_image'
author: 'author' // Default: 'author'
},
// Sitemap configuration
sitemap: {
output: 'sitemap.xml',
auto: true, // Default: true (automatic calculation)
changefreq: 'weekly', // Override auto-calculation
priority: 0.8, // Override auto-calculation
omitIndex: false
},
// Robots.txt configuration
robots: {
generateRobots: true, // Generate robots.txt if none exists
addSitemapReference: true, // Add sitemap reference to existing robots.txt
disallowPaths: ['/admin/', '/private/'], // Paths to disallow
userAgent: '*' // User agent directive
},
// llms.txt configuration (opt-in)
llms: {
enabled: true, // Opt in to llms.txt generation
output: 'llms.txt', // Index filename
fullText: false, // Also emit llms-full.txt (concatenated plaintext)
fullTextOutput: 'llms-full.txt',
title: 'My Site', // Defaults to social.siteName
description: 'Tagline.', // Defaults to defaults.description
details: undefined, // Extra markdown block under the description
pattern: '**/*.html', // Which files to include
privateProperty: 'private',// Frontmatter flag that excludes a file
groups: { // Optional: named groups by glob pattern
Writing: 'writing/**/*.html',
Notes: 'studio-notes/**/*.html'
},
perLocale: false, // Emit one file pair per locale
locales: ['en', 'de'], // Known locale path prefixes (fallback when
// frontmatter.locale isn't set by a plugin)
sort: 'date-desc' // 'date-desc' | 'date-asc' | 'alpha'
},
// Reading time calculation
wordsPerMinute: 200, // Default: 200 (reading speed for reading-time metadata)
// Performance options
batchSize: 10, // Process files in batches
enableSitemap: true, // Generate sitemap.xml
enableRobots: true, // Generate/update robots.txt
enableLlms: true // Generate llms.txt (or set llms.enabled: true)
}))SEO Property Reference
Core SEO Properties (Frontmatter)
seo:
# Essential properties (covers 90% of SEO needs)
title: 'Page-specific title'
description: 'Page-specific description'
image: '/images/page-image.jpg'
# Content type (auto-detected if not specified)
type: 'article' # article, product, page, local-business
# URL and indexing
canonicalURL: 'https://example.com/custom-url'
robots: 'index,follow' # Default: "index,follow"
noIndex: false # Exclude from search engines
# Dates (auto-detected from frontmatter if available)
publishDate: '2024-01-15'
modifiedDate: '2024-01-20'
# Author and content metadata
author: 'John Doe'
keywords: ['seo', 'metalsmith', 'optimization']Content Type Detection
The plugin automatically detects content type:
- Article: Has
dateandauthorortags - Product: Has
priceorskuproperties - Local Business: Has
addressorphone - Page: Default fallback
Output Examples
Blog Article
Input:
---
title: 'Ultimate SEO Guide'
date: 2024-01-15
author: 'Jane Smith'
tags: ['seo', 'marketing']
seo:
description: 'Complete guide to SEO optimization'
image: '/images/seo-guide.jpg'
---Generated SEO:
<title>Ultimate SEO Guide</title>
<meta name="description" content="Complete guide to SEO optimization" />
<meta property="og:type" content="article" />
<meta property="og:article:author" content="Jane Smith" />
<meta property="og:article:published_time" content="2024-01-15" />
<meta property="og:article:tag" content="seo" />
<meta property="og:article:tag" content="marketing" />
<script type="application/ld+json">
{
"@context": "https://schema.org",
"@type": "Article",
"headline": "Ultimate SEO Guide",
"author": { "@type": "Person", "name": "Jane Smith" },
"datePublished": "2024-01-15",
"keywords": ["seo", "marketing"]
}
</script>Product Page
Input:
---
title: 'Amazing Widget'
price: '$99.99'
seo:
description: 'The best widget money can buy'
image: '/images/widget.jpg'
type: 'product'
---Generated SEO:
<title>Amazing Widget</title>
<meta property="og:type" content="product" />
<meta property="og:price:amount" content="99.99" />
<meta property="og:price:currency" content="USD" />
<script type="application/ld+json">
{
"@context": "https://schema.org",
"@type": "Product",
"name": "Amazing Widget",
"offers": {
"@type": "Offer",
"price": "99.99",
"priceCurrency": "USD"
}
}
</script>Robots.txt Management
The plugin handles robots.txt files:
Automatic Generation
If no robots.txt exists, the plugin generates a basic one:
User-agent: *
Disallow:
Sitemap: https://example.com/sitemap.xmlCoordination with Existing Files
If robots.txt already exists, the plugin:
- Preserves existing content - Never overwrites your custom directives
- Adds sitemap reference - Automatically adds sitemap URL if missing
- Avoids duplicates - Won't add multiple sitemap references
Example - Before:
User-agent: *
Disallow: /admin/
Disallow: /private/Example - After plugin processing:
User-agent: *
Disallow: /admin/
Disallow: /private/
Sitemap: https://example.com/sitemap.xmlCustom Robots.txt Configuration
.use(seo({
hostname: 'https://example.com',
robots: {
generateRobots: true, // Generate if missing (default: true)
addSitemapReference: true, // Add sitemap to existing (default: true)
disallowPaths: ['/admin/', '/api/'], // Paths to disallow
userAgent: 'Googlebot' // Specific user agent (default: '*')
}
}))Generated output:
User-agent: Googlebot
Disallow: /admin/
Disallow: /api/
Sitemap: https://example.com/sitemap.xmlDisabling Robots.txt Processing
.use(seo({
hostname: 'https://example.com',
enableRobots: false // Skip robots.txt processing entirely
}))llms.txt Generation
The plugin can emit llms.txt — a machine-readable index that
helps large language models discover and summarize your site — and an optional
llms-full.txt containing the concatenated plaintext of every included page.
Both files are opt-in: set llms.enabled: true (or enableLlms: true) to
turn them on.
Minimal usage
.use(seo({
hostname: 'https://example.com',
llms: {
enabled: true,
title: 'My Site',
description: 'Short tagline for LLM consumers.'
}
}))The output is spec-compliant markdown:
# My Site
> Short tagline for LLM consumers.
## writing
- [Post Title](https://example.com/writing/post.html): Page description.
- ...When title or description are omitted, they default to the plugin's
resolved social.siteName and defaults.description (pulled from site
metadata).
Grouping
Pages are grouped in the output in this priority order:
llms.groups— an explicit map of{ 'Group Name': 'glob/pattern' }. The first pattern a file matches wins.- The first entry of
frontmatter.collection(as set bymetalsmith-collections). - The top-level directory of the file path.
llms: {
enabled: true,
groups: {
Writing: 'writing/**/*.html',
Art: 'art/**/*.html'
}
}Full-text dump
llms: { enabled: true, fullText: true }Emits /llms-full.txt alongside /llms.txt. Each page is rendered as a
### heading, its URL, and the plaintext body (HTML stripped). Good when
you want to hand an LLM the entire site in one file.
Multilingual sites
For sites built with metalsmith-multilingual (or any plugin that sets
frontmatter.locale), set perLocale: true to emit one file pair per
locale, rooted under the locale path:
llms: { enabled: true, perLocale: true }
// → /en_US/llms.txt, /de_DE/llms.txt, ...If no plugin sets locale, provide a locales array and the processor
will detect locale from the leading path segment:
llms: { enabled: true, perLocale: true, locales: ['en', 'de'] }
// 'de/texte/post.html' → goes into the 'de' bucketExcluding pages
Pages with a truthy private frontmatter value are excluded (the default;
override with llms.privateProperty). Use llms.pattern to restrict
which files are considered at all (default: **/*.html).
Disabling
The feature is off by default. To force-disable even when config is
present, set enableLlms: false.
Sitemap Generation
Sitemap Configuration Options
All sitemap options are configured under the sitemap property:
| Option | Type | Default | Description |
|--------|------|---------|-------------|
| output | string | 'sitemap.xml' | Filename for the generated sitemap |
| pattern | string | '**/*.html' | Glob pattern to match files for inclusion |
| auto | boolean | true | Enable automatic priority and changefreq calculation |
| changefreq | string | - | Default change frequency (always, hourly, daily, weekly, monthly, yearly, never) |
| priority | number | - | Default priority (0.0 to 1.0) |
| lastmod | Date|string | - | Default last modified date for all files |
| omitIndex | boolean | false | Remove /index.html from URLs (e.g., about/index.html → about/) |
| urlProperty | string | 'canonical' | Frontmatter property name to read canonical URL overrides |
| modifiedProperty | string | 'lastmod' | Frontmatter property name to read last modified dates |
| privateProperty | string | 'private' | Frontmatter property to exclude files (if true, file is excluded) |
| priorityProperty | string | 'priority' | Frontmatter property name to read priority values |
| links | string | - | Property name for alternate language links (hreflang) |
URL Transformation Examples:
// Example 1: Default behavior (no transformation)
// File: about/index.html → URL: https://example.com/about/index.html
.use(seo({ hostname: 'https://example.com' }))
// Example 2: Clean URLs with omitIndex (recommended for permalink-style URLs)
// File: about/index.html → URL: https://example.com/about/
.use(seo({
hostname: 'https://example.com',
sitemap: { omitIndex: true }
}))
// Example 3: Permalink-style URLs (for use with metalsmith-permalinks)
// File: blog/my-post/index.html → URL: https://example.com/blog/my-post/
.use(seo({
hostname: 'https://example.com',
sitemap: { omitIndex: true }
}))Excluding Files from Sitemap:
---
title: 'Draft Page'
private: true # This page won't appear in sitemap
---Or use a custom property name:
.use(seo({
hostname: 'https://example.com',
sitemap: {
privateProperty: 'draft' // Exclude files with draft: true
}
}))Automatic Calculation (Default)
By default, the plugin automatically calculates optimal values for sitemap entries:
.use(seo('https://example.com')) // Auto-calculation enabled by defaultWhat gets auto-calculated:
Priority (0.1-1.0) is derived from URL hierarchy only — content length and content type are deliberately not used:
index.html(the homepage) →1.0- Root-level pages →
0.8 - One level deep →
0.6 - Two levels deep →
0.4 - Deeper pages →
0.3 - Any non-root
index.html(a section landing page) gets+0.2
Change Frequency is derived from URL pattern plus the page's
lastmod:index.html(homepage) →weekly- Other
index.htmlpages (section landings) →monthly - Pages modified in the last 30 days →
monthly - Pages modified in the last 365 days →
yearly - Default →
yearly
Last Modified uses the frontmatter
lastmodproperty (or whichever propertymodifiedPropertyis set to), then the globallastmodoption as a fallback. Invalid date strings are skipped rather than emitted as the epoch.
Example auto-generated sitemap:
<url>
<loc>https://example.com/index.html</loc>
<lastmod>2024-01-15</lastmod>
<changefreq>weekly</changefreq>
<priority>1.0</priority>
</url>
<url>
<loc>https://example.com/blog/seo-guide/index.html</loc>
<lastmod>2024-01-10</lastmod>
<changefreq>monthly</changefreq>
<priority>0.6</priority>
</url>Manual Override Options
Disable auto-calculation for minimal sitemaps:
.use(seo({
hostname: 'https://example.com',
sitemap: {
auto: false // Disable auto-calculation (minimal sitemap)
}
}))Set global defaults (auto-calculation disabled):
.use(seo({
hostname: 'https://example.com',
sitemap: {
auto: false,
changefreq: 'weekly',
priority: 0.8
}
}))Per-page overrides in frontmatter:
---
title: 'Important Page'
seo:
priority: 1.0 # Override auto-calculated priority
changefreq: 'daily' # Override auto-calculated frequency
lastmod: '2024-01-15' # Override file modification date
---Benefits of Auto-Calculation
Better SEO Performance:
- ✅ Accurate lastmod dates that Google trusts and uses
- ✅ Realistic priorities based on actual content importance
- ✅ Calculated change frequencies based on content type patterns
Developer Experience:
- ✅ Zero configuration - works perfectly out of the box
- ✅ No manual maintenance - adapts as your site grows
- ✅ Override capability for special cases
Migration from v0.x to v1.0
Version 1.0.0 modernizes the toolchain. No plugin API or behavior changed — only the packaging.
Breaking Changes
- ESM only. The CommonJS build is gone. Use
import seo from 'metalsmith-seo'from an ESM project. On Node.js 22.12+,require('metalsmith-seo').defaultalso works thanks torequire(esm)— no separate CJS build is needed. - Node.js 22+ required. Earlier versions are unsupported.
No API changes. Plugin options, extraction logic, and generated output are identical to v0.8.x.
Migration from metalsmith-sitemap
This plugin includes all metalsmith-sitemap functionality:
Before:
.use(sitemap({
hostname: 'https://example.com',
changefreq: 'weekly',
priority: 0.8
}))After:
.use(seo({
hostname: 'https://example.com',
sitemap: {
changefreq: 'weekly',
priority: 0.8
}
// Now you also get SEO optimization!
}))License
MIT License - see LICENSE file for details.
Contributing
Contributions welcome! Please read our contributing guidelines first.
Attribution
The sitemap functionality in this plugin was inspired by and adapted from:
- metalsmith-sitemap by ExtraHop (MIT License)
Related
- @metalsmith/metadata - For loading site.json
- metalsmith - The static site generator
