@framework-cwf/seo
v0.2.6
Published
Per-route generateMetadata + JSON-LD schema generators + sitemap.xml/robots.txt/internal-link builders. Pure functions from typed config.
Readme
@framework-cwf/seo
Pure-function SEO builders for the customer-website framework. Consumes
the typed OperationalConfig + MarketingConfig from
@framework-cwf/contracts, emits Next.js metadata, JSON-LD (T1.E.2),
sitemap.xml/robots.txt entries (T1.E.3), and internal-link maps.
No I/O, no React, no window access — the package imports cleanly from
Node 24 and is safe to call from Server Components or the build-time
static-export pipeline.
Installation
Published to GitHub Packages under the @framework-cwf scope. Consumers need an
.npmrc pointing the scope at the GitHub Packages registry plus an auth token:
@framework-cwf:registry=https://npm.pkg.github.com
//npm.pkg.github.com/:_authToken=${NODE_AUTH_TOKEN}pnpm add @framework-cwf/seoWhat's in here (T1.E.1)
| Export | Purpose |
| ------------------------------------------------------------------------------ | ----------------------------------------------------------------------------------------------------- |
| generateMetadata | Per-route Next.js Metadata builder. Title / description / canonical / openGraph / twitter / robots. |
| buildCanonicalUrl | Hostname + path → fully-resolved canonical URL. Trailing-slash safe. |
| buildOgImageUrl | Resolves a marketing-block ogImageRef to a full URL. Passes absolute URLs through. |
| pageTypeToPath | Route discriminator → canonical path string. Reused by sitemap + JSON-LD downstream. |
| expandTemplate | {placeholder} template expansion used inside generateMetadata. Useful when composing your own. |
| PAGE_TYPES | Frozen 12-item array of every page type the framework supports. |
| Route, PageType, MetadataOutput, GenerateMetadataInput, TemplateVars | Types. |
Page-type union
generateMetadata accepts any of these 12 routes. Routes that need extra
data carry it on the discriminated union; the rest are bare type tags.
| route.type | Path | Extra fields | OG type |
| ---------------- | ------------------ | -------------------------------- | --------- |
| home | / | — | website |
| services | /services | — | website |
| service-detail | /services/{slug} | slug, serviceName? | website |
| about | /about | — | website |
| contact | /contact | — | website |
| gallery | /gallery | — | website |
| blog | /blog | — | website |
| blog-post | /blog/{slug} | slug, title?, publishedAt? | article |
| booking | /book | — | website |
| account | /account | — | website |
| 404 | /404 | — | website |
| 500 | /500 | — | website |
The list intentionally exceeds the four ROUTE_KEYS (home/services/
about/visit) in @framework-cwf/contracts — those are the
enabled-pages whitelist for the marketing editor, not the metadata
surface. See the T1.E.1 PR description for the divergence note.
Field resolution order
Per field (title, description, ogImage, robots), generateMetadata
falls through in this order:
- Per-route override at
marketing.seo.pages[path]ormarketing.seo.pages[routeKey](without leading slash). The Admin marketing editor lands either form here. - Marketing-block template —
marketing.seo.defaults.titleTemplate/descriptionTemplatewith{pageTitle}/{pageDescription}/{businessName}/{businessTagline}placeholders. Unknown placeholders are left in place as a "loud bug" rather than silenced. - Built-in per-page-type defaults — copy keyed on
route.typethat always includes the business name and a hand-written content hook.
Usage
// app/services/[slug]/page.tsx
import type { Metadata } from "next";
import {
generateMetadata as seoMetadata,
type Route,
} from "@framework-cwf/seo";
import { getWebsiteConfig } from "@/lib/website-config";
export async function generateMetadata({
params,
}: {
params: { slug: string };
}): Promise<Metadata> {
const { operational, marketing } = await getWebsiteConfig();
const route: Route = {
type: "service-detail",
slug: params.slug,
serviceName: lookupServiceName(operational, params.slug),
};
return seoMetadata({ route, business: operational, marketing });
}The no-undefined invariant
Every field in the returned object carries a concrete value. Crawlers
treat undefined-rendered tags as parse errors; the snapshot suite
walks every fixture × every route and asserts no value is undefined
anywhere in the tree. Optional content (an empty OG image array, an
absent publishedTime) is either an empty array or simply omitted —
never undefined.
What's in here (T1.E.2 — JSON-LD)
buildSchemas({ pageType, business, marketing, ... }) returns
{ jsonLd, bytes } — an array of schema.org-shaped objects plus the
JSON-stringified UTF-8 byte count, ready to inline into a
<script type="application/ld+json"> tag per page.
| Export | Purpose |
| -------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------ |
| buildSchemas | Page-type dispatch — the entry point consumers actually call. |
| buildLocalBusinessSchema | LocalBusiness / HairSalon / BeautySalon per the resolved businessType. |
| buildOrganizationSchema | Lightweight Organization — emitted on home + as publisher for blog posts. |
| buildWebSiteSchema | WebSite with optional SearchAction when searchPath is supplied. |
| buildServiceSchema | Service with provider, areaServed, optional Offer (price + currency). |
| buildPersonSchema | Person with worksFor Organization, optional jobTitle + image. |
| buildFaqPageSchema | FAQPage from contracts FaqEntry[] — skips entries missing question or answer. |
| buildBreadcrumbListSchema | BreadcrumbList for every non-home, non-error page. |
| buildImageGallerySchema | ImageGallery of ImageObjects — caption from gallery item altText. |
| buildArticleSchema | BlogPosting with author, publisher, mainEntityOfPage, optional image + dates. |
| buildOpeningHoursForEntries / buildOpeningHoursForLocation | Parses freeform hours: string ("9 — 17", "9am — 5pm", etc.) into OpeningHoursSpecification[]. Defensive: skips garbled rows. |
Page-type → schema map (implementation of ARCHITECTURE §9.2)
| pageType | Schemas emitted |
| ---------------- | ----------------------------------------------------------------------- |
| home | LocalBusiness (HairSalon) + Organization + WebSite + FAQPage? |
| services | BreadcrumbList |
| service-detail | Service + FAQPage? + BreadcrumbList |
| about | Organization + BreadcrumbList |
| contact | LocalBusiness + BreadcrumbList |
| gallery | ImageGallery? + BreadcrumbList |
| blog | BreadcrumbList |
| blog-post | BlogPosting + BreadcrumbList |
| booking | BreadcrumbList |
| account | BreadcrumbList |
| 404 / 500 | none (not indexed) |
? = emitted only when the corresponding config data is present
(non-empty FAQ entries, non-empty gallery). The dispatch never emits a
partial schema — if a builder can't fill the required schema.org
fields it returns null and buildSchemas filters it out.
Defensive-skip invariant
Every builder returns null (and is filtered from the output) when it
lacks the data to form a complete schema:
buildLocalBusinessSchema— null withoutbusiness.name.buildOrganizationSchema— null withoutbusiness.name.buildServiceSchema— null withoutservice.nameorbusiness.name.buildPersonSchema— null withoutstaff.nameorbusiness.name.buildFaqPageSchema— null on empty/undefined entries.buildBreadcrumbListSchema— null forhome/404/500; null forservice-detail/blog-postwithout aslug.buildImageGallerySchema— null on empty gallery; skips items whoseimageRefdoesn't resolve.buildArticleSchema— null withoutslug + headline + business.name.
Google's Rich Results Test treats partial structured data as a hard validation failure; emitting nothing is strictly better than emitting an incomplete object with placeholder fields.
Usage
// app/services/[slug]/page.tsx
import { buildSchemas } from "@framework-cwf/seo";
import { getWebsiteConfig } from "@/lib/website-config";
import { lookupService, lookupServiceFaqs } from "@/lib/services";
export default async function ServicePage({ params }) {
const { operational, marketing } = await getWebsiteConfig();
const service = lookupService(operational, params.slug);
const { jsonLd } = buildSchemas({
pageType: "service-detail",
business: operational,
marketing,
service: {
slug: params.slug,
service,
faqs: lookupServiceFaqs(marketing, params.slug),
},
});
return (
<>
{jsonLd.map((schema, i) => (
<script
key={i}
type="application/ld+json"
// schema.org JSON-LD is safe to inline — pure JSON, no script content.
dangerouslySetInnerHTML={{ __html: JSON.stringify(schema) }}
/>
))}
{/* ...the rendered service page... */}
</>
);
}What's in here (T1.E.3 — sitemap / robots / internal-links)
The crawl-surface layer per ARCHITECTURE §9.3. Three pure functions:
| Export | Purpose |
| ---------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| buildSitemap | sitemaps.org-spec <urlset> XML for a list of pages. Skips entries flagged noindex (per-route override OR 404/500 page types). Auto-paginates into a <sitemapindex> + shards when exceeding 50,000 URLs or 50 MB. |
| buildDefaultSitemapEntries | Convenience: enumerate marketing.pages.enabled into a SitemapEntry[]. Dynamic routes (service-detail / blog-post) are caller-supplied. |
| buildRobots | Environment-aware robots.txt. Production emits Allow: / + Disallow: /auth/, /api/ + Sitemap: line; non-production emits Disallow: / with no sitemap leak. |
| buildInternalLinks | Topical-cluster map for service / location / staff pages. Implements the service-at-location collapse rule. |
| buildPaginationLinks | rel="prev" / rel="next" descriptors for paginated routes (blog index, archive pages). |
| slugify | Name → URL slug (lowercase, hyphenated, ASCII). |
Sitemap defaults per page type
| pageType | changefreq | priority |
| ------------------------------- | -------------------------- | ---------- |
| home | weekly | 1.000 |
| services | monthly | 0.900 |
| service-detail | monthly | 0.800 |
| blog | weekly | 0.700 |
| about / contact / gallery | monthly / yearly / monthly | 0.600 |
| blog-post | yearly | 0.600 |
| booking | monthly | 0.500 |
| account | yearly | 0.300 |
| 404 / 500 | — (skipped) | — |
Entry-level overrides win; publishedAt is the site-wide lastmod fallback.
Collapse rule (load-bearing, ARCHITECTURE §9.3)
buildInternalLinks evaluates each (service, location) pair:
- Differentiated — location has either a price override (vs the global service price) OR a location-specific category description.
→ Emits
/services/{slug}/{location-slug}(programmatic page eligible) - Not differentiated — location matches the global service.
→ Collapses to
/services/{slug}#location-{location-slug}(in-page anchor on the canonical service page)
This is the "no doorway pages" mitigation: thin/duplicate templated pages don't get programmatic URLs.
Robots — environment-aware
production User-agent: *
Allow: /
Disallow: /auth/
Disallow: /api/
Sitemap: {canonical}/sitemap.xml
dev / uat / staging User-agent: *
Disallow: /Non-production builds never emit Sitemap: — even with the host robots-blocked, leaking the sitemap URL points crawlers (and attackers) at planned routes.
Tests
pnpm --filter @framework-cwf/seo test227 tests across ten files:
route-paths.test.ts(13),template.test.ts(6),url-helpers.test.ts(12),generate-metadata.test.ts(58) — T1.E.1 surface.schemas/builders.test.ts(35),build-schemas.test.ts(51) — T1.E.2 JSON-LD surface.sitemap.test.ts(19) — sitemap snapshot per fixture + structural validator (asserts XML declaration, root namespace,<loc>URL parseability,<lastmod>W3C datetime format,<changefreq>enum membership,<priority>range, no unknown tags inside<url>) + pagination + filtering behaviour.robots.test.ts(9) — robots snapshot per environment (production / dev / uat / staging) + override behaviour.internal-links.test.ts(22) — internal-links snapshot per fixture (single-location / minimal / multilocation-collapse-rule) + collapse-rule behaviour (differentiated vs collapsed, anchor URL emission, staff-services no-op without contracts patch) + pagination helper + slugify.ssr-import.test.ts(2) — locks in the SSR-safe import path.
Every emitted schema passes a recursive assertNoUndefined walker, a schema.org-structural validator, and (for sitemaps) a sitemaps.org-spec structural validator. No native XML library dependency — the sitemap output is generated by string templating and validated structurally; a real XSD validator would require a libxml2 native binding the rest of the repo doesn't pull in.
Coming next
Track E is complete. The remaining SEO work happens in apps/template (T1.G.2), which composes everything in this package into the actual per-business site routes.
