@framework-cwf/seo

v0.2.8

Published

4 days ago

Per-route generateMetadata + JSON-LD schema generators + sitemap.xml/robots.txt/internal-link builders. Pure functions from typed config.

0High
0Medium
0Low

davidcompton84

@framework-cwf/seo

Pure-function SEO builders for the customer-website framework. Consumes the typed OperationalConfig + MarketingConfig from @framework-cwf/contracts, emits Next.js metadata, JSON-LD (T1.E.2), sitemap.xml/robots.txt entries (T1.E.3), and internal-link maps.

No I/O, no React, no window access — the package imports cleanly from Node 24 and is safe to call from Server Components or the build-time static-export pipeline.

Installation

Published to GitHub Packages under the @framework-cwf scope. Consumers need an .npmrc pointing the scope at the GitHub Packages registry plus an auth token:

@framework-cwf:registry=https://npm.pkg.github.com
//npm.pkg.github.com/:_authToken=${NODE_AUTH_TOKEN}

pnpm add @framework-cwf/seo

What's in here (T1.E.1)

| Export | Purpose | | ------------------------------------------------------------------------------ | ----------------------------------------------------------------------------------------------------- | | generateMetadata | Per-route Next.js Metadata builder. Title / description / canonical / openGraph / twitter / robots. | | buildCanonicalUrl | Hostname + path → fully-resolved canonical URL. Trailing-slash safe. | | buildOgImageUrl | Resolves a marketing-block ogImageRef to a full URL. Passes absolute URLs through. | | pageTypeToPath | Route discriminator → canonical path string. Reused by sitemap + JSON-LD downstream. | | expandTemplate | {placeholder} template expansion used inside generateMetadata. Useful when composing your own. | | PAGE_TYPES | Frozen 12-item array of every page type the framework supports. | | Route, PageType, MetadataOutput, GenerateMetadataInput, TemplateVars | Types. |

Page-type union

generateMetadata accepts any of these 12 routes. Routes that need extra data carry it on the discriminated union; the rest are bare type tags.

| route.type | Path | Extra fields | OG type | | ---------------- | ------------------ | -------------------------------- | --------- | | home | / | — | website | | services | /services | — | website | | service-detail | /services/{slug} | slug, serviceName? | website | | about | /about | — | website | | contact | /contact | — | website | | gallery | /gallery | — | website | | blog | /blog | — | website | | blog-post | /blog/{slug} | slug, title?, publishedAt? | article | | booking | /book | — | website | | account | /account | — | website | | 404 | /404 | — | website | | 500 | /500 | — | website |

The list intentionally exceeds the four ROUTE_KEYS (home/services/ about/visit) in @framework-cwf/contracts — those are the enabled-pages whitelist for the marketing editor, not the metadata surface. See the T1.E.1 PR description for the divergence note.

Field resolution order

Per field (title, description, ogImage, robots), generateMetadata falls through in this order:

Per-route override at marketing.seo.pages[path] or marketing.seo.pages[routeKey] (without leading slash). The Admin marketing editor lands either form here.
Marketing-block template — marketing.seo.defaults.titleTemplate / descriptionTemplate with {pageTitle} / {pageDescription} / {businessName} / {businessTagline} placeholders. Unknown placeholders are left in place as a "loud bug" rather than silenced.
Built-in per-page-type defaults — copy keyed on route.type that always includes the business name and a hand-written content hook.

Usage

// app/services/[slug]/page.tsx
import type { Metadata } from "next";
import {
  generateMetadata as seoMetadata,
  type Route,
} from "@framework-cwf/seo";

import { getWebsiteConfig } from "@/lib/website-config";

export async function generateMetadata({
  params,
}: {
  params: { slug: string };
}): Promise<Metadata> {
  const { operational, marketing } = await getWebsiteConfig();
  const route: Route = {
    type: "service-detail",
    slug: params.slug,
    serviceName: lookupServiceName(operational, params.slug),
  };
  return seoMetadata({ route, business: operational, marketing });
}

The no-undefined invariant

Every field in the returned object carries a concrete value. Crawlers treat undefined-rendered tags as parse errors; the snapshot suite walks every fixture × every route and asserts no value is undefined anywhere in the tree. Optional content (an empty OG image array, an absent publishedTime) is either an empty array or simply omitted — never undefined.

What's in here (T1.E.2 — JSON-LD)

buildSchemas({ pageType, business, marketing, ... }) returns { jsonLd, bytes } — an array of schema.org-shaped objects plus the JSON-stringified UTF-8 byte count, ready to inline into a <script type="application/ld+json"> tag per page.

| Export | Purpose | | -------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------ | | buildSchemas | Page-type dispatch — the entry point consumers actually call. | | buildLocalBusinessSchema | LocalBusiness / HairSalon / BeautySalon per the resolved businessType. | | buildOrganizationSchema | Lightweight Organization — emitted on home + as publisher for blog posts. | | buildWebSiteSchema | WebSite with optional SearchAction when searchPath is supplied. | | buildServiceSchema | Service with provider, areaServed, optional Offer (price + currency). | | buildPersonSchema | Person with worksFor Organization, optional jobTitle + image. | | buildFaqPageSchema | FAQPage from contracts FaqEntry[] — skips entries missing question or answer. | | buildBreadcrumbListSchema | BreadcrumbList for every non-home, non-error page. | | buildImageGallerySchema | ImageGallery of ImageObjects — caption from gallery item altText. | | buildArticleSchema | BlogPosting with author, publisher, mainEntityOfPage, optional image + dates. | | buildOpeningHoursForEntries / buildOpeningHoursForLocation | Parses freeform hours: string ("9 — 17", "9am — 5pm", etc.) into OpeningHoursSpecification[]. Defensive: skips garbled rows. |

Page-type → schema map (implementation of ARCHITECTURE §9.2)

| pageType | Schemas emitted | | ---------------- | ----------------------------------------------------------------------- | | home | LocalBusiness (HairSalon) + Organization + WebSite + FAQPage? | | services | BreadcrumbList | | service-detail | Service + FAQPage? + BreadcrumbList | | about | Organization + BreadcrumbList | | contact | LocalBusiness + BreadcrumbList | | gallery | ImageGallery? + BreadcrumbList | | blog | BreadcrumbList | | blog-post | BlogPosting + BreadcrumbList | | booking | BreadcrumbList | | account | BreadcrumbList | | 404 / 500 | none (not indexed) |

? = emitted only when the corresponding config data is present (non-empty FAQ entries, non-empty gallery). The dispatch never emits a partial schema — if a builder can't fill the required schema.org fields it returns null and buildSchemas filters it out.

Defensive-skip invariant

Every builder returns null (and is filtered from the output) when it lacks the data to form a complete schema:

buildLocalBusinessSchema — null without business.name.
buildOrganizationSchema — null without business.name.
buildServiceSchema — null without service.name or business.name.
buildPersonSchema — null without staff.name or business.name.
buildFaqPageSchema — null on empty/undefined entries.
buildBreadcrumbListSchema — null for home/404/500; null for service-detail/blog-post without a slug.
buildImageGallerySchema — null on empty gallery; skips items whose imageRef doesn't resolve.
buildArticleSchema — null without slug + headline + business.name.

Google's Rich Results Test treats partial structured data as a hard validation failure; emitting nothing is strictly better than emitting an incomplete object with placeholder fields.

Usage

// app/services/[slug]/page.tsx
import { buildSchemas } from "@framework-cwf/seo";
import { getWebsiteConfig } from "@/lib/website-config";
import { lookupService, lookupServiceFaqs } from "@/lib/services";

export default async function ServicePage({ params }) {
  const { operational, marketing } = await getWebsiteConfig();
  const service = lookupService(operational, params.slug);

  const { jsonLd } = buildSchemas({
    pageType: "service-detail",
    business: operational,
    marketing,
    service: {
      slug: params.slug,
      service,
      faqs: lookupServiceFaqs(marketing, params.slug),
    },
  });

  return (
    <>
      {jsonLd.map((schema, i) => (
        <script
          key={i}
          type="application/ld+json"
          // schema.org JSON-LD is safe to inline — pure JSON, no script content.
          dangerouslySetInnerHTML={{ __html: JSON.stringify(schema) }}
        />
      ))}
      {/* ...the rendered service page... */}
    </>
  );
}

What's in here (T1.E.3 — sitemap / robots / internal-links)

The crawl-surface layer per ARCHITECTURE §9.3. Three pure functions:

| Export | Purpose | | ---------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | buildSitemap | sitemaps.org-spec <urlset> XML for a list of pages. Skips entries flagged noindex (per-route override OR 404/500 page types). Auto-paginates into a <sitemapindex> + shards when exceeding 50,000 URLs or 50 MB. | | buildDefaultSitemapEntries | Convenience: enumerate marketing.pages.enabled into a SitemapEntry[]. Dynamic routes (service-detail / blog-post) are caller-supplied. | | buildRobots | Environment-aware robots.txt. Production emits Allow: / + Disallow: /auth/, /api/ + Sitemap: line; non-production emits Disallow: / with no sitemap leak. | | buildInternalLinks | Topical-cluster map for service / location / staff pages. Implements the service-at-location collapse rule. | | buildPaginationLinks | rel="prev" / rel="next" descriptors for paginated routes (blog index, archive pages). | | slugify | Name → URL slug (lowercase, hyphenated, ASCII). |

Sitemap defaults per page type

| pageType | changefreq | priority | | ------------------------------- | -------------------------- | ---------- | | home | weekly | 1.000 | | services | monthly | 0.900 | | service-detail | monthly | 0.800 | | blog | weekly | 0.700 | | about / contact / gallery | monthly / yearly / monthly | 0.600 | | blog-post | yearly | 0.600 | | booking | monthly | 0.500 | | account | yearly | 0.300 | | 404 / 500 | — (skipped) | — |

Entry-level overrides win; publishedAt is the site-wide lastmod fallback.

Collapse rule (load-bearing, ARCHITECTURE §9.3)

buildInternalLinks evaluates each (service, location) pair:

Differentiated — location has either a price override (vs the global service price) OR a location-specific category description. → Emits /services/{slug}/{location-slug} (programmatic page eligible)
Not differentiated — location matches the global service. → Collapses to /services/{slug}#location-{location-slug} (in-page anchor on the canonical service page)

This is the "no doorway pages" mitigation: thin/duplicate templated pages don't get programmatic URLs.

Robots — environment-aware

production              User-agent: *
                        Allow: /
                        Disallow: /auth/
                        Disallow: /api/

                        Sitemap: {canonical}/sitemap.xml

dev / uat / staging     User-agent: *
                        Disallow: /

Non-production builds never emit Sitemap: — even with the host robots-blocked, leaking the sitemap URL points crawlers (and attackers) at planned routes.

Tests

pnpm --filter @framework-cwf/seo test

227 tests across ten files:

route-paths.test.ts (13), template.test.ts (6), url-helpers.test.ts (12), generate-metadata.test.ts (58) — T1.E.1 surface.
schemas/builders.test.ts (35), build-schemas.test.ts (51) — T1.E.2 JSON-LD surface.
sitemap.test.ts (19) — sitemap snapshot per fixture + structural validator (asserts XML declaration, root namespace, <loc> URL parseability, <lastmod> W3C datetime format, <changefreq> enum membership, <priority> range, no unknown tags inside <url>) + pagination + filtering behaviour.
robots.test.ts (9) — robots snapshot per environment (production / dev / uat / staging) + override behaviour.
internal-links.test.ts (22) — internal-links snapshot per fixture (single-location / minimal / multilocation-collapse-rule) + collapse-rule behaviour (differentiated vs collapsed, anchor URL emission, staff-services no-op without contracts patch) + pagination helper + slugify.
ssr-import.test.ts (2) — locks in the SSR-safe import path.

Every emitted schema passes a recursive assertNoUndefined walker, a schema.org-structural validator, and (for sitemaps) a sitemaps.org-spec structural validator. No native XML library dependency — the sitemap output is generated by string templating and validated structurally; a real XSD validator would require a libxml2 native binding the rest of the repo doesn't pull in.

Coming next

Track E is complete. The remaining SEO work happens in apps/template (T1.G.2), which composes everything in this package into the actual per-business site routes.

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

@framework-cwf/seo

Installation

What's in here (T1.E.1)

Page-type union

Field resolution order

Usage

The no-undefined invariant

What's in here (T1.E.2 — JSON-LD)

Page-type → schema map (implementation of ARCHITECTURE §9.2)

Defensive-skip invariant

Usage

What's in here (T1.E.3 — sitemap / robots / internal-links)

Sitemap defaults per page type

Collapse rule (load-bearing, ARCHITECTURE §9.3)

Robots — environment-aware

Tests

Coming next