npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@moonye/schemaguardian

v0.4.0

Published

Validate JSON-LD structured data on URLs, HTML files, or whole sites via sitemap. CI-friendly. Built for the AI search era. Now with a programmatic library API.

Downloads

205

Readme

schemaguardian

npm version npm downloads license

Validate JSON-LD structured data on any URL, HTML file, or whole site via sitemap. CI-friendly. Built for the AI search era.

# Validate one page
npx @moonye/schemaguardian check https://your-site.com

# Walk every URL in your sitemap.xml
npx @moonye/schemaguardian scan https://your-site.com

# Drop a ready-to-commit GitHub Actions workflow
npx @moonye/schemaguardian init --url https://your-site.com

Why this exists

Google scaled back FAQ and HowTo rich results in 2023 and cut them further in the March 2026 core update. But structured data is now a primary signal for citation in AI search engines (Perplexity, ChatGPT, Gemini, Google AI Overviews). schemaguardian validates your JSON-LD against schema.org rules plus the documented Google rejection patterns and the 2026 reality of which schema types still produce rich results.

It runs in CI. It exits non-zero on real problems. It tells you why.

Install

# one-off
npx @moonye/schemaguardian check https://example.com

# global
npm i -g @moonye/schemaguardian
schemaguardian check https://example.com

# project dev dependency
npm i -D @moonye/schemaguardian

Requires Node 18+.

Commands

schemaguardian check <url|file>      Validate a single URL or local HTML file.
schemaguardian scan  <site-url>      Walk a site's sitemap.xml and validate every page.
schemaguardian generate [type]       Interactively generate schema markup.
schemaguardian init                  Generate .github/workflows/schemaguardian.yml.
schemaguardian help
schemaguardian version

check — single page

schemaguardian check https://faqjsonld.com/faq-schema-generator
schemaguardian check ./dist/index.html
schemaguardian check https://staging.example.com --ci
schemaguardian check https://example.com --json | jq '.blocks[].issues'

Options: --ci (exit non-zero on errors) · --json (machine output) · --no-color.

scan — whole site via sitemap

Auto-discovers /sitemap-index.xml, /sitemap.xml, or /sitemap_index.xml. Recursively follows sitemap indices to their child sitemaps. Validates every URL in parallel.

schemaguardian scan https://faqjsonld.com
schemaguardian scan https://example.com --limit 25 --concurrency 8 --ci
schemaguardian scan https://example.com --sitemap https://example.com/news-sitemap.xml
schemaguardian scan https://example.com --json | jq '.summary'

Options:

| Flag | Default | Meaning | |---|---|---| | --sitemap <url> | auto-discover | Use this sitemap URL instead of guessing. | | --limit <n> | 100 | Max URLs to scan. | | --concurrency <n> | 4 | Parallel requests (1-32). | | --ci | off | Exit non-zero on any error or fetch failure. | | --json | off | Machine-readable output. | | --no-color | off | Disable ANSI color. |

Output includes per-page status, a per-type count of schemas found across the site, and a list of pages with no structured data at all.

init — generate a CI workflow

# default: writes .github/workflows/schemaguardian.yml using `scan`
schemaguardian init --url https://my-site.com

# use single-page check instead of scan
schemaguardian init --url https://my-site.com --command check

# write somewhere else
schemaguardian init --url https://my-site.com --target .gitlab-ci.yml --force

Options: --url <url> (the site to validate) · --command check|scan (default scan) · --target <path> (output location) · --force (overwrite an existing file).

generate — interactively generate schema markup

# Interactive mode: select schema type and fill in fields
schemaguardian generate

# Direct mode: specify schema type directly
schemaguardian generate faq

# Preview without saving
schemaguardian generate product --preview

# Save to file
schemaguardian generate article --output schema.json

# Combine options
schemaguardian generate recipe --output my-recipe.json --preview

Options: --output <path> (save to file) · --preview (show without saving) · --type <type> (specify schema type directly instead of interactive selection).

Supports all 12 schema types: FAQPage, HowTo, Product, Recipe, Article, Review, LocalBusiness, Event, BreadcrumbList, Organization, Course, JobPosting, and Video.

CI integration

GitHub Actions

# .github/workflows/schema.yml
name: schemaguardian
on:
  pull_request:
    branches: [main]
  push:
    branches: [main]
jobs:
  validate:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/setup-node@v4
        with: { node-version: '20' }
      - run: npx --yes @moonye/schemaguardian@latest scan https://your-site.com --ci

Or just npx @moonye/schemaguardian init once and commit the file.

GitLab CI

schema-check:
  image: node:20
  script:
    - npx --yes @moonye/schemaguardian@latest scan $CI_ENVIRONMENT_URL --ci

package.json

{
  "scripts": {
    "schema:check": "schemaguardian check https://faqjsonld.com --ci",
    "schema:scan":  "schemaguardian scan  https://faqjsonld.com --ci"
  }
}

What it validates

For every <script type="application/ld+json"> block found on a page:

  1. Generic envelope — JSON parses, @context includes schema.org, @type is present.
  2. Per-type required fields for the 12 schema types in the registry: FAQPage, HowTo, Product, Recipe, Article (and BlogPosting, NewsArticle), Review, LocalBusiness, Event, BreadcrumbList, Organization, Course, JobPosting.
  3. 2026-specific Google rejection patterns, including:
    • FAQ rich result deprecation since 2023, further cut March 2026
    • HowTo rich result removal since 2023-2024
    • Product without offers OR aggregateRating (no rich result)
    • JobPosting without validThrough (Google for Jobs suppression)
    • JobPosting without baseSalary (lower placement, AI filter skip)
    • Article without publisher logo (Top Stories ineligible)
    • BreadcrumbList with non-sequential positions
    • Many more, see src/lib/validators.ts.

Other @type values pass envelope checks and emit an info-level note that type-specific validation was skipped.

What it does NOT do (yet)

  • Microdata or RDFa parsing (only JSON-LD)
  • Validating that visible page content matches schema text content (Google requires this; only a human or rendered diff can verify it)
  • Full schema.org SHACL validation
  • Multi-domain monitoring (planned for paid Pro tier)

Severity levels

| Level | Meaning | --ci exit code | |---|---|---| | ERR | Required field missing or wrong type. Will not produce rich results. | 1 | | WARN | Best practice violation or 2026 deprecation note. Schema may still validate. | 0 | | INFO | Type unsupported or other note. | 0 |

scan --ci also exits 1 on any fetch failure (HTTP 4xx/5xx, timeout, DNS).

JSON output schemas

check --json

{
  "target": "https://example.com",
  "blocksFound": 2,
  "blocks": [
    {
      "block": { "raw": "...", "parsed": { ... }, "position": 1 },
      "schemaType": "FAQPage",
      "issues": [{ "severity": "warning", "code": "faq-rich-result-deprecated", "message": "...", "path": "..." }]
    }
  ]
}

scan --json

{
  "sitemap": "https://example.com/sitemap-index.xml",
  "totalUrlsInSitemap": 14,
  "scanned": 14,
  "limited": false,
  "pages": [
    { "url": "...", "status": "ok", "blocksFound": 2, "schemaTypes": ["FAQPage", "BreadcrumbList"], "errors": 0, "warnings": 1 }
  ],
  "summary": {
    "ok": 1, "withErrors": 0, "withWarnings": 13, "fetchErrors": 0,
    "missingSchema": 0, "schemaTypeCounts": { "FAQPage": 13 },
    "totalErrors": 0, "totalWarnings": 13
  }
}

Roadmap

  • v0.1: check command for a single URL or file
  • v0.2: scan for whole sites via sitemap, init for one-shot CI setup
  • v0.3: generate for interactive schema creation
  • v0.4+ (paid Pro, planned): multi-domain monitoring, auto-PR fix via GitHub API, team workflows, GitHub Action wrapper

The free CLI will always validate any site. Paid tiers add multi-domain operations and automation.

Contributing

Source lives at https://github.com/moonye6/faq under cli/. The 12 free schema generators on https://faqjsonld.com use the same validators. Issues and PRs welcome.

License

MIT