npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@seo-solver/extract

v0.3.0

Published

Structured SEO data extraction pipeline for SEO Solver

Readme

@seo-solver/extract

@seo-solver/extract turns fetched pages into structured SEO data. It owns the canonical target catalog for the workspace, so packages and applications can all talk about the same extraction targets.

Installation

pnpm add @seo-solver/extract

What this package gives you

  • page-level extraction results with a stable { source, data, errors } wrapper and target-driven sparse data
  • small helpers for extracting specific SEO targets like meta tags, Open Graph, JSON-LD, headings, and canonical links
  • small helpers for reading target data and status from ExtractedPage
  • a package-owned listTargets() catalog
  • an advanced pipeline surface for custom extractors and low-level extraction work

Simple API

Use the root API when you want extraction results that are ready to pass into comparison or validation.

The basic API returns the stable ExtractedPage contract from @seo-solver/types/extract:

| Need | Public import | Result shape | | --- | --- | --- | | Extract from an HTML string | extractHtml from @seo-solver/extract | ExtractedPage | | Extract from a FetchResult | extractPage from @seo-solver/extract | ExtractedPage | | Extract from a robots.txt body | extractRobotsText from @seo-solver/extract | ExtractedPage | | Extract only meta tags | extractMetaTags from @seo-solver/extract | MetaTagsData \| null | | Extract only Open Graph tags | extractOpenGraph from @seo-solver/extract | OpenGraphData \| null | | Extract only JSON-LD blocks | extractJsonLd from @seo-solver/extract | JsonLdData \| null | | Extract only headings | extractHeadings from @seo-solver/extract | HeadingsData \| null | | Extract only canonical links | extractCanonical from @seo-solver/extract | CanonicalData \| null | | Inspect supported targets | listTargets from @seo-solver/extract | TargetCatalogEntry[] | | Read selected target data safely | getTargetData, getTargetStatus, hasTargetData from @seo-solver/extract | Typed target data, status, or boolean | | Type the result in your app | ExtractedPage from @seo-solver/types/extract | Stable page-level contract | | Build custom low-level pipelines | @seo-solver/extract/advanced | ExtractionEnvelope[] |

For third-party applications, prefer the ExtractedPage result unless you intentionally need custom extractor instances or raw pipeline envelopes.

import { extractHtml, getTargetData, getTargetStatus, hasTargetData, listTargets } from '@seo-solver/extract';
import type { ExtractedPage } from '@seo-solver/types/extract';

const page: ExtractedPage = extractHtml('<!doctype html><html><head><title>Hello</title></head></html>', {
  targets: ['meta', 'headings'],
});

console.log(page.source.url);
console.log(getTargetData(page, 'meta')?.title ?? 'No title found');
console.log(hasTargetData(page, 'headings') ? getTargetData(page, 'headings')?.length : 0);
console.log(getTargetStatus(page, 'opengraph') ?? 'not selected');
console.log(page.errors.map((error) => error.message));
console.log(listTargets().map((target) => target.key));

If you already have a canonical fetch result from @seo-solver/fetch, use extractPage().

If you specifically want to extract one SEO target directly, the root API also exposes focused helpers:

import { extractCanonical, extractHeadings, extractMetaTags, extractOpenGraph } from '@seo-solver/extract';

const html = '<html><head><title>Hello</title><meta property="og:title" content="Hello"></head><body><h1>Hello</h1></body></html>';

console.log(extractMetaTags(html));
console.log(extractOpenGraph(html));
console.log(extractHeadings(html));
console.log(extractCanonical(html));

If you specifically want to parse a robots.txt body, there is also a dedicated helper:

import { extractRobotsText } from '@seo-solver/extract';

const robotsPage = extractRobotsText('User-agent: *\nDisallow: /admin');

console.log(robotsPage.data.robotsTxt);

Advanced API

The application uses the advanced surface when it needs direct access to pipelines or extractor classes rather than simple extraction helpers.

import { listTargets } from '@seo-solver/extract';
import { createExtractorPipeline, MetaTagsExtractor } from '@seo-solver/extract/advanced';

const pipeline = createExtractorPipeline({ targets: listTargets().map((entry) => entry.key) });
const customOnly = createExtractorPipeline({ targets: [new MetaTagsExtractor()] });

Use @seo-solver/extract/advanced when you intentionally want low-level extractor envelopes, pipeline control, or custom extractor injection. For normal consumers, the page-level root API is the better fit.

Core concepts

  • targets are the public selection vocabulary (meta, opengraph, jsonld, robotsTxt, and so on)
  • source describes what was fetched and from where
  • data contains only the selected or default-selected targets; requested targets with no extracted data remain present as null
  • targetStatus records whether each selected or default-selected target is present or missing
  • errors contains extractor-level warnings in a package-owned format

When reading data, treat it as target-driven and sparse: a selected target can have data, be present with null, or be absent when it was not selected. Use targetStatus when your app needs to tell the difference between “this target was checked and missing” and “this target was not part of this extraction.”

The target helper functions encode those checks for common consumers:

  • getTargetData(page, target) returns typed target data or null
  • getTargetStatus(page, target) returns present, missing, or undefined when the target was not selected
  • hasTargetData(page, target) returns true only when the selected target produced data

Related docs and examples