npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

srcfull

v2.0.1

Published

Image extraction and source-resolution toolkit for high-quality web images.

Readme

Srcfull

srcfull is a package-first toolkit for extracting and upgrading web image URLs.

It is designed as a standalone library and CLI for image extraction and source resolution. The focus is:

  • extract image candidates from HTML
  • filter obvious junk like logos and icons
  • resolve CDN/transformed URLs back to larger originals
  • probe likely source variants when no curated pattern exists
  • optionally plug in HTML fetchers like ScrapingBee and fallback image providers like Firecrawl

It handles the page-shape problems that usually make this kind of package annoying in practice:

  • relative image paths resolved against the page URL
  • lazy-loaded image attributes like data-src, data-srcset, and data-original
  • img srcset, picture source, inline background images, and social/meta image tags
  • private-host blocking for both page scraping and image validation
  • HEAD fallback to ranged GET for hosts that refuse metadata requests
  • persistent file-backed cache/pattern stores for repeat runs

Install

pnpm install
pnpm build

Library Usage

import { scrapePage, resolveImageUrl } from "srcfull";

const resolved = await resolveImageUrl(
  "https://cdn.example.com/image.jpg?w=400&q=80"
);

const page = await scrapePage("https://example.com/product-page");

scrapePage() normalizes relative candidates against the page URL before validation and resolution, so typical product/article HTML works without extra preprocessing.

If you need rendered HTML instead of plain fetch, inject a custom fetcher:

import { scrapePage } from "srcfull";
import { createScrapingBeeHtmlFetcher } from "srcfull/providers/scrapingbee";

const fetchHtml = createScrapingBeeHtmlFetcher({
  apiKey: process.env.SCRAPINGBEE_API_KEY!,
});

const result = await scrapePage("https://example.com", { fetchHtml });

If you want the built-in fetcher with different timeout or header behavior:

import { createDefaultHtmlFetcher, scrapePage } from "srcfull";

const fetchHtml = createDefaultHtmlFetcher({
  timeoutMs: 15_000,
  headers: {
    "Accept-Language": "en-GB,en;q=0.9",
  },
});

const result = await scrapePage("https://example.com", { fetchHtml });

For image-only fallback:

import { createFirecrawlImageFallback } from "srcfull/providers/firecrawl";

If you want candidate extraction without the rest of the pipeline:

import { extractImageCandidatesFromHtml } from "srcfull";

const candidates = extractImageCandidatesFromHtml(
  html,
  "https://example.com/product-page"
);

For repeat jobs, persist cache and learned patterns on disk:

import {
  createFileCache,
  createFilePatternStore,
  resolveImageUrl,
} from "srcfull";

const cache = createFileCache({ filePath: ".srcfull/cache.json" });
const patternStore = createFilePatternStore({
  filePath: ".srcfull/patterns.json",
});

const result = await resolveImageUrl("https://cdn.example.com/photo.jpg?w=400", {
  cache,
  patternStore,
});

CLI

srcfull resolve 'https://cdn.example.com/photo.jpg?w=300'
srcfull scrape 'https://example.com/listing' --max-images=12
srcfull scrape 'https://example.com/listing' --max-images=12 --min-size=300 --resolve-concurrency=8
srcfull --version

The JSON response from scrape includes stats.returned as well as found, resolved, failed, and durationMs.

Demo Page

There is a self-contained demo page at docs/demo/index.html.

pnpm demo:build
pnpm demo:serve

The page is generated from real calls to the package, so the HTML samples, extracted candidates, resolved URLs, and persisted cache/pattern snapshots are actual outputs rather than hand-written mockups.

Development

pnpm test
pnpm test:live-patterns
pnpm typecheck
pnpm build

pnpm test:live-patterns revalidates the researched real-world CDN fixtures in test/fixtures/curated-patterns.json against the network.