npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

broken-link-checker-html

v1.0.0

Published

Fast broken link detection for HTML content. Detects broken URLs, images, and invalid href values.

Readme

broken-link-checker-html

Fast broken link detection for HTML content. Detects broken URLs, broken images, and invalid href values.

npm version License: MIT

Features

  • Fast parallel checking - Check multiple URLs concurrently
  • Live page checking - Fetch a URL and check all its links
  • HEAD + GET fallback - Tries HEAD first, falls back to GET for better compatibility
  • Invalid href detection - Finds malformed links like href="click here" instead of proper URLs
  • Safe domain whitelist - Skip social media domains that often block automated requests
  • HTML cleaning - Remove broken links while preserving content
  • Zero dependencies - Uses native fetch API (Node.js 18+)
  • CLI & API - Use from command line or as a library

Installation

npm install broken-link-checker-html

Or use directly with npx:

npx broken-link-checker-html -f index.html

CLI Usage

Check a live page (fetch and scan all links)

broken-link-checker -p https://example.com/page
# or simply
broken-link-checker https://example.com/page

Check if a single URL is broken

broken-link-checker -u https://example.com/page

Check a local HTML file

broken-link-checker -f index.html

Options

-u, --url <url>         Check if a single URL is broken
-p, --page <url>        Fetch a live page and check all its links
-f, --file <path>       Check HTML file for broken links
-t, --timeout <ms>      Request timeout (default: 5000)
-c, --concurrency <n>   Max parallel requests (default: 50)
-m, --method <method>   HTTP method: HEAD, GET, or auto (default: auto)
-o, --output <path>     Output results to JSON file
-q, --quiet             Only output errors
--no-invalid            Don't check for invalid hrefs
--json                  Output as JSON
-h, --help              Show help
-v, --version           Show version

Examples

# Check all links on a live page
broken-link-checker -p https://example.com/blog

# Check with GET method only (some servers don't support HEAD)
broken-link-checker -p https://example.com -m GET

# Check HTML file and save results
broken-link-checker -f page.html -o results.json

# Output as JSON for piping
broken-link-checker -p https://example.com --json | jq '.brokenLinks'

# Check URL quietly (for CI/CD)
broken-link-checker -u https://example.com -q

API Usage

Find Broken Links

import { findBrokenLinks } from 'broken-link-checker-html';

const html = `
  <html>
    <body>
      <a href="https://example.com/valid">Valid Link</a>
      <a href="https://example.com/broken-page">Broken Link</a>
      <a href="click here">Invalid HREF</a>
      <img src="https://example.com/missing.jpg" />
    </body>
  </html>
`;

const result = await findBrokenLinks(html, {
  timeout: 5000,
  concurrency: 50,
});

console.log(result);
// {
//   brokenLinks: [{ url: 'https://example.com/broken-page', ... }],
//   brokenImages: [{ url: 'https://example.com/missing.jpg', ... }],
//   invalidLinks: [{ url: 'click here', reason: 'invalid_href', ... }],
//   stats: {
//     totalLinks: 2,
//     totalImages: 1,
//     totalInvalidLinks: 1,
//     brokenLinksCount: 1,
//     brokenImagesCount: 1,
//   }
// }

Check and Clean HTML

import { checkAndClean } from 'broken-link-checker-html';

const result = await checkAndClean(html);

console.log(result.html);
// HTML with broken links removed (content preserved)

console.log(result.stats.cleaned);
// Number of items cleaned

Check a Live Page

import { checkPage } from 'broken-link-checker-html';

// Fetch page and check all its links
const result = await checkPage('https://example.com/blog', {
  timeout: 10000,
  concurrency: 50,
});

console.log(result.pageUrl);        // 'https://example.com/blog'
console.log(result.brokenLinks);    // Array of broken links
console.log(result.invalidLinks);   // Array of invalid hrefs
console.log(result.stats);          // Statistics

Check a Single URL

import { checkUrl } from 'broken-link-checker-html';

const isBroken = await checkUrl('https://example.com/page');
console.log(isBroken); // true or false

// With options
const isBroken2 = await checkUrl('https://example.com/page', {
  timeout: 5000,
  method: 'GET',  // 'HEAD', 'GET', or 'auto' (default)
});

Extract URLs from HTML

import { extractUrls } from 'broken-link-checker-html';

const { links, images, invalidLinks } = extractUrls(html);

console.log(links);
// [{ url: 'https://...', fullMatch: '<a href="...">...' }]

console.log(invalidLinks);
// [{ url: 'click here', fullMatch: '<a href="click here">...', reason: 'invalid_href' }]

Custom Safe Domains

import { findBrokenLinks, DEFAULT_SAFE_DOMAINS } from 'broken-link-checker-html';

// Add custom domains to skip
const customSafeDomains = [
  ...DEFAULT_SAFE_DOMAINS,
  'internal-tool.company.com',
  'cdn.example.com',
];

const result = await findBrokenLinks(html, {
  safeDomains: customSafeDomains,
});

What is "Invalid HREF"?

Invalid HREFs are malformed link attributes that don't follow standard URL formats:

<!-- Invalid: Plain text instead of URL -->
<a href="click here">Click Here</a>

<!-- Invalid: Spaces without encoding -->
<a href="my page">My Page</a>

<!-- Valid formats -->
<a href="https://example.com">Absolute URL</a>
<a href="/page">Root-relative</a>
<a href="#section">Anchor</a>
<a href="tel:+1234567890">Phone</a>
<a href="mailto:[email protected]">Email</a>

When a browser encounters href="click here", it resolves it as a relative URL:

https://example.com/current-path/click%20here → 404 Not Found

This tool detects these invalid patterns before they cause 404 errors.

Default Safe Domains

The following domains are skipped by default (they often block automated requests):

  • facebook.com, instagram.com, twitter.com, x.com
  • youtube.com, linkedin.com, tiktok.com, pinterest.com
  • wa.me, whatsapp.com, t.me, telegram.org
  • discord.com, discord.gg

Requirements

  • Node.js 18+ (uses native fetch)

License

MIT License - see LICENSE file.

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Credits

Built with ❤️ by Hayati Ali Keles