@anabranch/broken-link-checker
v0.5.0
Published
Crawl websites and find broken links. Uses web-client for robust HTTP and anabranch streams for concurrent processing with backpressure.
Downloads
2,362
Readme
@anabranch/broken-link-checker
A concurrent website crawler that finds broken links. Built on anabranch streams for bounded-concurrency crawling, retry with exponential backoff, and streaming results.
Usage
import { BrokenLinkChecker } from '@anabranch/broken-link-checker'
const stream = new BrokenLinkChecker({
concurrency: 20,
timeout: 15_000,
retry: { attempts: 3, delay: (attempt) => 1000 * 2 ** attempt },
logLevel: 'info',
})
.filterUrls((url) => !url.pathname.endsWith('.pdf'))
.keepBroken((result) => result.status !== 401)
.check(['https://my-site.com', 'https://my-site.com/sitemap.xml'])
for await (const result of stream.successes()) {
if (!result.ok) {
console.log(
`BROKEN: ${result.url.href} (${result.reason}) on ${result.parent?.href}`,
)
}
}Installation
Deno (JSR)
import { BrokenLinkChecker } from 'jsr:@anabranch/broken-link-checker'Node / Bun (npm)
npm install @anabranch/broken-link-checkerAPI reference
See generated documentation for full API details.
