link-collector
v0.1.3
Published
A generic paginated link collection CLI built with Bun
Readme
link-collector
A generic paginated link collection CLI built with Bun.
Collect detail links from paginated HTML pages and optionally extract target elements from each detail page.
Run
bunx link-collector \
--url-template 'https://example.com/list?page=[1-8]' \
--detail-selector 'a[href]' \
--detail-attr 'href'Common Options
--url-template: URL containing an inline page range like[1-8]--detail-selector: CSS selector used on list pages--detail-attr: attribute to read from list-page elements, defaultdata-href--detail-fallback-attr: fallback attribute, defaulthref--target-selector: optional CSS selector used on detail pages--target-attr: attribute to read from detail-page elements, defaulthref--target-text: extract text instead of an attribute--output: write results to a file instead of stdout--concurrency: number of concurrent detail-page requests, default4
Progress logs are written to stderr. Result rows are written to stdout, so piping works cleanly:
bunx link-collector ... | pbcopyInstall In A Project
bun add link-collector
bun run link-collector \
--url-template 'https://example.com/list?page=[1-8]' \
--detail-selector 'a[href]' \
--detail-attr 'href'