silentscraper

v1.0.2

Published

a month ago

Silent stealth-grade scraping and reverse engineering toolkit: rotating UAs, proxy rotation, TLS/JA3 fingerprint hints, cookie jar, session handling, HTML/JSON/XML/RSS/sitemap/robots parsers, JS deobfuscator, API endpoint discovery, network call sniffer h

SilentScraper

Silent. Stealth. Surgical. A complete scraping and reverse-engineering toolkit for Node.js.

SilentScraper is a batteries-included toolkit for engineers who scrape, crawl, and reverse-engineer web targets. It bundles a stealth HTTP client, rotating identities, parsers for every common format, and reverse-engineering helpers, usable as a library or via the silent CLI.

Install

npm install silentscraper

At a glance

import { Silent } from "silentscraper";

const s = new Silent({
  proxies: ["http://user:[email protected]:8080"],
  proxyStrategy: "round-robin",
  rateLimit: { rps: 2, jitterMs: 250 },
  concurrency: 8,
  retries: 4,
  stealth: true,
  deviceProfile: "desktop",
  logLevel: "info",
});

const page = await s.scrape("https://example.com");
console.log(page.text("h1"));
console.log(page.links());
console.log(page.tables());

const data = await s.json("https://api.example.com/v1/items");
console.log(data.query("$..price"));

const intel = await s.reverse("https://target.com", { followScripts: true, deobfuscate: true });
console.log(intel.protections);
console.log(intel.scan.endpoints);
console.log(intel.scan.secrets);
console.log(intel.scan.jsonBlobs);

await s.crawl("https://example.com", {
  maxDepth: 3,
  maxPages: 500,
  onPage: ({ url, parser }) => console.log(url, parser.text("title")),
});

Features

Stealth HTTP client

Rotating User-Agents (desktop + mobile pools)
Proxy rotation: round-robin / random / sticky, with health tracking and auto-bans
Coherent sec-ch-ua client hints matched to UA
HTTP/1.1 + HTTP/2 via undici, gzip / deflate / brotli auto-decoded
Cookie jar with persistence (tough-cookie)
Per-host token-bucket rate limiting with jitter
Bounded-concurrency task queue
Automatic retries with exponential backoff on configurable status codes
Custom retry status codes, timeouts, redirect limits

Parsers

HTML (CSS selectors, links/images/scripts/stylesheets, meta, JSON-LD, forms, tables, tabular extract)
JSON (JSONPath via jsonpath-plus)
XML (JSON-style + XPath)
sitemap.xml + sitemapindex.xml
robots.txt
RSS 2.0 + Atom feeds

Reverse engineering

Endpoint finder (fetch, XHR, axios, generic /api/, GraphQL, WebSockets, NEXT_DATA, Apollo, application/json scripts, route hints)
Secret scanner (AWS, Google, Stripe, Slack, GitHub, JWT, generic)
Protection detector (Cloudflare, Akamai, PerimeterX, DataDome, Imperva, Kasada, reCAPTCHA, hCaptcha, Turnstile, ShieldSquare)
JS deobfuscator (escape decode, base64 extract, _0x rename, pretty-print)
String extractor for long constants
cURL parser + builder (DevTools "Copy as cURL" round-trip)
HAR parser
JA3 fingerprint hints

Output

CSV, JSON, NDJSON exporters with streaming append

CLI

silent fetch <url> [--proxy URL] [--ua STRING] [--out FILE]
silent scrape <url> --selector "h1" [--attr href]
silent crawl <url> [--depth 2] [--max 50] [--out file.ndjson]
silent sitemap <url>
silent robots <origin>
silent feed <url>
silent reverse <url> [--follow-scripts] [--deobfuscate]
silent deobfuscate <file.js> [--out file.js]
silent curl <file-or-->
silent endpoints <file>
silent protections <url>

Ethical use

This toolkit is intended for security research, accessibility, journalism, competitive intelligence on public data, and engineering on systems you have authorization to test. Respect robots.txt, terms of service, and applicable law.

License

MIT

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme