@scrappey/scrappey-cli
v0.1.0
Published
Official command-line client for the Scrappey scraping API — scrape any URL, bypass common antibots, pipe markdown to your shell.
Maintainers
Readme
scrappey-cli
The official command-line client for the Scrappey scraping API. Scrape any URL, bypass common antibot systems, and pipe markdown into jq/llm/cron — all from your shell.
npm install -g @scrappey/scrappey-cli
scrappey-cli auth --api-key YOUR_KEY
scrappey-cli scrape https://example.com -m -o example.mdFeatures
- One binary, zero runtime dependencies — pure Node 18+, nothing but
package.json. - Full API surface — GET/POST/PUT/PATCH/DELETE, sessions, proxy selection, custom headers, cookies, screenshots.
- Antibot flags —
--cloudflare,--datadome,--kasada,--solve-captchas. - LLM-ready output —
--markdownreturns markdown for RAG pipelines. - Shell-native — HTML to stdout by default; pipe straight into
jq,grep, or an LLM. - Safe key storage —
~/.config/scrappey-cli/config.jsonat0600, overridable viaSCRAPPEY_API_KEYor.env.
Install
# Global install
npm install -g @scrappey/scrappey-cli
# Or run without installing
npx @scrappey/scrappey-cli scrape https://example.comRequires Node.js 18 or newer.
You'll need an API key from scrappey.com.
Authentication
Resolution order (first match wins):
--api-key <key>flag on any commandSCRAPPEY_API_KEYenvironment variableSCRAPPEY_API_KEY=…in a.envfile in the current directory- Saved config at
~/.config/scrappey-cli/config.json(file mode0600)
# Save key to config (one-time)
scrappey-cli auth --api-key YOUR_KEY
# Inspect what's currently resolved (key is masked)
scrappey-cli auth --show
# → key=abcd…wxyz source=config
# Remove saved key
scrappey-cli auth --logoutCommands
scrape <url>
Scrape a URL and print the body to stdout.
# Plain HTML to stdout
scrappey-cli scrape https://example.com
# Markdown for LLM pipelines
scrappey-cli scrape https://example.com -m
# Write HTML to file
scrappey-cli scrape https://example.com -o page.html
# Full JSON response (solution, cookies, status, timing)
scrappey-cli scrape https://example.com --json | jq '.solution.statusCode'
# Anti-bot + geo proxy
scrappey-cli scrape https://protected.example \
--cloudflare --country UnitedStates --premium
# POST with JSON body and headers
scrappey-cli scrape https://httpbin.org/post \
-X POST \
-H 'content-type: application/json' \
-d '{"name":"demo","count":42}' --json
# Cheap request mode (no JS render)
scrappey-cli scrape https://api.example.com/data --request-type request --json
# Page screenshot
scrappey-cli scrape https://example.com --screenshot shot.pngOptions:
| Flag | Description |
|---|---|
| -o, --output <file> | Write body to file (default: stdout) |
| -m, --markdown | Return markdown instead of HTML |
| --json | Print full JSON response |
| -X, --method <METHOD> | HTTP method (GET, POST, PUT, DELETE, PATCH) |
| -d, --data <json\|string> | Request body for POST/PUT/PATCH |
| -H, --header <K:V> | Custom header (repeatable) |
| --cookies <string> | Cookie string to set |
| --request-type <t> | browser (default) or request |
| --country <code> | Proxy country (e.g. UnitedStates, Germany) |
| --premium / --mobile | Premium / mobile proxy pool |
| --cloudflare / --datadome / --kasada | Enable antibot bypass |
| --solve-captchas | Auto-solve detected captchas |
| --session <id> | Reuse an existing session |
| --screenshot <file> | Save page screenshot |
| --timeout <ms> | Per-request timeout |
auth
scrappey-cli auth --api-key KEY # save key
scrappey-cli auth --show # show source (masked)
scrappey-cli auth --logout # remove saved key
scrappey-cli auth # prompt for key on stdinbalance
scrappey-cli balance
# → { "balance": 12345, ... }session create | destroy
ID=$(scrappey-cli session create --country UnitedStates)
scrappey-cli scrape https://example.com --session "$ID"
scrappey-cli scrape https://example.com/next --session "$ID"
scrappey-cli session destroy "$ID"Programmatic use
The CLI is also a library — useful if you want the client in a script without adding a full SDK:
import { ScrappeyClient } from '@scrappey/scrappey-cli';
const client = new ScrappeyClient({ apiKey: process.env.SCRAPPEY_API_KEY });
const res = await client.get({ url: 'https://example.com', markdown: true });
console.log(res.solution.markdown);Pipelines
Classic Unix composition:
# Scrape + filter JSON
scrappey-cli scrape https://httpbin.org/json --json | jq '.solution.response | fromjson'
# Scrape + LLM summary
scrappey-cli scrape https://blog.example.com/post -m | llm 'summarize in 3 bullets'
# Batch from a file
while read url; do
scrappey-cli scrape "$url" -m -o "out/$(echo "$url" | md5sum | cut -d' ' -f1).md"
done < urls.txtEnvironment variables
| Variable | Purpose |
|---|---|
| SCRAPPEY_API_KEY | Key, takes precedence over saved config |
| SCRAPPEY_CONFIG_DIR | Override config directory (default ~/.config/scrappey-cli) |
| SCRAPPEY_LIVE=1 | Enable live integration tests |
Development
git clone https://github.com/YOUR_USER/scrappey-cli.git
cd scrappey-cli
npm test # unit + CLI tests (no network)
SCRAPPEY_LIVE=1 SCRAPPEY_API_KEY=... npm run test:live # hits real API (~3 credits)
node bin/scrappey-cli.js scrape https://example.com # run locallyNo runtime dependencies; test runner is Node's built-in node --test.
Exit codes
| Code | Meaning |
|---|---|
| 0 | Success |
| 1 | API / network error |
| 2 | Bad usage (missing key, unknown command, bad flag) |
License
MIT — see LICENSE. Contributions welcome via pull request.
Users are responsible for complying with the Scrappey terms of service and with the terms of any website they scrape.
