@xcrawl/cli
v0.3.3
Published
XCrawl command line interface
Readme
XCrawl CLI
XCrawl CLI is the official command-line interface for XCrawl. Use it to scrape pages, run search queries, inspect SERP engines, web scrapers, and LLM models, map sites, and manage crawl jobs from your terminal.
Installation
Run directly with npx:
npx -y @xcrawl/cli@latest --helpInstall globally with npm:
npm install -g @xcrawl/cli
xcrawl --helpQuick Start
Initialize authentication with browser login:
npx -y @xcrawl/cli@latest init -y --browserOr authenticate after installation:
xcrawl login --browser
xcrawl login --api-key <your_api_key>Run core commands:
xcrawl scrape https://example.com --format markdown
xcrawl llm --list-models
xcrawl llm chatgpt_model 'prompt=What is XCrawl CLI?' --param location=US
xcrawl search "xcrawl cli" --limit 10
xcrawl serp --list-engines
xcrawl serp google_search 'q=xcrawl cli'
xcrawl scraper --list-scrapers
xcrawl scraper reddit_user_posts 'url_list=["https://www.reddit.com/r/test/comments/abc"]'
xcrawl status
xcrawl map https://example.com --limit 10
xcrawl crawl https://example.com
xcrawl crawl status <job-id>Default shortcut for scrape:
xcrawl https://example.comAuthentication
XCrawl CLI supports browser authentication, manual API key entry, and environment variables:
xcrawl init -y --browser
xcrawl login
xcrawl login --browser
xcrawl login --api-key <your_api_key>
xcrawl logout
export XCRAWL_API_KEY=<your_api_key>Saved credentials are stored in ~/.xcrawl/config.json.
If no API key is configured, running xcrawl or any authenticated command prompts:
XCrawl CLI
Turn websites into LLM-ready data
Welcome! To get started, authenticate with your XCrawl account.
1. Login with browser (recommended)
2. Enter API key manually
Tip: You can also set XCRAWL_API_KEY environment variableCommon Commands
xcrawl init [-y] [--browser | --api-key <key>]
xcrawl login [--browser | --api-key <key>] [--json]
xcrawl scrape <url...> [--format markdown|json|html|screenshot|text] [--output <path>] [--json] [--input <path>] [--concurrency <n>] [--interval <ms>] [--wait-timeout <ms>]
xcrawl llm --list-models
xcrawl llm <model> [key=value...] [--param <key=value>] [--describe] [--json]
xcrawl search <query> [--limit <n>] [--json]
xcrawl serp --list-engines
xcrawl serp <engine> [key=value...] [--param <key=value>] [--describe] [--json]
xcrawl scraper --list-scrapers
xcrawl scraper <scraper> [key=value...] [--param <key=value>] [--describe] [--json]
xcrawl map <url> [--limit <n>] [--json]
xcrawl crawl <url> [--wait]
xcrawl crawl status <job-id>
xcrawl status [--json]Batch scrape example:
xcrawl scrape --input ./urls.txt --concurrency 3 --wait-timeout 120000 --jsonurls.txt should contain one URL per line. Lines starting with # are ignored.
When multiple URLs are provided, XCrawl CLI creates a batch scrape job, waits for completion, and then fetches each referenced scrape result.
In default text mode, batch scraping prints a short job summary before file output paths.
LLM examples:
xcrawl llm --list-models
xcrawl llm chatgpt_model --describe
xcrawl llm chatgpt_model 'prompt=What is XCrawl CLI?' --param location=US --jsonSERP examples:
xcrawl serp --list-engines
xcrawl serp bing_shopping --describe
xcrawl serp google_search 'q=xcrawl cli' --param page=2 --param no_cache=true --jsonWeb scraper examples:
xcrawl scraper --list-scrapers
xcrawl scraper reddit_user_posts --describe
xcrawl scraper reddit_user_posts 'url_list=["https://www.reddit.com/r/test/comments/abc"]' --jsonConfiguration
Manage config values:
xcrawl config keys
xcrawl config get api-base-url
xcrawl config set api-base-url https://run.xcrawl.comConfiguration priority:
- CLI flags
- Environment variables
- Local config file
~/.xcrawl/config.json - Defaults
Environment variables:
XCRAWL_API_KEYXCRAWL_API_BASE_URLXCRAWL_TIMEOUT_MSXCRAWL_OUTPUT_DIRXCRAWL_DEBUG
Output
- Default output is human-readable text.
xcrawl statusstarts withXCrawl cli v<version>before account usage details.- Use
--jsonfor machine-readable output. - Use
--output <path>to save output to a file. - Multi-URL scrape defaults to
.xcrawl/when no output path is provided.
API Routing Notes
- Default API base URL is
https://run.xcrawl.com. - Multi-URL
scrapeusesPOST /v1/batch/scrape, pollsGET /v1/batch/scrape/{batch_scrape_id}, and fetches completed results fromGET /v1/scrape/{scrape_id}. llmexecutes againsthttps://run.xcrawl.com/v1/llmand loads metadata fromhttps://api.xcrawl.com/web_v1/scraping/xcrawl*.scraperexecutes againsthttps://run.xcrawl.com/v1/dataand loads metadata fromhttps://api.xcrawl.com/web_v1/scraping/xcrawl*.statusalways callshttps://api.xcrawl.com/web_v1/user/credit-user-info.statusauthentication is sent as query param:app_key=<your_api_key>.
