@brightdata/cli

v0.3.2

Published

a month ago

Command-line interface for Bright Data. Scrape, search, extract structured data, and automate browsers directly from your terminal.

Overview

@brightdata/cli is the official npm package for the Bright Data CLI. It installs the brightdata command (with bdata as a shorthand alias) for access to the full Bright Data API surface:

| Command | What it does | |---|---| | brightdata scrape | Scrape any URL — bypasses CAPTCHAs, JS rendering, anti-bot protections | | brightdata search | Google / Bing / Yandex search with structured JSON output | | brightdata discover | AI-powered web discovery - find and rank results by intent with optional full-page content | | brightdata scraper create | Build a Bright Data scraper from a natural-language description using AI | | brightdata scraper run | Run a Bright Data scraper on a URL and return the data | | brightdata scraper heal | Fix an existing scraper in place via AI self-healing (stops at an approval gate) | | brightdata scraper approve | Approve (or reject) a self-healing fix that is awaiting approval | | brightdata pipelines | Extract structured data from 40+ platforms (Amazon, LinkedIn, TikTok…) | | brightdata browser | Control a real browser via Bright Data's Scraping Browser — navigate, snapshot, click, type, and more | | brightdata zones | List and inspect your Bright Data proxy zones | | brightdata budget | View account balance and per-zone cost & bandwidth | | brightdata skill | Install Bright Data AI agent skills into your coding agent | | brightdata add mcp | Add the Bright Data MCP server to Claude Code, Cursor, or Codex | | brightdata config | Manage CLI configuration | | brightdata init | Interactive setup wizard |

Installation

Requires Node.js ≥ 20

macOS / Linux

curl -fsSL https://cli.brightdata.com/install.sh | sh

Windows

npm install -g @brightdata/cli

Or install manually on any platform

npm install -g @brightdata/cli

You can also run without installing:

npx --yes --package @brightdata/cli brightdata <command>

Quick Start

# 1. Run the interactive setup wizard
brightdata init

# 2. Scrape a page as markdown
brightdata scrape https://example.com

# 3. Search Google
brightdata search "web scraping best practices"

# 4. Extract a LinkedIn profile
brightdata pipelines linkedin_person_profile "https://linkedin.com/in/username"

# 5. Check your account balance
brightdata budget

# 6. Install the Bright Data MCP server into your coding agent
brightdata add mcp

Authentication

Get your API key from brightdata.com/cp/setting/users.

# Interactive — opens browser, saves key automatically
brightdata login

# Non-interactive — pass key directly
brightdata login --api-key <your-api-key>

# Environment variable — no login required
export BRIGHTDATA_API_KEY=your-api-key

On first login the CLI checks for required zones (cli_unlocker, cli_browser) and creates them automatically if missing.

# Clear saved credentials
brightdata logout

brightdata add mcp uses the API key stored by brightdata login. It does not currently read BRIGHTDATA_API_KEY or the global --api-key flag, so log in first before using it.

Free Tier

Every new Bright Data account includes a recurring monthly free tier — no credit card or commitment required to start. See the Free Tier docs for full details.

5,000 free credits per month (≈ $7.50), renewing on the 1st of each month. Unused credits don't roll over.

These credits cover the products most CLI commands use:

| Product | CLI commands | Credit cost | |---|---|---| | Unlocker API | scrape | 1 credit / request | | SERP API | search | 1 credit / request | | Web Scraper API | pipelines | 1 credit / request | | Scraper Studio | scraper create / run / heal | 1 credit / page load (shared pool, not per record) |

Not included in the monthly free credits: proxy products (Datacenter, ISP, Residential) and the Browser API (brightdata browser). These get a separate one-time $2 trial (valid 7 days), plus a $5 bonus when you add a payment method (valid 30 days).

Adding a credit card is a verification step only — you are not charged unless your free credits are exhausted and you have funds deposited. Accounts on custom pay-as-you-go or pre-commit subscription plans don't receive the recurring monthly free credits.

Check your remaining balance any time with brightdata budget.

Commands

`init`

Interactive setup wizard. The recommended way to get started.

brightdata init

Walks through: API key detection → zone selection → default output format → quick-start examples.

| Flag | Description | |---|---| | --skip-auth | Skip the authentication step | | -k, --api-key <key> | Provide API key directly |

`scrape`

Scrape any URL using Bright Data's Web Unlocker. Handles CAPTCHAs, JavaScript rendering, and anti-bot protections automatically.

brightdata scrape <url> [options]

| Flag | Description | |---|---| | -f, --format <fmt> | markdown · html · screenshot · json (default: markdown) | | --country <code> | Geo-target by ISO country code (e.g. us, de, jp) | | --zone <name> | Web Unlocker zone name | | --mobile | Use a mobile user agent | | --async | Submit async job, return a snapshot ID | | -o, --output <path> | Write output to file | | --json / --pretty | JSON output (raw / indented) | | -k, --api-key <key> | Override API key |

Examples

# Scrape as markdown (default)
brightdata scrape https://news.ycombinator.com

# Scrape as raw HTML
brightdata scrape https://example.com -f html

# US geo-targeting, save to file
brightdata scrape https://amazon.com -f json --country us -o product.json

# Pipe to a markdown viewer
brightdata scrape https://docs.github.com | glow -

# Async — returns a snapshot ID you can poll with `status`
brightdata scrape https://example.com --async

`search`

Search Google, Bing, or Yandex via Bright Data's SERP API. Google results include structured data (organic results, ads, people-also-ask, related searches).

brightdata search <query> [options]

| Flag | Description | |---|---| | --engine <name> | google · bing · yandex (default: google) | | --country <code> | Localized results (e.g. us, de) | | --language <code> | Language code (e.g. en, fr) | | --page <n> | Page number, 0-indexed (default: 0) | | --type <type> | web · news · images · shopping (default: web) | | --device <type> | desktop · mobile | | --zone <name> | SERP zone name | | -o, --output <path> | Write output to file | | --json / --pretty | JSON output (raw / indented) | | -k, --api-key <key> | Override API key |

Examples

# Formatted table output (default)
brightdata search "typescript best practices"

# German localized results
brightdata search "restaurants berlin" --country de --language de

# News search
brightdata search "AI regulation" --type news

# Page 2 of results
brightdata search "web scraping" --page 1

# Extract just the URLs
brightdata search "open source scraping" --json | jq -r '.organic[].link'

# Search Bing
brightdata search "bright data pricing" --engine bing

`discover`

AI-powered web discovery. Submit a query with optional intent, and Bright Data finds, ranks, and optionally extracts full-page content for each result.

brightdata discover <query> [options]

| Flag | Description | |---|---| | --intent <text> | AI intent to evaluate and rank result relevance | | --country <code> | ISO country code (default: US) | | --city <name> | City for localized results (e.g. "New York") | | --language <code> | Language code (default: en) | | --num-results <n> | Number of results to return | | --filter-keywords <words> | Comma-separated keywords that must appear in results | | --include-content | Include full page content in each result | | --no-remove-duplicates | Keep duplicate results | | --start-date <date> | Only content updated from date (YYYY-MM-DD) | | --end-date <date> | Only content updated until date (YYYY-MM-DD) | | --timeout <seconds> | Polling timeout (default: 600) | | -o, --output <path> | Write output to file | | --json / --pretty | JSON output (raw / indented) | | -k, --api-key <key> | Override API key |

Examples

# Basic discovery — table output
brightdata discover "AI trends"

# With AI intent for relevance ranking
brightdata discover "AI trends" \
  --intent "Prioritize institutional reports for VC research"

# Include full page content as markdown
brightdata discover "AI trends" --include-content --num-results 5

# Geo-targeted with date range
brightdata discover "best restaurants" --country US --city "New York" \
  --start-date 2025-01-01 --end-date 2025-12-31

# Filter results by keywords
brightdata discover "generative AI SaaS" --filter-keywords "revenue,SaaS"

# JSON output to file
brightdata discover "AI trends" --num-results 10 --pretty -o results.json

# Pipe-friendly — redirected stdout outputs JSON automatically
brightdata discover "AI trends" --include-content --num-results 3 > results.json

`scraper create`

Build a Bright Data scraper from a natural-language description using AI.

brightdata scraper create <url> <description> [options]

| Flag | Description | |---|---| | --name <name> | Scraper template name (default: cli-scraper-<timestamp>) | | --deliver-webhook <url> | Webhook URL for the deliver stub (default: https://example.com/webhook) | | --timeout <seconds> | Polling timeout in seconds (default: 600) | | --max-retries <n> | Max retries on the AI-Flow concurrent-job cap 429 (default: 4). See below. | | --no-retry | Fail immediately on 429 instead of waiting. Same as --max-retries 0. | | -o, --output <path> | Write the JSON envelope to a file (see below) | | --json / --pretty | JSON output (raw / indented) | | --legacy-output | Write the pre-v0.3 bare AI-progress payload to -o instead of the envelope. Migration only. | | --timing | Show request timing | | -k, --api-key <key> | Override API key |

Note: The scraper is created with a placeholder webhook delivery target (https://example.com/webhook). You can reconfigure the actual delivery endpoint in the Bright Data web UI after creation.

Output envelope (`-o create.json`)

Every termination path — success or failure — writes the same JSON envelope shape:

{
  "collector_id":    "c_mp7x8a9b2c0d1e2f",
  "name":            "my-product-scraper",
  "status":          "done",
  "completed_steps": ["prepare_intent_analyzer", "planner", "..."],
  "view_url":        "https://brightdata.com/cp/scrapers/c_mp7x8a9b2c0d1e2f",
  "created_at":      "2026-05-18T07:28:30Z"
}

On failure paths the envelope adds an error field and the status reflects the failure category (ai_trigger_failed, failed, poll_failed). The collector_id and view_url are still present so you can recover or inspect the half-built scraper.

This makes the documented chain in recipes.md work as written:

brightdata scraper create https://example.com/product/1 "..." \
    -o create.json
COLLECTOR_ID=$(jq -r '.collector_id' create.json)
brightdata scraper run "$COLLECTOR_ID" https://example.com/product/2

The file format follows the -o extension, so .json is written compact (ideal for jq). Use --pretty for indented JSON on stdout when you omit -o.

Use --legacy-output if you have an existing script that depended on the pre-v0.3 bare-progress shape; the flag is supported for one minor version while you migrate.

Concurrent-job cap & auto-backoff

The Bright Data AI Flow caps concurrent scraper create generations per account (currently 3). If you exceed it, the API returns 429 Cannot run more than N jobs in parallel. The CLI handles this automatically: it waits with exponential backoff + jitter and retries up to --max-retries times (default 4). During the wait the CLI prints status lines so you know it isn't hung:

Triggering AI generation...
Hit AI-Flow concurrent-job cap (429). Waiting 32s before retry 1/4...
Hit AI-Flow concurrent-job cap (429). Waiting 67s before retry 2/4...
Generating scraper...

If the cap is still hit after all retries, the CLI exits with a stderr note pointing at the half-built collector's dashboard URL so you can inspect or delete it manually (Bright Data does not yet expose programmatic collector deletion).

Use --no-retry if you want the old fail-fast behavior — typically for scripts that prefer to handle backoff themselves.

Examples

# Build a scraper for a product page
brightdata scraper create https://example.com/product/1 \
    "Extract title, price, and image URL from this product page"

# Name the scraper and save the envelope to a file
brightdata scraper create https://example.com/product/1 \
    "Extract title, price, and image URL from this product page" \
    --name my-product-scraper -o create.json

# Capture the collector_id for chaining
COLLECTOR_ID=$(jq -r '.collector_id' create.json)

# Fan out 10 parallel creates — the CLI serialises automatically via 429 backoff
for url in $(cat urls.txt); do
    brightdata scraper create "$url" "Extract title, price, ..." \
        --name "scraper-$(basename $url)" &
done; wait

# Disable the auto-backoff (fail fast on 429)
brightdata scraper create https://example.com/product/1 \
    "Extract title, price, and image URL from this product page" \
    --no-retry

# Use a custom webhook delivery URL
brightdata scraper create https://example.com/product/1 \
    "Extract title, price, and image URL from this product page" \
    --deliver-webhook https://my-app.com/ingest

`scraper run`

Run a scraper (built with scraper create or in the web UI) against one or more URLs and get the extracted data.

brightdata scraper run <collector_id> [url] [options]

Provide URLs in exactly one of three ways:

Positional <url> — single URL (legacy form, unchanged).
--urls <u1,u2,...> — comma-separated list.
--input-file <path> — file with one URL per line, or a JSON array of URL strings, or a JSON array of {"url": "..."} objects.

| Flag | Description | |---|---| | --urls <list> | Comma-separated list of URLs (multi-URL batch path) | | --input-file <path> | File with URLs (txt one-per-line, or JSON array) | | --sync | Use the synchronous /dca/crawl endpoint (single-URL only, server-side cap of 25–50s) | | --sync-timeout <seconds> | Sync-mode server timeout, 25–50 (default: 50) | | --timeout <seconds> | Polling timeout (default: 600 single-URL, 3600 batch) | | --name <name> | Human-readable job name | | --version <version> | Scraper version (e.g. dev) | | -o, --output <path> | Write output to file | | --json / --pretty | JSON output (raw / indented) | | --timing | Show request timing | | -k, --api-key <key> | Override API key |

Routing

Single URL (positional, or one entry via --urls / --input-file) → async flow: /dca/trigger_immediate → poll /dca/get_result. Use --sync for /dca/crawl (one-shot, 25–50s).
Multiple URLs (--urls / --input-file with 2+ entries) → single POST to /dca/trigger with an array body, one collection_id, polled via /dca/dataset. This mirrors the canonical batch shape used by the reference SDKs (triggerWithUrls / trigger_with_urls). --sync is incompatible with multi-URL — /dca/crawl accepts only a single URL.

If a single URL expands to more pages than the realtime job limit allows (paginated listings, infinite scroll), the CLI automatically falls back to the batch endpoint and prints a one-line notice. No flag required.

Examples

# Default: async + poll until results arrive
brightdata scraper run c_mp3tuab31lswoxvpws https://www.amazon.com/dp/B08N5WRWNW

# Save pretty-printed results to a file
brightdata scraper run c_mp3tuab31lswoxvpws https://www.amazon.com/dp/B08N5WRWNW \
    --pretty -o product.json

# Sync mode for fast pages
brightdata scraper run c_mp3tuab31lswoxvpws https://example.com/p/1 --sync

# Sync with a shorter server timeout and a job name
brightdata scraper run c_mp3tuab31lswoxvpws https://example.com/p/1 \
    --sync --sync-timeout 30 --name first-test

# Multi-URL batch — one API call, one snapshot, one merged result array
brightdata scraper run c_mp3tuab31lswoxvpws \
    --urls "https://example.com/p/1,https://example.com/p/2,https://example.com/p/3" \
    --pretty -o products.json

# Multi-URL from a file (one URL per line; # comments and blank lines skipped)
brightdata scraper run c_mp3tuab31lswoxvpws --input-file urls.txt -o products.json

# Multi-URL from a JSON array
echo '["https://example.com/p/1","https://example.com/p/2"]' > urls.json
brightdata scraper run c_mp3tuab31lswoxvpws --input-file urls.json

`scraper heal`

Fix an existing scraper in place when it ran but returned wrong, empty, or partial data. The collector_id stays the same — the scraper is improved, not replaced. This is the maintenance twin of scraper create: it triggers Bright Data's AI self-healing flow (POST /dca/collectors/{id}/refactor_template), then polls progress.

brightdata scraper heal <collector_id> "<prompt>" [options]

You are the detector. The CLI never decides on its own that a scraper is broken — you inspect the run output and decide. The <prompt> is required (max 1000 chars); name exactly what is wrong and what the correct output should be. Vague prompts produce vague heals.

| Flag | Description | |---|---| | --url <url> | Verify target woven into the success next_step hint (not sent to the heal call) | | --auto-approve | When the heal hits the approval gate, approve it automatically and poll through to done (default: stop and let you review) | | --timeout <seconds> | Polling timeout (default: 600) | | --max-retries <n> | Max retries on the AI-Flow concurrent-job-cap 429 (default: 4) | | --no-retry | Fail immediately on 429 instead of waiting through the cap | | -o, --output <path> | Write output to file | | --json / --pretty | JSON output (raw / indented) | | --legacy-output | Emit the bare AI-progress payload instead of the envelope | | --timing | Show request timing | | -k, --api-key <key> | Override API key |

The approval gate

Self-healing is human-in-the-loop. Without --auto-approve, heal runs the fix and then stops at an approval gate rather than committing it, exiting 0 with a status: "awaiting_approval" envelope:

{
  "collector_id": "c_mp3tuab31lswoxvpws",
  "status":       "awaiting_approval",
  "prompt":       "Price returns null — the selector moved …",
  "preview_result": [ { "title": "…", "price": { "value": 51.77, "currency": "GBP" } }, … ],
  "diff_summary": "proposed template has 1 step(s) — review at view_url",
  "view_url":     "https://brightdata.com/cp/scrapers/c_mp3tuab31lswoxvpws",
  "next_step":    "bdata scraper approve c_mp3tuab31lswoxvpws --url https://example.com/product/1"
}

preview_result shows the sample rows the fixed scraper would produce — review them, then run the next_step (scraper approve) to commit. awaiting_approval is not a failure; it means the fix is ready and waiting for your decision. A failed heal (429 cap exhausted, timeout, terminal failed) is non-destructive — the existing scraper is unchanged and still works as before.

Examples

# Heal a scraper, stop at the gate, and get a ready-to-run verify command back
brightdata scraper heal c_mp3tuab31lswoxvpws \
    "The price field returns null — the selector moved into a span with \
     data-testid. Capture price and currency again." \
    --url https://example.com/product/1 --pretty -o heal.json

# Fully autonomous: heal and approve in one command (no manual review)
brightdata scraper heal c_mp3tuab31lswoxvpws \
    "Reviews stopped extracting after the page redesign" --auto-approve

`scraper approve`

Commit (or reject) a self-healing fix that scraper heal left awaiting approval. Calls POST /dca/collectors/{id}/resume_automation_job, then polls the refactor job to done.

brightdata scraper approve <collector_id> [options]

| Flag | Description | |---|---| | --reject | Reject the proposed fix instead of approving it | | --url <url> | Verify target woven into the success next_step hint | | --timeout <seconds> | Polling timeout (default: 600) | | -o, --output <path> | Write output to file | | --json / --pretty | JSON output (raw / indented) | | --legacy-output | Emit the bare AI-progress payload instead of the envelope | | --timing | Show request timing | | -k, --api-key <key> | Override API key |

On success the job advances to status: "done" and the envelope hands back a next_step = scraper run <id> <url> so you can verify the committed fix. --reject discards the proposed fix (status: "rejected") — re-run scraper heal with a sharper prompt to try again. If a heal needs multiple approvals, approve may stop at awaiting_approval again — just run it once more.

The self-healing loop

# 1. Run and inspect the data
brightdata scraper run c_mp3tuab31lswoxvpws https://example.com/product/1 --json -o out.json

# 2. If the data is wrong, heal (stops at the approval gate)
brightdata scraper heal c_mp3tuab31lswoxvpws \
    "Price returns null — the selector moved; capture price + currency." \
    --url https://example.com/product/1 --pretty -o heal.json

# 3. Review heal.json's preview_result, then approve
brightdata scraper approve c_mp3tuab31lswoxvpws \
    --url https://example.com/product/1 --pretty -o approve.json

# 4. Verify the committed fix
brightdata scraper run c_mp3tuab31lswoxvpws https://example.com/product/1 --pretty

`pipelines`

Extract structured data from 40+ platforms using Bright Data's Web Scraper API. Triggers an async collection job, polls until ready, and returns results.

brightdata pipelines <type> [params...] [options]

| Flag | Description | |---|---| | --format <fmt> | json · csv · ndjson · jsonl (default: json) | | --timeout <seconds> | Polling timeout (default: 600) | | -o, --output <path> | Write output to file | | --json / --pretty | JSON output (raw / indented) | | -k, --api-key <key> | Override API key |

# List all available dataset types
brightdata pipelines list

Examples

# LinkedIn profile
brightdata pipelines linkedin_person_profile "https://linkedin.com/in/username"

# Amazon product → CSV
brightdata pipelines amazon_product "https://amazon.com/dp/B09V3KXJPB" \
  --format csv -o product.csv

# Instagram profile
brightdata pipelines instagram_profiles "https://instagram.com/username"

# Amazon search by keyword
brightdata pipelines amazon_product_search "laptop" "https://amazon.com"

# Google Maps reviews
brightdata pipelines google_maps_reviews "https://maps.google.com/..." 7

# YouTube comments (top 50)
brightdata pipelines youtube_comments "https://youtube.com/watch?v=..." 50

See Dataset Types Reference for the full list.

`browser`

Control a real browser session powered by Bright Data's Scraping Browser. A lightweight local daemon holds the browser connection open between commands, giving you persistent state without reconnecting on every call.

brightdata browser open <url>              # Start a session and navigate
brightdata browser snapshot                # Get an accessibility tree of the page
brightdata browser screenshot [path]       # Take a PNG screenshot
brightdata browser click <ref>             # Click an element
brightdata browser type <ref> <text>       # Type into an element
brightdata browser fill <ref> <value>      # Fill a form field
brightdata browser select <ref> <value>    # Select a dropdown option
brightdata browser check <ref>             # Check a checkbox / radio
brightdata browser uncheck <ref>           # Uncheck a checkbox
brightdata browser hover <ref>             # Hover over an element
brightdata browser scroll                  # Scroll the page
brightdata browser get text [selector]     # Get text content
brightdata browser get html [selector]     # Get HTML content
brightdata browser back                    # Navigate back
brightdata browser forward                 # Navigate forward
brightdata browser reload                  # Reload the page
brightdata browser network                 # Show captured network requests
brightdata browser cookies                 # Show cookies
brightdata browser status                  # Show session state
brightdata browser sessions                # List all active sessions
brightdata browser close                   # Close session and stop daemon

Global flags (work with every subcommand)

| Flag | Description | |---|---| | --session <name> | Session name — run multiple isolated sessions in parallel (default: default) | | --country <code> | Geo-target by ISO country code (e.g. us, de). On open, changing country reconnects the browser | | --zone <name> | Scraping Browser zone (default: cli_browser) | | --timeout <ms> | IPC command timeout in milliseconds (default: 30000) | | --idle-timeout <ms> | Daemon auto-shutdown after idle (default: 600000 = 10 min) | | --json / --pretty | JSON output | | -o, --output <path> | Write output to file | | -k, --api-key <key> | Override API key |

`browser open <url>`

Navigate to a URL. Starts the daemon and browser session automatically if not already running.

brightdata browser open https://example.com
brightdata browser open https://amazon.com --country us --session shop

| Flag | Description | |---|---| | --country <code> | Geo-targeting. Reconnects the browser if the country changes on an existing session | | --zone <name> | Browser zone name | | --idle-timeout <ms> | Daemon idle timeout for this session |

`browser snapshot`

Capture the page as a text accessibility tree. This is the primary way AI agents read page content — far more token-efficient than raw HTML.

brightdata browser snapshot
brightdata browser snapshot --compact          # Interactive elements + ancestors only
brightdata browser snapshot --interactive      # Interactive elements as a flat list
brightdata browser snapshot --depth 3          # Limit tree depth
brightdata browser snapshot --selector "main"  # Scope to a CSS subtree
brightdata browser snapshot --wrap             # Wrap output in AI-safe content boundaries

Output format:

Page: Example Domain
URL: https://example.com

- heading "Example Domain" [level=1]
- paragraph "This domain is for use in illustrative examples."
- link "More information..." [ref=e1]

Each interactive element gets a ref (e.g. e1, e2) that you pass to click, type, fill, etc.

| Flag | Description | |---|---| | --compact | Only interactive elements and their ancestors (70–90% fewer tokens) | | --interactive | Only interactive elements, as a flat list | | --depth <n> | Limit tree depth to a non-negative integer | | --selector <sel> | Scope snapshot to elements matching a CSS selector | | --wrap | Wrap output in --- BRIGHTDATA_BROWSER_CONTENT ... --- boundaries (useful for AI agent prompt injection safety) |

`browser screenshot [path]`

Capture a PNG screenshot of the current viewport.

brightdata browser screenshot
brightdata browser screenshot ./result.png
brightdata browser screenshot --full-page -o page.png
brightdata browser screenshot --base64

| Flag | Description | |---|---| | [path] | Where to save the PNG (default: temp directory) | | --full-page | Capture the full scrollable page, not just the viewport | | --base64 | Output base64-encoded PNG data instead of saving to a file |

`browser click <ref>`

Click an element by its snapshot ref.

brightdata browser click e3
brightdata browser click e3 --session shop

`browser type <ref> <text>`

Type text into an element. Clears the field first by default.

brightdata browser type e5 "search query"
brightdata browser type e5 " more text" --append   # Append to existing value
brightdata browser type e5 "search query" --submit  # Press Enter after typing

| Flag | Description | |---|---| | --append | Append to existing value using key-by-key simulation | | --submit | Press Enter after typing |

`browser fill <ref> <value>`

Fill a form field directly (no keyboard simulation). Use type if you need to trigger keydown/keyup events.

brightdata browser fill e2 "[email protected]"

`browser select <ref> <value>`

Select a dropdown option by its visible label.

brightdata browser select e4 "United States"

`browser check <ref>` / `browser uncheck <ref>`

Check or uncheck a checkbox or radio button.

brightdata browser check e7
brightdata browser uncheck e7

`browser hover <ref>`

Hover the mouse over an element (triggers hover states, tooltips, dropdowns).

brightdata browser hover e2

`browser scroll`

Scroll the viewport or scroll an element into view.

brightdata browser scroll                        # Scroll down 300px (default)
brightdata browser scroll --direction up
brightdata browser scroll --direction down --distance 600
brightdata browser scroll --ref e10              # Scroll element e10 into view

| Flag | Description | |---|---| | --direction <dir> | up, down, left, right (default: down) | | --distance <px> | Pixels to scroll (default: 300) | | --ref <ref> | Scroll this element into view instead of the viewport |

`browser get text [selector]`

Get the text content of the page or a scoped element.

brightdata browser get text           # Full page text
brightdata browser get text "h1"      # Text of the first h1
brightdata browser get text "#price"  # Text inside #price

`browser get html [selector]`

Get the HTML of the page or a scoped element.

brightdata browser get html              # Full page outer HTML
brightdata browser get html ".product"   # innerHTML of .product
brightdata browser get html --pretty     # JSON output with selector field

`browser network`

Show HTTP requests captured since the last navigation.

brightdata browser network
brightdata browser network --json

Example output:

Network Requests (5 total):
[GET] https://example.com/ => [200]
[GET] https://example.com/style.css => [200]
[POST] https://api.example.com/track => [204]

`browser cookies`

Show cookies for the active session.

brightdata browser cookies
brightdata browser cookies --pretty

`browser status`

Show the current state of a browser session.

brightdata browser status
brightdata browser status --session shop --pretty

`browser sessions`

List all active browser daemon sessions.

brightdata browser sessions
brightdata browser sessions --pretty

`browser close`

Close a session and stop its daemon.

brightdata browser close                   # Close the default session
brightdata browser close --session shop    # Close a named session
brightdata browser close --all             # Close all active sessions

Example: AI agent workflow

# Open a US-targeted session
brightdata browser open https://example.com --country us

# Read the page structure (compact for token efficiency)
brightdata browser snapshot --compact

# Interact using refs from the snapshot
brightdata browser click e3
brightdata browser type e5 "search query" --submit

# Get updated snapshot after interaction
brightdata browser snapshot --compact

# Save a screenshot for visual verification
brightdata browser screenshot ./result.png

# Done
brightdata browser close

Example: multi-session comparison

brightdata browser open https://amazon.com --session us --country us
brightdata browser open https://amazon.com --session de --country de

brightdata browser snapshot --session us --json > us.json
brightdata browser snapshot --session de --json > de.json

brightdata browser close --all

`status`

Check the status of an async snapshot job (returned by --async or pipelines).

brightdata status <job-id> [options]

| Flag | Description | |---|---| | --wait | Poll until the job completes | | --timeout <seconds> | Polling timeout (default: 600) | | -o, --output <path> | Write output to file | | --json / --pretty | JSON output (raw / indented) | | -k, --api-key <key> | Override API key |

# Check current status
brightdata status s_abc123xyz

# Block until complete
brightdata status s_abc123xyz --wait --pretty

# Custom timeout (5 minutes)
brightdata status s_abc123xyz --wait --timeout 300

`zones`

List and inspect your Bright Data proxy zones.

brightdata zones               # List all active zones
brightdata zones info <name>   # Show full details for a zone

# Export all zones as JSON
brightdata zones --json -o zones.json

# Inspect a specific zone
brightdata zones info my_unlocker_zone --pretty

`budget`

View your account balance and per-zone cost and bandwidth usage. Read-only — no writes to the API.

brightdata budget                     # Show account balance (quick view)
brightdata budget balance             # Account balance + pending charges
brightdata budget zones               # Cost & bandwidth table for all zones
brightdata budget zone <name>         # Detailed cost & bandwidth for one zone

| Flag | Description | |---|---| | --from <datetime> | Start of date range (e.g. 2024-01-01T00:00:00) | | --to <datetime> | End of date range | | --json / --pretty | JSON output (raw / indented) | | -k, --api-key <key> | Override API key |

# Current account balance
brightdata budget

# Zone costs for January 2024
brightdata budget zones --from 2024-01-01T00:00:00 --to 2024-02-01T00:00:00

# Detailed view of a specific zone
brightdata budget zone my_unlocker_zone

`skill`

Install Bright Data AI agent skills into your coding agent (Claude Code, Cursor, Copilot, etc.). Skills provide your agent with context and instructions for using Bright Data APIs effectively.

brightdata skill add              # Interactive picker — choose skill + agent
brightdata skill add <name>       # Install a specific skill directly
brightdata skill list             # List all available Bright Data skills

Available skills

| Skill | Description | |---|---| | search | Search Google and get structured JSON results | | scrape | Scrape any webpage as clean markdown with bot bypass | | data-feeds | Extract structured data from 40+ websites | | bright-data-mcp | Orchestrate 60+ Bright Data MCP tools | | bright-data-best-practices | Reference knowledge base for writing Bright Data code |

# Interactive — select skills and choose which agents to install to
brightdata skill add

# Install the scrape skill directly
brightdata skill add scrape

# See what's available
brightdata skill list

`add mcp`

Write a Bright Data MCP server entry into Claude Code, Cursor, or Codex config files using the API key already stored by brightdata login.

brightdata add mcp                               # Interactive agent + scope prompts
brightdata add mcp --agent claude-code --global
brightdata add mcp --agent claude-code,cursor --project
brightdata add mcp --agent codex --global

| Flag | Description | |---|---| | --agent <agents> | Comma-separated targets: claude-code,cursor,codex | | --global | Install to the agent's global config file | | --project | Install to the current project's config file |

Config targets

| Agent | Global path | Project path | |---|---|---| | Claude Code | ~/.claude.json | .claude/settings.json | | Cursor | ~/.cursor/mcp.json | .cursor/mcp.json | | Codex | $CODEX_HOME/mcp.json or ~/.codex/mcp.json | Not supported |

The command writes the MCP server under mcpServers["bright-data"]:

{
  "mcpServers": {
    "bright-data": {
      "command": "npx",
      "args": ["@brightdata/mcp"],
      "env": {
        "API_TOKEN": "<stored-api-key>"
      }
    }
  }
}

Behavior notes:

Existing config is preserved; only mcpServers["bright-data"] is added or replaced.
If the target config contains invalid JSON, the CLI warns and offers to overwrite it in interactive mode.
In non-interactive mode, pass both --agent and the appropriate scope flag to skip prompts.

`config`

View and manage CLI configuration.

brightdata config                              # Show all config
brightdata config get <key>                    # Get a single value
brightdata config set <key> <value>            # Set a value

| Key | Description | |---|---| | default_zone_unlocker | Default zone for scrape and search | | default_zone_serp | Default zone for search (overrides unlocker zone) | | default_format | Default output format: markdown or json | | api_url | Override the Bright Data API base URL |

brightdata config set default_zone_unlocker my_zone
brightdata config set default_format json

`login` / `logout`

brightdata login                      # Interactive login
brightdata login --api-key <key>      # Non-interactive
brightdata logout                     # Clear saved credentials

Configuration

Config is stored in an OS-appropriate location:

| OS | Path | |---|---| | macOS | ~/Library/Application Support/brightdata-cli/ | | Linux | ~/.config/brightdata-cli/ | | Windows | %APPDATA%\brightdata-cli\ |

Two files are stored:

credentials.json — API key
config.json — zones, output format, preferences

Priority order (highest → lowest):

CLI flags  →  Environment variables  →  config.json  →  Defaults

Environment Variables

| Variable | Description | |---|---| | BRIGHTDATA_API_KEY | API key (overrides stored credentials) | | BRIGHTDATA_UNLOCKER_ZONE | Default Web Unlocker zone | | BRIGHTDATA_SERP_ZONE | Default SERP zone | | BRIGHTDATA_POLLING_TIMEOUT | Default polling timeout in seconds | | BRIGHTDATA_BROWSER_ZONE | Default Scraping Browser zone (default: cli_browser) | | BRIGHTDATA_DAEMON_DIR | Override the directory used for browser daemon socket and PID files |

BRIGHTDATA_API_KEY=xxx BRIGHTDATA_UNLOCKER_ZONE=my_zone \
  brightdata scrape https://example.com

Output Modes

Every command supports:

| Mode | Flag | Behavior | |---|---|---| | Human-readable | (default) | Formatted table or markdown, with colors | | JSON | --json | Compact JSON to stdout | | Pretty JSON | --pretty | Indented JSON to stdout | | File | -o <path> | Write to file; format inferred from extension |

Auto-detected file formats:

| Extension | Format | |---|---| | .json | JSON | | .md | Markdown | | .html | HTML | | .csv | CSV |

Pipe-Friendly Usage

When stdout is not a TTY, colors and spinners are automatically disabled. Errors go to stderr, data to stdout.

# Extract URLs from search results
brightdata search "nodejs tutorials" --json | jq -r '.organic[].link'

# Scrape and view with a markdown reader
brightdata scrape https://docs.github.com | glow -

# Save scraped content to a file
brightdata scrape https://example.com -f markdown > page.md

# Amazon product data as CSV
brightdata pipelines amazon_product "https://amazon.com/dp/xxx" --format csv > product.csv

# Chain search → scrape
brightdata search "top open source projects" --json \
  | jq -r '.organic[0].link' \
  | xargs brightdata scrape

Dataset Types Reference

brightdata pipelines list   # See all types in your terminal

E-Commerce

| Type | Platform | |---|---| | amazon_product | Amazon product page | | amazon_product_reviews | Amazon reviews | | amazon_product_search | Amazon search results | | walmart_product | Walmart product page | | walmart_seller | Walmart seller profile | | ebay_product | eBay listing | | bestbuy_products | Best Buy | | etsy_products | Etsy | | homedepot_products | Home Depot | | zara_products | Zara | | google_shopping | Google Shopping |

Professional Networks

| Type | Platform | |---|---| | linkedin_person_profile | LinkedIn person | | linkedin_company_profile | LinkedIn company | | linkedin_job_listings | LinkedIn jobs | | linkedin_posts | LinkedIn posts | | linkedin_people_search | LinkedIn people search | | crunchbase_company | Crunchbase | | zoominfo_company_profile | ZoomInfo |

Social Media

| Type | Platform | |---|---| | instagram_profiles | Instagram profiles | | instagram_posts | Instagram posts | | instagram_reels | Instagram reels | | instagram_comments | Instagram comments | | facebook_posts | Facebook posts | | facebook_marketplace_listings | Facebook Marketplace | | facebook_company_reviews | Facebook reviews | | facebook_events | Facebook events | | tiktok_profiles | TikTok profiles | | tiktok_posts | TikTok posts | | tiktok_shop | TikTok shop | | tiktok_comments | TikTok comments | | x_posts | X (Twitter) posts | | youtube_profiles | YouTube channels | | youtube_videos | YouTube videos | | youtube_comments | YouTube comments | | reddit_posts | Reddit posts |

Other

| Type | Platform | |---|---| | google_maps_reviews | Google Maps reviews | | google_play_store | Google Play | | apple_app_store | Apple App Store | | reuter_news | Reuters news | | github_repository_file | GitHub repository files | | yahoo_finance_business | Yahoo Finance | | zillow_properties_listing | Zillow | | booking_hotel_listings | Booking.com |

Troubleshooting

Error: No Web Unlocker zone specified

brightdata config set default_zone_unlocker <your-zone-name>
# or
export BRIGHTDATA_UNLOCKER_ZONE=<your-zone-name>

Error: Invalid or expired API key

brightdata login

Error: Access denied

Check zone permissions in the Bright Data control panel.

Error: Rate limit exceeded

Wait a moment and retry. Use --async for large jobs to avoid timeouts.

Async job is too slow

brightdata pipelines amazon_product <url> --timeout 1200
# or
export BRIGHTDATA_POLLING_TIMEOUT=1200

No active browser session "default"

# Start a session first
brightdata browser open https://example.com

Browser daemon won't start

# Check if a stale socket file exists and clear it
brightdata browser close
# Then retry
brightdata browser open https://example.com

Element ref not found after interaction

Refs are re-assigned on every snapshot call. If you navigate or click (which may cause the page to change), take a fresh snapshot before using refs again:

brightdata browser click e3
brightdata browser snapshot --compact   # refresh refs
brightdata browser type e5 "text"

Garbled output in non-interactive terminal

Colors and spinners are disabled automatically when not in a TTY. If you still see ANSI codes, add | cat at the end of your command.

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

Overview

Table of Contents

Installation

macOS / Linux

Windows

Or install manually on any platform

Quick Start

Authentication

Free Tier

Commands

init

scrape

search

discover

scraper create

Output envelope (-o create.json)

Concurrent-job cap & auto-backoff

scraper run

scraper heal

scraper approve

pipelines

browser

browser open <url>

browser snapshot

browser screenshot [path]

browser click <ref>

browser type <ref> <text>

browser fill <ref> <value>

browser select <ref> <value>

browser check <ref> / browser uncheck <ref>

browser hover <ref>

browser scroll

browser get text [selector]

browser get html [selector]

browser network

browser cookies

browser status

browser sessions

browser close

status

zones

budget

skill

add mcp

config

login / logout

Configuration

Environment Variables

Output Modes

Pipe-Friendly Usage

Dataset Types Reference

E-Commerce

Professional Networks

Social Media

Other

Troubleshooting

Links

`init`

`scrape`

`search`

`discover`

`scraper create`

Output envelope (`-o create.json`)

`scraper run`

`scraper heal`

`scraper approve`

`pipelines`

`browser`

`browser open <url>`

`browser snapshot`

`browser screenshot [path]`

`browser click <ref>`

`browser type <ref> <text>`

`browser fill <ref> <value>`

`browser select <ref> <value>`

`browser check <ref>` / `browser uncheck <ref>`

`browser hover <ref>`

`browser scroll`

`browser get text [selector]`

`browser get html [selector]`

`browser network`

`browser cookies`

`browser status`

`browser sessions`

`browser close`

`status`

`zones`

`budget`

`skill`

`add mcp`

`config`

`login` / `logout`