@scraping-proxy/auth-cli
v0.5.0
Published
CLI tool that captures browser authentication state (cookies, localStorage) for use with `@scraping-proxy/client`.
Readme
@scraping-proxy/auth-cli
CLI tool that captures browser authentication state (cookies, localStorage) for use with @scraping-proxy/client.
Some sites require a logged-in session that cannot be replicated via HTTP headers alone. This tool opens a real browser, lets you log in manually, then saves the resulting auth state to a local file. That file is passed to client.scrape() and injected into the Playwright context on the server — entirely in memory, never stored.
Prerequisites
Node.js ≥ 20. For Google or other OAuth providers, a real browser must be used:
# Install system Chrome if not already present, then:
pnpm exec playwright install chromeFor non-OAuth sites the bundled Chromium works fine:
pnpm exec playwright install chromiumUsage
scraping-proxy-auth <url> [options]
Arguments:
url Page to open (typically the login page)
Options:
-o, --output <file> Where to write the state file (default: state.json)
-b, --browser <name> Browser to use: chromium | chrome | edge (default: chromium)
-h, --help Show helpBasic
scraping-proxy-auth https://example.com/loginOpens a browser at the URL. Log in, then press Enter in the terminal. The auth state is saved to ./state.json.
Google / OAuth sites
Google blocks Playwright's bundled Chromium. Use --browser chrome to launch your system Chrome instead:
scraping-proxy-auth https://accounts.google.com --browser chromeCustom output path
scraping-proxy-auth https://example.com/login --output ./auth/example.jsonUsing the state file
Pass the file contents to client.scrape(). The server injects it into the Playwright browser context and discards it after the request — it is never written to disk or stored in the database.
import { readFile } from 'node:fs/promises';
import { ScrapingProxyClient } from '@scraping-proxy/client';
import type { BrowserStorageState } from '@scraping-proxy/client';
const authState: BrowserStorageState = JSON.parse(
await readFile('./state.json', 'utf-8')
);
const client = new ScrapingProxyClient({
baseUrl: 'https://your-proxy.example.com',
apiToken: 'oat_xxx',
});
const { data: job } = await client.scrape({
url: 'https://example.com/dashboard',
scrapeMode: 'browser',
authState,
selectors: {
title: { selector: 'h1' },
},
});
const result = await client.waitForJob(job.jobId);
console.log(result.result);Security
- The state file contains session cookies. Treat it like a password — do not commit it to version control.
- Add
state.json(or your custom output path) to.gitignore. - The server never persists the auth state: it lives only in the Redis queue payload (ephemeral) and in memory during scrape execution.
