@scraping-proxy/auth-cli

v0.5.0

Published

9 days ago

CLI tool that captures browser authentication state (cookies, localStorage) for use with `@scraping-proxy/client`.

0High
0Medium
0Low

sonny93

@scraping-proxy/auth-cli

CLI tool that captures browser authentication state (cookies, localStorage) for use with @scraping-proxy/client.

Some sites require a logged-in session that cannot be replicated via HTTP headers alone. This tool opens a real browser, lets you log in manually, then saves the resulting auth state to a local file. That file is passed to client.scrape() and injected into the Playwright context on the server — entirely in memory, never stored.

Prerequisites

Node.js ≥ 20. For Google or other OAuth providers, a real browser must be used:

# Install system Chrome if not already present, then:
pnpm exec playwright install chrome

For non-OAuth sites the bundled Chromium works fine:

pnpm exec playwright install chromium

Usage

scraping-proxy-auth <url> [options]

Arguments:
  url                    Page to open (typically the login page)

Options:
  -o, --output <file>    Where to write the state file  (default: state.json)
  -b, --browser <name>   Browser to use: chromium | chrome | edge  (default: chromium)
  -h, --help             Show help

Basic

scraping-proxy-auth https://example.com/login

Opens a browser at the URL. Log in, then press Enter in the terminal. The auth state is saved to ./state.json.

Google / OAuth sites

Google blocks Playwright's bundled Chromium. Use --browser chrome to launch your system Chrome instead:

scraping-proxy-auth https://accounts.google.com --browser chrome

Custom output path

scraping-proxy-auth https://example.com/login --output ./auth/example.json

Using the state file

Pass the file contents to client.scrape(). The server injects it into the Playwright browser context and discards it after the request — it is never written to disk or stored in the database.

import { readFile } from 'node:fs/promises';
import { ScrapingProxyClient } from '@scraping-proxy/client';
import type { BrowserStorageState } from '@scraping-proxy/client';

const authState: BrowserStorageState = JSON.parse(
	await readFile('./state.json', 'utf-8')
);

const client = new ScrapingProxyClient({
	baseUrl: 'https://your-proxy.example.com',
	apiToken: 'oat_xxx',
});

const { data: job } = await client.scrape({
	url: 'https://example.com/dashboard',
	scrapeMode: 'browser',
	authState,
	selectors: {
		title: { selector: 'h1' },
	},
});

const result = await client.waitForJob(job.jobId);
console.log(result.result);

Security

The state file contains session cookies. Treat it like a password — do not commit it to version control.
Add state.json (or your custom output path) to .gitignore.
The server never persists the auth state: it lives only in the Redis queue payload (ephemeral) and in memory during scrape execution.

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

@scraping-proxy/auth-cli

Prerequisites

Usage

Basic

Google / OAuth sites

Custom output path

Using the state file

Security