octoparse-cli
v0.1.15
Published
Standalone Octoparse engine CLI that runs local extraction without the Electron client.
Readme
octoparse-cli
Command-line runner for Octoparse extraction tasks.
octoparse can list cloud tasks, run tasks locally, control active local
runs, and export collected data.
Requirements
- Node.js 20 or newer
- A valid Octoparse API key
Quick start
1. Install
Install the CLI globally:
npm install -g octoparse-cliThe installed command is:
octoparseCheck the installation:
octoparse --version
octoparse doctor2. Log in with an API key
Most commands require a Octoparse API key. Run:
octoparse auth loginauth login opens the API key page automatically in a browser when possible,
then verifies and saves the key locally.
Create the key here:
https://www.octoparse.com/console/account-center/api-keysIf you already copied the key, you can save time and pass it directly:
octoparse auth login XXXXXFor CI or scripts, set the key with an environment variable instead:
OCTO_ENGINE_API_KEY=xxx octoparse task list --json3. Use the CLI
Query the task list:
octoparse task list
octoparse task list --page 2 --page-size 20Query a single task:
octoparse task inspect <taskId>Run a task locally:
octoparse run <taskId>Run in the background:
octoparse run <taskId> --detachQuery the local run status, or stop the local process running a task:
octoparse local status <taskId>
octoparse local stop <taskId>Note: local run status is tracked by this CLI only and is not synchronized with the Octoparse desktop client status.
Export data:
octoparse data export <taskId> --source local --format xlsx
octoparse data export <taskId> --source cloud --format csvCommon commands
# Help and diagnostics
octoparse --help
octoparse doctor
octoparse browser doctor
# Authentication
octoparse auth login
octoparse auth login XXXXX
octoparse auth status
octoparse auth logout
# Task discovery
octoparse task list
octoparse task list --page 2 --page-size 20
octoparse task list --keyword news --page 2 --page-size 10
octoparse task inspect <taskId>
# Local extraction
octoparse run <taskId>
octoparse run <taskId> --jsonl
octoparse run <taskId> --detach
octoparse local status <taskId>
octoparse local pause <taskId>
octoparse local resume <taskId>
octoparse local stop <taskId>
# Cloud extraction
octoparse cloud start <taskId>
octoparse cloud stop <taskId>
octoparse cloud status <taskId>
octoparse cloud history <taskId>
# Data
octoparse data history <taskId> --source local
octoparse data history <taskId> --source cloud
octoparse data export <taskId> --source local --format xlsx
octoparse data export <taskId> --source cloud --format csvBy default, local run artifacts are stored in ~/.octoparse/runs. If you
customize the run artifact directory with --output, use the same --output
again when reading local history or exporting local data:
octoparse run <taskId> --output ./runs
octoparse data history <taskId> --source local --output ./runs
octoparse data export <taskId> --source local --output ./runs --format xlsxAPI key
Most commands require an API key. Only setup and diagnostic commands such as
--help, --version, doctor, browser doctor, capabilities, and auth
can run before login.
Create API keys in the Octoparse console:
https://www.octoparse.com/console/account-center/api-keysFor interactive use:
octoparse auth loginIf the API key is already copied:
octoparse auth login XXXXXUse --no-open if you want to copy the URL manually:
octoparse auth login --no-openFor CI or scripts:
OCTO_ENGINE_API_KEY=xxx octoparse task list --jsonCredential precedence:
1. OCTO_ENGINE_API_KEY
2. ~/.octoparse/credentials.jsonLocal task files
You can run or validate a local task definition file:
octoparse task validate <taskId> --task-file ./task.json
octoparse run <taskId> --task-file ./task.json
octoparse run sample --task-file ./sample.otdSupported local task file types:
.json.xml.otd
Kernel browser tasks are not supported in this CLI.
Machine-readable output
Use --json for one JSON response:
octoparse task list --json
octoparse local status <taskId> --jsonUse --jsonl for local run event streams:
octoparse run <taskId> --jsonlThe stream includes captcha and proxy events when the runtime asks the CLI
to resolve CAPTCHA or proxy resources automatically.
Local run artifacts are written under ~/.octoparse/runs by default, or under
the selected --output directory when configured:
<output>/<runId>/
meta.json
events.jsonl
logs.jsonl
rows.jsonlTroubleshooting
Check the local environment:
octoparse doctor
octoparse browser doctorIf the browser is not detected automatically, pass its path:
octoparse run <taskId> --chrome-path "/path/to/chrome"Clean stale local control state:
octoparse local cleanup
octoparse runs cleanup