@mercuryo-ai/magicbrowse-cli
v0.0.9
Published
CLI for @mercuryo-ai/magicbrowse
Readme
@mercuryo-ai/magicbrowse-cli
Browser-only CLI around @mercuryo-ai/magicbrowse. Drives a real Chrome
session over CDP: launch / attach / observe / act, plus deterministic
click / type / fill / select / press. When a workflow reaches a
login, identity, checkout, donation, subscription, or payment page, stop and
return a handoff to the orchestrator or approved protected-data handler.
Install
npm i -g @mercuryo-ai/magicbrowse-cli@latest
magicbrowse --versionThe package ships one binary, magicbrowse. All commands described
below are invoked as magicbrowse <cmd> [args].
Configure
act and run need credentials for the LLM that drives the planner +
navigator. Configure the gateway with magicbrowse init:
magicbrowse init <apiKey>
magicbrowse doctorCurrent CLI compatibility note: the persisted config path and environment
override names still use the existing ~/.magicpay/config.json,
MAGICPAY_API_KEY, and MAGICPAY_API_URL names. Treat these as gateway
configuration names, not protected-form ownership.
doctor prints the current gateway endpoint and whether the API key is
configured. The deterministic commands
(launch, attach, observe, click, type, fill, select,
press, browser-status, close) do not need this config — they only
talk to Chrome.
The CLI uses its bundled default gateway URL for normal setup.
Pass --api-url <url> to init only when you intentionally target a
non-default staging, self-hosted, or test gateway.
Usage
# Open a browser
magicbrowse launch https://news.ycombinator.com
magicbrowse browser-status
# Let the LLM drive
magicbrowse act \
"Report the title and current score of the front-page top story, then stop." \
--max-steps 5
# Or take deterministic actions on observed targets
magicbrowse observe
magicbrowse click 3
magicbrowse type 4 "[email protected]"
magicbrowse fill 4 "[email protected]"
magicbrowse fill 4 ""
magicbrowse select 5 "United States"
magicbrowse type 4 -- --query=open
magicbrowse press Enter
# After a confirmed real CAPTCHA is solved externally on the current page
magicbrowse mark-captcha-resolved
magicbrowse act "continue from the resolved verification step"
magicbrowse close
# Attach to an existing CDP browser instead of launching one
magicbrowse attach http://127.0.0.1:9222
magicbrowse act "summarize the visible page"
# launch → act → close in one call
magicbrowse run \
--url https://en.wikipedia.org/wiki/Ada_Lovelace \
--goal "Report her year of birth and year of death from the infobox, then stop." \
--max-steps 6Add --use-vision to act or run when a specific prompt needs the
current page screenshot in the LLM browser state.
Options
| Flag | Commands | Default | Notes |
|---|---|---|---|
| <apiKey> / --api-url <url> | init | bundled default API URL | Writes shared gateway credentials; --api-url is only for non-default gateways |
| [url] | launch | optional | Initial URL to open in the persistent session |
| <endpoint> | attach | required | CDP HTTP URL or browser websocket endpoint |
| <target-id> | click, type, fill, select | required | Bare observed target id from [N] in magicbrowse observe, such as 3 |
| <text> / <value> | type, fill | required | User-provided input; command output omits the raw value |
| <option-text> | select | required | Native <select> visible option text; command output omits the raw value |
| <keys> | press | required | Key chord sent to the focused context; command output omits the raw value |
| <prompt> / --goal <text> | act, run | required | Task description |
| --url <url> | run | required | Initial URL for the wrapper |
| --headful | launch, run | off | Debug/visible-browser override. Use only when a visible browser is explicitly requested; default runs headless. |
| --profile <name> | launch | session id | Owned Chrome profile name |
| --ttl <s> | mark-captcha-resolved | 300 | Seconds before the external CAPTCHA-solved marker expires |
| --max-steps N | act, run | 20 | Cap navigator iterations |
| --use-vision | act, run | off | Include a screenshot in the LLM browser state for this call only |
| --format human\|text\|json | act, run | human | human and text print readable event lines, json prints JSON Lines |
| --help | all | | Print help |
Direct action payloads are not parsed as CLI options. Values may start
with - or --; an optional -- separator before the payload is
accepted and is not sent to the SDK payload. For type, fill, and
select, the separator comes after <target-id>. fill <target-id> ""
clears ordinary text targets by passing an empty string.
Environment
MAGICPAY_API_KEY and MAGICPAY_API_URL override the gateway config for
act and run. Without environment overrides, use magicbrowse init once
and check the gateway config with magicbrowse doctor.
Output
act and run accept --format human|text|json. The default is
human: a structured, readable per-step renderer suitable for live
debugging. --format text prints one bare line per event
([hh:mm:ss] actor.state step=N details) followed by a final
=> <status> | finalUrl=<url> line and the final message if any.
--format json prints one JSON object per line — events and the final
result.
For LLM-driven act and run, branch on the final status field:
completed, blocked, needs_handoff, needs_approval, failed,
max_steps, or cancelled. blocked, needs_handoff, and
needs_approval are controlled semantic stops and exit 0; they are
not runtime failures. Use finalMessage as the explanation to show the
user or upstream orchestrator, not as the control-flow discriminator.
For blocked, branch on
blockedReason: missing_input | item_unavailable | ambiguous | no_path.
For needs_handoff, branch on
handoff.kind: protected_form | captcha | auth | identity_verification.
Protected-form handoffs include
handoff: { "kind": "protected_form", "resumeObjective": "..." }; after
the orchestrator or approved protected-data handler fills the form,
resume with that objective. CAPTCHA handoffs should be cleared by the user
or an approved external solver, followed by magicbrowse
mark-captcha-resolved.
magicbrowse mark-captcha-resolved [--ttl <s>] records that a real CAPTCHA
on the current active page was solved by an external participant. It does
not solve CAPTCHA or interact with the widget. The next act consumes the
marker once and still checks the actual page result; if CAPTCHA or human
verification remains visible, act returns needs_handoff again.
Direct deterministic commands print one redacted JSON result with
status, action, and safe target/reason context. Target-scoped direct
commands print the same bare target id that was passed on the command line.
completed exits 0; blocked direct actions exit 1.
launch and attach also print the persisted sessionId and runId.
browser-status prints one redacted JSON diagnostic and exits 0 when
the diagnostic completes, including when alive is false. close
prints the session id it closed.
The full per-run record is written under MAGICBROWSE_HOME (default
~/.magicbrowse):
current-session.json— current CDP session pointer and active page identity.runs/<runId>.json— launch / attach / act / close events, executor events, planner/navigator debug dumps, LLM trace markers, final status, and final URL.run-index.json— session id to run id mapping.
No --debug flag is needed for diagnostics; act writes to the run
record by default, and repeated act calls in one session append to
the same run.
Exit codes
| Code | Meaning | |---|---| | 0 | completed, blocked, needs_handoff, or needs_approval | | 1 | failed | | 2 | max_steps reached | | 130 | cancelled |
Tips for good runs
- Concrete goals with explicit stop conditions ("click X then confirm URL contains Y") work much better than vague ones ("click the link"). The navigator will otherwise keep verifying or wander.
- Start with
--max-steps 5–10for simple flows; bump to 15–20 only for multi-page tasks. - Keep launches headless by default. Use
--headfulonly for live debugging or when a visible browser is explicitly requested.
Local Development (This Monorepo)
If you're working inside the mercuryo-agent-pay repo and don't want
to publish or npm i -g on each change, build the package and call
the built entrypoint directly:
pnpm install
pnpm --filter @mercuryo-ai/magicbrowse build
pnpm --filter @mercuryo-ai/magicbrowse-cli build
node packages/magicbrowse-cli/dist/index.js init <apiKey>
node packages/magicbrowse-cli/dist/index.js launch https://example.com
node packages/magicbrowse-cli/dist/index.js act "..." --max-steps 5
node packages/magicbrowse-cli/dist/index.js closepnpm exec magicbrowse is not guaranteed to be linked for in-repo
development; use the dist/index.js path above instead.
