@hasna/browser
v0.5.0
Published
General-purpose browser agent toolkit — Playwright, Chrome DevTools Protocol, Lightpanda with auto engine selection. CLI + MCP + REST + SDK.
Maintainers
Readme
@hasna/browser
General-purpose browser agent toolkit — Playwright, Chrome DevTools Protocol, Lightpanda with auto engine selection. CLI + MCP + REST + SDK.
Install
npm install -g @hasna/browserCLI Usage
browser --helpChoose The Control Lane
Use the narrowest browser lane that can finish the job:
| Lane | Use it for | Boundary |
|------|------------|----------|
| Browser-native automation | Owned sites, local fixtures, staging apps, CI, extraction, screenshots, audits, forms, and repeatable workflows in controlled browser sessions. | Use Playwright, CDP, Bun, or Lightpanda through the CLI, SDK, MCP, or REST API. --headed makes the automation browser visible; it does not turn automation into user-trusted hardware input. |
| Extension engine | Authorized workflows inside an operator-paired visible Chrome profile that is already logged in. | Use --engine extension only after browser-serve and browser extension pair. Extension sessions are explicit-only, policy-gated, and never selected by auto. |
| Pixel computer control | Browser chrome, OS dialogs, cross-app workflows, or visual-only UI that browser APIs cannot reach. | Use @hasna/computer for display-level mouse, keyboard, and screenshot control, then keep browser page work in this package when possible. |
Do not use browser automation, headed mode, CDP, stealth settings, or the extension engine to bypass CAPTCHA, MFA, bot detection, rate limits, paywalls, access controls, website terms, or anti-abuse systems. Prefer official APIs. Use automation only on domains, accounts, and data you are authorized to operate, and stop for manual approval when a site presents authentication, CAPTCHA, MFA, payment, or account-safety challenges.
MCP Server
browser-mcpHTTP mode
Run a long-lived Streamable HTTP MCP server on 127.0.0.1 (default port 8802):
browser-mcp --http
# or: MCP_HTTP=1 browser-mcp
# port override: --port 8802 or MCP_HTTP_PORT=8802- Health:
GET http://127.0.0.1:8802/health→{"status":"ok","name":"browser"} - MCP:
http://127.0.0.1:8802/mcp
Stdio remains the default when no --http / MCP_HTTP=1 is set.
REST API
browser-serveKernel Cloud Browsers
The kernel engine is explicit-only unless OPEN_BROWSER_BACKEND=kernel or
OPEN_BROWSER_REMOTE=1 is set. Local fallback remains unchanged for auto:
open-browser still picks Bun, Lightpanda, or Playwright locally unless you ask
for Kernel.
Configure Kernel with the established secret first. open-browser checks the
vault key before KERNEL_API_KEY and never prints the key:
secrets set hasna/xyz/opensource/browser/prod/kernel_api_key <kernel-api-key>
# or for one process:
export KERNEL_API_KEY=<kernel-api-key>
browser kernel status --remoteCreate an autonomous remote browser:
browser kernel open --url https://example.com --headed --kernel-profile-name example-agent
browser navigate https://example.com --engine kernel --kernel-profile-name example-agent --screenshot
browser session create --engine kernel --url https://example.com --kernel-timeout-seconds 600Kernel session options are available through CLI, SDK, MCP, and REST:
- Profiles:
--kernel-profile-name,--kernel-profile-id, or--kernel-persistence-id; named profiles are created when the SDK supports it. Profile state is saved when the Kernel browser is explicitly deleted or times out. - Runtime:
--kernel-timeout-seconds,--kernel-proxy-id,--kernel-gpu,--kernel-kiosk-mode,--kernel-tag key=value,--kernel-project-id, and--kernel-base-url. - Secrets/env:
--kernel-env KEY=VALUEfor non-secrets and--kernel-env-secret ENV_VAR=secret/keyfor values resolved from@hasna/secrets. - Auth:
--kernel-auth-mode managed|auto|cdp_autofill|off. Managed auth uses Kernel connections/credentials when a matching vault login exists;autofalls back to CDP autofill if managed auth cannot start.
Remote operations:
browser kernel sessions --json
browser kernel exec <session> "await page.goto('https://example.com'); return await page.title();"
browser kernel computer screenshot <session>
browser kernel files list <session> --path /tmp
browser kernel files download <session> /tmp/report.pdf
browser kernel replays start <session>
browser kernel replays stop <session> <replay_id>
browser kernel replays download <session> <replay_id>
browser kernel close <session>MCP tools include browser_kernel_status, browser_kernel_sessions,
browser_kernel_playwright_execute, browser_kernel_computer_action,
browser_kernel_computer_screenshot, browser_kernel_files_*, and
browser_kernel_replay_*. REST endpoints are exposed under
/api/kernel/.... SDK methods are available on BrowserSDK, for example
sdk.executeKernel(session, code), sdk.kernelFiles(session, "/tmp"), and
sdk.downloadKernelReplay(session, replayId).
Kernel create-time start_url is best-effort, so open-browser performs an
explicit navigation after attaching through CDP when startUrl is provided.
Use headful sessions (--headed) when live view or computer controls matter;
headless is still useful for fast script-only work. Kernel File I/O artifacts
are available only while the remote session is active, so download files or
replays before closing or allowing the session to time out.
Chrome Extension Engine
The extension engine is explicit-only: it is never auto-selected. It runs
jobs inside a paired, user-loaded Chrome MV3 extension, so actions execute in
the user's real logged-in browser session and network context.
Creating extension sessions requires explicit operator approval through
BROWSER_ALLOW_EXTENSION_SESSION=1 for a trusted local session, or
BROWSER_CAPABILITY_TOKEN plus the matching approval token.
bun run build:extension
browser-serve
browser extension pair
browser extension status
browser navigate https://example.com --engine extensionLoad extension/dist in Chrome via chrome://extensions -> Developer mode ->
Load unpacked, then enter the six-digit pairing code in the toolbar popup. The
service worker dials out to browser-serve over loopback WebSocket and keeps
the MV3 worker alive with 20s pings plus chrome.alarms.
Security defaults:
- No server-side website credentials are stored; the user's Chrome session is the auth.
- The bridge accepts only explicit, token-authenticated jobs.
- Arbitrary JavaScript
evaluatejobs are disabled unlessBROWSER_EXTENSION_ALLOW_EVAL=1. - Pairing codes are short-lived and single-use; tokens can be revoked with
browser extension unpair. - The default extension does not request
chrome.cookies; provider-specific cookie export should be added only behind an explicit opt-in build/scope. - DOM actions are injected into the real tab, but synthetic DOM events are not
browser-trusted user input (
Event.isTrustedstays false). Use the extension engine for authorized real-profile/session automation, not as a claim of hardware-level human input or a way around site anti-abuse controls.
Video Recording
Record browser sessions as WebM video instead of one-off screenshots:
browser video record https://example.com --duration 5 --quality high
browser video record "htop" --engine tui --duration 10 --quality high
browser video record "codewith --auth-profile account002" --engine tui --duration 30 --preset x-square
browser video record "codewith --auth-profile account002" --engine tui --duration 30 --preset reels --format mp4
browser video record https://example.com --duration 30 --quality ultra --format mp4 --capture-mode cdp --encoding crisp --crf 10
browser video record https://example.com --duration 30 --quality ultra --format mov --capture-mode cdp --encoding prores
browser video record http://spark01.taild59be2.ts.net:3325/ --duration 20 --quality ultra --format mp4 --capture-mode x11 --fps 60 --display-scale 2 --crf 10
browser video listMCP tools are available as browser_video_start, browser_video_stop,
browser_videos_list, and browser_video_delete. REST endpoints are exposed
under /api/videos. Quality presets map to the recording viewport (medium
= 720p, high = 1080p, ultra = 4K), and files are saved into the browser
downloads store with video metadata. TUI recording uses the existing ttyd
engine, so terminal apps are recorded through the rendered xterm.js browser
surface.
For crisp marketing captures, use --quality ultra --format mp4
--capture-mode cdp --encoding crisp --crf 10 to record a 4K canvas through
lossless PNG frames and export a high-bitrate H.264 file.
MP4 defaults to crisp UI-friendly encoding (CRF 12, x264 slow, animation
tuning) instead of ffmpeg's generic defaults, and high-fidelity MP4/MOV
exports automatically prefer CDP capture unless you pass --capture-mode
native. For a Mac/QuickTime-style master file, use --format mov
--capture-mode cdp --encoding prores; this creates a much larger ProRes 422 HQ
.mov intended for editing or archival handoff before social export. Use
--encoding lossless with MP4 only when you need a huge H.264 lossless
intermediate, and --fps 60, --video-bitrate 40M, or --ffmpeg-preset
veryslow when you need explicit encoder control.
For smooth real-time marketing video, use --capture-mode x11. This launches
a headed Chromium window on a private Xvfb display and records that display via
ffmpeg x11grab, avoiding Playwright WebM compression and screenshot polling.
Use --quality ultra --fps 60 --display-scale 2 for a Retina-style 4K output:
the browser lays out like 1920x1080 CSS pixels, while the captured video is
3840x2160 pixels. This mode requires Xvfb and an ffmpeg build with x11grab
and libx264; set BROWSER_XVFB_PATH or pass --xvfb-path if Xvfb is not on
PATH.
For social demos, use --preset x-square for X/Twitter feed posts or
--preset reels / --preset tiktok for vertical video. These presets render
the terminal inside a realistic light window with larger text, instead of
capturing an oversized raw desktop that becomes hard to read after platform
downscaling. Override the composition with --tui-font-size,
--tui-zoom, --tui-frame-fit canvas, --tui-padding, --tui-window-width,
--tui-window-height, --tui-theme, --background, or --tui-frame off.
Use --tui-frame-fit canvas --tui-padding 24 for a terminal window that fills
almost the whole video, --tui-frame-fit canvas --tui-padding 0 for framed
edge-to-edge output, or --tui-frame off for raw fullscreen terminal output.
Use --tui-zoom 0.85 for slightly smaller text without changing the preset.
For colorized terminal demos, run commands with FORCE_COLOR=3,
CLICOLOR_FORCE=1, and TERM=xterm-256color. Native capture produces WebM;
use --format mp4 --capture-mode cdp for a social-upload-friendly H.264 MP4
export or --format mov --capture-mode cdp --encoding prores for the
highest-fidelity master. Conversion uses the bundled ffmpeg-static binary and
falls back to BROWSER_FFMPEG_PATH or system ffmpeg.
Storage Sync
This package supports optional remote storage sync through a package-local Postgres connection:
export HASNA_BROWSER_DATABASE_URL=postgres://...
browser storage status
browser storage push
browser storage pull
browser storage syncThe MCP server also exposes storage_status, storage_push, storage_pull, and storage_sync.
Programmatic storage helpers are available from @hasna/browser/storage.
Programmatic video helpers are available from @hasna/browser/video, and the
extension bridge helpers are available from @hasna/browser/extension.
Data Directory
Data is stored in ~/.hasna/browser/.
License
Apache-2.0 -- see LICENSE
