@hera-al/browser-server
v1.0.5
Published
Hera Browser Server — local browser automation via Playwright and CDP
Maintainers
Readme
@hera-al/browser-server
Part of Hera Artificial Life — an opinionated AI assistant platform that runs locally on your machine. This package provides the browser automation layer used by Hera agents to interact with the web. It can also be used independently in any Node.js project.
Local browser automation server powered by Playwright and the Chrome DevTools Protocol (CDP).
Exposes a lightweight HTTP API on 127.0.0.1 that lets AI agents — or any local process — launch, control, and inspect browser sessions without embedding a full browser SDK.
Features
- Launch or attach to a local Chrome/Chromium instance via CDP
- Multi-profile support (separate CDP ports, colors, isolated sessions)
- Tab management — open, list, focus, close tabs
- Page actions — click, type, hover, scroll, drag, select, fill forms, press keys, navigate
- Snapshots — role-based (Playwright), ARIA, DOM, text/HTML extraction, CSS query, AI-digest
- Screenshots & PDF — full-page or element-level capture
- Storage — read/write cookies, localStorage, sessionStorage
- JavaScript evaluation — run arbitrary JS in page context
- Wait conditions — wait for text, selector, URL, load state, or custom function
- Standalone or embedded — run as a standalone process or import into your own Node.js app
Install
npm install @hera-al/browser-serverPeer dependency: Requires
playwright-core≥ 1.50 and a Chromium-based browser available on the system.
Quick start
Standalone (CLI)
npx hera-browser --port 3002 --headless falseAll CLI flags:
| Flag | Default | Description |
| ------------------ | ------- | ---------------------------------------- |
| --port | 3002 | HTTP control port |
| --headless | false | Run Chrome headless |
| --noSandbox | false | Disable Chrome sandbox (CI/Docker) |
| --attachOnly | false | Never launch Chrome, only attach via CDP |
| --executablePath | — | Custom Chrome/Chromium binary path |
Programmatic
import { startBrowserServer, stopBrowserServer } from "@hera-al/browser-server";
import { resolveBrowserConfig, BrowserConfigSchema } from "@hera-al/browser-server/config";
const config = resolveBrowserConfig(
BrowserConfigSchema.parse({
enabled: true,
controlPort: 3002,
headless: false,
})
);
await startBrowserServer(config);
// ... your app logic ...
await stopBrowserServer();HTTP API Reference
All endpoints listen on http://127.0.0.1:{port}. Responses are JSON with an ok boolean.
Status & lifecycle
| Method | Path | Description |
| ------ | -------- | --------------------------------- |
| GET | / | Check browser status (running / stopped) |
| POST | /start | Launch or attach to the browser |
| POST | /stop | Stop the browser |
Tabs
| Method | Path | Description |
| ------ | ------------- | ------------- |
| GET | /tabs | List all open tabs |
| POST | /tabs/open | Open a new tab ({ url }) |
| POST | /tabs/focus | Focus a tab ({ id }) |
| DELETE | /tabs/:id | Close a tab |
Actions (POST /act)
Send { kind, ...params } to perform a page action:
| Kind | Key params |
| ------------ | ------------------------------------------------------------- |
| navigate | url, timeoutMs? |
| click | ref, doubleClick?, button?, modifiers? |
| type | ref, text, submit?, slowly? |
| press | key, delayMs? |
| hover | ref |
| scroll | ref |
| drag | startRef, endRef |
| select | ref, values |
| fill_form | fields: [{ ref, type, value }] |
| screenshot | ref?, element?, fullPage?, type? (png | jpeg) |
| evaluate | fn (JS string), ref? |
| wait | timeMs?, text?, textGone?, selector?, url?, loadState?, fn? |
All actions accept optional profile, targetId, and timeoutMs.
Snapshots (GET /snapshot)
Query params:
| Param | Values | Description |
| --------------- | ------------------------------------------- | ---------------------------------- |
| mode | role (default), aria, dom, text, html, ai, query | Snapshot format |
| selector | CSS selector string | Scope to element (for text, html, query) |
| frameSelector | CSS selector string | Target iframe |
| interactive | true / false | Include interactive elements only |
| compact | true / false | Compact output |
Screenshots & PDF
| Method | Path | Description |
| ------ | ------------- | -------------------------------------- |
| POST | /screenshot | Capture screenshot ({ fullPage?, type?, ref?, element? }) |
| POST | /pdf | Export page as PDF |
Storage
| Method | Path | Description |
| ------ | ----------------- | ----------------------------- |
| GET | /cookies | Get all cookies |
| POST | /cookies | Set a cookie ({ cookie }) |
| DELETE | /cookies | Clear all cookies |
| GET | /storage/:kind | Get localStorage or sessionStorage (kind = local | session) |
| POST | /storage/:kind | Set a key ({ key, value }) |
| DELETE | /storage/:kind | Clear storage |
Configuration
When using the programmatic API, the full config schema is:
{
enabled: boolean; // default: false
controlPort: number; // default: 3002
headless: boolean; // default: false
noSandbox: boolean; // default: false
attachOnly: boolean; // default: false
executablePath?: string; // custom Chrome path
remoteCdpTimeoutMs: number; // default: 1500
profiles: {
[name: string]: {
cdpPort?: number; // default: 9222
cdpUrl?: string; // overrides cdpPort
color?: string; // hex color, default: "#FF4500"
}
}
}Profiles
Profiles allow managing multiple isolated browser sessions on different CDP ports. The default profile is always available.
const config = resolveBrowserConfig(
BrowserConfigSchema.parse({
enabled: true,
profiles: {
default: { cdpPort: 9222 },
social: { cdpPort: 9223, color: "#1DA1F2" },
},
})
);Pass ?profile=social (GET) or { "profile": "social" } (POST) to target a specific profile.
Architecture
┌─────────────────────────────────────────┐
│ HTTP API (Hono) │
│ basic · tabs · act · snapshot · storage│
└────────────────┬────────────────────────┘
│
┌───────┴───────┐
│ BrowserContext │ ← multi-profile manager
└───────┬───────┘
│
┌────────────┼────────────┐
│ │ │
Chrome Playwright CDP
launcher session mgr helpers- Hono — lightweight HTTP framework (< 15 KB)
- Playwright — used for page interactions, snapshots, screenshots (connects to existing Chrome via CDP)
- CDP direct — used for ARIA/DOM snapshots, tab management, WebSocket resolution
Requirements
- Node.js ≥ 18
- Chrome or Chromium installed locally (or provide
executablePath) - macOS, Linux, or WSL
License
MIT — © 2026 TGP / Hera Artificial Life
