octoflow-browser-bridge
v1.0.1
Published
Drive an OctoFlow agent (Claude Code by default) from a Chrome side-panel extension that streams the live page you're reviewing into the model and lets it edit local files.
Maintainers
Readme
octoflow-browser-bridge
Chrome side-panel extension and local bridge server for running an OctoFlow agent against the page you are viewing. The agent receives live DOM context, can stream answers back to the panel, and can use approved browser actions or workspace-scoped file edits when you enable them.
Use this README for the package surface. Use ../../docs/browser-bridge.md for the full setup runbook, screenshots, troubleshooting, and deeper protocol notes.
Install
npm install octoflow-core octoflow-browser-bridgeNode.js >=20 and Chrome/Chromium are required, plus at least one OctoFlow backend — a hosted API key (Anthropic / OpenAI / Gemini) or a local Ollama install (see Provide a backend). The extension is built from this package; the server binds to 127.0.0.1 by default.
Use It When
- You want an agent that can reason over the live page DOM instead of pasted HTML.
- You want browser-side approvals for page actions, navigation, tab actions, and file edits.
- You need a local side-panel workflow for reviewing web apps, docs, dashboards, or local code.
- You want persisted browser chat sessions without adding a hosted service.
Quick Start
Run it in three steps: provide a backend, start the loopback server, then load the extension in Chrome.
Provide a backend (Ollama for local, zero-key runs)
The bridge needs at least one OctoFlow backend to talk to. The default priority prefers hosted API backends (Anthropic / OpenAI / Gemini — set the matching API key), then falls back to local Ollama. To run fully locally with no API key, install and start Ollama first:
# Install Ollama (macOS): brew install ollama
# or download the installer from https://ollama.com/download
ollama serve # start the local daemon (127.0.0.1:11434)
ollama pull llama3.1 # pull a tool-capable chat model (e.g. qwen2.5 / mistral)Ollama is reached at 127.0.0.1:11434 by default (override with OLLAMA_HOST).
Page chat works with any chat model; for in-page browser actions prefer a
tool-loop-capable model, since those actions run through OctoFlow-owned tools.
Build the extension and start the server
From the repository root:
npm run -w octoflow-browser-bridge build:extension
npm run -w octoflow-browser-bridge startstart rebuilds the server, frees the port, and serves on 127.0.0.1:7878. For
a faster development loop that runs from source without a server rebuild:
npm run -w octoflow-browser-bridge start:devLoad the extension in Chrome
- Open
chrome://extensions. - Enable Developer mode.
- Choose Load unpacked.
- Select
packages/octoflow-browser-bridge/chrome-extension. - Pin OctoFlow Browser Bridge and open the side panel.
Runtime Model
Chrome side panel + background worker
-> captures page snapshots, stores global browser-tool toggles, routes approvals
-> WebSocket / SSE
Local bridge server on 127.0.0.1:7878
-> enriches prompts with page context
-> creates or reuses an OctoFlow agent
-> exposes only the enabled browser and file-edit tools for that sessionEach tab can have its own conversation. Opening the side panel starts on a fresh current-tab chat so the user can type immediately; previously persisted sessions remain available from the session picker instead of taking over the active view. The current page snapshot is attached to every turn, and browser-tool grants are global side-panel settings applied to whichever session/tab is active. Grants are re-synced with every known background session and again before a turn starts, so newly opened or restored chats use the globally enabled browser tools consistently. Browser actions use OctoFlow-owned tools, so the default backend priority prefers API/tool-loop-capable backends before native CLI bridges; native CLI backends can chat with page context but cannot execute these in-process browser actions.
Capabilities
| Capability | Default | What the agent can do |
| ------------ | ------- | ---------------------------------------------------------------------------------------------------------- |
| Page context | On | Read the active tab URL, title, DOM summary, frames, and selected element context. |
| Page actions | On | Request approved click, fill, submit, select, focus, or scroll actions in the session tab. |
| Navigation | On | Request approved go, back, forward, or reload actions. |
| Snapshot | On | Capture a fresh page snapshot on demand. Query the live DOM with browser_accessibility_tree (ARIA semantic tree via CDP — 10× token-efficient, CSP-safe), browser_screenshot (JPEG viewport capture for vision backends), browser_find_elements (CSS selector → confirmed element list), browser_get_text (targeted text extraction), and browser_dom_outline (layered DOM tree at configurable depth). |
| Diagnostics | On | Inspect recent user events, DOM counters, console/errors, network/resource timing, performance observer data, storage key names only (never values), and a compact live monitor feed. Filter network requests by URL pattern, method, or status with browser_query_network. |
| Tab | On | Read session-tab info, or request approved open, focus, and close actions scoped to the session tab. |
| File edits | Off | Edit files only under the workspace path you selected in the panel. |
Approvals appear as side-panel cards. Approve all applies only to the current session.
Side-Panel UX
- Fresh chat first: panel boot binds an empty session to the active tab; old history is available from the session picker.
- Sessions:
+opens another current-tab chat. Clear deletes the current session and stored history. Delete all removes all sessions and histories while preserving preferences, then opens a fresh chat. - Confirmation modal: destructive session actions use an in-panel confirmation dialog instead of blocking browser
confirm()prompts. - Settings screen: opening Settings hides the session picker, chat transcript, and composer. Settings owns scrolling, and the Save/Cancel bar stays reachable.
- Set all to default: restores the server URL, system prompt, auto-approve default, provider, and model in one action.
API Surfaces
The server exposes a small local API for the extension and CLI:
| Surface | Purpose |
| --------------------------------- | ---------------------------------------------------------- |
| GET /health | Liveness and counts. |
| GET /octogarage/providers | Ready and unavailable OctoFlow backends. |
| POST /octogarage/run | Start one streamed agent turn over SSE. |
| POST /octogarage/approve | Resolve a pending approval. |
| GET /octogarage/sessions | List persisted sessions. |
| GET /octogarage/sessions/:id | Read one persisted session. |
| DELETE /octogarage/sessions/:id | Drop one session, remove its JSONL history, and deny pending approvals. |
| GET /octogarage/preferences | Read local preferences. |
| PUT /octogarage/preferences | Patch local preferences. |
| WS /octogarage | Extension bus for snapshots, actions, status, and reloads. |
Use the Zod schemas in src/domain/ as the source of truth for request and event shapes.
Persistence And Environment
Sessions are stored as an index plus append-only JSONL transcripts by default:
~/.octoflow/browser-extension/Useful environment variables:
| Variable | Default | Purpose |
| ------------------------------------- | -------------------------------- | ------------------------------------------------------------------ |
| OCTOFLOW_BROWSER_BRIDGE_PORT | 7878 | HTTP and WebSocket port. |
| OCTOFLOW_BROWSER_BRIDGE_HOST | 127.0.0.1 | Bind host. Keep loopback unless you add your own network controls. |
| OCTOFLOW_BROWSER_BRIDGE_UNSAFE_BIND | unset | Set 1 to allow a non-loopback HOST; otherwise startup refuses. |
| OCTOFLOW_BROWSER_BRIDGE_TOKEN | auto-generated | Shared token required on HTTP/WS calls; persisted to the data dir when unset. |
| OCTOFLOW_BROWSER_BRIDGE_LOG | normal | quiet, normal, or verbose. |
| OCTOFLOW_BROWSER_BRIDGE_DATA_DIR | ~/.octoflow/browser-extension/ | Persistence directory. |
| OCTOFLOW_BROWSER_BRIDGE_PERSISTENCE | file | file or memory. |
The side panel deletes sessions through scoped DELETE /octogarage/sessions/:id calls so preferences are preserved. Session CLI helpers:
octoflow-browser-bridge sessions list
octoflow-browser-bridge sessions show <sessionId>
octoflow-browser-bridge sessions export <sessionId> file.json
octoflow-browser-bridge sessions purge --yesSecurity Model
- The server binds to loopback by default; HTTP CORS and WebSocket upgrades accept only loopback plus Chrome extension origins.
- Browser tools are enabled by default for local side-panel workflows; mutating actions still require approval and can be disabled globally in Settings.
- Mutating page, navigation, tab, and file-edit actions require approval.
- Tab tools are scoped to the session-bound tab; the agent cannot enumerate or target arbitrary tabs.
- File writes use
safePath({ base: workspaceRoot })fromoctoflow-coreand reject workspace escapes, symlink escapes,..segments, and reserved system paths. - Background-worker guards reject capability mismatches, tab mismatches, and URL drift before calling Chrome APIs.
- Diagnostics run in the page's main world plus Chrome DevTools Protocol to observe console/errors, network timing, user events, DOM mutation counts, PerformanceObserver entries, ReportingObserver reports, workers, cookie metadata, and storage key names. Storage values and cookie values are never read.
Layout
| Path | Owns |
| -------------------------- | --------------------------------------------------------------------------------------- |
| src/server/ | Local HTTP/WS server, agent pool, run handler, approval wiring, safe-edit policy. |
| src/domain/ | Zod schemas for run requests, browser actions, extension capabilities, and SSE events. |
| src/runtime/persistence/ | SessionStore, file persistence, and memory persistence. |
| extension/ | MV3 extension, background worker, content script, frame snapshotting, React side panel. |
| bin/ | octoflow-browser-bridge CLI entrypoint. |
Learn More
../../docs/browser-bridge.md- full setup, capability behavior, and troubleshooting.../../docs/security.md- OctoFlow approvals, safe paths, and sandboxing context.../octoflow-core- runtime used by the bridge server.
Validate
npm run -w octoflow-browser-bridge lint
npm run -w octoflow-browser-bridge typecheck
npm run -w octoflow-browser-bridge test
npm run -w octoflow-browser-bridge buildStatus
Preview local workflow package. Pin versions and read ../../CHANGELOG.md before depending on it in production.
