swarmy-iris-mcp
v0.3.0
Published
MCP server for IRIS — give your agents a real browser. Wraps the IRIS SSE endpoint as a single iris_run tool.
Downloads
64
Maintainers
Readme
swarmy-iris-mcp
MCP server for IRIS. Give your agents a real browser via a single MCP tool.
Wraps the IRIS SSE endpoint (POST /api/agent/run) as one tool — iris_run — that takes a natural-language instruction and returns the final answer. Progress events are forwarded as MCP notifications/progress so host UIs can render a live status bar. When the in-container agent hits a captcha / login / MFA wall, the tool surfaces a vnc_url + container_id so a human can fix it and the caller can retry with the same container.
IRIS is a self-hosted headful-Chrome runtime for AI agents. See the swarmy repo for the manager, worker, and endpoint docs.
Need the agent to hit your local app? This MCP doesn't expose localhost tunneling — use the CLI instead:
swarmy-iris-cli test --expose 3000 --expose 8000 --then "<task>". The CLI sets up a manager-mediated WS tunnel so the in-container Chrome can reachhttp://localhost:<port>on your machine natively (CORS unchanged). See help.html → "Test against your local app".
Install
The recommended path is via the FirstToFly marketplace plugin, which bundles this MCP server alongside the matching iris skill and auto-registers it in your host:
# Claude Code
/plugin marketplace add first-to-fly/claude-marketplace
/plugin install first-to-fly@first-to-fly-plugins
# Codex CLI
codex plugin marketplace add first-to-fly/claude-marketplace
codex plugin add first-to-fly@first-to-fly-pluginsYou don't need to edit any host config file by hand — the marketplace's .mcp.json does the registration.
For headless or standalone use (CI, scripts, hosts without marketplace support):
npm install -g swarmy-iris-mcp
# or one-shot: npx -y swarmy-iris-mcpPrerequisites:
- Node.js ≥ 18
- An IRIS manager URL (e.g.
https://swarmy.firsttofly.com) - An IRIS credential (see below)
Configure credentials
Config priority: CLI flags > env vars > ~/.config/swarmy-iris-cli/credentials.
Recommended — pair via the CLI (no manual token paste)
Install swarmy-iris-cli and run the device-pairing flow once:
npm install -g swarmy-iris-cli
swarmy-iris-cli login --url https://swarmy.firsttofly.comA browser tab opens, you click Approve, and the CLI saves a scoped dev_… token to ~/.config/swarmy-iris-cli/credentials (mode 0600). This MCP server picks up the url + token from that file automatically — no env vars, no CLI flags, no per-host config.
Manual — env vars or flags (for headless setups)
If you can't run the device-pairing flow (CI, container with no browser), generate a long-lived swm_… token in the manager UI at Settings → API Tokens, then pass it via:
IRIS_URL=https://swarmy.firsttofly.com IRIS_TOKEN=swm_xxx swarmy-iris-mcp
# or
swarmy-iris-mcp --url https://swarmy.firsttofly.com --token swm_xxxBoth dev_… and swm_… tokens are accepted on the agent endpoint. dev_… is scoped (agent tasks + profile management); swm_… covers the broader REST API.
Tool reference
iris_run
Run a browsing task via IRIS.
Input:
| field | type | required | description |
|---|---|---|---|
| instruction | string | yes | Natural-language task for the in-container agent. Include full URLs and desired output format. |
| profile | string | conditional | Profile name or id with the saved browser state. Required unless container_id is set. |
| container_id | string | no | Resume an existing container (e.g., after a human fixed a blocked state via VNC). |
| keyframes | boolean | no | Capture a PNG of the running tab at every agent tool boundary (plan/execute/revise/followup/navigate). |
| final_screenshot | boolean | no | Capture one PNG of the running tab right before stop. |
| final_video | boolean | no | Capture a change-driven CDP screencast and JIT-encode an MP4 (H.264 / yuv420p) at end-of-run. Starts when the agent leaves claude.ai/* extension-setup pages. |
| final_video_max_seconds | integer | no | Tail-cap on the screencast ring buffer in seconds. Default 60, max 3600. |
Output: the final answer as plain text. When any capture flag is set, the result text gets a trailing "--- Capture artifacts ---" block with run_id, final_screenshot URL, final_video URL, and the list of keyframe URLs. Fetch each with Authorization: Bearer $IRIS_TOKEN (or ?token=$IRIS_TOKEN query for use in <img>/<video> tags).
Progress notifications: ready → 0, progress phase → 10/20/40/90, activity → 50 (rolling message), done → 100.
Error cases (returned as isError: true results):
- Blocked state (captcha / login / MFA): message carries
container_id,vnc_url,resume_hint. Send a human to the VNC URL, then retry withcontainer_id. - Mid-stream error (
container_start_failed,task_failed, etc.): code + message + optional hint / web_ui_url. - Pre-stream HTTP error (401/403/400/409/503): code + message + onboarding URLs (
sign_up_url,token_settings_url,profiles_url,containers_url).
Development
git clone https://github.com/first-to-fly/swarmy
cd swarmy/mcp/swarmy-iris-mcp
npm install
npm run dev -- --url https://... --token <dev_… or swm_…>Build a release artifact:
npm run build
node dist/index.js --url ... --token ...The package is ESM-only ("type": "module") and built with tsc. Output goes to dist/.
Links
- IRIS endpoint docs:
docs/agent-sse-endpoint.md - OpenAPI spec:
manager/docs/agent-openapi.yaml - Swarmy project: github.com/first-to-fly/swarmy
- Model Context Protocol: modelcontextprotocol.io
License
MIT
