web-task-api
v0.2.4
Published
Browser-task runtime for MCP and HTTP automation with runs, sessions, and recipes.
Downloads
428
Maintainers
Readme
Web Task API
web-task-api is a generalized browser-task runtime for projects that want to treat websites like programmable APIs.
For MCP discovery and host UI, the human-facing title should be Web Task. The package and registry IDs stay web-task-api for compatibility because the same package ships both the HTTP API and the MCP server.[^2]
It exposes one API for:
- starting from a URL + goal
- letting an agent drive a real browser
- validating structured output against a schema
- reusing persistent browser profiles and promoted recipes
- storing traces and artifacts for replay/debugging
The same runtime now ships in two surfaces:
- HTTP API for application-to-application integration
- MCP server for Claude Code, OpenCode, and other MCP clients[^1]
Why this exists
Instead of building one brittle adapter per website, this project uses an agent-first browser runtime:
- default path: goal-driven browser control
- optimization path: reusable recipes for common flows
- escape hatch: login profiles and artifacts for debugging failures
That gives a more future-proof foundation for “API for any site” style automation.
Implemented MVP
- Fastify HTTP API
- stdio MCP server
- Playwright browser runtime
- CLIProxyAPI-backed planner for freeform browser control
- Optional OpenCode SDK planner adapter for environments that already use OpenCode well
- Auto planner mode that falls back to your existing local OpenCode auth/runtime when direct CLIProxy credentials are not wired yet
- Mock agent for local deterministic demos/tests
- Recipe registry and matching
- Persistent browser profile reuse
- Run artifacts and step traces
- Local demo and end-to-end tests
Initial workflow targets
- generic search/form workflows
- Dexscreener token/pair reading starter recipe
- GMGN token/wallet reading starter recipe
For Dexscreener/GMGN, treat the shipped recipes as starter recipes, not guaranteed turnkey integrations yet. A warmed persistent browser source is often required because fresh headless sessions can hit Cloudflare or similar anti-bot checks. The runtime now fails fast for these protected-site recipes unless you provide one of:
request.profileBROWSER_USER_DATA_DIRsessionIdfor a warmed session that already preserves browser storage across tasks
Quick start
Install dependencies:
npm install npm run playwright:installStart the HTTP API:
npm run devOr start the MCP server:
npm run dev:mcpRun the demo flow:
npm run demo
MCP
The package now exposes the MCP server binary directly:
npx -y web-task-apiThat launches the stdio MCP server.
The HTTP runtime remains available separately:
npx -y -p web-task-api web-task-api-httpMCP tools
webtask_run— start a new browser task from a goal and optional URLwebtask_get_task— inspect one persisted task runwebtask_list_recipes— list reusable starter recipes before a runwebtask_create_session— create continuity for related taskswebtask_list_sessions— list saved continuity sessionswebtask_get_session— inspect one saved session and its recent historywebtask_update_session— update session metadata like name, notes, or defaultswebtask_health— verify the runtime is alive
Tool-selection guidance follows MCP best practice: use human-readable titles/descriptions, make the “when should I use this tool?” boundary explicit, and publish accurate behavior hints instead of vague marketing copy.[^3]
MCP config examples
- Claude Code:
examples/claude.mcp.json - OpenCode:
examples/opencode.json
API
POST /v1/tasks/run
Runs a browser task synchronously and returns structured results.
Example request is in examples/demo-task.json.
GET /v1/tasks/:taskId
Returns the persisted run record with step trace and artifact paths.
GET /v1/recipes
Lists registered recipes.
POST /v1/sessions
Creates a reusable session for connected tasks. Sessions can carry:
- guest vs profile mode
- default start URL
- default planner config
- notes
- compact task history
GET /v1/sessions
Lists saved sessions.
GET /v1/sessions/:sessionId
Returns session metadata and recent task history.
PATCH /v1/sessions/:sessionId
Updates session metadata like notes, default start URL, or the bound profile for an existing profile-mode session. Guest sessions cannot be rebound into named profiles by patch.
GET /health
Basic health endpoint.
TypeScript client
Software can use the bundled client:
import { WebTaskApiClient } from "web-task-api"
const client = new WebTaskApiClient({ baseUrl: "http://127.0.0.1:4317" })
const session = await client.createSession({
name: "axiom trader",
mode: "profile",
profile: "axiom",
notes: "Authenticated Axiom trading session"
})
const result = await client.runTask({
goal: "Extract token name and price",
startUrl: "https://example.com",
sessionId: session.id,
agent: { kind: "auto" },
})Connected tasks with sessions
Sessions let related web tasks share:
- browser/profile identity
- guest-session cookies and local storage across tasks
- default start URL
- planner defaults
- recent task context
Example pattern:
- Create session for
axiomprofile - Run login/manual warmup task once
- Run later research/action tasks with the same
sessionId - Inspect session history to see what the agent already found
Guest sessions also work: create a mode: "guest" session and repeated tasks will preserve browser storage between runs under that session ID.
For protected recipes, that guest session still needs to be warmed first before you rely on it as a continuity source.
Browser profiles
To create a reusable login profile:
npm run profile:login -- --id my-profile --url https://example.com/loginThis opens a real persistent browser profile. Log in manually or solve bot challenges, then press Enter in the terminal. The runtime saves a reusable Chromium user-data directory at <data-root>/profiles/<id>/user-data-dir and later tasks can use "profile": "my-profile".
This matters for sites like Dexscreener or GMGN that may block fresh headless sessions behind Cloudflare or similar anti-bot checks.
If you want the runtime to behave as closely as possible to your normal local Chrome, you can also point it at an existing browser profile:
BROWSER_USER_DATA_DIR=/path/to/your/chrome/profile
That is the closest match to “it works in my Chrome already”.
Runtime storage roots
By default the runtime keeps mutable data out of the ambient working directory:
- Linux:
~/.local/share/web-task-api - macOS:
~/Library/Application Support/web-task-api - Windows:
%LOCALAPPDATA%\web-task-api
Under that data root the runtime writes:
profiles/<id>/user-data-dirruns/<taskId>/...sessions/<sessionId>.json
Bundled starter recipes are read from the installed package's recipes/ directory, not from the current shell cwd.
Useful overrides:
WEB_TASK_API_DATA_DIR— custom mutable data rootWEB_TASK_API_RECIPES_DIR— custom recipes directoryWEB_TASK_API_TEMP_DIR— custom temp root used when the incoming temp env points at your home/cwd
Planner backends
Recommended: CLIProxyAPI
This is the default non-mock path to avoid tying the system too tightly to OpenCode.
CLIProxyAPI is treated as a multi-provider router, not a single-provider API key wrapper. You can point this product at any model alias/provider path exposed by your CLIProxy setup.
Useful environment variables:
CLIPROXY_BASE_URL— defaulthttp://127.0.0.1:8317/v1CLIPROXY_AUTH_TOKEN— optional client token if your CLIProxy instance requires oneCLIPROXY_MODEL— planner model alias/name exposed by your proxy, for example whatever provider/model mapping you configured there
Example:
{
"agent": {
"kind": "cliproxy"
}
}Easiest local path right now: auto
If your GPT/OAuth is already working through local OpenCode, use:
{
"agent": {
"kind": "auto"
}
}auto probes CLIProxy first and uses it when reachable/authenticated and a planner model is configured; otherwise it falls back to OpenCode so the product can still use your existing local auth/runtime. This path is verified locally against the fixture flow; for real sites, treat it as the recommended runtime path, not a guarantee that every protected site will work without profile warmup.
Optional: OpenCode
If you already run OpenCode headless and want to reuse that stack, the project also supports an OpenCode planner adapter.
Useful variables:
OPENCODE_BASE_URLOPENCODE_MODEL
Then use:
{
"agent": {
"kind": "opencode"
}
}Main files
docs/design.md— architecture, decisions, and implementation plandocs/releasing.md— tag-driven release flow and MCP registry packaging notessrc/— server, runtime, agent, browser, and storage codetests/— end-to-end verification with a local fixture sitescripts/— demo runner and profile bootstrap
References
[^1]: docs/design.md for the detailed system design, tradeoffs, and roadmap.
[^2]: server.json is the MCP registry metadata source of truth; package.json carries the npm package and executable metadata.
[^3]: Model Context Protocol, "Tools" specification and tool-annotation guidance — titles, descriptions, JSON Schema field descriptions, and accurate hints improve host UX and tool selection.
