@iflow-mcp/alibaba-page-agent
v1.7.0
Published
MCP server for controlling the browser via Page Agent extension
Readme
@page-agent/mcp
MCP server that lets AI agent clients (Claude Desktop, Copilot, etc.) control your browser through the Page Agent extension.
Prerequisites
- Node.js >= 20
- Page Agent Extension installed in Chrome
- An LLM API key (OpenAI-compatible)
Installation
Claude Desktop
Add to ~/Library/Application Support/Claude/claude_desktop_config.json:
{
"mcpServers": {
"page-agent": {
"command": "npx",
"args": ["-y", "@page-agent/mcp"],
"env": {
"LLM_BASE_URL": "https://dashscope.aliyuncs.com/compatible-mode/v1",
"LLM_API_KEY": "sk-xxx",
"LLM_MODEL_NAME": "qwen3.5-plus"
}
}
}
}Cursor / Copilot
Same format — add the config to the MCP settings of your client.
MCP Tools
| Tool | Input | Description |
| -------------- | ------------------ | ---------------------------------------------------- |
| execute_task | { task: string } | Execute a browser task in natural language. Blocking. |
| get_status | — | Returns { connected, busy } |
| stop_task | — | Stop the currently running task. |
Environment Variables
| Variable | Default | Description |
| ---------------- | ------- | --------------------- |
| LLM_BASE_URL | — | LLM API base URL |
| LLM_API_KEY | — | LLM API key |
| LLM_MODEL_NAME | — | Model name |
| PORT | 38401 | HTTP + WebSocket port |
How It Works
┌──────────────┐ stdio ┌──────────────────┐ WebSocket ┌──────────────┐
│ Claude / │◄────────►│ @page-agent/mcp │◄────────────►│ Hub tab │
│ Copilot │ (MCP) │ (Node.js) │ (localhost) │ (extension) │
└──────────────┘ └──────────────────┘ └──────┬───────┘
│ │
│ HTTP │ useAgent
▼ ▼
┌──────────────────┐ ┌──────────────┐
│ Launcher page │ │ MultiPage │
│ (localhost:PORT) │ │ Agent │
└──────────────────┘ └──────────────┘- Agent client starts the MCP server via stdio (
npx @page-agent/mcp). - Server starts HTTP + WS on
localhost:PORT, opens the launcher page in browser. - Launcher page triggers the extension to open a hub tab (
hub.html?ws=PORT). - Hub connects to the WS server. MCP tools now proxy tasks to the hub.
The hub tab speaks a generic WebSocket protocol (defined in hub-ws.ts in the extension package) and has no knowledge of MCP. See the hub's protocol docs for message format details.
Architecture
Pure JS ESM, no build step. Source files are the published artifacts.
src/
├── index.js # CLI entry: MCP server (stdio) + opens launcher
├── hub-bridge.js # HTTP server + WebSocket bridge to hub tab
└── launcher.html # Bootstrap page: detects extension, triggers hub openDev
npm run build:libs
npm run dev:ext
npx @modelcontextprotocol/inspector node packages/mcp/src/index.js