@devload/pagent
v1.0.0
Published
An agent that acts on real web pages. CLI and AI-powered browser automation with MCP support.
Downloads
4
Maintainers
Readme
PAGENT
Page AGENT - An agent that acts on real web pages. Control your browser from CLI and AI assistants with YAML-based UI testing (Playwright) and Chrome extension bridge for real browser automation.
Features
- YAML DSL: Write test scenarios in simple, readable YAML
- Playwright-powered: Fast, reliable cross-browser testing
- Rich artifacts: Collect HTML, screenshots, console logs, network requests, HAR files, computed styles, and traces
- Parallel execution: Run multiple scenarios concurrently
- Retry support: Automatically retry failed tests
- CI-friendly: JSON output and proper exit codes
Installation
# Install globally from npm
npm install -g @devload/pagent
# Or install locally in your project
npm install @devload/pagentFrom Source
# Clone the repository
git clone https://github.com/devload/pagent.git
cd pagent
# Install dependencies and build
npm install
npm run buildQuick Start
# Initialize with example scenarios
pagent init
# Validate scenarios
pagent validate scenarios/*.yaml
# Run all scenarios
pagent run "scenarios/*.yaml"
# Run with visible browser
pagent run scenarios/smoke-example.yaml --headedCLI Commands
pagent init
Initialize PAGENT in the current directory with example scenarios.
pagent init [options]
Options:
--json Output as JSON
--force Overwrite existing filesCreates:
scenarios/directory with example YAML filesartifacts/directory with.gitignore- Example scenarios:
smoke-example.yaml,hackernews.yaml
pagent run <pattern>
Run YAML test scenarios.
pagent run <pattern> [options]
Arguments:
pattern Glob pattern for scenario files (e.g., "scenarios/*.yaml")
Options:
--headed Run in headed mode (visible browser)
--workers <number> Number of parallel workers (default: 1)
--base-url <url> Override base URL for all scenarios
--timeout <ms> Default timeout in milliseconds
--retries <number> Number of retries for failed tests (default: 0)
--artifact-dir <dir> Directory for artifacts (default: ./artifacts)
--json Output as JSON onlyExamples:
# Run single scenario
pagent run scenarios/login.yaml
# Run all scenarios in parallel
pagent run "scenarios/*.yaml" --workers 4
# Run with retries and custom timeout
pagent run scenarios/*.yaml --retries 2 --timeout 60000
# CI mode with JSON output
pagent run "scenarios/*.yaml" --jsonpagent validate <pattern>
Validate YAML scenario files without running them.
pagent validate <pattern> [options]
Arguments:
pattern Glob pattern for scenario files
Options:
--json Output as JSONpagent list [dir]
List YAML scenarios in a directory.
pagent list [dir] [options]
Arguments:
dir Directory to search (default: "scenarios")
Options:
--json Output as JSONYAML DSL Specification
Schema (Version 1)
version: 1 # Required: Schema version
name: my-test-scenario # Required: Unique scenario name
baseURL: https://example.com # Optional: Base URL for relative paths
use: # Optional: Browser configuration
headless: true # Default: true
viewport: # Default: { width: 1280, height: 720 }
width: 1280
height: 720
timeoutMs: 30000 # Default: 30000 (30 seconds)
locale: en-US # Optional: Browser locale
timezoneId: America/New_York # Optional: Timezone
userAgent: "..." # Optional: Custom user agent
artifacts: # Optional: Artifact collection settings
html: true # Capture final page HTML
screenshot: true # Capture final screenshot
console: true # Capture console logs
network: true # Capture network requests
har: false # Save HAR file (default: false)
trace: on-first-retry # Trace recording: on|off|on-first-retry|retain-on-failure
styles: # Compute styles for specific selectors
- selector: "#myButton"
computed: ["display", "color", "font-size"]
networkBodyCapture: # Capture response bodies (default: disabled)
enabled: false
maxSizeBytes: 1048576 # Max 1MB per response
contentTypes: ["text/*", "application/json"]
steps: # Required: List of test steps
- goto: "/login"
- fill: { selector: "#email", text: "[email protected]" }
- click: { selector: "#submit" }
- expect: { urlContains: "/dashboard" }Available Steps
Navigation
# Navigate to URL (absolute or relative to baseURL)
- goto: "/login"
- goto: "https://example.com/page"Interactions
# Click element
- click: { selector: "#button" }
- click: { selector: "button", button: "right", clickCount: 2 }
# Fill input (clears existing content)
- fill: { selector: "#email", text: "[email protected]" }
# Type text (character by character with optional delay)
- type: { selector: "#search", text: "query", delayMs: 50 }
# Press keyboard key
- press: { key: "Enter" }
- press: { key: "Tab", selector: "#input" }Waiting
# Wait for element
- waitFor: { selector: ".loaded" }
- waitFor: { selector: "#content", timeoutMs: 10000 }
# Wait for page state
- waitFor: { state: "load" }
- waitFor: { state: "domcontentloaded" }
- waitFor: { state: "networkidle", timeoutMs: 15000 }Assertions
# Check element visibility
- expect: { visible: "#welcome-message" }
- expect: { hidden: ".loading-spinner" }
# Check page content
- expect: { textContains: "Welcome back" }
# Check URL
- expect: { urlContains: "/dashboard" }Snapshots
# Take screenshot
- snapshot: { name: "after-login" }
- snapshot: { name: "full-page", fullPage: true }Debug
# Execute JavaScript
- evaluate: { js: "console.log('Debug:', document.title)" }Output Structure
artifacts/
└── 20251220-143052-abc123/ # Run ID
├── summary.json # Overall run summary
└── my-scenario/ # Scenario folder
├── summary.json # Scenario summary
├── page.html # Final page HTML
├── screenshot.png # Final screenshot
├── console.jsonl # Console logs (one JSON per line)
├── network.jsonl # Network requests (one JSON per line)
├── network.har # HAR file (if enabled)
├── styles.json # Computed styles
├── trace.zip # Playwright trace
└── after-login.png # Named snapshotssummary.json Format
{
"ok": true,
"runId": "20251220-143052-abc123",
"scenario": "my-scenario",
"filePath": "/path/to/scenario.yaml",
"startedAt": "2025-12-20T14:30:52.000Z",
"endedAt": "2025-12-20T14:31:04.000Z",
"durationMs": 12345,
"steps": [
{ "i": 1, "type": "goto", "ok": true, "durationMs": 1200 },
{ "i": 2, "type": "fill", "ok": true, "durationMs": 50 },
{ "i": 3, "type": "click", "ok": true, "durationMs": 100 },
{ "i": 4, "type": "expect", "ok": true, "durationMs": 30 }
],
"artifacts": {
"html": "page.html",
"screenshot": "screenshot.png",
"console": "console.jsonl",
"network": "network.jsonl",
"snapshots": ["after-login.png"]
}
}console.jsonl Format
{"timestamp":"2025-12-20T14:30:53.000Z","type":"log","text":"Page loaded"}
{"timestamp":"2025-12-20T14:30:54.000Z","type":"error","text":"API error","location":{"url":"app.js","lineNumber":42}}network.jsonl Format
{"timestamp":"2025-12-20T14:30:52.500Z","method":"GET","url":"https://api.example.com/user","resourceType":"fetch","status":200,"statusText":"OK","requestHeaders":{"authorization":"[MASKED]"},"responseHeaders":{"content-type":"application/json"},"timing":{"startTime":1703082652500,"responseEnd":1703082652700,"durationMs":200}}Exit Codes
0: All scenarios passed1: One or more scenarios failed, or validation error
Security
Sensitive headers are automatically masked in network logs:
authorizationcookieset-cookiex-api-keyx-auth-tokenx-access-token
Response body capture is disabled by default. When enabled, bodies are limited by size and content type.
Programmatic Usage
import { runScenario, loadScenario } from '@devload/pagent';
// Load and run a scenario
const { scenario } = await loadScenario('./scenarios/test.yaml');
const result = await runScenario(scenario, './scenarios/test.yaml', {
headless: true,
artifactDir: './artifacts',
});
console.log(result.summary.ok ? 'Passed' : 'Failed');MCP Integration
The runScenario function can be wrapped as an MCP tool:
import { runScenario, loadScenario } from '@devload/pagent';
// In your MCP server tool handler
async function runUITest(scenarioPath: string) {
const loadResult = await loadScenario(scenarioPath);
if (!loadResult.success) {
return { error: loadResult.errors };
}
const result = await runScenario(loadResult.scenario!, scenarioPath);
return result.summary;
}Assumptions & Defaults
- Browser: Chromium only (for speed and consistency)
- Timeout: 30 seconds default for all operations
- Viewport: 1280x720 default
- Artifacts: HTML, screenshot, console, network enabled by default
- HAR: Disabled by default (large files)
- Trace: Recorded on first retry by default
- Network body: Not captured by default (security)
Troubleshooting
"Browser executable not found"
npx playwright install chromium"Timeout waiting for selector"
Increase the timeout in your scenario:
use:
timeoutMs: 60000Or for specific steps:
- waitFor: { selector: ".slow-element", timeoutMs: 30000 }Viewing Traces
npx playwright show-trace artifacts/*/trace.zipChrome Extension Bridge
PAGENT includes a Chrome extension bridge for controlling a real browser instance.
Setup
Install the Extension:
# Open Chrome and go to chrome://extensions/ # Enable "Developer mode" # Click "Load unpacked" and select chrome-extension/ folderStart the Bridge Server:
pagent bridge startConnect the Extension:
- Click the PAGENT extension icon
- Click "Connect"
Bridge Commands
# Start bridge server (default port 9222)
pagent bridge start
pagent bridge start --port 9000
# Get current page info
pagent bridge exec getPageInfo
# Capture screenshot
pagent bridge exec screenshot ./page.png
# Get page HTML
pagent bridge exec getDOM
pagent bridge exec getDOM "#main-content"
# Execute JavaScript
pagent bridge exec execute "document.title"
pagent bridge exec execute "document.querySelectorAll('a').length"
# Interact with elements
pagent bridge exec click "#submit-button"
pagent bridge exec fill "#email" "[email protected]"
pagent bridge exec navigate "https://example.com"
# Tab management
pagent bridge exec newTab "https://google.com"
pagent bridge exec listTabs
pagent bridge exec switchTab 123456789
pagent bridge exec closeTab 123456789
# Execute on specific tab (without switching)
pagent bridge exec getPageInfo --tab 123456789
pagent bridge exec screenshot --tab 123456789
# Get captured logs
pagent bridge exec consoleLogs
pagent bridge exec networkLogsBridge Architecture
┌──────────────────────────────────────┐
│ Chrome Browser │
│ ┌────────────────────────────────┐ │
│ │ PAGENT Extension │ │
│ │ (WebSocket Client) │ │
│ └─────────────┬──────────────────┘ │
└────────────────┼─────────────────────┘
│ ws://localhost:9222
▼
┌──────────────────────────────────────┐
│ pagent bridge start │
│ (WebSocket Server) │
└──────────────────────────────────────┘Use Cases
- Real Browser Testing: Test in actual Chrome with real extensions
- DevTools Integration: Access console logs, network requests in real-time
- Manual + Automated: Combine manual browsing with CLI automation
- AI Integration: Ask AI assistants to control your browser via MCP
MCP Integration (Model Context Protocol)
PAGENT can be used as an MCP server, allowing AI assistants like Claude to control your browser.
Setup for Claude Code (CLI) - Recommended
Option 1: Using claude mcp add command
# Add PAGENT as MCP server (one command!)
claude mcp add pagent -s user -- npx -y @devload/pagent
# Or with environment variable
claude mcp add pagent -s user -e BROWSER_BRIDGE_URL=ws://localhost:9222 -- npx -y @devload/pagent
# Verify installation
claude mcp listOption 2: Manual configuration
Add to your project's .mcp.json:
{
"mcpServers": {
"pagent": {
"command": "npx",
"args": ["-y", "@devload/pagent"],
"env": {
"BROWSER_BRIDGE_URL": "ws://localhost:9222"
}
}
}
}Or add to global config (~/.claude/settings.json):
{
"mcpServers": {
"pagent": {
"command": "npx",
"args": ["-y", "@devload/pagent"],
"env": {
"BROWSER_BRIDGE_URL": "ws://localhost:9222"
}
}
}
}Setup for Claude Desktop
Add to your Claude Desktop config (~/Library/Application Support/Claude/claude_desktop_config.json):
{
"mcpServers": {
"pagent": {
"command": "npx",
"args": ["-y", "@devload/pagent"],
"env": {
"BROWSER_BRIDGE_URL": "ws://localhost:9222"
}
}
}
}Prerequisites
Before using MCP, you need:
Install the Chrome Extension:
# Load unpacked extension from chrome-extension/ folder # in chrome://extensions/Start the Bridge Server:
pagent bridge startConnect the Extension:
- Click the PAGENT extension icon
- Click "Connect"
Available MCP Tools
| Tool | Description |
|------|-------------|
| browser_list_tabs | List all open browser tabs |
| browser_get_page_info | Get URL, title, and state of a tab |
| browser_navigate | Navigate to a URL |
| browser_new_tab | Open a new tab |
| browser_close_tab | Close a tab |
| browser_click | Click an element by CSS selector |
| browser_fill | Fill an input field |
| browser_screenshot | Capture a screenshot |
| browser_get_dom | Get page HTML content |
| browser_console_logs | Get browser console logs |
| browser_network_logs | Get network request logs |
Example Usage with Claude
Once connected, you can ask Claude to:
- "Open google.com and search for 'Anthropic Claude'"
- "Take a screenshot of the current page"
- "Fill in the login form with my email"
- "Click the submit button"
- "Get all the links on this page"
License
MIT
