agex
v0.2.12
Published
**Let AI agents prove their work — with video.**
Readme
agex
Let AI agents prove their work — with video.
AI agents work in the background. They write code, fix bugs, ship features. But how do they prove their work?
No screenshots. No recordings. No proof. You just have to trust the logs — or worse, manually test it yourself. A bottleneck.
agex lets any AI agent (Cursor, Claude, Codex) open a browser, take screenshots, record video, and deliver visual evidence that the work is done. Run it locally. Run it in CI. Ship with proof.
Built on agent-browser by Vercel.
Quick Start
npx agex prove "the homepage has a search bar and a sign up button" --url https://github.comOn your PR CI
npx agex prove-pr --url http://localhost:3000 # your local dev URL, start your app in the CIInstall
npm install -g agexCommands
agex prove - Visual assertion testing
Prove visual assertions on any URL with screenshots and video evidence.
# Is there a dog in Google's logo? (spoiler: no — and the agent catches it)
agex prove "google home page has a dog in the logo" \
--url "https://www.google.com/?hl=en&gl=us" --agent codex
# Verify GitHub's Code dropdown has HTTPS/SSH/GitHub CLI tabs
agex prove "clicking the Code dropdown opens a panel with HTTPS/SSH/GitHub CLI tabs" \
--url https://github.com/facebook/react
# Multi-element verification
agex prove "github.com homepage has a search bar, a sign up button, and at least 3 navigation links" \
--url https://github.com --agent claude
# Complex page structure
agex prove "the wikipedia page for JavaScript has a History section, a Syntax section, and an infobox sidebar" \
--url https://en.wikipedia.org/wiki/JavaScript --agent claude
# E-commerce flow
agex prove "search for 'mechanical keyboard', filter by 4+ stars, verify the first result shows product image, price, and rating" \
--url https://amazon.com --agent claude
# Video content
agex prove "youtube homepage shows at least 6 video thumbnails with titles and view counts" \
--url https://www.youtube.com --agent codex| Flag | Description | Default |
|------|-------------|---------|
| --agent, -a | Agent to use: cursor, claude, codex | cursor |
| --url, -u | URL to navigate to | - |
| --output, -o | Output directory | ./prove-results |
| --video | Enable video recording | true |
| --screenshots | Enable screenshots | true |
| --model, -m | Model to use | - |
| --timeout, -t | Timeout in ms | 300000 |
| --viewport | Viewport size (WxH) | 1920x1080 |
| --headless | Run headless | true |
| --browser, -b | Browser mode: mcp or cli | mcp |
agex demo - Record demo videos
Record a narrated browser walkthrough as a video. The agent opens a URL, performs the task, and captures the session.
# Record a demo of selecting a brand design from a menu
agex demo "how to select a brand design from the menu" --url https://example.com
# With subtitles and custom colors (via task text)
agex demo "how to create a new project. use subtitles with white text on red background" \
--url https://app.example.com --agent cursor
# Non-headless for debugging
agex demo "sign up for a free trial" --url https://example.com --no-headless| Flag | Description | Default |
|------|-------------|---------|
| --agent, -a | Agent to use: cursor, claude, codex | auto-detected |
| --url, -u | URL to navigate to | - |
| --output, -o | Output directory | ./demo-results |
| --video | Enable video recording | true |
| --screenshots | Enable screenshots | true |
| --model, -m | Model to use | - |
| --timeout, -t | Timeout in ms | 300000 |
| --viewport | Viewport size (WxH) | 1920x1080 |
| --headless | Run headless | true |
agex prove-pr - PR verification in CI
Run it in your CI. It reads the diff, thinks, acts, and reports autonomously.
agex prove-pr --base main --url http://localhost:3000
agex prove-pr --base HEAD~3 --agent claude --hypotheses 10 --hint "focus on mobile layout"| Flag | Description | Default |
|------|-------------|---------|
| --base | Base branch/commit to compare against | auto-detected |
| --agent, -a | Agent to use: cursor, claude, codex | cursor |
| --url, -u | Dev server URL for visual testing | - |
| --output, -o | Output directory | ./prove-pr-results |
| --hypotheses | Number of hypotheses to generate | 5 |
| --model, -m | Model to use | - |
| --viewport | Viewport size (WxH) | 1920x1080 |
| --hint | Additional prompt hint for hypothesis generation | - |
| --timeout, -t | Timeout in ms | 300000 |
agex review - AI code review
Review code changes using an AI agent.
agex review --base main --agent cursor
agex review --base HEAD~3 --agent claude --hint "focus on security"
agex review --agent codex --hypotheses 3 --no-worktree| Flag | Description | Default |
|------|-------------|---------|
| --base | Base branch/commit to compare against | auto-detected |
| --agent, -a | Agent to use: cursor, claude, codex | required |
| --model, -m | Model to use | - |
| --worktree | Include worktree changes | true |
| --hypotheses | Number of hypotheses to generate | - |
| --hint | Additional prompt hint | - |
| --timeout, -t | Timeout in ms | 300000 |
agex run - Execute AI agent tasks
Run a prompt through any supported AI agent.
agex run "build a landing page" --agent cursor
agex run "refactor this function" --agent claude --mode json
agex run "fix the bug" --approval on-request --timeout 600000| Flag | Description | Default |
|------|-------------|---------|
| --agent, -a | Agent to use: cursor, claude, codex | auto-detected |
| --model, -m | Model to use | - |
| --mode | Output mode: text, json, debug | text |
| --approval | Approval policy: never, on-request, on-failure, untrusted | - |
| --timeout | Timeout in ms | 300000 |
| --install | Run install command before agent | false |
| --install-command | Custom install command | - |
| --stream | Enable streaming output | true |
| --browser | Enable browser | true |
agex browse - Browser automation
Low-level browser automation via agent-browser.
agex browse open https://example.com
agex browse snapshot -i
agex browse click @e1
agex browse fill @e2 "[email protected]"
agex browse screenshot page.png
agex browse close| Flag | Description | Default |
|------|-------------|---------|
| --install | Force reinstall browser | false |
Supported Agents
| Agent | Description |
|-------|-------------|
| cursor | Cursor IDE agent |
| claude | Anthropic Claude CLI |
| codex | OpenAI Codex CLI |
Ship with proof.
MIT License
