agex

v0.2.19

Published

22 days ago

**Let AI agents prove their work — with video.**

0High
0Medium
0Low

mati1000

agex

Let AI agents prove their work — with video.

AI agents work in the background. They write code, fix bugs, ship features. But how do they prove their work?

No screenshots. No recordings. No proof. You just have to trust the logs — or worse, manually test it yourself. A bottleneck.

agex lets any AI agent (Cursor, Claude, Codex) open a browser, take screenshots, record video, and deliver visual evidence that the work is done. Run it locally. Run it in CI. Ship with proof.

Built on playwright-cli by Microsoft.

Quick Start

npx agex prove "the homepage has a search bar and a sign up button" --url https://github.com

On your PR CI

npx agex prove-pr --url http://localhost:3000 # your local dev URL, start your app in the CI

Install

npm install -g agex

Commands

`agex prove` - Visual assertion testing

Prove visual assertions on any URL with screenshots and video evidence.

# Is there a dog in Google's logo? (spoiler: no — and the agent catches it)
agex prove "google home page has a dog in the logo" \
  --url "https://www.google.com/?hl=en&gl=us" --agent codex

# Verify GitHub's Code dropdown has HTTPS/SSH/GitHub CLI tabs
agex prove "clicking the Code dropdown opens a panel with HTTPS/SSH/GitHub CLI tabs" \
  --url https://github.com/facebook/react

# Multi-element verification
agex prove "github.com homepage has a search bar, a sign up button, and at least 3 navigation links" \
  --url https://github.com --agent claude

# Complex page structure
agex prove "the wikipedia page for JavaScript has a History section, a Syntax section, and an infobox sidebar" \
  --url https://en.wikipedia.org/wiki/JavaScript --agent claude

# E-commerce flow
agex prove "search for 'mechanical keyboard', filter by 4+ stars, verify the first result shows product image, price, and rating" \
  --url https://amazon.com --agent claude

# Video content
agex prove "youtube homepage shows at least 6 video thumbnails with titles and view counts" \
  --url https://www.youtube.com --agent codex

| Flag | Description | Default | |------|-------------|---------| | --agent, -a | Agent to use: cursor, claude, codex | cursor | | --url, -u | URL to navigate to | - | | --output, -o | Output directory | ./prove-results | | --video | Enable video recording | true | | --screenshots | Enable screenshots | true | | --model, -m | Model to use | - | | --timeout, -t | Timeout in ms | 300000 | | --viewport | Viewport size (WxH) | 1920x1080 | | --headless | Run headless | true | | --browser, -b | Browser mode: mcp or cli | mcp |

`agex demo` - Record demo videos

Record a narrated browser walkthrough as a video. The agent opens a URL, performs the task, and captures the session.

# Record a demo of selecting a brand design from a menu
agex demo "how to select a brand design from the menu" --url https://example.com

# With subtitles and custom colors (via task text)
agex demo "how to create a new project. use subtitles with white text on red background" \
  --url https://app.example.com --agent cursor

# Non-headless for debugging
agex demo "sign up for a free trial" --url https://example.com --no-headless

| Flag | Description | Default | |------|-------------|---------| | --agent, -a | Agent to use: cursor, claude, codex | auto-detected | | --url, -u | URL to navigate to | - | | --output, -o | Output directory | ./demo-results | | --video | Enable video recording | true | | --screenshots | Enable screenshots | true | | --model, -m | Model to use | - | | --timeout, -t | Timeout in ms | 300000 | | --viewport | Viewport size (WxH) | 1920x1080 | | --headless | Run headless | true |

`agex prove-pr` - PR verification in CI

Run it in your CI. It reads the diff, thinks, acts, and reports autonomously.

agex prove-pr --base main --url http://localhost:3000
agex prove-pr --base HEAD~3 --agent claude --hypotheses 10 --hint "focus on mobile layout"

| Flag | Description | Default | |------|-------------|---------| | --base | Base branch/commit to compare against | auto-detected | | --agent, -a | Agent to use: cursor, claude, codex | cursor | | --url, -u | Dev server URL for visual testing | - | | --output, -o | Output directory | ./prove-pr-results | | --hypotheses | Number of hypotheses to generate | 5 | | --model, -m | Model to use | - | | --viewport | Viewport size (WxH) | 1920x1080 | | --hint | Additional prompt hint for hypothesis generation | - | | --timeout, -t | Timeout in ms | 300000 |

`agex review` - AI code review

Review code changes using an AI agent.

agex review --base main --agent cursor
agex review --base HEAD~3 --agent claude --hint "focus on security"
agex review --agent codex --hypotheses 3 --no-worktree

| Flag | Description | Default | |------|-------------|---------| | --base | Base branch/commit to compare against | auto-detected | | --agent, -a | Agent to use: cursor, claude, codex | required | | --model, -m | Model to use | - | | --worktree | Include worktree changes | true | | --hypotheses | Number of hypotheses to generate | - | | --hint | Additional prompt hint | - | | --timeout, -t | Timeout in ms | 300000 |

`agex run` - Execute AI agent tasks

Run a prompt through any supported AI agent.

agex run "build a landing page" --agent cursor
agex run "refactor this function" --agent claude --mode json
agex run "fix the bug" --approval on-request --timeout 600000

| Flag | Description | Default | |------|-------------|---------| | --agent, -a | Agent to use: cursor, claude, codex | auto-detected | | --model, -m | Model to use | - | | --mode | Output mode: text, json, debug | text | | --approval | Approval policy: never, on-request, on-failure, untrusted | - | | --timeout | Timeout in ms | 300000 | | --install | Run install command before agent | false | | --install-command | Custom install command | - | | --stream | Enable streaming output | true | | --browser | Enable browser | true |

`agex browse` - Browser automation

Low-level browser automation via playwright-cli.

agex browse open https://example.com
agex browse snapshot -i
agex browse click @e1
agex browse fill @e2 "[email protected]"
agex browse screenshot page.png
agex browse close

| Flag | Description | Default | |------|-------------|---------| | --install | Force reinstall browser | false |

Supported Agents

| Agent | Description | |-------|-------------| | cursor | Cursor IDE agent | | claude | Anthropic Claude CLI | | codex | OpenAI Codex CLI |

Ship with proof.

MIT License

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

agex

Quick Start

Install

Commands

agex prove - Visual assertion testing

agex demo - Record demo videos

agex prove-pr - PR verification in CI

agex review - AI code review

agex run - Execute AI agent tasks

agex browse - Browser automation

Supported Agents

Ship with proof.

`agex prove` - Visual assertion testing

`agex demo` - Record demo videos

`agex prove-pr` - PR verification in CI

`agex review` - AI code review

`agex run` - Execute AI agent tasks

`agex browse` - Browser automation