@goobz22/claude-runner

v1.0.0

Published

2 months ago

Autonomous browser testing framework powered by Claude Code. Sweeps your web app route-by-route using a Chrome audit extension, reports results, and keeps going unattended.

0High
0Medium
0Low

goobz22

testing browser-testing claude-code chrome-extension autonomous qa e2e audit ai-testing unattended

Claude Runner

Autonomous browser testing powered by Claude Code. Sweeps your web app route-by-route using a Chrome audit extension, reports results, and keeps going — unattended.

You go to work. Claude tests your app. You come back to a full report.

How It Works

bun run start
    |
    v
[wrapper] reads runner.config.ts → gets your routes
    |
    v
for each route:
    |
    +→ spawns: claude -p --chrome --verbose
    |     |
    |     +→ navigates to the route
    |     +→ handles auth (clicks account picker if needed)
    |     +→ triggers your audit extension's Run button via JS
    |     +→ waits for tests to complete
    |     +→ reads results from localStorage
    |     +→ prints summary + writes last-result.json
    |
    +→ wrapper records result, advances to next route
    +→ sends email notification on milestones
    +→ handles crashes, restarts dev server / Chrome if needed
    |
    v
[summary.json] — full results for every route

Requirements

Bun (runtime)
Claude Code CLI with a Max, Pro, or Teams account
Claude in Chrome extension (v1.0.36+)
Google Chrome or Microsoft Edge
A Chrome audit extension for your app (scaffold provided)

Quick Start

1. Install

git clone https://github.com/mkgoluba/claude-runner.git
cd claude-runner
bun install

2. Create your audit extension

Copy extension-scaffold/ and customize it for your app:

Edit the ROUTES object to match your app's routes
Edit PAGE_TESTS to define what to check on each page
Update manifest.json with your app's URL pattern
Load it in Chrome: chrome://extensions → Developer mode → Load unpacked

The scaffold gives you a working panel with verifyElement, verifyText, countElements, and clickButton test actions. Add more for your app's needs.

3. Configure

Create runner.config.ts in your project root:

import type { SweepConfig } from 'claude-runner/src/types';

const config: SweepConfig = {
  appUrl: 'http://localhost:3000',
  appName: 'My App',
  projectDir: process.cwd(),

  routes: {
    public: ['/', '/about'],
    dashboard: ['/dashboard', '/dashboard/settings'],
    admin: ['/admin', '/admin/users'],
  },

  auditPanel: {
    panelId: 'audit-panel',
    modeSelectId: 'audit-test-mode',
    pageSelectId: 'audit-page-select',
    runButtonId: 'audit-btn-run',
    abortButtonId: 'audit-btn-abort',
    logAreaId: 'audit-log',
    resultsId: 'audit-results-dash',
    trackerStorageKey: 'audit-tracker',
  },

  auth: {
    authPaths: ['/login', '/auth'],
    accountMap: {
      admin: 'ADMIN',
      dashboard: 'USER',
    },
  },

  notifications: {
    enabled: false,
    to: '',
    smtp: { host: '', port: 587, secure: false, user: '', pass: '', from: '' },
  },

  delayBetweenRoutes: 3000,
  delayAfterCrash: 5000,
  delayAuthRetry: 3000,
  maxAuthRetries: 10,
  maxConsecutiveCrashes: 5,
  pollInterval: 5000,
  maxPolls: 40,
  heartbeatEveryNRoutes: 15,
};

export default config;

4. Run

# Start your dev server and open Chrome first, then:
bun run start

Claude will sweep every route, test it, report results, and keep going.

5. Go to work

Come back to:

Terminal output with every test result
.runner/summary.json with the full report
.runner/runner.log with the complete log
Email notifications (if configured)

Commands

bun run start        # Start or resume a sweep
bun run reset        # Clear all state, start fresh

CLI

claude-runner          # Start or resume
claude-runner reset    # Clear state
claude-runner status   # Show progress

Graceful Stop

Create a .stop file to finish the current route and exit cleanly:

touch .runner/.stop

Email Notifications

Configure SMTP in runner.config.ts to get emails at milestones:

Sweep started
Section/portal completed (with pass/fail counts)
Heartbeat every N routes
Auth expired
Fatal errors

Works with Mailgun, SendGrid, Amazon SES, or any SMTP provider.

Auth Handling

If your app has an account picker (multiple roles), configure auth.accountMap:

auth: {
  authPaths: ['/auth'],
  accountMap: {
    admin: 'ADMINISTRATOR',    // clicks card containing "ADMINISTRATOR"
    customer: 'CUSTOMER',      // clicks card containing "CUSTOMER"
    employee: 'EMPLOYEE',
  },
},

Claude will click the right account based on which section of routes it's currently testing.

Writing Your Audit Extension

The audit extension is where your domain knowledge lives. The scaffold provides a minimal working example, but real power comes from custom tests.

Your extension needs:

A floating panel with specific DOM element IDs (configurable in runner.config.ts)
A <select> for choosing which page to test
A Run button that starts tests and sets disabled=true while running
Test results stored in localStorage under your configured key

The scaffold supports these test actions:

| Action | Fields | What it does | |--------|--------|-------------| | verifyElement | selector | Checks if a CSS selector matches an element | | verifyText | text | Checks if text appears on the page | | countElements | selector, min | Counts elements matching selector, fails if below min | | clickButton | text | Finds and clicks a button by its text |

Add custom actions for your app: CRUD operations, form validation, API checks, data integrity tests, etc.

How It Handles Failures

| Scenario | What happens | |----------|-------------| | Test fails | Recorded, moves to next route | | Claude crashes | Recorded as skipped, moves on | | Dev server dies | Auto-restarts, retries | | Chrome crashes | Auto-restarts, retries | | Auth page appears | Clicks the right account card | | Auth truly expired | Retries N times, then halts with email | | 5+ consecutive crashes | Full recovery (restart everything) | | You press Ctrl+C | Saves state, resume later with bun run start |

Output

.runner/summary.json:

{
  "totalRoutes": 50,
  "routesTested": 50,
  "testsPassed": 423,
  "testsFailed": 12,
  "routesSkipped": 2,
  "results": {
    "/dashboard": { "status": "passed", "passCount": 8, "failCount": 0 },
    "/dashboard/tasks": { "status": "failed", "passCount": 5, "failCount": 3, "errors": ["..."] }
  }
}

Architecture

your-project/
  runner.config.ts              ← your config
  .runner/                      ← created at runtime (gitignored)
    progress.json              ← current state (resumable)
    last-result.json           ← Claude's output per route
    system-prompt.md           ← generated from config
    runner.log                  ← full log
    summary.json               ← final report
    .lock                      ← prevents multiple instances
    .stop                      ← graceful stop signal

claude-runner/
  src/
    wrapper.ts                 ← main loop
    config.ts                  ← config loader
    prompt.ts                  ← system prompt + task prompt generator
    notify.ts                  ← SMTP email notifications
    recovery.ts                ← process management, lock, health checks
    types.ts                   ← TypeScript types
    reset.ts                   ← state reset utility
  bin/
    cli.ts                     ← CLI entry point
  extension-scaffold/          ← starter Chrome extension
    manifest.json
    background.js
    content.js

Why This Exists

Existing browser testing tools (Playwright, Cypress, Selenium) require hand-written test scripts with CSS selectors that break every time the UI changes. AI testing tools (TestSprite, Testim) are cloud-hosted, generic, and expensive.

Claude Runner is different:

Your extension carries your domain knowledge — you define what "correct" means for your app
Claude operates the browser like a human — real Chrome session, real auth, real clicks
Runs unattended — start it and leave
Self-recovering — handles crashes, auth, server restarts
Free — uses your existing Claude Max/Pro subscription
Open source — MIT licensed, customize everything

License

MIT — Matthew Goluba