@blopai/cli
v1.4.0
Published
Agentic browser testing package for Blop. Runs intent-based E2E tests with Playwright and an agent loop.
Maintainers
Readme
@blopai/cli
Blop is an agentic browser testing package. You write intent-based E2E tests — a plain-English goal instead of a brittle click-by-click script — and a native agent loop (src/runtime/agent-loop.ts) drives a real browser to satisfy that goal. It streams OpenAI-compatible tool calls and executes Playwright-backed browser tools in-process (no agent binary, no RPC hop).
It is designed to work locally and in CI first. Result files are shaped so they can later be uploaded to Blop Platform for run history, traces, screenshots, cloud browsers, and hosted runners.
How it works
spec (.blop.ts) → runner → agent loop → browser tools → Playwright browser
goal text orchestrates streams LLM snapshot/click/ local launch OR
reporters tool calls expect/extract… Docker containerThe agent decomposes your goal into critical points, snapshots the page (visible text + ARIA roles), drives deterministic browser tools (browser_expect_*, browser_extract, browser_run_steps, …), and ends by calling finish_test with a pass/fail verdict and reason.
Package layout
src/
cli/ # command parsing, discovery, watch/list/init flows
runtime/ # agent test DSL, agent loop, runner, public runtime types
browser/ # Playwright-backed native tools exposed to the agent
reporters/ # JSON, event log, and JUnit-compatible report output
platform/ # Blop Platform upload boundary
node/ # Node/Bun host utilities (config, CI metadata, Docker container)
test/
browser/ # browser tool behavior tests
e2e/ # CLI, config, reporting, platform, and runner flowsInstall
npm install --save-dev @blopai/cli
# or: pnpm add -D @blopai/cli | bun add -d @blopai/cliThe package ships a blop binary, so you can also run it without installing:
npx @blopai/cli init
npx @blopai/cli test --base-url http://localhost:3000Playwright is a dependency. The first local run may need browsers installed:
npx playwright install chromium(With --containerized, the browser lives in Docker and no local install is needed — see Docker integration.)
Quick start
# 1. Scaffold a test
npx @blopai/cli init # writes tests/homepage.blop.ts
# 2. Point the agent at a model provider (see Provider setup)
export BLOP_AGENT_PROVIDER=openrouter
export BLOP_AGENT_MODEL=anthropic/claude-sonnet-4-6
export BLOP_AGENT_API_KEY=sk-or-...
# 3. Run against your app
npx @blopai/cli test --base-url http://localhost:3000From source inside this repo (uses Bun):
bun install
bun run src/cli/index.ts templates/basic.blop.ts --base-url http://localhost:3000Provider setup
The native agent loop speaks the OpenAI chat-completions wire protocol only. Any provider that exposes that protocol works out of the box; Blop ships the base URLs for the common ones.
| Provider | --provider value | Base URL |
|--------------|--------------------|---------------------------------------|
| OpenRouter | openrouter (default) | https://openrouter.ai/api/v1 |
| OpenAI | openai | https://api.openai.com/v1 |
| Groq | groq | https://api.groq.com/openai/v1 |
| xAI | xai | https://api.x.ai/v1 |
| Mistral | mistral | https://api.mistral.ai/v1 |
| Cerebras | cerebras | https://api.cerebras.ai/v1 |
| NVIDIA | nvidia | https://integrate.api.nvidia.com/v1 |
Default provider is openrouter. model and an API key are always required — there is no implicit default model.
Anthropic & Google models
Anthropic (Claude) and Google (Gemini) do not speak the chat-completions wire protocol natively, so route them through OpenRouter:
export BLOP_AGENT_PROVIDER=openrouter
export BLOP_AGENT_MODEL=anthropic/claude-sonnet-4-6 # or google/gemini-2.5-pro
export BLOP_AGENT_API_KEY=sk-or-...Three ways to configure
Provider, model, and API key resolve in this order — CLI flag → blop.config.ts → environment variable:
1. Environment variables (recommended for CI; keep keys out of the repo):
export BLOP_AGENT_PROVIDER=openai
export BLOP_AGENT_MODEL=gpt-5
export BLOP_AGENT_API_KEY=sk-...
# Optional: override the base URL for any provider (self-hosted gateways, proxies)
export BLOP_AGENT_BASE_URL=https://my-llm-gateway.internal/v12. CLI flags (handy for one-off runs):
blop test \
--provider openrouter \
--model anthropic/claude-sonnet-4-6 \
--api-key sk-or-... \
--base-url http://localhost:30003. blop.config.ts (per-project defaults; never commit live keys — read from process.env):
import type { BlopConfig } from "@blopai/cli";
export default {
provider: "openrouter",
model: "anthropic/claude-sonnet-4-6",
apiKey: process.env.BLOP_AGENT_API_KEY,
baseUrl: "http://localhost:3000",
} satisfies BlopConfig;Note: Blop reads the
BLOP_AGENT_API_KEYenvironment variable for the key. Provider-native names likeOPENAI_API_KEYare not read automatically — map them yourself (BLOP_AGENT_API_KEY=$OPENAI_API_KEY) or passapiKeyin config.
A custom/self-hosted OpenAI-compatible endpoint? Set BLOP_AGENT_BASE_URL and use any of the supported --provider values for the auth header shape.
Writing tests
Function form — multi-step, reads like a user story:
import { agentTest, describe } from "@blopai/cli";
describe("onboarding", () => {
agentTest("user can create a project", async ({ agent }) => {
await agent.goto("/");
await agent.goal(`
Sign in as the test user.
Create a project called Checkout QA.
Verify the project appears in the dashboard.
`);
});
});Object form — compact, great for generated tests:
import { defineAgentTest } from "@blopai/cli";
export default defineAgentTest({
name: "smoke test",
goal: "Open the app and verify the homepage is usable.",
});Specs are discovered by the default **/*.blop.ts / **/*.blop.tsx glob. agent.goto(url) resolves relative URLs against --base-url.
Commands
Blop follows a familiar testing-framework command shape, but the runtime is purpose-built for browser agents.
blop # discover and run **/*.blop.ts
blop test # same as blop
blop run tests/smoke.blop.ts # run once
blop watch tests # rerun when matching specs change
blop list # list discovered tests without launching a browser
blop init [file] # scaffold tests/homepage.blop.tsOptions
-u, --base-url <url> Base URL for the app under test
--report-dir <dir> Report output directory (default: .blop)
--progress-file <file> Append live NDJSON progress (test_start, action, frame, test_finish)
--capture-screenshots Save a screenshot after each browser action
--no-stream Disable the live CDP screencast (use per-action screenshots)
--frame-interval <ms> Min ms between streamed frame progress lines (default: 200)
--headed Run the browser headed (local launch only)
--browser <name> chromium, firefox, or webkit
--containerized Run the browser in a shared Docker Playwright container
--viewport <size> Viewport as WIDTHxHEIGHT (e.g. 390x844)
-c, --config <file> Config file path
--provider <provider> Agent provider (openrouter, openai, groq, xai, mistral, cerebras, nvidia)
--model <model> Agent model id
--api-key <key> Agent API key
--max-steps <number> Optional hard cap on agent tool steps
-r, --reporter <reporter> basic, json, junit, or all
-v, --verbose Print agent stream events in real time
-h, --help Show helpBy default there is no step cap — the agent works until it calls finish_test, the test times out, or the runner's stall guard detects it looping without progress. Pass --max-steps <n> only for a hard budget.
Reporters are basic, json, junit, and all. Blop always writes results.json and events.jsonl; junit/all also write report.xml.
Configuration file
Any CLI option has a blop.config.ts equivalent (blop.config.{ts,mts,js,mjs} is auto-discovered from the cwd):
import type { BlopConfig } from "@blopai/cli";
export default {
include: ["tests/**/*.blop.ts"],
baseUrl: "http://localhost:3000",
browser: "chromium",
viewport: { width: 1280, height: 720 },
reporter: "all",
provider: "openrouter",
model: "anthropic/claude-sonnet-4-6",
apiKey: process.env.BLOP_AGENT_API_KEY,
} satisfies BlopConfig;Docker integration
Pass --containerized (or set containerized: true in config) to run the browser inside a Docker container instead of launching one on the host. Useful for CI, hermetic environments, or machines without a local browser install.
blop test --containerized --base-url http://host.docker.internal:3000How it works:
- On first use, Blop starts one long-lived container named
blop-playwrightrunningplaywright run-server, then connects to it over a WebSocket (CDP). Each run gets an isolated browser inside that shared container; the container stays warm (--restart unless-stopped) and is reused across runs, so only the first run pays startup cost. - The image is pinned to your installed Playwright version —
mcr.microsoft.com/playwright:v<version>-noble— because the run-server connect protocol must match the client minor version. A Playwright upgrade transparently replaces the container. - Containerized runs use chromium (the
--browserflag is ignored in this mode).
Reaching your app from inside the container:
The container is started with --add-host host.docker.internal:host-gateway, so a dev server running on the host is reachable at http://host.docker.internal:<port>. Use that for --base-url:
# App under test runs on the host at :3000
blop test --containerized --base-url http://host.docker.internal:3000If the app under test runs in another container on a shared Docker network, use that service's hostname instead.
Environment overrides:
| Variable | Purpose |
|-----------------------------|---------------------------------------------------------------------|
| BLOP_PLAYWRIGHT_IMAGE | Force a specific Playwright server image instead of the auto-pinned one |
| BLOP_PLAYWRIGHT_CONTAINER | Override the shared container name (default blop-playwright) |
Requirements: Docker installed and the daemon running. The container is created on demand; Blop never tears it down automatically (remove it manually with docker rm -f blop-playwright if needed).
Reports & CI
blop test ./tests/smoke.blop.ts \
--base-url http://localhost:3000 \
--report-dir .blop \
--reporter allOutputs in .blop/:
results.json— structured run result (status, reason, actions, events, provider/model, CI metadata)events.jsonl— append-only agent event logscreenshots/— captured step/evidence screenshotsreport.xml— JUnit-compatible report (junit/allreporters)
The CLI exits 0 when every test passes and 1 otherwise, so it drops straight into a CI gate.
Example: GitHub Actions
- name: Run Blop agent tests
env:
BLOP_AGENT_PROVIDER: openrouter
BLOP_AGENT_MODEL: anthropic/claude-sonnet-4-6
BLOP_AGENT_API_KEY: ${{ secrets.BLOP_AGENT_API_KEY }}
run: npx @blopai/cli test --base-url http://localhost:3000 --report-dir .blop --reporter allFuture platform integration
Set these once Blop Platform ingest exists:
BLOP_PLATFORM_URL=https://app.blop.dev
BLOP_API_KEY=...The runner already preserves the fields needed for upload: run id, test id, CI metadata, model/provider, screenshots, agent events, status, and failure reason.
Releasing
Publishing is automated by .github/workflows/release-blop.yml, triggered by a version tag.
One-time setup: add an npm automation token as the NPM repository secret, and make sure the @blopai scope exists on npm with that token allowed to publish to it.
To cut a release:
# 1. Bump the version in packages/blop/package.json (e.g. 1.1.0 -> 1.2.0)
# 2. Commit the bump, then tag with the matching version:
git tag blop-v1.2.0
git push origin blop-v1.2.0The workflow verifies the tag matches package.json, builds, then runs npm publish --access public and opens a GitHub Release. To publish manually instead:
cd packages/blop
npm run build
npm publish # publishConfig sets access: public