@blopai/cli

v1.4.0

Published

3 days ago

Agentic browser testing package for Blop. Runs intent-based E2E tests with Playwright and an agent loop.

0High
0Medium
0Low

hananinas

blop testing e2e browser playwright agent qa cli

@blopai/cli

Blop is an agentic browser testing package. You write intent-based E2E tests — a plain-English goal instead of a brittle click-by-click script — and a native agent loop (src/runtime/agent-loop.ts) drives a real browser to satisfy that goal. It streams OpenAI-compatible tool calls and executes Playwright-backed browser tools in-process (no agent binary, no RPC hop).

It is designed to work locally and in CI first. Result files are shaped so they can later be uploaded to Blop Platform for run history, traces, screenshots, cloud browsers, and hosted runners.

How it works

spec (.blop.ts)  →  runner  →  agent loop  →  browser tools  →  Playwright browser
   goal text         orchestrates   streams LLM    snapshot/click/    local launch OR
                     reporters      tool calls     expect/extract…    Docker container

The agent decomposes your goal into critical points, snapshots the page (visible text + ARIA roles), drives deterministic browser tools (browser_expect_*, browser_extract, browser_run_steps, …), and ends by calling finish_test with a pass/fail verdict and reason.

Package layout

src/
  cli/        # command parsing, discovery, watch/list/init flows
  runtime/    # agent test DSL, agent loop, runner, public runtime types
  browser/    # Playwright-backed native tools exposed to the agent
  reporters/  # JSON, event log, and JUnit-compatible report output
  platform/   # Blop Platform upload boundary
  node/       # Node/Bun host utilities (config, CI metadata, Docker container)
test/
  browser/    # browser tool behavior tests
  e2e/        # CLI, config, reporting, platform, and runner flows

Install

npm install --save-dev @blopai/cli
# or: pnpm add -D @blopai/cli   |   bun add -d @blopai/cli

The package ships a blop binary, so you can also run it without installing:

npx @blopai/cli init
npx @blopai/cli test --base-url http://localhost:3000

Playwright is a dependency. The first local run may need browsers installed:

npx playwright install chromium

(With --containerized, the browser lives in Docker and no local install is needed — see Docker integration.)

Quick start

# 1. Scaffold a test
npx @blopai/cli init                  # writes tests/homepage.blop.ts

# 2. Point the agent at a model provider (see Provider setup)
export BLOP_AGENT_PROVIDER=openrouter
export BLOP_AGENT_MODEL=anthropic/claude-sonnet-4-6
export BLOP_AGENT_API_KEY=sk-or-...

# 3. Run against your app
npx @blopai/cli test --base-url http://localhost:3000

From source inside this repo (uses Bun):

bun install
bun run src/cli/index.ts templates/basic.blop.ts --base-url http://localhost:3000

Provider setup

The native agent loop speaks the OpenAI chat-completions wire protocol only. Any provider that exposes that protocol works out of the box; Blop ships the base URLs for the common ones.

| Provider | --provider value | Base URL | |--------------|--------------------|---------------------------------------| | OpenRouter | openrouter (default) | https://openrouter.ai/api/v1 | | OpenAI | openai | https://api.openai.com/v1 | | Groq | groq | https://api.groq.com/openai/v1 | | xAI | xai | https://api.x.ai/v1 | | Mistral | mistral | https://api.mistral.ai/v1 | | Cerebras | cerebras | https://api.cerebras.ai/v1 | | NVIDIA | nvidia | https://integrate.api.nvidia.com/v1 |

Default provider is openrouter. model and an API key are always required — there is no implicit default model.

Anthropic & Google models

Anthropic (Claude) and Google (Gemini) do not speak the chat-completions wire protocol natively, so route them through OpenRouter:

export BLOP_AGENT_PROVIDER=openrouter
export BLOP_AGENT_MODEL=anthropic/claude-sonnet-4-6     # or google/gemini-2.5-pro
export BLOP_AGENT_API_KEY=sk-or-...

Three ways to configure

Provider, model, and API key resolve in this order — CLI flag → blop.config.ts → environment variable:

1. Environment variables (recommended for CI; keep keys out of the repo):

export BLOP_AGENT_PROVIDER=openai
export BLOP_AGENT_MODEL=gpt-5
export BLOP_AGENT_API_KEY=sk-...
# Optional: override the base URL for any provider (self-hosted gateways, proxies)
export BLOP_AGENT_BASE_URL=https://my-llm-gateway.internal/v1

2. CLI flags (handy for one-off runs):

blop test \
  --provider openrouter \
  --model anthropic/claude-sonnet-4-6 \
  --api-key sk-or-... \
  --base-url http://localhost:3000

3. blop.config.ts (per-project defaults; never commit live keys — read from process.env):

import type { BlopConfig } from "@blopai/cli";

export default {
  provider: "openrouter",
  model: "anthropic/claude-sonnet-4-6",
  apiKey: process.env.BLOP_AGENT_API_KEY,
  baseUrl: "http://localhost:3000",
} satisfies BlopConfig;

Note: Blop reads the BLOP_AGENT_API_KEY environment variable for the key. Provider-native names like OPENAI_API_KEY are not read automatically — map them yourself (BLOP_AGENT_API_KEY=$OPENAI_API_KEY) or pass apiKey in config.

A custom/self-hosted OpenAI-compatible endpoint? Set BLOP_AGENT_BASE_URL and use any of the supported --provider values for the auth header shape.

Writing tests

Function form — multi-step, reads like a user story:

import { agentTest, describe } from "@blopai/cli";

describe("onboarding", () => {
  agentTest("user can create a project", async ({ agent }) => {
    await agent.goto("/");
    await agent.goal(`
      Sign in as the test user.
      Create a project called Checkout QA.
      Verify the project appears in the dashboard.
    `);
  });
});

Object form — compact, great for generated tests:

import { defineAgentTest } from "@blopai/cli";

export default defineAgentTest({
  name: "smoke test",
  goal: "Open the app and verify the homepage is usable.",
});

Specs are discovered by the default **/*.blop.ts / **/*.blop.tsx glob. agent.goto(url) resolves relative URLs against --base-url.

Commands

Blop follows a familiar testing-framework command shape, but the runtime is purpose-built for browser agents.

blop                         # discover and run **/*.blop.ts
blop test                    # same as blop
blop run tests/smoke.blop.ts # run once
blop watch tests             # rerun when matching specs change
blop list                    # list discovered tests without launching a browser
blop init [file]             # scaffold tests/homepage.blop.ts

Options

-u, --base-url <url>       Base URL for the app under test
    --report-dir <dir>     Report output directory (default: .blop)
    --progress-file <file> Append live NDJSON progress (test_start, action, frame, test_finish)
    --capture-screenshots  Save a screenshot after each browser action
    --no-stream            Disable the live CDP screencast (use per-action screenshots)
    --frame-interval <ms>  Min ms between streamed frame progress lines (default: 200)
    --headed               Run the browser headed (local launch only)
    --browser <name>       chromium, firefox, or webkit
    --containerized        Run the browser in a shared Docker Playwright container
    --viewport <size>      Viewport as WIDTHxHEIGHT (e.g. 390x844)
-c, --config <file>        Config file path
    --provider <provider>  Agent provider (openrouter, openai, groq, xai, mistral, cerebras, nvidia)
    --model <model>        Agent model id
    --api-key <key>        Agent API key
    --max-steps <number>   Optional hard cap on agent tool steps
-r, --reporter <reporter>  basic, json, junit, or all
-v, --verbose              Print agent stream events in real time
-h, --help                 Show help

By default there is no step cap — the agent works until it calls finish_test, the test times out, or the runner's stall guard detects it looping without progress. Pass --max-steps <n> only for a hard budget.

Reporters are basic, json, junit, and all. Blop always writes results.json and events.jsonl; junit/all also write report.xml.

Configuration file

Any CLI option has a blop.config.ts equivalent (blop.config.{ts,mts,js,mjs} is auto-discovered from the cwd):

import type { BlopConfig } from "@blopai/cli";

export default {
  include: ["tests/**/*.blop.ts"],
  baseUrl: "http://localhost:3000",
  browser: "chromium",
  viewport: { width: 1280, height: 720 },
  reporter: "all",
  provider: "openrouter",
  model: "anthropic/claude-sonnet-4-6",
  apiKey: process.env.BLOP_AGENT_API_KEY,
} satisfies BlopConfig;

Docker integration

Pass --containerized (or set containerized: true in config) to run the browser inside a Docker container instead of launching one on the host. Useful for CI, hermetic environments, or machines without a local browser install.

blop test --containerized --base-url http://host.docker.internal:3000

How it works:

On first use, Blop starts one long-lived container named blop-playwright running playwright run-server, then connects to it over a WebSocket (CDP). Each run gets an isolated browser inside that shared container; the container stays warm (--restart unless-stopped) and is reused across runs, so only the first run pays startup cost.
The image is pinned to your installed Playwright version — mcr.microsoft.com/playwright:v<version>-noble — because the run-server connect protocol must match the client minor version. A Playwright upgrade transparently replaces the container.
Containerized runs use chromium (the --browser flag is ignored in this mode).

Reaching your app from inside the container:

The container is started with --add-host host.docker.internal:host-gateway, so a dev server running on the host is reachable at http://host.docker.internal:<port>. Use that for --base-url:

# App under test runs on the host at :3000
blop test --containerized --base-url http://host.docker.internal:3000

If the app under test runs in another container on a shared Docker network, use that service's hostname instead.

Environment overrides:

| Variable | Purpose | |-----------------------------|---------------------------------------------------------------------| | BLOP_PLAYWRIGHT_IMAGE | Force a specific Playwright server image instead of the auto-pinned one | | BLOP_PLAYWRIGHT_CONTAINER | Override the shared container name (default blop-playwright) |

Requirements: Docker installed and the daemon running. The container is created on demand; Blop never tears it down automatically (remove it manually with docker rm -f blop-playwright if needed).

Reports & CI

blop test ./tests/smoke.blop.ts \
  --base-url http://localhost:3000 \
  --report-dir .blop \
  --reporter all

Outputs in .blop/:

results.json — structured run result (status, reason, actions, events, provider/model, CI metadata)
events.jsonl — append-only agent event log
screenshots/ — captured step/evidence screenshots
report.xml — JUnit-compatible report (junit/all reporters)

The CLI exits 0 when every test passes and 1 otherwise, so it drops straight into a CI gate.

Example: GitHub Actions

- name: Run Blop agent tests
  env:
    BLOP_AGENT_PROVIDER: openrouter
    BLOP_AGENT_MODEL: anthropic/claude-sonnet-4-6
    BLOP_AGENT_API_KEY: ${{ secrets.BLOP_AGENT_API_KEY }}
  run: npx @blopai/cli test --base-url http://localhost:3000 --report-dir .blop --reporter all

Future platform integration

Set these once Blop Platform ingest exists:

BLOP_PLATFORM_URL=https://app.blop.dev
BLOP_API_KEY=...

The runner already preserves the fields needed for upload: run id, test id, CI metadata, model/provider, screenshots, agent events, status, and failure reason.

Releasing

Publishing is automated by .github/workflows/release-blop.yml, triggered by a version tag.

One-time setup: add an npm automation token as the NPM repository secret, and make sure the @blopai scope exists on npm with that token allowed to publish to it.

To cut a release:

# 1. Bump the version in packages/blop/package.json (e.g. 1.1.0 -> 1.2.0)
# 2. Commit the bump, then tag with the matching version:
git tag blop-v1.2.0
git push origin blop-v1.2.0

The workflow verifies the tag matches package.json, builds, then runs npm publish --access public and opens a GitHub Release. To publish manually instead:

cd packages/blop
npm run build
npm publish   # publishConfig sets access: public

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

@blopai/cli

How it works

Package layout

Install

Quick start

Provider setup

Anthropic & Google models

Three ways to configure

Writing tests

Commands

Options

Configuration file

Docker integration

Reports & CI

Example: GitHub Actions

Future platform integration

Releasing