flowforge-capture

v0.2.4

Published

28 minutes ago

CLI that records a QA engineer's browser flow (screen + mic + HAR + activity markers) and emits a structured flow.md for FlowForge ideas generation.

0High
0Medium
0Low

saravanamv

qa testing api flow-capture playwright whisper flowforge

flowforge-capture

CLI that records a QA engineer's browser flow (screen + microphone narration + HAR + activity-marker hotkey) and emits a structured flow.md for ingestion by the FlowForge Ideas pipeline.

Status: Phase 1, Windows-only pilot. Node 20+. Cross-platform audio is on the roadmap.

Install

npm install -g flowforge-capture
flowforge-capture init

That's it. The npm install step also downloads the Playwright Chromium browser (~150 MB) and a bundled ffmpeg binary (~70 MB) via post-install hooks. First-time install takes a few minutes; after that, updates are tiny.

If you're behind a corporate proxy and the post-install Chromium download fails, run flowforge-capture install-browsers once your network can reach playwright.dev.

What `init` does

A one-time interactive setup:

Asks for your FlowForge URL (e.g. https://flowforge-document360-staging.azurestaticapps.net)
Fetches the available projects from that URL and lets you pick one
Sets a default capture save directory
Optionally accepts an OpenAI API key as a fallback (the FlowForge URL handles transcription server-side by default — most QA people don't need a key)

Config is written to ~/.flowforge-capture/config.json. Re-run init any time to reconfigure.

Log in (one-time, per machine)

flowforge-capture auth

A Chromium window opens. Log in to your application normally, then come back to the terminal and press Enter. Auth state is saved to .flowforge-capture-auth.json in the current directory and reused on future start commands until the session expires.

Capture a flow

flowforge-capture start drive-folder-test

A controlled Chromium window opens, pre-authenticated. Perform the flow, narrate what you're doing out loud, and press Ctrl+Shift+M at the start of each logical activity to drop a marker. Close the browser when done (or wait for the 120-second hard cap).

Output lands in <captures-dir>/drive-folder-test/:

recording.webm      # screen + system audio
narration.wav       # microphone only
session.har         # network capture
markers.json        # activity boundaries
transcript.json     # Whisper output
spec-matches.json   # HAR calls matched to OpenAPI operations
flow.md             # the deliverable — upload this to FlowForge

Open flow.md in the FlowForge web app via the Ideas → Generate from capture modal. The AI will read the narration + matched calls + chained variables and produce test ideas.

Re-process an existing capture

If you tweak the OpenAPI spec or want to regenerate flow.md without recording again:

flowforge-capture process <captures-dir>/drive-folder-test

Commands

flowforge-capture init                  First-run setup wizard
flowforge-capture install-browsers      Manual Chromium install (fallback)
flowforge-capture auth                  Log in once, save auth state for replay
flowforge-capture start <flow-name>     Record a new flow
flowforge-capture process <flow-dir>    Re-run processing on existing capture
flowforge-capture list-mics             Print available audio input devices

Run flowforge-capture <command> --help for the full per-command option list.

Updates

The CLI checks for updates on every startup (async, cached for 24 hours via npm registry). When a new version is available, the next time you run any command, the CLI runs npm install -g flowforge-capture@latest automatically before dispatching your command — you'll see a one-line ✓ Updated flowforge-capture to vX.Y.Z and the command continues. The update never interrupts a recording in progress.

Opt out via ~/.flowforge-capture/config.json setting "autoUpdate": false, or by setting environment variable FLOWFORGE_CAPTURE_NO_UPDATE=1.

Transcription

When you configure a FlowForge URL via init, narration audio is sent to FlowForge's backend, which transcribes it via OpenAI Whisper using the org's API key. You never see or manage a key.

If the FlowForge proxy is unreachable (offline, server down), the CLI falls back to direct OpenAI calls using OPENAI_API_KEY from your environment or the optional fallback key you set in init. Cost reference: ~$0.006 per minute of audio.

The narration is uploaded to your FlowForge instance or to OpenAI directly — whichever path runs. If that's not acceptable for a deployment, we can swap to local Whisper; the transcribe module is ~100 lines and the rest of the pipeline doesn't care which backend produced the segments.

Configuration overrides

| Setting | Lookup order | |---|---| | FlowForge URL, project, captures dir | ~/.flowforge-capture/config.json (written by init) | | Per-project overrides | .flowforge-capture.local.config.mjs in CWD | | OpenAI key | OPENAI_API_KEY env var → user config openaiApiKey | | Auth state file | .flowforge-capture-auth.json in CWD |

Development (from source)

Clone the parent repo and work in the flowforge-capture/ subfolder:

cd flowforge-capture
npm install
npm run build
npm test -- --runInBand
node dist/cli.js --help

Tests use --runInBand because parallel Jest workers can OOM with the new transcribe + auto-update deps.

The package's postinstall hook auto-skips when npm_config_global is unset — so cloning the repo for development doesn't accidentally trigger a 150MB Playwright download. To set up Playwright manually for dev work: npx playwright install chromium.

Architecture

Three modules, with a deterministic markdown-generation contract locked by a golden test:

capture/   Playwright + ffmpeg + global hotkey -> raw artifacts
process/   raw artifacts -> ProcessedCapture (typed intermediate)
generate/  ProcessedCapture -> flow.md

The markdown template and contract live in src/generate/markdown.ts. The golden test in test/generate.test.ts locks it byte-for-byte against test/fixtures/expected-flow.md. Do not change the template without updating the fixture.

Releasing (maintainers)

Bump version in package.json
Commit + push
Tag: git tag capture-v1.2.3 && git push origin capture-v1.2.3

The .github/workflows/release-capture.yml workflow fires on the tag and publishes to npm. -rc / -beta / -alpha suffix tags run a --dry-run publish so you can validate without polluting the registry. Tags must match package.json version exactly — the workflow refuses mismatches.

One-time setup: an npm Automation token in repo secrets as NPM_TOKEN.

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

flowforge-capture

Install

What init does

Log in (one-time, per machine)

Capture a flow

Re-process an existing capture

Commands

Updates

Transcription

Configuration overrides

Development (from source)

Architecture

Releasing (maintainers)

What `init` does