cliproof
v1.0.0
Published
Prove your CLI actually works. Captures a real terminal command and its real output as a polished screenshot or GIF, redacts secrets, and embeds it into your README as proof-it-runs evidence. Agent-agnostic (Claude Code, Cursor, Codex, OpenCode, Gemini, W
Maintainers
Readme
📸 cliproof
Your README should show it works — not just say it.
Capture any real command and its real output — tests, builds, servers,
scripts, pipelines — as a polished screenshot or GIF, redact any secrets,
and embed it into your README.md as honest, durable evidence it works.
Then let CI fail the moment that evidence goes stale.
☝️ A real cliproof capture — generated by this repo's own skill, of its own preflight command.
Why cliproof
A README that says "it works" convinces no one. A screenshot of the command actually running — version banner, passing tests, real output — convinces everyone in two seconds. cliproof makes that screenshot honest, repeatable, and durable: it runs the real command, captures the genuine output, strips any secrets, embeds it in your README, and can re-check in CI that the command still produces the same result.
It works as a Claude Code Agent Skill, an installable plugin, and an
agent-agnostic npm CLI (npm i -g cliproof) for Cursor, Codex, OpenCode,
Gemini CLI, Windsurf, or any agent that runs shell commands.
How is this different?
Most "agent eyes" tools verify browser/UI sessions. cliproof verifies the terminal — the build, the test run, the CLI itself — and turns it into durable README proof.
| | screenshot/GIF tools | browser-proof tools | cliproof | |---|:---:|:---:|:---:| | Real command + real output | ✅ | ➖ (browser) | ✅ | | Static screenshot (themed window) | some | ❌ | ✅ | | Animated GIF | ✅ | ✅ | ✅ | | Enforced secret redaction | ❌ | ❌ | ✅ | | Destructive-command guard | ❌ | ❌ | ✅ | | Idempotent README embed | ❌ | ❌ | ✅ | | CI freshness check (fails when proof drifts) | ❌ | ❌ | ✅ | | Pass/fail verify (10+ langs) | ❌ | ✅ | ✅ | | Proof → PR comment | ❌ | ✅ | ✅ | | Agent-agnostic install | ❌ | ✅ | ✅ | | Zero-dependency, no-network core | ❌ | ❌ | ✅ |
cliproof's ✅s are backed by tests — see Testing. (Animated GIF is exercised best-effort in CI since it needs ttyd; the PR-comment post is a gh side-effect, only its body is unit-tested.)
Who it's for
cliproof works for anyone who runs shell commands and wants durable proof:
- Open-source maintainers — show contributors and users the project ships
- Backend & platform engineers — prove builds, servers, and scripts run
- QA & DevOps teams — capture test runs and pipeline output as evidence
- AI agent builders — wire capture → redact → embed as an MCP tool chain
- Anyone writing a README — stop saying it works; show it
Features
- 📸 Static screenshots — macOS/iOS window or Windows-terminal look, one-flag themes (
--preset macos|github-dark|nord|iterm|win11). - 🎬 Animated GIFs — scripted typing + live output via
vhs. - 🔒 Enforced secret redaction — masks API keys, tokens, JWTs, private keys; normalises emails/IPs/home paths before anything is committed. Extend with a
.cliproof/redact.jsonpolicy (custom patterns + allowlist). - 🛡️ Command-safety guard — refuses destructive/exfiltration commands (
rm -rf /,curl … | sh, credential reads). - ♻️ Idempotent README inserts — marker blocks with diff preview + backup.
- 🎯 Determinism + freshness check — normalises volatile output (durations, timestamps, paths) and fails CI when a proof drifts from reality. Reusable GitHub Action included.
- ✅ Verify — run a command and judge pass/fail from exit code + error signatures across 10+ languages; emit a PR-ready report.
- 🧠 Auto-suggest — scans your repo (
package.json/Makefile/--help/quickstart) for the best command to capture. - 🎞️ Storyboard — stitch a command sequence into one "session" image.
- 💬 PR proof — post the screenshot + verdict as a pull-request comment.
- 🧰 Zero-dependency core — pure Python stdlib, no network, fully auditable.
Install
npm — works in any agent, pipeline, or IDE:
npm install -g cliproof
cliproof install # detects Claude Code, Cursor, Codex, OpenCode, Gemini, WindsurfMCP server (Claude Code, Cursor, Windsurf, LangChain — any MCP-compatible agent):
// .mcp.json
{ "mcpServers": { "cliproof": { "command": "cliproof", "args": ["mcp"] } } }Python library:
pip install cliprooffrom cliproof import capture, redact, embed
result = capture("pytest -q", preset="catppuccin")
redact(result.image, in_place=True)
embed("README.md", image=result.image, block_id="tests")Docker (any CI — GitHub Actions, GitLab CI, Jenkins, CircleCI):
- docker run --rm -v $PWD:/repo \
ghcr.io/aks-builds/cliproof:latest \
capture --execute "pytest -q" -o /repo/.github/media/tests.svg --jsonLocal HTTP daemon (IDE extensions, polyglot callers):
cliproof serve # starts on localhost:7070
curl localhost:7070/healthClaude Code plugin:
/plugin marketplace add aks-builds/cliproof
/plugin install cliproof@cliproof # @cliproof is the marketplace name, not the repo pathManual skill: copy skills/cliproof/ into ~/.claude/skills/cliproof/.
Capture tools (installed on demand, official sources, version-pinned):
freeze—go install github.com/charmbracelet/[email protected]·brew install charmbracelet/tap/freeze·scoop install freezevhs(GIFs only) —brew install vhs(needsffmpeg+ttyd; on Windows use WSL)
Usage
In your agent, just ask: "screenshot myapp --help for the README" or
"add proof the tests pass". Under the hood (also runnable directly, or via the
cliproof <cmd> CLI):
cliproof suggest . # what's the best proof command?
cliproof guard -- "myapp --help" # safe to capture?
cliproof capture --execute "myapp --help" \
--preset macos -o .github/media/help.svg # real capture → redactable SVG
cliproof redact .github/media/help.svg --in-place # strip secrets (exit 3 = blocked)
cliproof embed README.md --image .github/media/help.svg \
--alt "myapp --help" --id help --heading Demo # idempotent insert (diff + .bak)embed maintains a keyed marker block, so re-running with the same --id
updates in place instead of duplicating:
<!-- cliproof:start id=help -->

<!-- cliproof:end id=help -->
capturewrapsfreeze: it closes stdin (rawfreezehangs on an inherited pipe in agent/CI shells) and writes SVG (its PNG rasterizer can crash on Windows). Userasterizeto make a PNG after redaction.
Keep your proofs fresh (the moat)
A screenshot proves your CLI worked once. cliproof keeps it honest. Record a
baseline of the command's normalised output, list it in .cliproof/proof.json,
and let CI re-run it and fail if it drifts:
// .cliproof/proof.json
{ "proofs": [
{ "id": "cli-help", "command": "myapp --help",
"baseline": ".github/media/cli-help.cliproof.txt",
"image": ".github/media/cli-help.svg" } ] }# .github/workflows/freshness.yml
- uses: aks-builds/cliproof/.github/actions/cliproof-check@v1
with: { manifest: .cliproof/proof.json }cliproof check --update records baselines; cliproof check compares. Volatile
noise (durations, timestamps, temp paths, ports) is normalised so only real
drift ("3 tests now fail") fails the build.
Security
cliproof runs real shell commands and writes into your repo, so security is enforced by code, not by convention:
- Pure stdlib, no network — zero third-party deps, no build step, no telemetry. Audit
skills/cliproof/scripts/. - Redaction before embedding —
redact.pyblocks SECRET-class findings (exit 3); extend via.cliproof/redact.json. - Command guard —
guard.pyrefuses destructive/exfiltration commands. - Pinned, official tools —
freeze/vhsversion-pinned, installed only with confirmation.
Threat model: references/security.md. Report privately: SECURITY.md.
Repository layout
cliproof/
├── .claude-plugin/ plugin.json · marketplace.json
├── bin/cli.js agent-agnostic npm CLI (install + passthrough)
├── skills/cliproof/
│ ├── SKILL.md the skill (rules, workflow, gates)
│ ├── references/ tooling.md · security.md
│ ├── scripts/ preflight · guard · capture · redact · rasterize · embed
│ │ · normalize · check · suggest · verify · storyboard · annotate · pr
│ └── assets/ demo.tape.template (VHS)
├── .cliproof/ proof.json (freshness manifest) · redact.json (policy)
├── tests/ pytest (87 tests)
└── .github/ CI · CodeQL · Dependabot · release/publish · freshness actionTesting
Every claim above is backed by tests, at one of two levels:
- Unit (114 pytest + 5 Node tests, runs on every PR): redaction (+ policy/allowlist), command guard, idempotent embed, determinism + freshness engine, verify across languages, suggest, storyboard, annotate, capture-wrapper + theme presets, rasterizer selection, manifest validation, the agent-agnostic
install, and a guard test that asserts the scripts import only stdlib and no network modules. - Integration (
integration.yml, real tools on Linux): installsfreezeand runs the full pipeline end-to-end —capture --execute→ assert SVG →redact→embed(asserts idempotency) →checkround-trip. A realvhsGIF is rendered best-effort (needsttyd).
Not asserted by tests: the freeze PNG rasterizer on Windows (it crashes there — that's why cliproof captures SVG), and the gh PR-comment network call (only the comment body is unit-tested). Run it all locally with pytest -q and npm test.
Contributing
PRs welcome — see CONTRIBUTING.md. Good first contributions:
redaction patterns, guard rules, themes, language signatures for verify. Found
a security issue? See SECURITY.md. Licensed under MIT.
