one-shot-ui

v0.9.0

Published

7 days ago

Deterministic UI extraction and diffing from screenshots

0High
0Medium
0Low

tn0123

ui screenshot extract diff design-to-code cli agent vision mcp model-context-protocol claude-code

one-shot-ui

Catch what the eye can't.

Deterministic screenshot diffing for AI coding agents. Turn a reference screenshot into structured data, diff any build against it — pixel, layout, color, and type — then get the exact CSS to fix.

The problem

AI agents get UI ~90% of the way there — then stall. The layout looks right, but a card is 8px too tall, a panel is the wrong shade of gray, a shadow is flat, a gap is off by 24px. Asking the model to "look at the screenshot again and fix it" is slow, and you get a different answer every time.

one-shot-ui closes that last 10% deterministically. It extracts structured data from a reference screenshot — layout regions, colors, typography, spacing, design tokens — diffs your implementation against it, and returns specific, ranked fixes, not "make it look more like this."

Set width to 616px (currently 640px)
Change the fill color to #303040.
Set box-shadow to -3px 0px 24px 0px rgba(28, 29, 38, 0.32).
gap: 176px; /* currently ~152px */

Copy-paste CSS, ranked by visual impact — every example above is real output from the run below.

Watch it converge

The run command loops extract → capture → compare → fix until the heatmap goes quiet. In the run above, the agent's first build looked identical to the eye — one-shot-ui flagged 15 concrete deltas (position, size, color, shadow, spacing) and the loop drove the build to ~2.5% pixel mismatch, within ~0.5% of the tool's own estimated irreducible floor (≈2%, sub-pixel font rendering) for this design.

Why one-shot-ui

Deterministic, not vibes. Stable pixel + structural diff scores — same input, same numbers — so you can gate CI on "is this pixel-close enough?"
Exact fixes, not nudges. It returns concrete CSS (width: 616px, #303040, gap: 176px), grouped by component and ranked by visual impact.
Structural, not just pixels. Detects missing/extra elements, position & size shifts, color, shadow, spacing, and typography — and labels which differences are irreducible (anti-aliasing, photographic content) so agents don't chase ghosts.
Agent-native. Ships an AGENTS.md (auto-discovered by Claude Code, Cursor, Codex, …) plus a Claude Code skill, so your agent drives it without hand-holding.
Local & private. Pixel diffing, OCR, and layout extraction all run on your machine. No images leave your box, no API keys.

Install

npm install -g one-shot-ui

For commands that need a browser (capture, run):

npx playwright install chromium

Quick start

# Diff your implementation against a reference and see exactly what's off
one-shot-ui compare reference.png build.png --json --heatmap heatmap.png

# Get copy-paste CSS fixes, ranked by impact
one-shot-ui suggest-fixes reference.png build.png --json

# Or run the full automated loop until it converges
one-shot-ui run reference.png --impl ./index.html --max-passes 5 --threshold 0.02

Every command supports --json for structured, agent-friendly output.

Use it with your coding agent

one-shot-ui ships an AGENTS.md (auto-discovered by Claude Code, Cursor, Codex, and other agent tools) plus a skill/SKILL.md for Claude Code.

Install the skill in one line:

mkdir -p .claude/skills/one-shot-ui && cp "$(npm root -g)/one-shot-ui/skill/SKILL.md" .claude/skills/one-shot-ui/

Use it as an MCP server

one-shot-ui also runs as a local MCP server, so any MCP-capable agent (Claude Code, Cursor, Cline, Windsurf, VS Code) can call compare, suggest_fixes, extract, tokens, and plan as tools — no shell glue. It runs over stdio, makes no network calls, and needs no API keys.

claude mcp add one-shot-ui -- npx -y one-shot-ui mcp

Or add to any client's MCP config:

{
  "mcpServers": {
    "one-shot-ui": {
      "command": "npx",
      "args": ["-y", "one-shot-ui", "mcp"]
    }
  }
}

See docs/MCP.md for per-client setup and registry publishing.

Commands

| Command | Purpose | Key Flags | |---------|---------|-----------| | extract | Analyze a screenshot into layout, color, and text data | --json, --no-ocr, --overlay, --fine | | compare | Pixel + structural diff between two screenshots | --json, --heatmap, --dom-diff | | tokens | Extract design tokens (colors, spacing, radii) | --json | | plan | Generate an implementation strategy | --json | | capture | Screenshot a URL or local HTML file | --url, --file, --output | | suggest-fixes | Tailwind/CSS fix suggestions from a diff | --json, --top, --dom-diff, --framework | | run | Multi-pass extract→capture→compare→fix loop | --impl, --max-passes, --threshold | | benchmark | Run benchmark suites | --json, --output |

How it works

extract — segments the reference into layout regions, samples colors/tokens, and OCRs text.
capture — screenshots your implementation (URL or local HTML) at a matched viewport.
compare — aligns the two, computes a pixel heatmap and a structural diff, and classifies each issue (layout / color / typography / spacing) plus whether it's actionable.
suggest-fixes — turns issues into concrete, ranked CSS edits.

run chains all four in a loop until the diff drops below --threshold.

Development

Requires Bun.

bun install
bun run install:browsers   # Playwright Chromium
bun run typecheck

Dev scripts run directly from source:

bun run dev:extract -- ./reference.png --json
bun run dev:compare -- ./reference.png ./build.png --json

Build for npm:

bun run build

License

MIT

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

one-shot-ui

Catch what the eye can't.

The problem

Watch it converge

Why one-shot-ui

Install

Quick start

Use it with your coding agent

Use it as an MCP server

Commands

How it works

Development

License