clone-architect
v2.7.0
Published
Extract real design (computed CSS + screenshots + narrative DESIGN.md) from any URL using Playwright + getComputedStyle. The verifiable alternative to manually-written design system docs.
Downloads
2,216
Maintainers
Readme
Clone Architect
Extract any website's real design system in 90 seconds. Every value verified by
getComputedStyle()— not guesswork.
🌐 Live demo: https://clone-architect.ps-tools.dev/ · 📦 npm: npm i -g clone-architect
Clone Architect extracts the computed CSS of any public website — not the raw stylesheets, but what the browser actually renders. It generates a 16-section narrative DESIGN.md (same format AI coding agents like Claude Code / Cursor / Bolt natively consume), plus structured tokens.json, full screenshots, and a layout analysis.
📄 Independent audit (2026-05-28): Stress test 28 brands vs getdesign.md — CA wins on Volume (1.8×), Color richness (1.5×), and Verifiability (100%). GD wins on Narrative editorial.
Built with Playwright + TypeScript. MIT, local-first, no API keys, no accounts, no $39 per brand.
⚡ vs getdesign.md
| | getdesign.md | Clone Architect |
|---|---|---|
| Brands | 73 hand-curated | 88 pre-extracted · Unlimited new — any public URL |
| Source | Manual descriptions | getComputedStyle() ground truth |
| Proof | None | Desktop + Mobile screenshots saved |
| Updates | Static (frozen) | clone-architect update <domain> |
| Format | DESIGN.md only | DESIGN.md + tokens.json + raw-css.json + screenshots |
| Price | $39/brand for custom | Free MIT |
| Verification | Trust the author | Every claim screenshot-verified (demo) |
Quick start
Path A — Browse the catalog (30 sec, no Playwright needed)
npm install -g clone-architect
clone-architect list # Browse 88 pre-extracted brands
clone-architect add linear.app # Install Linear's DESIGN.md + tokens.json to cwd
clone-architect add stripe.com # Install Stripe's design system
clone-architect random # Discover a random brand
clone-architect search dark # Find brands by keywordThen tell your AI: "Use DESIGN.md as reference before writing any UI."
Path B — Extract any URL (5 min, requires Playwright)
npm install -g clone-architect
npx playwright install chromium # One-time ~150MB download
clone-architect extract https://yoursite.comOutput lands in extractions/<domain>/:
DESIGN.md— 9-section narrative (Visual Theme, Colors, Typography, Components, Layout, Depth, Do's/Don'ts, Responsive, Agent Prompts)tokens.json— normalized design tokens (semantic palette from CSS custom properties)raw-css.json— fullgetComputedStyle()dump (ground truth, audit-friendly)screenshots/— desktop 1440px + mobile 390pxlayout-analysis.md— structural analysis
Why not just use getdesign.md?
Respect to the VoltAgent team — getdesign.md proved the format. But it's a static, manually-written catalog.
| | getdesign.md | Clone Architect |
|---|---|---|
| Extract any URL | 66 pre-written brands | Unlimited |
| Source of truth | Manually written / AI | getComputedStyle() ground truth |
| Raw CSS audit shipped | No | raw-css.json (21k lines for Linear) |
| Screenshots | None | Desktop + Mobile |
| Snapshot diff between versions | Static file | Re-extract → token-diff |
| Pricing per brand | $39 + 2 days wait | Free CLI · 90 seconds |
| License | MIT catalog / paid custom | MIT end-to-end |
Dev / from source
git clone https://files.ps-tools.dev/clone-architect/
cd clone-architect
npm install
npx playwright install chromium
# Full pipeline (extract + analyze + tokenize + DESIGN.md)
npx tsx scripts/clone.ts https://linear.app
# Or step-by-step
npx tsx scripts/extract.ts https://linear.app
npx tsx scripts/tokenize.ts linear.app
npx tsx scripts/generate-design-md.ts linear.app
# Extract a specific block (carousel, hero, product card)
npx tsx scripts/extract-block.ts https://addictsneakers.com "li.product-card"
# Register all extracted blocks into a queryable bank
npx tsx scripts/bank.ts register --all
# Generate clean HTML
npx tsx scripts/bank-inject.ts linear-app--block--header
# Apply another site's design tokens (retheme)
npx tsx scripts/bank-inject.ts wethenew-com--block--header \
--retheme addictsneakers.com \
--format reactWhat it actually does
Extraction pipeline
For any URL, Clone Architect captures:
- Every computed CSS value on every key element (body, header, nav, main, cards, buttons, inputs, headings…) via
getComputedStyle() - CSS custom properties (700+ on Shopify sites, enabling faithful retheming)
- @keyframes animations with all frames (0% → 100%), via
document.styleSheetswalk - Pseudo-elements (::before / ::after) with content + positioning
- Z-index stacking context (which element sits on top of what)
- Transform 3D matrix parsed (rotate, translate3d, perspective)
- Grid layouts with
grid-template-columns/rows/areas - Container queries (
@containerrules + container-type) - Responsive per-breakpoint snapshots at 360px + 768px + 1440px
- Font faces (URL, weight, style) + OpenType features + variable axes
- Full DOM tree for any CSS selector (
extract-block), recursively - Screenshots full-page + scroll positions, desktop + mobile
Anti-bot stealth mode for sites behind Cloudflare / DataDome:
npx tsx scripts/extract-block.ts https://protected-site.com "header" --stealthGeneration pipeline
- HTML self-contained (inline CSS, assets downloaded locally)
- React JSX with typed props + style objects
- Tailwind with utility classes + arbitrary values fallback
- Token retheming — apply one site's design tokens on another's structure
- 5 built-in theme presets (dark-minimal, light-clean, warm-cream, neon-dark, ocean-light)
Measured fidelity
Clone Architect publishes real pixelmatch scores against original screenshots — no hand-waving:
| Site | Header match (pixelmatch @ threshold 0.1) | |------|-----:| | raindrop.io | 97.49% | | linear.app | 35.38% (CSS-in-JS styled-components) | | wethenew.com | limited (extraction block rect 0×0 — known issue) | | addictsneakers.com | 6.36% (header at y=419 in flow — rect mismatch) |
Honest take: Clone Architect excels at clean semantic headers + pure CSS sites (raindrop.io). It struggles with:
- CSS-in-JS runtime (emotion, styled-components, Stitches) — classes
.css-abc123change on every build - Layouts where our "header" selector grabs a fragment rather than the visual top region
- Typekit / Adobe Fonts / DRM-protected web fonts (detection OK, download blocked)
- WebGL / Canvas content (screenshot only, not reconstructable)
We measure this against 5 baseline sites in benchmarks/baseline-*.json. Run it yourself:
npx tsx scripts/bench.ts --baselineWhat makes this different
| | Clone Architect | getdesign.md | v0.dev / Bolt | Figma Dev Mode | |---|---|---|---|---| | Extract any public URL | ✅ | ❌ (66 hand-picked) | ⚠️ (prompt) | ❌ | | Real computed CSS | ✅ | ❌ (manual) | ❌ (generated) | N/A | | Cross-site retheming | ✅ | ❌ | ❌ | ❌ | | Animations captured | ✅ | ❌ | ❌ | ❌ | | Open source | ✅ | ❌ | ❌ | ❌ | | Pixelmatch baseline | ✅ | ❌ | ❌ | N/A | | Component bank queryable | ✅ | ❌ | ❌ | ✅ |
Architecture
scripts/
├── shared/ # Types, CSS helpers, named colors, logger
├── extractors/ # Advanced captures (keyframes, pseudo, grid, etc.)
├── extract.ts # Main extraction pipeline (Playwright)
├── extract-block.ts # Targeted block DOM extraction
├── tokenize.ts # raw-css.json → normalized tokens.json
├── generate-design-md.ts # Narrative DESIGN.md (9 sections, getdesign.md format)
├── bank.ts # Component bank CLI (register/query/stats/show)
├── bank-register.ts # Snapshot creation + indexing
├── bank-inject.ts # HTML/React/Tailwind code generation
├── retheme.ts # Token remapping + fuzzy color matching + 5 presets
├── asset-downloader.ts # Local asset download with SSRF guards
├── renderers.ts # React + Tailwind renderers
├── browser-stealth.ts # playwright-extra + stealth plugin
├── compare.ts # Visual diff (pixelmatch)
├── bench.ts # Automated baseline pixelmatch harness
└── clone.ts # Top-level orchestrator (extract → analyze → tokenize → design-md)Stats: 9.5K LoC TypeScript, strict mode, 0 any in critical paths, 72 unit tests, GitHub Actions CI.
Commands
# Extraction
npx tsx scripts/clone.ts <URL> # Full pipeline
npx tsx scripts/extract.ts <URL> # Extraction only
npx tsx scripts/extract-block.ts <URL> <selector> # Targeted block
npx tsx scripts/tokenize.ts <domain> # raw-css → tokens
npx tsx scripts/generate-design-md.ts <domain> # Narrative markdown
# Bank
npx tsx scripts/bank.ts register --all # Index all extractions
npx tsx scripts/bank.ts query --type block --tag dark
npx tsx scripts/bank.ts show <component-id>
npx tsx scripts/bank.ts diff <domain1> <domain2> # Token comparison
npx tsx scripts/bank.ts stats
# Generation
npx tsx scripts/bank-inject.ts <id> # HTML (default)
npx tsx scripts/bank-inject.ts <id> --format react
npx tsx scripts/bank-inject.ts <id> --format tailwind
npx tsx scripts/bank-inject.ts <id> --download-assets # Standalone w/ images
npx tsx scripts/bank-inject.ts <id> --retheme <domain> # Apply another site's tokens
npx tsx scripts/bank-inject.ts <id> --preset neon-dark # Built-in preset
# Benchmarks
npx tsx scripts/bench.ts --baseline # Measure fidelity on 5 sites
npx tsx scripts/bench.ts --compare baseline-*.json # Diff vs baseline
# Tests
npm testUse cases
Where Clone Architect shines:
- Design system audits (extract 10 competitors, compare tokens)
- Design token migration from legacy sites to Tailwind
- UI veille — weekly digest of design trends
- Prototype with real design language from a reference site
- Education — understand how professional sites structure their CSS
Where it doesn't fit:
- Cloning SPAs with heavy runtime CSS-in-JS (linear.app, notion.so)
- Pixel-perfect clones of animation-heavy sites (Framer, Stripe 3D hero)
- Anything behind an auth wall
- Content extraction (we extract structure, not copy)
Legal
Clone Architect extracts publicly accessible CSS data (no content, no proprietary code). This is consistent with browser dev tools usage. Respect robots.txt. Do not use to clone-and-deploy competitor sites.
Custom fonts served under commercial license (Typekit, Monotype) are detected but not downloaded — use your own licensed copies.
Contributing
Issues and PRs welcome. Good first issues are tagged good-first-issue on GitHub.
Run tests before submitting:
npm test # 72 unit tests
npx tsc --noEmit # Type checkRoadmap
Post-launch priorities driven by user feedback (not pre-committed):
- [ ] Google Fonts auto-detection + download (~+5 pts measured fidelity)
- [ ] CSS-in-JS runtime capture via
CSSStyleSheet.prototype.insertRuleinterception - [ ] Vision LLM refinement loop (Claude Sonnet) for sites <80% pixelmatch
- [ ] Semantic component mapping (dropdown → Radix, carousel → Embla)
- [ ] HTTP API wrapper (only if usage justifies it)
Credits
Built by Paul Sainton. Inspired by getdesign.md (VoltAgent) and the Atareh "Clone Any Website with Claude Code" guide.
Stack: Playwright · playwright-extra · pixelmatch · pngjs · tsx.
License
MIT © Paul Sainton
