npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

pagescan

v0.0.2

Published

웹페이지에서 디자인 요소를 추출하여 XML로 저장하는 CLI 도구

Readme

PageScan

Extract design elements from any webpage and transform them into LLM-ready XML format.

PageScan is a powerful CLI tool that helps you capture HTML/CSS from web pages with intelligent optimization and automatic design token extraction. Perfect for design system analysis, UI component extraction, and AI-assisted design workflows.

Features

  • Interactive Extraction - Launch a browser, set up the page exactly as you want, then extract with a single keypress
  • Smart Optimization - Automatically removes scripts, comments, hidden elements, and unnecessary attributes
  • Design Token Detection - Auto-categorizes CSS variables into colors, spacing, and typography
  • LLM-Ready Output - Structured XML format optimized for AI analysis and processing
  • Zero Configuration - Works out of the box with sensible defaults

Getting Started

Installation

Install globally via npm:

npm install -g pagescan

Or use directly with npx (no installation required):

npx pagescan https://example.com

Alternative Package Managers

Using pnpm:

pnpm add -g pagescan

Using yarn:

yarn global add pagescan

Quick Start

  1. Run PageScan with a URL

    pagescan https://github.com
  2. A browser window opens - Interact with the page (scroll, click, expand menus)

  3. Press Enter in terminal when ready to capture

  4. Find your output in the output/ directory as structured XML

Usage

Basic Usage

pagescan <URL>

Example:

pagescan https://github.com
pagescan https://stripe.com/pricing

How It Works

  1. Launch - Run the command with your target URL
  2. Interact - A browser window opens - scroll, click, expand menus, etc.
  3. Capture - Press Enter in the terminal when ready
  4. Extract - PageScan automatically captures and optimizes the HTML/CSS
  5. Done - Find your structured XML in the output/ directory

Example Workflow

# Extract a component library
pagescan https://ui.shadcn.com/docs/components/button

# Capture a landing page
pagescan https://vercel.com

# Analyze a pricing page
pagescan https://stripe.com/pricing

Output Format

PageScan generates timestamped XML files in the output/ directory with the following structure:

<?xml version="1.0" encoding="UTF-8"?>
<design-extraction>
  <metadata>
    <url>https://example.com</url>
    <timestamp>2025-01-01T00:00:00.000Z</timestamp>
  </metadata>

  <styles>
    <design-tokens>
      <colors>
        <token name="--primary-color" value="#3498db" />
        <token name="--background" value="#ffffff" />
      </colors>
      <spacing>
        <token name="--spacing-md" value="16px" />
        <token name="--gap-lg" value="24px" />
      </spacing>
      <typography>
        <token name="--font-base" value="Arial, sans-serif" />
        <token name="--font-size-lg" value="18px" />
      </typography>
    </design-tokens>

    <global-styles><![CDATA[
      /* Optimized CSS content */
    ]]></global-styles>
  </styles>

  <structure>
    <html-content><![CDATA[
      <!-- Cleaned HTML structure -->
    ]]></html-content>
  </structure>
</design-extraction>

Optimization Features

PageScan intelligently cleans extracted HTML by removing:

  • <script> tags and JavaScript code
  • HTML comments
  • Hidden elements (display:none, visibility:hidden)
  • Elements with hidden attribute
  • Event handlers (onclick, onload, etc.)
  • Long text content (preserving headings and buttons)

This results in clean, focused output perfect for design analysis and LLM processing.

Design Token Classification

CSS custom properties are automatically categorized:

| Category | Detected Patterns | |----------|------------------| | Colors | color, background, bg, border-color, fill, stroke | | Spacing | margin, padding, spacing, gap, inset | | Typography | font, text, line-height, letter-spacing | | Other | All other CSS variables |

Use Cases

  • Design System Analysis - Extract and analyze design tokens from existing sites
  • Component Libraries - Capture UI components for reference or reimplementation
  • AI-Assisted Design - Feed structured design data to LLMs for analysis or generation
  • Design Audits - Quickly capture and review design patterns across pages
  • Documentation - Generate design documentation from live examples

Development

Prerequisites

  • Node.js 16+
  • pnpm (recommended) or npm

Setup

git clone https://github.com/yourusername/design-extractor.git
cd design-extractor
pnpm install

Available Scripts

pnpm build          # Build the project
pnpm dev <URL>      # Run in development mode
pnpm test           # Run tests
pnpm test:coverage  # Run tests with coverage
pnpm lint           # Lint code
pnpm typecheck      # Type check

Project Structure

src/
├── index.ts           # CLI entry point
├── core/              # Core logic
│   ├── extractor.ts   # Page extraction
│   ├── optimizer.ts   # HTML/CSS optimization
│   └── packer.ts      # XML packaging
├── utils/             # Utilities
│   ├── css.ts         # CSS helpers
│   └── xml.ts         # XML helpers
├── types/             # Type definitions
│   └── index.ts
└── config.ts          # Configuration

tests/
├── css-utils.test.ts  # CSS utilities tests
├── xml-utils.test.ts  # XML utilities tests
├── packer.test.ts     # Packer tests
└── fixtures/          # Test data

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add some amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

License

MIT License - see LICENSE for details

Support


Made with ❤️ for designers and developers