npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

mcp-server-image-extractor

v1.0.8

Published

MCP server for extracting and categorizing images from web pages with intelligent classification

Readme

Image Extractor MCP Server

An MCP (Model Context Protocol) server that extracts and categorizes images from web pages using intelligent heuristics.

Features

  • Smart Image Extraction: Extracts images from various sources including:

    • <img> tags
    • CSS background images
    • Meta tags (og:image, twitter:image)
    • Favicons and touch icons
  • Intelligent Classification: Categorizes images into three types:

    • Icons: Logos, favicons, small brand images
    • Products: E-commerce product images
    • Other: Banners, article images, decorative content
  • Dual Extraction Modes:

    • Static Mode: Fast extraction using axios and cheerio
    • JavaScript Mode: Full rendering with Puppeteer for dynamic sites
  • Rich Metadata: Returns comprehensive information for each image:

    • Absolute URL
    • Dimensions (width/height)
    • Alt text and title
    • Position on page (header/main/footer)
    • Surrounding context
    • Classification confidence score

Installation

As an MCP Server

npm install -g mcp-server-image-extractor

For Development

# Download and extract the source code
cd image-extractor
npm install
npm run build

MCP Configuration

Add the server to your MCP settings:

Using npx (recommended)

{
  "mcpServers": {
    "image-extractor": {
      "command": "npx",
      "args": ["-y", "mcp-server-image-extractor"],
      "timeout": 120
    }
  }
}

Note: The first run with npx may take longer as it downloads the package. Set a higher timeout (120 seconds) to accommodate this.

Using global installation (faster startup)

First install globally:

npm install -g mcp-server-image-extractor

Then configure:

{
  "mcpServers": {
    "image-extractor": {
      "command": "mcp-server-image-extractor"
    }
  }
}

Using local installation

For development or local testing:

{
  "mcpServers": {
    "image-extractor": {
      "command": "node",
      "args": ["C:/path/to/image-extractor/build/index.js"]
    }
  }
}

Alternative: Using npx with cache

To avoid timeout issues, you can pre-cache the package:

npx mcp-server-image-extractor --version

Then use the standard npx configuration.

Usage

Once connected, you can use the extract_images tool:

Tool Parameters

  • url (required): The URL to extract images from
  • useJavaScript (optional): Use Puppeteer for JavaScript-rendered sites (default: false)
  • includeDataUrls (optional): Include base64 data URLs (default: false)
  • minSize (optional): Minimum image size in pixels (default: 0)

Example Request

{
  "url": "https://example.com",
  "useJavaScript": false,
  "includeDataUrls": false,
  "minSize": 100
}

Example Response

{
  "url": "https://example.com",
  "timestamp": "2024-01-07T12:00:00Z",
  "images": {
    "icons": [
      {
        "url": "https://example.com/logo.png",
        "alt": "Company Logo",
        "dimensions": { "width": 150, "height": 50 },
        "confidence": 0.95,
        "position": "header",
        "context": "Main navigation area"
      }
    ],
    "products": [
      {
        "url": "https://example.com/product1.jpg",
        "alt": "Product Image",
        "dimensions": { "width": 500, "height": 500 },
        "confidence": 0.88,
        "position": "main",
        "context": "Product gallery, near price $29.99"
      }
    ],
    "other": [
      {
        "url": "https://example.com/banner.jpg",
        "alt": "Hero Banner",
        "dimensions": { "width": 1200, "height": 400 },
        "confidence": 0.75,
        "position": "main",
        "context": "Hero section"
      }
    ]
  },
  "summary": {
    "total": 25,
    "icons": 5,
    "products": 10,
    "other": 10
  }
}

Classification Heuristics

The server uses multiple factors to classify images:

Icon Detection

  • Small dimensions (< 200x200px)
  • Located in header/navigation
  • Filename contains: logo, icon, favicon, brand
  • Alt text with company/brand names
  • Meta favicon tags

Product Detection

  • Medium to large size (> 300x300px)
  • Square aspect ratio
  • Located near price/cart elements
  • Product-related keywords in alt text
  • E-commerce context patterns

Context Analysis

  • Examines surrounding HTML elements
  • Checks for e-commerce patterns
  • Analyzes parent container classes
  • Detects proximity to price elements

Development

Project Structure

image-extractor/
├── src/
│   ├── index.ts        # MCP server entry point
│   ├── extractor.ts    # Core extraction logic
│   ├── classifier.ts   # Image classification
│   ├── utils.ts        # Helper functions
│   └── types.ts        # TypeScript types
├── build/              # Compiled JavaScript
├── package.json
└── tsconfig.json

Building

npm run build    # Compile TypeScript
npm run dev      # Watch mode

Testing

npm test         # Run tests (when implemented)

Use Cases

  • E-commerce Analysis: Extract product images from online stores
  • Brand Monitoring: Collect logos and brand images from websites
  • Content Aggregation: Gather images for content curation
  • Web Scraping: Extract visual content for analysis
  • SEO Auditing: Analyze image usage and optimization

Limitations

  • Image dimension detection requires downloading image headers
  • JavaScript mode is slower but more accurate for dynamic sites
  • Classification accuracy depends on page structure and naming conventions
  • Large pages with many images may take longer to process
  • Puppeteer requires additional system dependencies for headless Chrome

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

MIT