npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

cleanweb-mcp

v1.0.1

Published

A lightweight MCP server for extracting clean web content with intelligent content filtering and Markdown conversion

Readme

🌐 CleanWeb MCP

npm version GitHub stars License: MIT

A lightweight Model Context Protocol (MCP) server

Specialized in intelligently extracting core web content, automatically filtering ads and irrelevant elements, and converting to clean Markdown format

🚀 Quick Start📖 Documentation🔧 Configuration🤝 Contributing

✨ Features

| 🌐 Smart Extraction | 🧹 Content Cleaning | 📝 Format Conversion | ⚡ Lightweight Deploy | |:---:|:---:|:---:|:---:| | Axios + Cheerio + Readability | Auto-filter ads & distractions | HTML → Markdown | Zero browser dependency |

🎯 Core Advantages

  • 🌐 Smart Content Extraction: Uses Axios + Cheerio + Readability algorithm to extract main web content
  • 🧹 Intelligent Content Cleaning: Automatically removes ads, navigation, sidebars and other distracting elements
  • 📝 Markdown Conversion: Converts HTML content to clean Markdown format
  • 🖼️ Image Link Optimization: Automatically handles overly long image links for better readability
  • Lightweight Deployment: No browser dependencies, simple and fast deployment
  • 🔧 Multiple Output Formats: Supports pure Markdown or JSON format with metadata
  • 🚀 MCP Protocol: Fully compatible with Model Context Protocol standard

🛠️ Tech Stack

TypeScript Node.js Axios Cheerio

🚀 Quick Start

📦 Installation

# Install from npm
npm install cleanweb-mcp

# Or clone the repository
git clone https://github.com/guangxiangdebizi/cleanweb-mcp.git
cd cleanweb-mcp
npm install

💡 Advantage: Uses lightweight HTTP client, no browser download required, simpler deployment! Focused on content cleaning and optimization.

🔧 Build Project

npm run build

🎯 Usage

1. Stdio Mode (Local Development)

npm run mcp:stdio

2. SSE Mode (via Supergateway)

npm run mcp:sse

Server will start at http://localhost:3100/sse

3. WebSocket Mode

npm run mcp:ws

4. Development Mode (Watch file changes)

npm run mcp:dev

🛠️ Claude Configuration

Stdio Mode Configuration

Add to Claude's configuration file:

{
  "mcpServers": {
    "cleanweb-mcp": {
      "command": "node",
      "args": ["path/to/your/project/build/index.js"]
    }
  }
}

SSE Mode Configuration

{
  "mcpServers": {
    "cleanweb-mcp-sse": {
      "type": "sse",
      "url": "http://localhost:3100/sse",
      "timeout": 600
    }
  }
}

🔨 API Reference

extract_web_content

Intelligently extract web content and convert to Markdown format.

Parameters

| Parameter | Type | Required | Default | Description | |-----------|------|----------|---------|-------------| | url | string | ✅ | - | The web URL to extract content from | | format | string | ❌ | markdown | Return format: markdown or json | | timeout | number | ❌ | 30000 | Page loading timeout (milliseconds) |

Usage Examples

// Basic usage
extract_web_content({
  url: "https://example.com/article"
})

// Advanced usage
extract_web_content({
  url: "https://example.com/article",
  format: "json",
  timeout: 60000
})

📁 Project Structure

cleanweb-mcp/
├── 📄 README.md                 # Project documentation
├── 📦 package.json              # Project configuration
├── ⚙️ tsconfig.json             # TypeScript configuration
├── 🔧 claude-config-example.json # Claude configuration example
├── 📖 example-usage.md          # Usage examples
├── 🏗️ build/                    # Compiled output
│   ├── index.js
│   └── tools/
│       └── web-content-extractor.js
└── 📝 src/                      # Source code
    ├── index.ts                 # MCP server main entry
    └── tools/
        └── web-content-extractor.ts # Web content extraction tool

🔄 Migration from Express Server

The original Express server (server.js) can still run independently:

npm start

The MCP version provides the same core functionality but integrates with AI assistants through the MCP protocol.

🚨 Important Notes

  1. Lightweight Implementation: Uses HTTP client to fetch static content, no browser dependencies required
  2. Network Access: Requires access to target websites
  3. Static Content: Primarily suitable for static HTML content, dynamically rendered content may not be accessible
  4. Timeout Settings: For slow-loading websites, you can appropriately increase the timeout parameter
  5. Content Optimization: Automatically optimizes image link display for better readability

🤝 Contributing

Welcome to submit Issues and Pull Requests! If you have any questions or suggestions, feel free to contact me.

📞 Contact

🔗 Related Links

📄 License

MIT License - See LICENSE file for details