npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

markudown-mcp

v1.0.2

Published

MCP server for MarkUDown — connect AI agents to web scraping, search, and extraction

Readme

markudown-mcp

MCP server for MarkUDown / Scrape Technology — gives AI agents (Claude, Cursor, Windsurf, n8n, LangGraph, etc.) access to web scraping, Google search, data extraction, and more.

No infrastructure needed. Just an API key.


How it works

AI Agent (Claude Desktop, Cursor, etc.)
        │  MCP protocol (stdio)
        ▼
  markudown-mcp  (this package)
        │  HTTPS + polling
        ▼
  api.scrapetechnology.com
        │  BullMQ jobs
        ▼
  MarkUDown Engine workers

Quick start

MARKUDOWN_API_KEY=your-key npx markudown-mcp

Get your API key at scrapetechnology.com.


Claude Desktop

Add to claude_desktop_config.json:

  • macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
  • Windows: %APPDATA%\Claude\claude_desktop_config.json
{
  "mcpServers": {
    "markudown": {
      "command": "npx",
      "args": ["markudown-mcp"],
      "env": {
        "MARKUDOWN_API_KEY": "your-api-key-here"
      }
    }
  }
}

Restart Claude Desktop after saving.


Cursor / Windsurf

Add to your editor's MCP config:

{
  "markudown": {
    "command": "npx",
    "args": ["markudown-mcp"],
    "env": {
      "MARKUDOWN_API_KEY": "your-api-key-here"
    }
  }
}

Environment variables

| Variable | Required | Default | Description | |---|---|---|---| | MARKUDOWN_API_KEY | Yes | — | Your Scrape Technology API key | | MARKUDOWN_API_URL | No | https://api.scrapetechnology.com | Override the API base URL (self-hosted deployments) |


Available tools

scrape

Scrape a single URL and return its content as clean Markdown.

| Parameter | Type | Default | Description | |---|---|---|---| | url | string | required | URL to scrape | | mainContent | boolean | true | Strip navigation, ads, and sidebars | | includeLinks | boolean | false | Include hyperlinks in output | | includeHtml | boolean | false | Also return raw HTML | | excludeTags | string[] | ["header","nav","footer"] | HTML tags to strip | | timeout | number | 60 | Timeout in seconds |

Example prompt: "Scrape https://example.com and summarize the main content."


map

Discover all URLs on a website via sitemap parsing and link crawling.

| Parameter | Type | Default | Description | |---|---|---|---| | url | string | required | Root URL of the site | | maxUrls | number | 1000 | Maximum URLs to return | | allowedWords | string[] | [] | Only include URLs containing these words | | blockedWords | string[] | [] | Exclude URLs containing these words |

Example prompt: "Map all URLs on docs.example.com."


crawl

Recursively crawl a website, following internal links. Returns Markdown for each page.

| Parameter | Type | Default | Description | |---|---|---|---| | url | string | required | Starting URL | | maxDepth | number | 2 | Maximum link depth | | limit | number | 10 | Maximum pages to crawl | | mainContent | boolean | true | Extract only main content | | includeLinks | boolean | false | Include hyperlinks | | includeHtml | boolean | false | Include raw HTML | | excludeTags | string[] | — | HTML tags to strip | | blockedWords | string[] | — | Skip URLs with these words | | includeOnly | string[] | — | Only crawl URLs with these words | | timeout | number | 60 | Per-page timeout in seconds |

Example prompt: "Crawl https://docs.example.com up to depth 3 and find all mentions of 'authentication'."


search

Run a Google search and optionally scrape the full content of each result page.

| Parameter | Type | Default | Description | |---|---|---|---| | query | string | required | Search query | | limit | number | 5 | Number of results | | scrapeResults | boolean | true | Scrape full content for each result | | lang | string | "en" | Language code | | country | string | "us" | Country code | | timeout | number | 60 | Per-page timeout in seconds |

Example prompt: "Search for 'best web scraping libraries 2025' and summarize the top 5 results."


extract

Scrape a URL and extract structured data using a natural language prompt.

| Parameter | Type | Default | Description | |---|---|---|---| | url | string | required | URL to extract from | | prompt | string | required | What to extract, e.g. "All product names and prices" | | timeout | number | 60 | Timeout in seconds |

Example prompt: "Extract all job listings from https://jobs.example.com including title, location, and salary."


batch_scrape

Scrape multiple URLs in parallel. Returns Markdown for each.

| Parameter | Type | Default | Description | |---|---|---|---| | urls | string[] | required | URLs to scrape (at least 1) | | timeout | number | 60 | Per-page timeout in seconds |

Example prompt: "Scrape these 5 competitor pricing pages and compare their plans."


screenshot

Take a full-page screenshot of a URL.

| Parameter | Type | Default | Description | |---|---|---|---| | url | string | required | URL to screenshot | | timeout | number | 60 | Navigation timeout in seconds |

Example prompt: "Take a screenshot of https://example.com."


deep_research

Scrape multiple URLs and synthesize their content into a comprehensive research report using an LLM.

| Parameter | Type | Default | Description | |---|---|---|---| | query | string | required | Research question or topic | | urls | string[] | required | Source URLs to scrape (at least 1) | | maxTokens | number | 4096 | Maximum tokens for the report | | timeout | number | 180 | Timeout in seconds |

Example prompt: "Research the pricing strategies of these 3 SaaS companies and write a comparative analysis."


agent

Launch an AI agent that autonomously navigates a website to answer a question or complete a task.

| Parameter | Type | Default | Description | |---|---|---|---| | url | string | required | Starting URL | | prompt | string | required | Task or question for the agent | | maxSteps | number | 10 | Maximum navigation steps | | maxPages | number | 5 | Maximum pages to visit | | allowNavigation | boolean | true | Allow following links | | includeScreenshots | boolean | false | Screenshot each step | | timeout | number | 120 | Total timeout in seconds |

Example prompt: "Go to https://shop.example.com and find the return policy."


change_detection

Detect whether a URL's content has changed since it was last checked.

| Parameter | Type | Default | Description | |---|---|---|---| | url | string | required | URL to monitor | | mainContent | boolean | true | Compare only main content | | excludeTags | string[] | ["header","nav","footer"] | Tags to ignore when comparing | | timeout | number | 60 | Timeout in seconds |

Example prompt: "Check if the pricing page at https://example.com/pricing has changed."


Self-hosted deployment

If you run your own MarkUDown Engine, set MARKUDOWN_API_URL to point to your instance:

{
  "mcpServers": {
    "markudown": {
      "command": "npx",
      "args": ["markudown-mcp"],
      "env": {
        "MARKUDOWN_API_KEY": "your-api-key",
        "MARKUDOWN_API_URL": "https://your-engine.example.com"
      }
    }
  }
}

For direct BullMQ access (no API key needed), use the self-hosted MCP server in MarkUDown-Engine/services/mcp/.


Support & Community

| Channel | Link | |---------|------| | 💬 Discord | discord.gg/GBSKsC8DvS | | 📧 Email | [email protected] | | 🌐 Docs | scrapetechnology.com/markudown/docs |

License

MIT