mcpp-crawler

v1.0.0

Published

3 months ago

MCP 网页爬虫工具 - 抓取网页内容、提取结构化数据

Downloads

0High
0Medium
0Low

wenhan123

mcp crawler scraper web

@mcpp/crawler

网页爬虫 MCP 工具 - 让 AI 助手自动搜索和抓取网页内容

功能

| 工具 | 说明 | |------|------| | scrape | 抓取单个网页内容 (Markdown/HTML/Text) | | batch_scrape | 批量抓取多个 URL | | search | 搜索网页 (使用百度) | | search_and_scrape | 搜索并抓取结果页面 | | map | 发现网站所有链接 | | analyze_github | 分析 GitHub 仓库信息 | | clone_github | 克隆 GitHub 仓库到服务器 | | push_to_github | 推送目录到 GitHub 仓库 | | close | 关闭浏览器 |

配置

{
  "mcpServers": {
    "crawler": {
      "command": "npx",
      "args": ["-y", "mcpp-crawler@latest"],
      "env": {
        "MCP_ACCESS_KEY": "your-access-key"
      },
      "autoApprove": ["scrape", "search", "search_and_scrape", "map", "batch_scrape", "analyze_github", "clone_github", "push_to_github", "close"]
    }
  }
}

使用示例

搜索并抓取

用户: 帮我找微信小程序的解决方案
AI: 调用 search_and_scrape({ query: "微信小程序开发解决方案" })

抓取单个页面

用户: 抓取这个页面的内容 https://example.com/article
AI: 调用 scrape({ url: "https://example.com/article" })

发现网站链接

用户: 列出这个网站的所有文章链接
AI: 调用 map({ url: "https://blog.example.com", filter: "/article/" })

分析 GitHub 仓库

用户: 分析一下这个仓库 https://github.com/vuejs/vue
AI: 调用 analyze_github({ repo: "vuejs/vue" })

克隆仓库到服务器

用户: 把这个仓库克隆到服务器
AI: 调用 clone_github({ repo: "owner/repo", targetDir: "/tmp/repo" })

推送到我的仓库

用户: 把这个目录推送到我的 GitHub 仓库
AI: 调用 push_to_github({ sourceDir: "/tmp/repo", targetRepo: "myuser/myrepo", targetPath: "ui-templates/login" })

环境变量

| 变量 | 说明 | 默认值 | |------|------|--------| | MCP_ACCESS_KEY | 访问密钥 (必填) | - | | HEADLESS | 无头模式 | true | | TIMEOUT | 超时时间(ms) | 30000 |

开发

cd mcp-platform/packages/mcpp-crawler
npm install
npm run build

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme