crawl4ai-mcp-sse-stdio
v1.2.1
Published
MCP (Model Context Protocol) server for Crawl4AI - Universal web crawling and data extraction. Supports STDIO, SSE, and HTTP transports.
Downloads
53
Maintainers
Readme
🕷️ Crawl4AI MCP Server
MCP (Model Context Protocol) server for Crawl4AI - Universal web crawling and data extraction for AI agents.
Integrate powerful web scraping capabilities into Claude, ChatGPT, and any MCP-compatible AI assistant.
📑 Table of Contents
🚀 Quick Start
NPM Installation (Recommended)
# Install globally
npm install -g crawl4ai-mcp-sse-stdio
# Run in different modes
npx crawl4ai-mcp --stdio --endpoint https://your-crawl4ai-server.com
npx crawl4ai-mcp --sse --port 3001 --endpoint https://your-crawl4ai-server.com
npx crawl4ai-mcp --http --port 3000 --endpoint https://your-crawl4ai-server.com
# With optional bearer token
npx crawl4ai-mcp --stdio --endpoint https://your-crawl4ai-server.com --bearer-token your-tokenWith Claude Desktop
Add to your claude_desktop_config.json:
{
"mcpServers": {
"crawl4ai": {
"command": "npx",
"args": [
"crawl4ai-mcp-sse-stdio",
"--stdio",
"--endpoint", "https://your-crawl4ai-server.com",
"--bearer-token", "your-optional-token"
]
}
}
}🐳 Docker Usage
Docker Hub
Official Docker image available on Docker Hub:
# Pull and run the official Docker image
docker pull stgmt/crawl4ai-mcp:latest
docker run -p 3000:3000 stgmt/crawl4ai-mcp:latestDocker Compose
Create a docker-compose.yml:
version: '3.8'
services:
crawl4ai-mcp:
image: stgmt/crawl4ai-mcp:latest
ports:
- "3000:3000"
environment:
- CRAWL4AI_ENDPOINT=https://your-crawl4ai-server.com
- CRAWL4AI_BEARER_TOKEN=your-optional-token
restart: unless-stoppedRun with:
docker-compose up -d🛠️ Available Tools
1. crawl - Full Web Crawling
Extract complete content from any webpage.
{
"name": "crawl",
"arguments": {
"url": "https://example.com",
"wait_for": "css:.content",
"timeout": 30000
}
}2. md - Markdown Extraction
Get clean markdown content from webpages.
{
"name": "md",
"arguments": {
"url": "https://docs.example.com",
"clean": true
}
}3. html - Raw HTML
Retrieve raw HTML content.
{
"name": "html",
"arguments": {
"url": "https://example.com"
}
}4. screenshot - Visual Capture
Take screenshots of webpages.
{
"name": "screenshot",
"arguments": {
"url": "https://example.com",
"full_page": true
}
}5. pdf - PDF Generation
Convert webpages to PDF.
{
"name": "pdf",
"arguments": {
"url": "https://example.com",
"format": "A4"
}
}6. execute_js - JavaScript Execution
Execute JavaScript on webpages.
{
"name": "execute_js",
"arguments": {
"url": "https://example.com",
"script": "document.title"
}
}⚙️ Configuration
Environment Variables
# REQUIRED: Crawl4AI endpoint URL
export CRAWL4AI_ENDPOINT="https://your-crawl4ai-server.com"
# OPTIONAL: Bearer authentication token
export CRAWL4AI_BEARER_TOKEN="your-api-token"Command Line Options
crawl4ai-mcp --help
Options:
--stdio Run in STDIO mode for MCP clients
--sse Run in SSE mode for web interfaces
--http Run in HTTP mode
--endpoint ENDPOINT Crawl4AI API endpoint URL (REQUIRED)
--bearer-token TOKEN Bearer authentication token (OPTIONAL)
--port PORT HTTP server port (default: 3000)
--sse-port PORT SSE server port (default: 9001)
--version, -v Show versionBasic Commands
# HTTP mode (recommended for testing)
crawl4ai-mcp --http --port 3000 --endpoint https://your-crawl4ai-server.com
# SSE mode (Server-Sent Events)
crawl4ai-mcp --sse --port 3001 --endpoint https://your-crawl4ai-server.com
# STDIO mode (for MCP clients)
crawl4ai-mcp --stdio --endpoint https://your-crawl4ai-server.com
# With optional bearer token
crawl4ai-mcp --http --port 3000 --endpoint https://your-crawl4ai-server.com --bearer-token your-token🤝 Contributing
We welcome contributions! See CONTRIBUTING.md for guidelines.
📄 License
MIT License - see LICENSE file for details.
🔗 Links
- NPM Package: https://www.npmjs.com/package/crawl4ai-mcp-sse-stdio
- GitHub Repository: https://github.com/stgmt/crawl4ai-mcp
- Docker Hub: https://hub.docker.com/r/stgmt/crawl4ai-mcp
Made with ❤️ for the AI community
