@voiys/fetcher-mcp
v0.2.0
Published
MCP server for fetching web content using Playwright browser
Readme
@voiys/fetcher-mcp
MCP server for fetching web content via Playwright. Connects to an existing Chrome/Chromium instance via CDP.
Installation
npx @voiys/fetcher-mcp --cdp-url=ws://localhost:9222MCP Client Configuration
{
"mcpServers": {
"fetcher": {
"command": "npx",
"args": [
"@voiys/fetcher-mcp",
"--cdp-url=ws://localhost:9222"
]
}
}
}With Steel:
{
"mcpServers": {
"fetcher": {
"command": "npx",
"args": [
"@voiys/fetcher-mcp",
"--cdp-url=wss://connect.steel.dev?apiKey=XXX&sessionId=YYY"
]
}
}
}Tools
fetch_url
Fetch a single URL:
await mcpClient.callTool("fetch_url", {
url: "https://company.com/careers"
});fetch_urls
Fetch multiple URLs in parallel:
await mcpClient.callTool("fetch_urls", {
urls: ["https://a.com", "https://b.com"]
});evaluate
Execute JavaScript on an existing session (requires keepSession: true on fetch_url):
// First fetch with keepSession
const result = await mcpClient.callTool("fetch_url", {
url: "https://example.com",
keepSession: true
});
// Extract sessionId from response
// Then interact with the page
await mcpClient.callTool("evaluate", {
sessionId: "...",
script: "document.querySelector('.load-more').click()"
});
// Get updated content
await mcpClient.callTool("evaluate", {
sessionId: "...",
script: "null",
returnContent: true
});close_session
Close a persistent session (or let it auto-close after 5 min):
await mcpClient.callTool("close_session", { sessionId: "..." });
// Or close all:
await mcpClient.callTool("close_session", {});Options
| Option | Type | Default | Description |
|--------|------|---------|-------------|
| url / urls | string / string[] | required | URL(s) to fetch |
| stripTags | string[] | ['script','style','svg','noscript','iframe','link[rel="stylesheet"]','meta'] | CSS selectors to remove before processing |
| returnHtml | boolean | false | Return HTML instead of Markdown |
| structured | boolean | false | Return JSON {title, url, content, metadata} instead of plain text |
| timeout | number | 30000 | Page load timeout in ms |
| waitUntil | string | 'load' | Navigation wait condition: 'load', 'domcontentloaded', 'networkidle' |
| maxLength | number | 0 | Max content length (0 = unlimited) |
| disableMedia | boolean | true | Block images, fonts, stylesheets |
| evaluate | string | undefined | JavaScript to execute on page before extracting content |
| keepSession | boolean | false | Keep session alive, returns sessionId for use with evaluate tool |
Examples
Custom tag stripping:
await mcpClient.callTool("fetch_url", {
url: "https://company.com/jobs/123",
stripTags: ["nav", "footer", "[class*='sidebar']", "script", "style"]
});Execute JavaScript (click button, scroll, wait for content):
await mcpClient.callTool("fetch_url", {
url: "https://example.com/infinite-scroll",
evaluate: `
// Click "Load More" button
document.querySelector('.load-more-btn')?.click();
// Wait for content to load
await new Promise(r => setTimeout(r, 2000));
// Scroll to bottom
window.scrollTo(0, document.body.scrollHeight);
`
});Starting Chrome with CDP
chrome --remote-debugging-port=9222Recommended Stealth Flags
chrome --remote-debugging-port=9222 \
--disable-blink-features=AutomationControlled \
--disable-features=IsolateOrigins,site-per-process \
--no-sandbox \
--disable-setuid-sandbox \
--disable-dev-shm-usage \
--disable-webgl \
--disable-infobars \
--disable-extensions| Flag | Purpose |
|------|---------|
| --disable-blink-features=AutomationControlled | Hide automation detection |
| --disable-features=IsolateOrigins,site-per-process | Disable site isolation |
| --no-sandbox | Disable sandboxing (needed in containers) |
| --disable-setuid-sandbox | Disable setuid sandbox |
| --disable-dev-shm-usage | Avoid /dev/shm issues in Docker |
| --disable-webgl | Reduce fingerprinting surface |
| --disable-infobars | Hide "Chrome is being controlled" bar |
| --disable-extensions | No extensions |
License
Licensed under the MIT License
