npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

deepcrawler-mcp

v1.0.2

Published

Production-ready MCP server for Hyperbrowser integration with AI-powered API discovery, browser automation, and web scraping

Downloads

27

Readme

Hyperbrowser MCP Server with DeepCrawler

Production-ready Model Context Protocol (MCP) server for Hyperbrowser integration with AI-powered API discovery, advanced web scraping, and browser automation.

Features

🔍 API Discovery (DeepCrawler)

  • Discover APIs: Find hidden APIs on any website using AI agents
  • Network Analysis: Analyze network traffic for API endpoints
  • JavaScript Analysis: Extract APIs from JavaScript code
  • CAPTCHA Solving: Detect and solve all CAPTCHA types
  • OpenAPI Generation: Generate OpenAPI specifications automatically
  • WebSocket Analysis: Analyze WebSocket connections and messages

🌐 Web Automation (Hyperbrowser)

  • Link Extraction: Extract all hyperlinks from webpages with context
  • Web Crawling: Crawl multiple pages with intelligent navigation
  • Data Extraction: Extract structured data from webpages
  • Browser Automation: Full browser control with Playwright
  • Web Search: Search the web using Bing integration

⚙️ Production Features

  • Retry Logic: Automatic exponential backoff retry on rate limits and server errors
  • Credential Sanitization: API keys masked in all logs
  • Zero-Config Execution: Works via npx without prior installation
  • Multi-location Config: Supports .env files in multiple locations
  • Full Type Safety: Complete TypeScript support with exported types
  • Comprehensive Testing: 100% test coverage with unit and integration tests
  • AI Assistant Support: Works with 20+ AI coding assistants

Installation

NPM (Recommended)

npm install -g deepcrawler-mcp

NPX (Zero-Config)

npx deepcrawler-mcp

PyPI

pip install deepcrawler-mcp

Quick Start

1. Get Your API Keys

  • OpenRouter API Key: Get one at https://openrouter.ai (for DeepCrawler AI agents)
  • Hyperbrowser API Key: Get one at https://hyperbrowser.ai (for browser automation)

2. Set Up Environment

Create a .env file in your project root:

OPENROUTER_API_KEY=sk_live_your_key_here
HYPERBROWSER_API_KEY=your_hyperbrowser_key_here
LOG_LEVEL=info

Or set environment variables:

export OPENROUTER_API_KEY=sk_live_your_key_here
export HYPERBROWSER_API_KEY=your_hyperbrowser_key_here

2. Start the Server

# Using npx
npx deepcrawler-mcp

# Using npm
npm run start

# Using Python
python -m jaegis_hyperbrowser_mcp

3. List Available Tools

npx deepcrawler-mcp --list-tools

Configuration

Environment Variables

| Variable | Required | Default | Description | |----------|----------|---------|-------------| | HYPERBROWSER_API_KEY | Yes | - | Your Hyperbrowser API key | | HYPERBROWSER_BASE_URL | No | https://api.hyperbrowser.ai | API base URL | | HYPERBROWSER_TIMEOUT | No | 30000 | Request timeout in ms | | HYPERBROWSER_RETRY_ATTEMPTS | No | 3 | Number of retry attempts | | HYPERBROWSER_RETRY_DELAY | No | 1000 | Initial retry delay in ms | | LOG_LEVEL | No | info | Log level: debug, info, warn, error |

Configuration File Locations

The server checks for .env files in this order:

  1. ./.env (project root)
  2. ~/.mcp/.env (user home)
  3. ~/.env (user home)
  4. /etc/mcp/.env (system-wide)

AI Assistant Configuration

This MCP server works with 20+ AI coding assistants. See SETUP_GUIDE.md for detailed configuration instructions for:

  • Augment Code - config-examples/augment-config.json
  • Claude Desktop - config-examples/claude_desktop_config.json
  • Cursor - config-examples/cursor-config.json
  • Cline - config-examples/cline-config.json
  • GitHub Copilot - config-examples/github-copilot-config.json
  • Tabnine - config-examples/tabnine-config.json
  • Cody - config-examples/cody-config.json
  • And 13+ more...

Quick Configuration Example

For most assistants, add this to your MCP configuration:

{
  "mcp_server": {
    "command": "npx",
    "args": ["deepcrawler-mcp"],
    "env": {
      "OPENROUTER_API_KEY": "sk_live_your_key_here",
      "HYPERBROWSER_API_KEY": "your_hyperbrowser_key_here"
    }
  }
}

Then restart your AI assistant and verify tools are available.

Tools

DeepCrawler Tools (AI-Powered API Discovery)

discover_apis

Discover hidden APIs on a website using AI agents.

Parameters:

  • url (string, required): Target website URL
  • depth (number, optional): Crawl depth (1-5, default: 2)
  • mode (string, optional): 'direct' or 'crew' (default: 'direct')
  • include_websockets (boolean, optional): Include WebSocket analysis (default: true)
  • include_static_analysis (boolean, optional): Include JavaScript analysis (default: true)
  • timeout (number, optional): Timeout in ms (default: 300000)

Example:

{
  "url": "https://example.com",
  "depth": 2,
  "mode": "direct"
}

analyze_network_traffic

Analyze network traffic for API endpoints.

Parameters:

  • url (string, required): Target website URL
  • duration (number, required): Analysis duration in seconds (1-300)
  • filter_by_type (string, optional): Filter by request type

analyze_javascript_code

Extract APIs from JavaScript code.

Parameters:

  • url (string, required): Target website URL
  • include_comments (boolean, optional): Include code comments
  • include_strings (boolean, optional): Include string literals

solve_captcha

Detect and solve CAPTCHAs.

Parameters:

  • url (string, required): Target website URL
  • captcha_type (string, required): Type of CAPTCHA (recaptcha_v2, recaptcha_v3, hcaptcha, image, audio)

generate_openapi_schema

Generate OpenAPI specifications from discovered APIs.

Parameters:

  • endpoints (array, required): List of API endpoints
  • base_url (string, required): API base URL
  • title (string, optional): API title
  • version (string, optional): API version

analyze_websockets

Analyze WebSocket connections.

Parameters:

  • url (string, required): Target website URL
  • duration (number, required): Analysis duration in seconds (1-300)
  • filter_by_type (string, optional): Filter by message type

Hyperbrowser Tools (Web Automation)

scrape_links

Extract all hyperlinks from a webpage.

Parameters:

  • url (string, required): Target webpage URL
  • include_markdown (boolean, optional): Also return page content in markdown format
  • include_tags (array, optional): CSS selectors to include
  • exclude_tags (array, optional): CSS selectors to exclude
  • only_main_content (boolean, optional): Extract only main content links

Example:

{
  "url": "https://example.com",
  "include_markdown": true,
  "only_main_content": true
}

Response:

{
  "links": [
    {
      "url": "https://example.com/page1",
      "text": "Page 1",
      "context": "Navigation link"
    }
  ],
  "markdown": "# Example\n\nContent here...",
  "metadata": {
    "total_links": 42,
    "unique_links": 38,
    "scraped_at": "2025-01-15T10:30:00Z"
  }
}

crawl_webpages

Crawl multiple linked pages from a starting URL.

Parameters:

  • url (string, required): Starting webpage URL to crawl
  • followLinks (boolean, optional): Whether to follow links to other pages (default: false)
  • maxPages (number, optional): Maximum number of pages to crawl, 1-100 (default: 10)
  • outputFormat (array, optional): Desired output formats: markdown, html, links, screenshot

Example:

{
  "url": "https://example.com",
  "followLinks": true,
  "maxPages": 5,
  "outputFormat": ["markdown", "links"]
}

Response:

{
  "pages": [
    {
      "url": "https://example.com",
      "title": "Example",
      "content": "Page content...",
      "links": [
        {
          "url": "https://example.com/page1",
          "text": "Page 1"
        }
      ]
    }
  ],
  "metadata": {
    "total_pages": 5,
    "crawled_at": "2025-01-15T10:30:00Z",
    "duration_ms": 2500
  }
}

extract_structured_data

Extract structured data from webpages using JSON schemas.

Parameters:

  • urls (array, required): List of URLs to extract data from
  • schema (object, required): JSON schema defining the structure of data to extract
  • prompt (string, optional): Custom prompt for extraction guidance

Example:

{
  "urls": ["https://example.com/product1", "https://example.com/product2"],
  "schema": {
    "title": { "type": "string" },
    "price": { "type": "string" },
    "description": { "type": "string" }
  },
  "prompt": "Extract product information"
}

Response:

{
  "results": [
    {
      "url": "https://example.com/product1",
      "data": {
        "title": "Product 1",
        "price": "$99.99",
        "description": "Great product"
      },
      "success": true
    }
  ],
  "metadata": {
    "total_urls": 2,
    "successful": 2,
    "failed": 0,
    "extracted_at": "2025-01-15T10:30:00Z"
  }
}

browser_use_agent

Execute advanced browser automation tasks with step-by-step execution.

Parameters:

  • task (string, required): Description of the browser task to execute
  • url (string, optional): Starting URL for the task
  • maxSteps (number, optional): Maximum number of steps to execute, 1-100 (default: 10)
  • returnStepInfo (boolean, optional): Whether to return detailed step information (default: false)

Example:

{
  "task": "Click the submit button and wait for confirmation",
  "url": "https://example.com/form",
  "maxSteps": 5,
  "returnStepInfo": true
}

Response:

{
  "result": "Task completed successfully",
  "steps": [
    {
      "action": "navigate",
      "result": "Navigated to page",
      "timestamp": "2025-01-15T10:30:00Z"
    },
    {
      "action": "click",
      "result": "Clicked submit button",
      "timestamp": "2025-01-15T10:30:01Z"
    }
  ],
  "metadata": {
    "total_steps": 2,
    "completed_at": "2025-01-15T10:30:02Z",
    "success": true
  }
}

search_with_bing

Perform web searches using Bing search engine.

Parameters:

  • query (string, required): Search query string
  • numResults (number, optional): Number of results to return, 1-50 (default: 10)

Example:

{
  "query": "TypeScript best practices",
  "numResults": 5
}

Response:

{
  "results": [
    {
      "title": "TypeScript Best Practices",
      "url": "https://example.com/typescript-best-practices",
      "snippet": "Learn the best practices for writing TypeScript code..."
    }
  ],
  "metadata": {
    "query": "TypeScript best practices",
    "total_results": 5,
    "searched_at": "2025-01-15T10:30:00Z"
  }
}

AI Assistant Configuration

Augment Code

{
  "mcpServers": {
    "hyperbrowser": {
      "command": "npx",
      "args": ["deepcrawler-mcp"],
      "env": {
        "HYPERBROWSER_API_KEY": "${HYPERBROWSER_API_KEY}"
      }
    }
  }
}

Claude Desktop

{
  "mcpServers": {
    "hyperbrowser": {
      "command": "npx",
      "args": ["deepcrawler-mcp"],
      "env": {
        "HYPERBROWSER_API_KEY": "${HYPERBROWSER_API_KEY}"
      }
    }
  }
}

Cursor

{
  "mcpServers": {
    "hyperbrowser": {
      "command": "npx",
      "args": ["deepcrawler-mcp"],
      "env": {
        "HYPERBROWSER_API_KEY": "${HYPERBROWSER_API_KEY}"
      }
    }
  }
}

Development

Build

npm run build

Test

npm test
npm run test:coverage

Lint

npm run lint
npm run format

Troubleshooting

"Invalid API Key" Error

Ensure HYPERBROWSER_API_KEY is set correctly:

echo $HYPERBROWSER_API_KEY

Rate Limiting (429 Errors)

The server automatically retries with exponential backoff. Adjust retry settings:

HYPERBROWSER_RETRY_ATTEMPTS=5
HYPERBROWSER_RETRY_DELAY=2000

Connection Timeouts

Increase the timeout value:

HYPERBROWSER_TIMEOUT=60000

Debug Logging

Enable debug logging:

LOG_LEVEL=debug

API Reference

See Hyperbrowser API Documentation

License

MIT - See LICENSE file for details

Support

  • NPM Package: https://www.npmjs.com/package/deepcrawler-mcp
  • GitHub: TBD - To be configured
  • Issues: TBD - To be configured
  • Hyperbrowser Docs: https://docs.hyperbrowser.ai