@octivas/mcp

v0.1.4

Published

a month ago

MCP server for Octivas web scraping, crawling, and search API

0High
0Medium
0Low

soeffing

octivas mcp model-context-protocol scraping web-scraping crawling search ai

@octivas/mcp

MCP (Model Context Protocol) server for web scraping, search, and crawling powered by Octivas.

Quick Start

With Cursor

Add to your .cursor/mcp.json:

{
  "mcpServers": {
    "octivas": {
      "command": "npx",
      "args": ["-y", "@octivas/mcp@latest"],
      "env": {
        "OCTIVAS_API_KEY": "your-api-key"
      }
    }
  }
}

With Claude Desktop

Add to your claude_desktop_config.json:

{
  "mcpServers": {
    "octivas": {
      "command": "npx",
      "args": ["-y", "@octivas/mcp@latest"],
      "env": {
        "OCTIVAS_API_KEY": "your-api-key"
      }
    }
  }
}

Get your API key at octivas.com/app/api-keys.

Tools

`scrape`

Extract content from a single web page. Returns markdown by default, with optional structured data extraction via JSON Schema or natural language prompts.

Parameters:

| Name | Type | Required | Description | | --- | --- | --- | --- | | url | string | Yes | URL of the page to scrape | | formats | string[] | No | Output formats: markdown, html, rawHtml, screenshot, links, json, images, summary (default: ["markdown"]) | | only_main_content | boolean | No | Strip navigation, headers, footers, and sidebars | | schema | object | No | JSON Schema for structured data extraction | | prompt | string | No | Natural language extraction instructions (e.g., "Extract the product name and price") |

`search`

Search the web and extract full page content from the results.

Parameters:

| Name | Type | Required | Description | | --- | --- | --- | --- | | query | string | Yes | Search query (max 500 chars) | | limit | number | No | Number of results (default: 5, max: 20) | | formats | string[] | No | Content formats for each result (default: ["markdown"]) | | country | string | No | ISO 3166-1 alpha-2 country code for geo-targeted results (e.g., US, GB) | | tbs | string | No | Time filter: qdr:h (hour), qdr:d (day), qdr:w (week), qdr:m (month), qdr:y (year) | | only_main_content | boolean | No | Strip boilerplate from extracted content |

`crawl`

Crawl a website starting from a URL and extract content from multiple pages. Useful for indexing documentation sites or gathering content across a domain.

Parameters:

| Name | Type | Required | Description | | --- | --- | --- | --- | | url | string | Yes | Starting URL to crawl from | | limit | number | No | Max pages to crawl (default: 10, max: 100) | | formats | string[] | No | Content formats for each page (default: ["markdown"]) | | max_depth | number | No | Max link depth from the starting URL (0 = start page only) | | include_paths | string[] | No | Regex patterns for paths to include (e.g., ["/docs/.*"]) | | exclude_paths | string[] | No | Regex patterns for paths to exclude (e.g., ["/admin/.*"]) | | allow_subdomains | boolean | No | Follow links to subdomains of the starting URL | | only_main_content | boolean | No | Strip boilerplate from extracted content |

Environment Variables

| Variable | Required | Description | | --- | --- | --- | | OCTIVAS_API_KEY | Yes | Your Octivas API key (get one at octivas.com/app/settings) | | OCTIVAS_BASE_URL | No | Override the API base URL |

Development

npm install
npm run build
npm run dev    # rebuild on changes

License

MIT

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

@octivas/mcp

Quick Start

With Cursor

With Claude Desktop

Tools

scrape

search

crawl

Environment Variables

Development

License

`scrape`

`search`

`crawl`