npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@supadata/mcp

v1.0.1

Published

MCP server for Supadata video & web scraping integration. Features include YouTube, TikTok, Instagram, Twitter, and file video transcription, web scraping, batch processing and structured data extraction.

Downloads

378

Readme

Supadata MCP Server

A Model Context Protocol (MCP) server implementation that integrates with Supadata for video & web scraping capabilities.

Features

  • Video transcript extraction from YouTube, TikTok, Instagram, Twitter, and file URLs
  • Web scraping, crawling, and discovery
  • Automatic retries and rate limiting

Play around with our MCP Server on Smithery or on MCP.so's playground.

Installation

Running with npx

env SUPADATA_API_KEY=your-api-key npx -y @supadata/mcp

Manual Installation

npm install -g @supadata/mcp

Running on Cursor

Configuring Cursor 🖥️ Note: Requires Cursor version 0.45.6+ For the most up-to-date configuration instructions, please refer to the official Cursor documentation on configuring MCP servers: Cursor MCP Server Configuration Guide

To configure Supadata MCP in Cursor v0.48.6

  1. Open Cursor Settings
  2. Go to Features > MCP Servers
  3. Click "+ Add new global MCP server"
  4. Enter the following code:
    {
      "mcpServers": {
        "@supadata/mcp": {
          "command": "npx",
          "args": ["-y", "@supadata/mcp"],
          "env": {
            "SUPADATA_API_KEY": "YOUR-API-KEY"
          }
        }
      }
    }

To configure Supadata MCP in Cursor v0.45.6

  1. Open Cursor Settings
  2. Go to Features > MCP Servers
  3. Click "+ Add New MCP Server"
  4. Enter the following:
    • Name: "@supadata/mcp" (or your preferred name)
    • Type: "command"
    • Command: env SUPADATA_API_KEY=your-api-key npx -y @supadata/mcp

If you are using Windows and are running into issues, try cmd /c "set SUPADATA_API_KEY=your-api-key && npx -y @supadata/mcp"

Replace your-api-key with your Supadata API key. If you don't have one yet, you can create an account and get it from https://www.supadata.dev/app/api-keys

After adding, refresh the MCP server list to see the new tools. The Composer Agent will automatically use Supadata MCP when appropriate, but you can explicitly request it by describing your web scraping needs. Access the Composer via Command+L (Mac), select "Agent" next to the submit button, and enter your query.

Running on Windsurf

Add this to your ./codeium/windsurf/model_config.json:

{
  "mcpServers": {
    "@supadata/mcp": {
      "command": "npx",
      "args": ["-y", "@supadata/mcp"],
      "env": {
        "SUPADATA_API_KEY": "YOUR_API_KEY"
      }
    }
  }
}

Installing via Smithery

To install Supadata for Claude Desktop automatically via Smithery:

npx -y @smithery/cli install @supadata-ai/mcp --client claude

Running on VS Code

For one-click installation, click one of the install buttons below...

Install with NPX in VS Code Install with NPX in VS Code Insiders

For manual installation, add the following JSON block to your User Settings (JSON) file in VS Code. You can do this by pressing Ctrl + Shift + P and typing Preferences: Open User Settings (JSON).

{
  "mcp": {
    "inputs": [
      {
        "type": "promptString",
        "id": "apiKey",
        "description": "Supadata API Key",
        "password": true
      }
    ],
    "servers": {
      "supadata": {
        "command": "npx",
        "args": ["-y", "@supadata/mcp"],
        "env": {
          "SUPADATA_API_KEY": "${input:apiKey}"
        }
      }
    }
  }
}

Optionally, you can add it to a file called .vscode/mcp.json in your workspace. This will allow you to share the configuration with others:

{
  "inputs": [
    {
      "type": "promptString",
      "id": "apiKey",
      "description": "Supadata API Key",
      "password": true
    }
  ],
  "servers": {
    "supadata": {
      "command": "npx",
      "args": ["-y", "@supadata/mcp"],
      "env": {
        "SUPADATA_API_KEY": "${input:apiKey}"
      }
    }
  }
}

Configuration

Environment Variables

  • SUPADATA_API_KEY: Your Supadata API key

Usage with Claude Desktop

Add this to your claude_desktop_config.json:

{
  "mcpServers": {
    "@supadata/mcp": {
      "command": "npx",
      "args": ["-y", "@supadata/mcp"],
      "env": {
        "SUPADATA_API_KEY": "YOUR_API_KEY_HERE"
      }
    }
  }
}

System Configuration

The server includes several configurable parameters that can be set via environment variables. Here are the default values if not configured:

const CONFIG = {
  retry: {
    maxAttempts: 3, // Number of retry attempts for rate-limited requests
    initialDelay: 1000, // Initial delay before first retry (in milliseconds)
    maxDelay: 10000, // Maximum delay between retries (in milliseconds)
    backoffFactor: 2, // Multiplier for exponential backoff
  },
};

Rate Limiting and Batch Processing

The server utilizes Supadata's built-in rate limiting and batch processing capabilities:

  • Automatic rate limit handling with exponential backoff
  • Efficient parallel processing for batch operations
  • Smart request queuing and throttling
  • Automatic retries for transient errors

How to Choose a Tool

Use this guide to select the right tool for your task:

  • If you need transcripts from video content: use transcript
  • If you know the exact URL(s) you want: use scrape
  • If you need to discover URLs on a site: use map
  • If you want to analyze a whole site or section: use crawl (with limits!)

Quick Reference Table

| Tool | Best for | Returns | | ---------- | ----------------------------------- | --------------- | | transcript | Video transcript extraction | text/markdown | | scrape | Single page content | markdown/html | | map | Discovering URLs on a site | URL[] | | crawl | Multi-page extraction (with limits) | markdown/html[] |

Available Tools

1. Transcript Tool (supadata_transcript)

Extract transcripts from supported video platforms and file URLs.

Best for:

  • Video content analysis and transcript extraction from YouTube, TikTok, Instagram, Twitter, and file URLs.

Not recommended for:

  • Non-video content (use scrape for web pages)

Common mistakes:

  • Using transcript for regular web pages (use scrape instead).

Prompt Example:

"Get the transcript from this YouTube video: https://youtube.com/watch?v=example"

Usage Example:

{
  "name": "supadata_transcript",
  "arguments": {
    "url": "https://youtube.com/watch?v=example",
    "lang": "en",
    "text": false,
    "mode": "auto"
  }
}

Returns:

  • Transcript content in text or formatted output
  • For async processing: Job ID for status checking

2. Check Transcript Status (supadata_check_transcript_status)

Check the status of a transcript job.

{
  "name": "supadata_check_transcript_status",
  "arguments": {
    "id": "550e8400-e29b-41d4-a716-446655440000"
  }
}

Returns:

  • Response includes the status of the transcript job with completion progress and results.

3. Scrape Tool (supadata_scrape)

Scrape content from a single URL with advanced options.

Best for:

  • Single page content extraction, when you know exactly which page contains the information.

Not recommended for:

  • Extracting content from multiple pages (use crawl for comprehensive multi-page extraction)

Common mistakes:

  • Using scrape for a list of URLs (use crawl instead for multiple pages).

Prompt Example:

"Get the content of the page at https://example.com."

Usage Example:

{
  "name": "supadata_scrape",
  "arguments": {
    "url": "https://example.com",
    "noLinks": false,
    "lang": "en"
  }
}

Returns:

  • URL of the scraped page
  • Extracted content in Markdown format
  • Page name and description
  • Character count
  • List of URLs found on the page

4. Map Tool (supadata_map)

Map a website to discover all indexed URLs on the site.

Best for:

  • Discovering URLs on a website before deciding what to scrape
  • Finding specific sections of a website

Not recommended for:

  • When you already know which specific URL you need (use scrape)
  • When you need the content of the pages (use scrape after mapping)

Common mistakes:

  • Using crawl to discover URLs instead of map

Prompt Example:

"List all URLs on example.com."

Usage Example:

{
  "name": "supadata_map",
  "arguments": {
    "url": "https://example.com"
  }
}

Returns:

  • Array of URLs found on the site

5. Crawl Tool (supadata_crawl)

Starts an asynchronous crawl job on a website and extract content from all pages.

Best for:

  • Extracting content from multiple related pages, when you need comprehensive coverage.

Not recommended for:

  • Extracting content from a single page (use scrape)
  • When token limits are a concern (use map first to discover URLs, then scrape individual pages)
  • When you need fast results (crawling can be slow)

Warning: Crawl responses can be very large and may exceed token limits. Limit the number of pages to crawl for better control.

Common mistakes:

  • Setting limit too high (causes token overflow)
  • Using crawl for a single page (use scrape instead)

Prompt Example:

"Get all pages from example.com/blog."

Usage Example:

{
  "name": "supadata_crawl",
  "arguments": {
    "url": "https://example.com/blog",
    "limit": 100
  }
}

Returns:

  • Response includes operation ID for status checking:
{
  "content": [
    {
      "type": "text",
      "text": "Started crawl for: https://example.com/* with job ID: 550e8400-e29b-41d4-a716-446655440000. Use supadata_check_crawl_status to check progress."
    }
  ],
  "isError": false
}

6. Check Crawl Status (supadata_check_crawl_status)

Check the status of a crawl job.

{
  "name": "supadata_check_crawl_status",
  "arguments": {
    "id": "550e8400-e29b-41d4-a716-446655440000"
  }
}

Returns:

  • Response includes the status of the crawl job with details on completion progress and results.

Development

# Install dependencies
npm install

# Build
npm run build

# Run tests
npm test

Contributing

  1. Fork the repository
  2. Create your feature branch
  3. Run tests: npm test
  4. Submit a pull request

License

MIT License - see LICENSE file for details