npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

mcp-researchpowerpack

v3.9.5

Published

The ultimate research MCP toolkit: Reddit mining, web search with CTR aggregation, AI-powered deep research, and intelligent web scraping - all in one modular package

Readme


An MCP server that gives Claude, Cursor, Windsurf, and any MCP-compatible AI assistant a complete research toolkit. Google search, Reddit deep-dives, web scraping with AI extraction, and multi-model deep research — all as tools that chain into each other.

Zero config to start. Each API key you add unlocks more capabilities.

Tools

| Tool | What it does | Requires | |:-----|:-------------|:---------| | web_search | Parallel Google search across 3–100 keywords with CTR-weighted ranking and consensus detection | SERPER_API_KEY | | search_reddit | Same search engine filtered to reddit.com — 10–50 queries in parallel | SERPER_API_KEY | | get_reddit_post | Fetch 2–50 Reddit posts with full comment trees, smart comment budget allocation | REDDIT_CLIENT_ID + REDDIT_CLIENT_SECRET | | scrape_links | Scrape 1–50 URLs with JS rendering fallback, HTML→Markdown, optional AI extraction | SCRAPEDO_API_KEY | | deep_research | Send questions to research-capable models (Grok, Gemini) with web search, file attachments | OPENROUTER_API_KEY |

Tools are designed to chain: web_searchscrape_linkssearch_redditget_reddit_postdeep_research for synthesis. Each tool suggests the next logical step in its output.

Quick Start

Claude Desktop / Claude Code

Add to your MCP config (~/Library/Application Support/Claude/claude_desktop_config.json):

{
  "mcpServers": {
    "research-powerpack": {
      "command": "npx",
      "args": ["-y", "mcp-research-powerpack"],
      "env": {
        "SERPER_API_KEY": "your-key-here",
        "OPENROUTER_API_KEY": "your-key-here"
      }
    }
  }
}

Cursor

Add to .cursor/mcp.json in your project:

{
  "mcpServers": {
    "research-powerpack": {
      "command": "npx",
      "args": ["-y", "mcp-research-powerpack"],
      "env": {
        "SERPER_API_KEY": "your-key-here"
      }
    }
  }
}

From Source

git clone https://github.com/yigitkonur/mcp-research-powerpack.git
cd mcp-research-powerpack
pnpm install && pnpm build
pnpm start

HTTP Transport

MCP_TRANSPORT=http MCP_PORT=3000 npx mcp-research-powerpack

Exposes /mcp endpoint (POST/GET/DELETE with session headers) and /health.

API Keys

Each key unlocks a capability. Missing keys silently disable their tools — the server never crashes.

| Variable | Enables | Free Tier | |:---------|:--------|:----------| | SERPER_API_KEY | web_search, search_reddit | 2,500 searches/mo — serper.dev | | REDDIT_CLIENT_ID + REDDIT_CLIENT_SECRET | get_reddit_post | Unlimited — reddit.com/prefs/apps (script type) | | SCRAPEDO_API_KEY | scrape_links | 1,000 credits/mo — scrape.do | | OPENROUTER_API_KEY | deep_research, LLM extraction | Pay-per-token — openrouter.ai | | CEREBRAS_API_KEY | Cerebras LLM extraction | — | | USE_CEREBRAS | Enable Cerebras for extraction (set true) | false |

Configuration

Optional tuning via environment variables:

| Variable | Default | Description | |:---------|:--------|:------------| | RESEARCH_MODEL | x-ai/grok-4-fast | Primary deep research model | | RESEARCH_FALLBACK_MODEL | google/gemini-2.5-flash | Fallback when primary fails | | LLM_EXTRACTION_MODEL | openai/gpt-oss-120b:nitro | Model for scrape/reddit AI extraction | | DEFAULT_REASONING_EFFORT | high | Research depth: low, medium, high | | DEFAULT_MAX_URLS | 100 | Max search results per research question (10–200) | | API_TIMEOUT_MS | 1800000 | Request timeout in ms (default: 30 min) | | MCP_TRANSPORT | stdio | Transport mode: stdio or http | | MCP_PORT | 3000 | Port for HTTP mode | | USE_CEREBRAS | false | Set to true to use Cerebras for extraction instead of OpenRouter | | CEREBRAS_API_KEY | — | API key for Cerebras cloud — cloud.cerebras.ai |

Cerebras Support

When USE_CEREBRAS=true and CEREBRAS_API_KEY are set, the scrape_links tool uses Cerebras (Z.ai GLM 4.7) for AI content extraction instead of OpenRouter. This provides:

  • Ultra-fast extraction — Cerebras inference is optimized for speed
  • Independent from OpenRouter — extraction works even without OPENROUTER_API_KEY
  • Automatic fallback — if Cerebras is not configured, falls back to OpenRouter
# Enable Cerebras for extraction
USE_CEREBRAS=true CEREBRAS_API_KEY=your-key npx mcp-research-powerpack

Network Resilience

All LLM API calls include built-in stability protections:

  • Request deadlines — hard timeout prevents calls from hanging indefinitely
  • Stall detection — if no response arrives within a threshold, the request is aborted and retried
  • Exponential backoff — transient failures (429, 5xx) retry with jitter to avoid thundering herd
  • Connection loss recovery — network errors (ECONNRESET, ECONNREFUSED) trigger automatic retry
  • Graceful degradation — all tools return structured errors instead of crashing

How It Works

Search Ranking

Results from multiple queries are deduplicated by normalized URL and scored using CTR-weighted position values (position 1 = 100.0, position 10 = 12.56). URLs appearing across multiple queries get a consensus marker. Frequency threshold starts at ≥3, falls back to ≥2, then ≥1 to ensure results.

Reddit Comment Budget

Global budget of 1,000 comments, max 200 per post. After the first pass, surplus from posts with fewer comments is redistributed to truncated posts in a second fetch pass.

Scraping Pipeline

Three-mode fallback per URL: basic → JS rendering → JS + US geo-targeting. Results go through HTML→Markdown conversion (Turndown), then optional AI extraction with a 100K char input cap and 8,000 token output per URL.

Deep Research

32,000 token budget divided across questions (1 question = 32K, 10 questions = 3.2K each). Gemini models get google_search tool access. Grok/Perplexity get search_parameters with citations. Primary model fails → automatic fallback to secondary model.

File Attachments

deep_research can read local files and include them as context. Files over 600 lines are smart-truncated (first 500 + last 100 lines). Line ranges supported. Line numbers preserved in output.

Concurrency

| Operation | Parallel Limit | |:----------|:---------------| | Web search keywords | 8 | | Reddit search queries | 8 | | Reddit post fetches per batch | 5 (batches of 10) | | URL scraping per batch | 10 (batches of 30) | | LLM extraction | 3 | | Deep research questions | 3 |

All clients use manual retry with exponential backoff and jitter. The OpenAI SDK's built-in retry is disabled (maxRetries: 0).

Architecture

src/
├── index.ts                    Entry point — STDIO + HTTP transport, graceful shutdown
├── worker.ts                   Cloudflare Workers entry (Durable Objects)
├── config/
│   ├── index.ts                Env parsing, capability detection, lazy Proxy config
│   ├── loader.ts               YAML → Zod → JSON Schema pipeline
│   └── yaml/tools.yaml         Single source of truth for tool definitions
├── schemas/                    Zod input validation (deep-research, scrape-links, web-search)
├── tools/
│   ├── registry.ts             Tool lookup → capability check → validate → execute
│   ├── search.ts               web_search handler
│   ├── reddit.ts               search_reddit + get_reddit_post handlers
│   ├── scrape.ts               scrape_links handler
│   └── research.ts             deep_research handler
├── clients/
│   ├── search.ts               Google Serper API client
│   ├── reddit.ts               Reddit OAuth + comment tree parser
│   ├── scraper.ts              Scrape.do client with fallback modes
│   └── research.ts             OpenRouter client with model-specific handling
├── services/
│   ├── llm-processor.ts        Shared LLM extraction (singleton OpenAI client)
│   ├── markdown-cleaner.ts     HTML → Markdown via Turndown
│   └── file-attachment.ts      Local file reading with line ranges
└── utils/
    ├── retry.ts                Shared backoff + retry constants
    ├── concurrency.ts          Bounded parallel execution (pMap, pMapSettled)
    ├── url-aggregator.ts       CTR-weighted scoring + consensus detection
    ├── errors.ts               Error classification + structured errors
    ├── logger.ts               MCP logging protocol
    └── response.ts             Standardized 70/20/10 output formatting

Deploy

Cloudflare Workers

npx wrangler deploy

Uses Durable Objects with SQLite storage. YAML-based tool definitions are replaced with inline definitions since there's no filesystem in Workers.

npm

Published as mcp-research-powerpack. Binary names: mcp-research-powerpack, research-powerpack-mcp.

Development

pnpm install          # Install dependencies
pnpm dev              # Run with tsx (live TypeScript)
pnpm build            # Compile to dist/
pnpm typecheck        # Type-check without emitting
pnpm start            # Run compiled output

Testing

pnpm test:web-search     # Test web search tool
pnpm test:reddit-search  # Test Reddit search
pnpm test:scrape-links   # Test scraping
pnpm test:deep-research  # Test deep research
pnpm test:all            # Run all tests
pnpm test:check          # Check environment setup

Contributing

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Make your changes
  4. Run pnpm typecheck && pnpm build to verify
  5. Commit (git commit -m 'feat: add amazing feature')
  6. Push to your branch (git push origin feature/amazing-feature)
  7. Open a Pull Request

License

MIT © Yiğit Konur