npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

deep-research-hub-mcp

v0.1.0

Published

Deep research MCP server for Claude — Perplexity & OpenAI APIs with cost tracking, batch queries, gap analysis, and prompt enrichment. Local self-hosted TypeScript research agent.

Downloads

21

Readme

deep-research-hub-mcp

npm version npm downloads Coverage Tests TypeScript License

A deep research MCP server and research agent tool for Claude, Cursor, and any MCP-compatible client. Routes queries to Perplexity and OpenAI deep research APIs with cost tracking, batch queries, gap analysis, prompt enrichment, and a full audit trail. Local, self-hosted, provider-agnostic.

For comparisons, provider guides, and common questions, see FAQ.md.

Contents

FAQ → What is a deep research MCP? · How much does it cost? · Do I need Firecrawl? · Cursor setup · Gemini? · Best deep research MCP?


Quick Start

Prerequisites

  • Node.js 18+ (download)
  • Perplexity API key (get one) — primary provider
  • OpenAI API key (get one) — for prompt enrichment + optional escalation

Setup

git clone https://github.com/thedapperdev/deep-research-hub-mcp.git
cd deep-research-hub-mcp
npm install

Create .env with your API keys:

OPENAI_API_KEY=sk-...
PERPLEXITY_API_KEY=pplx-...
PORT=3100

Start the server:

npm run dev    # Development with live reload
npm start      # Production

Register with Your Client

Claude Code:

claude mcp add --transport http deep-research-hub http://localhost:3100/mcp

Cursor — add to .cursor/mcp.json:

{
  "mcpServers": {
    "deep-research-hub": {
      "url": "http://localhost:3100/mcp"
    }
  }
}

Works with any MCP-compatible client (Claude Desktop, VS Code, Windsurf, etc.).

Verify It Works

curl http://localhost:3100/health
# {"status":"ok","providers":{"perplexity":true,"openai":true}}

Tools

5 composable tools — each does one thing well. As one r/mcp commenter noted, "MCPs should be atomic in nature." See usage examples below.

research_question

Submit a single research query. Returns a job ID immediately — research runs asynchronously.

| Parameter | Type | Default | Description | |-----------|------|---------|-------------| | question | string | required | The research question | | provider | "perplexity" | "openai" | "perplexity" | Research provider | | enrichPrompt | boolean | true | Rewrite query via gpt-4.1 before sending | | context | string | — | Additional context (subject, persona) | | outputPath | string | — | File path to write results when complete | | persona | string | — | Research persona (e.g. "security-auditor") |

→ { jobId: "abc-123", status: "submitted", estimatedMinutes: 3 }

research_batch

Process multiple questions with concurrency control.

| Parameter | Type | Default | Description | |-----------|------|---------|-------------| | questions | string[] | — | Array of questions (or use questionsFile) | | questionsFile | string | — | Path to a persona markdown file | | provider | "perplexity" | "openai" | "both" | "perplexity" | Provider for all questions | | concurrency | number (1-5) | 2 | Parallel jobs | | outputDir | string | — | Directory for output files | | enrichPrompts | boolean | true | Enrich all prompts via gpt-4.1 |

→ { batchId: "batch-456", totalQuestions: 5, estimatedMinutes: 8 }

check_status

Poll job or batch progress. Any agent can call this with any job ID.

| Parameter | Type | Description | |-----------|------|-------------| | jobId | string | Check a specific job | | batchId | string | Check all jobs in a batch |

→ { status: "completed", costUsd: 0.06, hasResult: true }

get_results

Retrieve completed research with citations and cost data.

| Parameter | Type | Default | Description | |-----------|------|---------|-------------| | jobId | string | — | Specific job ID | | batchId | string | — | Get all results from a batch | | format | "full" | "summary" | "citations-only" | "full" | Detail level |

find_gaps

Analyse results for gaps, contradictions, and shallow coverage. Optionally auto-submit follow-up questions.

| Parameter | Type | Default | Description | |-----------|------|---------|-------------| | resultsDir | string | — | Directory of previous result files | | jobIds | string[] | — | Specific job IDs to analyse | | autoSubmit | boolean | false | Auto-submit follow-ups for identified gaps | | provider | "perplexity" | "openai" | "perplexity" | Provider for follow-ups |

→ { totalResultsAnalysed: 5, gapsFound: 2, gaps: [{ area, severity, suggestedQuestion }] }

Architecture

graph LR
    subgraph Input["🔵 Input"]
        A[You] --> B[Claude / Cursor<br/>MCP Client]
    end
    subgraph Processing["🟣 Processing"]
        B --> C[deep-research-hub-mcp]
        C --> D{Provider Router}
        D --> E[Prompt Enrichment<br/>gpt-4.1]
        E --> F[Perplexity<br/>sonar-deep-research]
        E --> G[OpenAI<br/>o4-mini-deep-research]
        F --> H[Job Manager<br/>async polling]
        G --> H
    end
    subgraph Output["🟢 Output"]
        H --> I[Cost Tracker<br/>audit log]
        I --> J[Gap Analyser<br/>auto follow-up]
        J --> K[Markdown Output<br/>with citations]
    end

    style Input fill:#dbeafe,stroke:#3b82f6,color:#1e3a5f
    style Processing fill:#ede9fe,stroke:#8b5cf6,color:#3b1f7e
    style Output fill:#dcfce7,stroke:#22c55e,color:#14532d

Prompt Enrichment

Raw questions often produce shallow results. Following OpenAI's recommended pattern, the server uses gpt-4.1 (~$0.007/query) to rewrite your question before sending it to the deep research model. Controlled per-query via the enrichPrompt parameter.

Before (raw):

How does prompt enrichment improve deep research quality?

After (enriched by gpt-4.1):

Provider: General
Persona: Prompt Engineer
Cluster: Prompt Optimisation Techniques

Research the impact of prompt enrichment on deep research output quality.
Include: specific before/after examples, evidence of quality improvement
from pre-processing models, implementation patterns, and citation quality
differences. Prioritise primary sources and technical documentation.

OUTPUT REQUIREMENTS:
- Structured markdown with inline citations
- Distinguish confirmed findings from speculation
- Include a "Research Gaps" section

Research Output

Completed research is automatically saved to output/. Output location is configurable via DATA_DIR. Each file includes YAML frontmatter:

---
question: "Compare Perplexity vs OpenAI deep research..."
provider: perplexity
model: sonar-deep-research
researched: 2026-03-28T18:27:01Z
duration_seconds: 122
cost_usd: 0.0524
input_tokens: 117
output_tokens: 6520
citations_count: 41
---

Configuration

Environment Variables

| Variable | Default | Description | |----------|---------|-------------| | OPENAI_API_KEY | required | OpenAI API key (enrichment + escalation) | | PERPLEXITY_API_KEY | required | Perplexity API key (primary provider) | | PORT | 3100 | Server port | | COST_LIMIT_SESSION_USD | 200 | Hard cap per session | | COST_LIMIT_PER_JOB_USD | 5 | Cap per single query | | COST_LIMIT_PER_BATCH_USD | 50 | Cap per batch | | POLL_INTERVAL_PERPLEXITY_MS | 15000 | Poll Perplexity every 15s | | POLL_INTERVAL_OPENAI_MS | 30000 | Poll OpenAI every 30s | | MAX_POLL_ATTEMPTS | 60 | Timeout after ~15 min | | DEFAULT_BATCH_CONCURRENCY | 2 | Parallel jobs in batch | | DATA_DIR | ~/.deep-research-mcp | Job store + audit log location |

Cost Tracking

Every query is logged to ~/.deep-research-mcp/audit.log with timestamp, job ID, provider, cost, and question. check_status returns cost data for any job in real time. Full cost breakdown →


Providers & Benchmarks

| Provider | Model | Cost/query | Speed | Best for | |----------|-------|-----------|-------|----------| | Perplexity (default) | sonar-deep-research | ~$0.06 | ~3 min | Factual research, citations | | OpenAI (escalation) | o4-mini-deep-research | ~$0.92 | ~23 min | Complex reasoning | | OpenAI (enrichment) | gpt-4.1 | ~$0.007 | <1s | Prompt rewriting |

Why Perplexity Is the Default

| Benchmark | Perplexity DR | OpenAI o4-mini DR | Winner | |-----------|--------------|-------------------|--------| | DRACO (citation + accuracy) | 70.5% | 41.9% | Perplexity | | SimpleQA (factual accuracy) | 93.9% | 20.2% (base) | Perplexity | | Citation accuracy | 90.24% | N/A | Perplexity | | Average response time | ~3 min | ~23 min | Perplexity | | Cost per query | ~$0.06 | ~$0.92 | Perplexity |

Perplexity outperforms both OpenAI and Gemini on DRACO — the production-grounded benchmark for deep research (Gemini comparison →). OpenAI is available as manual escalation via provider: "openai" for questions requiring deeper reasoning.

Provider-Agnostic by Design

Our src/providers/ abstraction makes it straightforward to add new providers via PR. Implement src/providers/<name>.ts following the existing pattern. PRs welcome.


Usage Examples

Full parameter reference: Tools

Single question:

You: "Research how Kubernetes handles pod autoscaling."

Claude calls: research_question({
  question: "How does Kubernetes handle pod autoscaling...",
  provider: "perplexity"
})
→ Returns: { jobId: "abc-123", status: "submitted", estimatedMinutes: 3 }

Claude calls: check_status({ jobId: "abc-123" })
→ Returns: { status: "completed", costUsd: 0.06, hasResult: true }

Claude calls: get_results({ jobId: "abc-123" })
→ Full markdown report with 42 citations, saved to output/

Batch research:

You: "Compare these 5 databases: Postgres, MongoDB, CockroachDB, PlanetScale, Neon."

Claude calls: research_batch({
  questions: ["Postgres strengths...", "MongoDB strengths...", ...],
  provider: "perplexity",
  outputDir: "./output/db-comparison"
})
→ Returns: { batchId: "batch-456", totalQuestions: 5, estimatedMinutes: 8 }

Claude calls: check_status({ batchId: "batch-456" })
→ Returns: { completed: 3, running: 2, totalCost: 0.18 }

Gap analysis:

You: "Check if any of the database research has gaps."

Claude calls: find_gaps({ resultsDir: "./output/db-comparison" })
→ "CockroachDB result has only 1 citation and contains generic language.
   Suggested follow-up: What specific consistency guarantees does
   CockroachDB provide under network partition?"

Use Cases

  • Tech stack evaluation — "Compare 10 databases." Batch all queries, get structured output, know the total cost.
  • Competitive intelligence — Track 20 competitors weekly with full audit trail and gap flags.
  • Academic literature review — Batch research questions, auto-flag where citations are weak.
  • Lead enrichment — 500 company profiles at $0.06 each instead of $2+ manual research.
  • Due diligence — 100+ source synthesis with full citation trail for compliance.

Get started →


Why This Exists

Developers have been asking for exactly this — a deep research MCP server that runs locally, integrates with Claude in one command, and doesn't require standalone project setup. The existing options had real gaps:

  • No cost visibility. Deep research can cost $10+ per query with OpenAI. At scale, costs compound with zero audit trail.
  • Scraping dependencies. Most MCP research servers require Firecrawl or Scrapegraph — unnecessary when Perplexity and OpenAI handle search natively.
  • No batching. Researching 50 topics means copy-pasting 50 times.
  • No gap detection. You don't know which answers are shallow until you read every one.
  • Single-provider lock-in. The official @perplexity-ai/mcp-server is great for single queries but only supports Perplexity. OctagonAI locks you into their proprietary backend with no cost transparency.
  • Cost tracking was explicitly requested by the community as a missing feature.

For a full competitor breakdown, see FAQ.md — How does deep-research-hub-mcp compare?

Competitor Feature Matrix

| Feature | deep-research-hub-mcp | teelaitila | pminervini | ssdeanx | reading-plus-ai | Arindam200 | OctagonAI | |---------|---|---|---|---|---|---|---| | Language | 🟦 TypeScript | 🟦 TypeScript | 🐍 Python | 🟦 TypeScript | 🐍 Python | 🐍 Python | 🟨 JavaScript | | No external service needed | ✅ | ❌ Firecrawl | ✅ | ✅ | ✅ | ❌ Scrapegraph | ❌ Octagon API | | Truly self-hosted | ✅ | ⚠️ Partial | ✅ | ✅ | ✅ | ⚠️ Partial | ❌ SaaS | | Cost tracking | ✅ Per-query audit + limits | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | | Batch processing | ✅ Concurrency control | ❌ | ❌ | ✅ | ❌ | ❌ | ❌ | | Gap analysis | ✅ Auto-detect + follow-up | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | | Prompt enrichment | ✅ gpt-4.1 | ❌ | ⚠️ Clarification | ⚠️ Persona prompts | ⚠️ Elaboration | ❌ | ❌ | | Research approach | ✅ Native APIs | ⚠️ Scrape + LLM | ✅ Native APIs | ✅ Gemini API | ✅ Claude built-in | ⚠️ Scrape + LLM | ⚠️ Proprietary | | Multiple providers | ✅ Perplexity + OpenAI | ✅ Multiple | ✅ OpenAI + Gemini | ❌ Gemini only | ❌ Claude only | ❌ Nebius only | ❌ Octagon only | | Atomic tools | ✅ 5 composable | ❌ Single tool | ⚠️ Multiple | ❌ Single tool | ❌ Prompt only | ❌ Pipeline | ❌ 1 tool | | Async jobs | ✅ Submit/poll/retrieve | ❌ | ✅ | ❌ | ❌ | ❌ | ❓ | | Audit trail | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | | Test coverage | ✅ 97% (134 tests) | ❓ | ✅ pytest | ❓ | ❓ | ❌ | ❌ |


Dogfooding

The competitive analysis, community demand evidence, and cost data in this README were gathered by running this MCP server on itself. All 5 tools demonstrated end-to-end:

| # | Tool Used | Query | Cost | Duration | Tokens | Citations | |---|-----------|-------|------|----------|--------|-----------| | 1 | research_question | Developer demand for deep research MCP tools | $0.073 | 190s | 9,004 | 50 | | 2 | research_batch | Competitor comparison (7 named repos) | $0.091 | ~180s | 8,800+ | 50 | | 3 | research_batch | Why cost tracking matters at scale | $0.077 | ~180s | 7,700+ | 50 | | 4 | find_gaps | Gap analysis on all 3 results | $0.00 | <1s | — | — | | Total | All 5 tools demonstrated | 3 comprehensive reports | $0.24 | ~9 min | 25,500+ | 150 |

All query prompts were enriched by gpt-4.1 before sending. find_gaps found 0 gaps — all results were well-cited. Output files are in output/examples/.


Troubleshooting

Server starts but tools don't appear in Claude:

  • Ensure you registered with: claude mcp add --transport http deep-research-hub http://localhost:3100/mcp
  • Restart Claude Code after registering

"At least one provider API key must be set":

  • Check your .env file has valid OPENAI_API_KEY and/or PERPLEXITY_API_KEY
  • See Configuration for all environment variables

Jobs stuck in "running" forever:

  • Check ~/.deep-research-mcp/audit.log for errors
  • Verify API key is valid: curl -s http://localhost:3100/health
  • Deep research queries take 2-4 min (Perplexity) or 5-20 min (OpenAI) — this is normal

Known Limitations

This server uses HTTP StreamableHTTP transport. Research jobs run asynchronously — research_question returns a job ID immediately, and you poll with check_status.

MCP clients (including Claude Code) do not yet support automatic resumption when async operations complete (anthropics/claude-code#1478). This is a limitation of all HTTP MCP servers with async patterns, not specific to this tool.


Scripts

| Script | Purpose | |--------|---------| | npm run dev | Development with live reload | | npm start | Production start | | npm test | Run test suite (134 tests) | | npm run test:watch | Jest in watch mode | | npm run test:coverage | Coverage report | | npm run lint | TypeScript type checking | | npm run build | Compile TypeScript |


Contributing

This project is provider-agnostic by design. PRs for new providers are welcome — implement src/providers/<name>.ts following the existing perplexity.ts and openai.ts patterns. stdio transport support would also be a welcome contribution.


License

MIT


This project is experimental and AI-assisted. No warranties provided. Use at your own risk.