researchmcp

v1.0.2

Published

3 months ago

MCP-powered Research Assistant for querying academic papers from arXiv, Semantic Scholar, and PubMed

0High
0Medium
0Low

mcp model-context-protocol research arxiv semantic-scholar pubmed academic-papers research-assistant ai llm citations bibtex

🧠 Research MCP - Model Context Protocol Research Assistant

A complete Model Context Protocol (MCP)-based Research Assistant that enables LLMs to fetch, analyze, and summarize academic research papers in real-time from multiple trusted sources: arXiv, Semantic Scholar, and PubMed.

🔍 Overview

The Research MCP system provides standardized access to academic research databases through three specialized MCP servers. Each server implements the MCP specification, allowing AI assistants to query live research data, process results, and return structured insights like summaries, comparisons, and citations.

✨ Features

📚 Multi-Source Search: Query arXiv, Semantic Scholar, and PubMed simultaneously
🔄 Automatic Deduplication: Smart paper matching across different sources
📊 Citation Analysis: Track citation counts and influential papers
📝 BibTeX Generation: Automatic citation formatting for all sources
⚡ Rate Limiting: Built-in request throttling to respect API limits
🎯 Advanced Filtering: Filter by year, author, venue, and more
🔍 Full Metadata: Complete paper information including abstracts, authors, and links

🏗️ Architecture

┌─────────────────────────────────────────────────────────────┐
│                      LLM Client                             │
│  (Issues natural language queries)                          │
└────────────────┬────────────────────────────────────────────┘
                 │
                 │ MCP Protocol
                 │
┌────────────────┴───────────────────────────────────────────┐
│                   MCP Servers                              │
│  ┌──────────┐   ┌──────────────┐  ┌──────────┐             │
│  │  arXiv   │   │   Semantic   │  │  PubMed  │             │
│  │  Server  │   │    Scholar   │  │  Server  │             │
│  └────┬─────┘   └──────┬───────┘  └────┬─────┘             │
└───────┼────────────────┼───────────────┼───────────────────┘
        │                │               │
        │                │               │
┌───────┴────────────────┴───────────────┴───────────────────┐
│              External APIs                                 │
│    arXiv API    Semantic Scholar API    PubMed E-utils     │
└────────────────────────────────────────────────────────────┘

📋 Prerequisites

Node.js 18+
An MCP-compatible client (Claude Desktop, Cline, etc.)

🚀 Installation

Quick Start with npx (Recommended)

No installation or API keys needed! Just add to your MCP client configuration:

{
  "mcpServers": {
    "research-arxiv": {
      "command": "npx",
      "args": ["-y", "researchmcp", "arxiv"]
    },
    "research-semantic-scholar": {
      "command": "npx",
      "args": ["-y", "researchmcp", "semantic"]
    },
    "research-pubmed": {
      "command": "npx",
      "args": ["-y", "researchmcp", "pubmed"]
    }
  }
}

That's it! All three servers work perfectly without any API keys or configuration.

Local Development

For contributing or modifying the code:

git clone https://github.com/gyash1512/ResearchMCP.git
cd ResearchMCP
npm install
npm run build

🎮 Usage

Using with npx (Recommended)

Just configure in your MCP client - that's it! No API keys needed.

Local Development

Start servers individually for testing:

npm run start:arxiv
npm run start:semantic
npm run start:pubmed

MCP Configuration

Simple setup - no API keys required:

{
  "mcpServers": {
    "research-arxiv": {
      "command": "npx",
      "args": ["-y", "researchmcp", "arxiv"]
    },
    "research-semantic-scholar": {
      "command": "npx",
      "args": ["-y", "researchmcp", "semantic"]
    },
    "research-pubmed": {
      "command": "npx",
      "args": ["-y", "researchmcp", "pubmed"]
    }
  }
}

{
  "mcpServers": {
    "research-semantic-scholar": {
      "command": "npx",
      "args": ["-y", "researchmcp", "semantic"],
      "env": {
        "SEMANTIC_SCHOLAR_API_KEY": "your_key_here"
      }
    },
    "research-pubmed": {
      "command": "npx",
      "args": ["-y", "researchmcp", "pubmed"],
      "env": {
        "PUBMED_API_KEY": "your_key_here",
        "PUBMED_EMAIL": "[email protected]"
      }
    }
  }
}

{
  "mcpServers": {
    "research-arxiv": {
      "command": "node",
      "args": ["./dist/servers/arxiv-server.js"],
      "cwd": "/absolute/path/to/ResearchMCP"
    },
    "research-semantic-scholar": {
      "command": "node",
      "args": ["./dist/servers/semantic-scholar-server.js"],
      "cwd": "/absolute/path/to/ResearchMCP"
    },
    "research-pubmed": {
      "command": "node",
      "args": ["./dist/servers/pubmed-server.js"],
      "cwd": "/absolute/path/to/ResearchMCP"
    }
  }
}

Note: Replace /absolute/path/to/ResearchMCP with your actual project path.

📚 Available Tools

arXiv Server

`search_arxiv`

Search for papers on arXiv by keyword, author, or subject.

Parameters:

query (string, required): Search query
maxResults (number, optional): Max results (default: 10, max: 100)
startYear (number, optional): Filter by start year
endYear (number, optional): Filter by end year
author (string, optional): Filter by author name
sortBy (string, optional): Sort by relevance, lastUpdatedDate, or submittedDate

Example:

{
  "query": "quantum computing",
  "maxResults": 5,
  "startYear": 2023,
  "sortBy": "relevance"
}

`get_arxiv_paper`

Get detailed information about a specific arXiv paper by ID.

Parameters:

arxivId (string, required): arXiv paper ID (e.g., "2301.12345")

`arxiv_to_bibtex`

Convert arXiv paper to BibTeX format.

Parameters:

arxivId (string, required): arXiv paper ID

Semantic Scholar Server

`search_semantic_scholar`

Search for papers with citation information.

Parameters:

query (string, required): Search query
maxResults (number, optional): Max results (default: 10, max: 100)
startYear (number, optional): Filter by start year
endYear (number, optional): Filter by end year

Example:

{
  "query": "transformer architecture",
  "maxResults": 10,
  "startYear": 2023
}

`get_semantic_scholar_paper`

Get paper by Semantic Scholar ID or DOI.

Parameters:

identifier (string, required): Paper ID or DOI

`get_paper_citations`

Get papers that cite a specific paper.

Parameters:

paperId (string, required): Semantic Scholar paper ID
maxResults (number, optional): Max citing papers (default: 10, max: 100)

`semantic_scholar_to_bibtex`

Convert paper to BibTeX format.

Parameters:

identifier (string, required): Paper ID or DOI

PubMed Server

`search_pubmed`

Search biomedical and life sciences papers.

Parameters:

query (string, required): Search query (supports MeSH terms)
maxResults (number, optional): Max results (default: 10, max: 100)
startYear (number, optional): Filter by start year
endYear (number, optional): Filter by end year

Example:

{
  "query": "cancer treatment",
  "maxResults": 5,
  "startYear": 2022
}

`get_pubmed_paper`

Get paper by PMID.

Parameters:

pmid (string, required): PubMed ID

`pubmed_to_bibtex`

Convert paper to BibTeX format.

Parameters:

pmid (string, required): PubMed ID

💡 Example Queries

Example 1: Multi-Source Research Query

Query: "Find recent papers on federated learning in healthcare"

Workflow:

Search arXiv: search_arxiv with query "federated learning healthcare", startYear: 2023
Search Semantic Scholar: search_semantic_scholar with same parameters
Search PubMed: search_pubmed with same parameters
Combine and deduplicate results
Sort by citation count and relevance
Generate summary with top 5 papers

Expected Output:

Comprehensive list of papers from all sources
Deduplicated results
Citation counts where available
Links to full papers
BibTeX citations

Example 2: Most Cited Paper

Query: "What's the most cited 2023 paper on quantum machine learning?"

Workflow:

Call search_semantic_scholar:

{
  "query": "quantum machine learning",
  "maxResults": 50,
  "startYear": 2023,
  "endYear": 2023
}

Sort results by citationCount
Get detailed info with get_semantic_scholar_paper
Generate BibTeX with semantic_scholar_to_bibtex

Expected Output:

Paper title and authors
Citation count and venue
Abstract and key findings
BibTeX citation
Link to paper

Example 3: Research Trend Analysis

Query: "Summarize transformer innovations after 2023"

Workflow:

Search multiple sources for "transformer architecture" papers after 2023
Extract key information from abstracts
Identify common themes and methods
Generate trend analysis
Provide top papers with citations

Expected Output:

Overview of key innovations
Timeline of developments
Most influential papers
Citation network analysis
Recommended reading list

Example 4: Citation Network

Query: "Find papers citing 'Attention is All You Need'"

Workflow:

Find original paper: search_semantic_scholar with title
Get paper ID from results
Call get_paper_citations with the paper ID
Filter by year/relevance
Generate summary of citing papers

Expected Output:

List of papers that cite the original work
Citation contexts
Related research directions
Impact analysis

🔧 API Response Schemas

arXiv Paper Object

{
  id: string;              // arXiv ID (e.g., "2301.12345")
  title: string;
  authors: string[];
  abstract: string;
  published: string;       // ISO date
  updated: string;         // ISO date
  url: string;            // Paper URL
  pdfUrl: string;         // PDF download URL
  categories: string[];   // Subject categories
  primaryCategory: string;
}

Semantic Scholar Paper Object

{
  paperId: string;
  title: string;
  abstract: string | null;
  year: number | null;
  authors: Array<{
    authorId: string;
    name: string;
  }>;
  citationCount: number;
  referenceCount: number;
  influentialCitationCount: number;
  url: string;
  venue: string | null;
  publicationDate: string | null;
}

PubMed Paper Object

{
  pmid: string;            // PubMed ID
  title: string;
  abstract: string;
  authors: string[];
  journal: string;
  year: string;
  doi: string | null;
  url: string;
  publicationTypes: string[];
  meshTerms: string[];    // Medical Subject Headings
}

🛡️ Rate Limiting

All servers work great without API keys:

| Server | Default Rate | With API Key | Do You Need Keys? | |--------|--------------|--------------|-------------------| | arXiv | 3 req/sec | N/A | ❌ No - works perfectly! | | Semantic Scholar | 1-3 req/sec | 10 req/sec | ❌ No - unless making 100+ queries/min | | PubMed | 3 req/sec | 10 req/sec | ❌ No - unless making 100+ queries/min |

Recommendation: Start without any API keys. Only add them if you hit rate limits.

🔒 Security Notes

No API keys needed - all servers work out of the box
If using API keys, pass via MCP config env section (see optional config above)
Never commit API keys to version control
Respect API rate limits and terms of service

📖 MCP Specification Compliance

This implementation follows the Model Context Protocol specification:

✅ Standard tool definition schema
✅ JSON-based request/response format
✅ Error handling with proper status codes
✅ Resource management and cleanup
✅ Stdio transport for client communication

🤝 Contributing

Contributions are welcome! Areas for improvement:

Additional research sources (IEEE, ACM, etc.)
Advanced filtering and ranking algorithms
Paper recommendation system
Citation graph visualization
Full-text analysis capabilities

📄 License

MIT License - See LICENSE file for details

🙏 Acknowledgments

arXiv for open access to research papers
Semantic Scholar for citation data and API
PubMed/NCBI for biomedical research database
Model Context Protocol team for the MCP specification

📞 Support

For issues, questions, or contributions:

Open an issue on GitHub
Check API documentation for each service
Review MCP specification for protocol details

Built with ❤️ using TypeScript and the Model Context Protocol

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

🧠 Research MCP - Model Context Protocol Research Assistant

🔍 Overview

✨ Features

🏗️ Architecture

📋 Prerequisites

🚀 Installation

Quick Start with npx (Recommended)

Local Development

🎮 Usage

Using with npx (Recommended)

Local Development

MCP Configuration

📚 Available Tools

arXiv Server

search_arxiv

get_arxiv_paper

arxiv_to_bibtex

Semantic Scholar Server

search_semantic_scholar

get_semantic_scholar_paper

get_paper_citations

semantic_scholar_to_bibtex

PubMed Server

search_pubmed

get_pubmed_paper

pubmed_to_bibtex

💡 Example Queries

Example 1: Multi-Source Research Query

Example 2: Most Cited Paper

Example 3: Research Trend Analysis

Example 4: Citation Network

🔧 API Response Schemas

arXiv Paper Object

Semantic Scholar Paper Object

PubMed Paper Object

🛡️ Rate Limiting

🔒 Security Notes

📖 MCP Specification Compliance

🤝 Contributing

📄 License

🙏 Acknowledgments

📞 Support

`search_arxiv`

`get_arxiv_paper`

`arxiv_to_bibtex`

`search_semantic_scholar`

`get_semantic_scholar_paper`

`get_paper_citations`

`semantic_scholar_to_bibtex`

`search_pubmed`

`get_pubmed_paper`

`pubmed_to_bibtex`