paperplain-mcp

v1.2.5

Published

3 months ago

MCP server — search 200M+ peer-reviewed papers from PubMed, ArXiv, and Semantic Scholar. Free. No API key.

Downloads

364

0High
0Medium
0Low

sulmatajb

mcp model-context-protocol pubmed arxiv semantic-scholar research papers science ai-agent claude

PaperPlain MCP

Web search gives your agent links. PaperPlain gives it science.

Give any AI agent instant access to 200M+ peer-reviewed papers from PubMed, ArXiv, and Semantic Scholar — structured, verifiable, and ready for reasoning.

Free. No API key. No account. No backend.

Why not just use web search?

| Web Search | PaperPlain MCP | |---|---| | Snippets, SEO noise, blogs | Full abstracts, peer-reviewed only | | Returns URLs to scrape | Structured JSON ready for reasoning | | Can hallucinate or misattribute sources | Real DOIs, real PMIDs — verifiable | | Search engines block bots | PubMed/ArXiv/S2 built for programmatic access | | No quality signal | Citation counts included | | Mixed sources, no routing | Health → PubMed, CS/AI → ArXiv, general → all three |

Install

npx -y paperplain-mcp

Setup

Add to your MCP config file (Claude Desktop, Cursor, Windsurf, or any MCP-compatible client):

{
  "mcpServers": {
    "paperplain": {
      "command": "npx",
      "args": ["-y", "paperplain-mcp"]
    }
  }
}

Restart your client. That's it.

Config file locations:

Claude Desktop (Mac): ~/Library/Application Support/Claude/claude_desktop_config.json
Cursor: .cursor/mcp.json
Windsurf: ~/.codeium/windsurf/mcp_config.json

Note: PaperPlain is a stdio-based MCP. It works with local clients (Claude Desktop, Cursor, Windsurf, VS Code agents). It does not support Claude.ai web chat, which requires remote HTTP-based MCP servers.

Limitations

PaperPlain uses free public APIs — no backend, no cost. The trade-off is rate limits imposed by each source:

PubMed — generous, rarely an issue for normal agent usage
ArXiv — strict under parallel load; PaperPlain falls back to Semantic Scholar's ARXIV: endpoint automatically
Semantic Scholar — ~1 req/s unauthenticated; most likely to cause 429s in batch workflows

When a source is rate-limited, search_research returns a warnings field explaining which source failed and why. find_paper_by_title returns a plain-text error the agent can relay to the user.

Optional: Semantic Scholar API key

For heavy usage (automated research workflows, batch fetches), you can add a free S2 API key to raise the rate limit from ~1 req/s to 100 req/s.

Request a key at semanticscholar.org/product/api (free, approved within a day)
Add it to your MCP config:

{
  "mcpServers": {
    "paperplain": {
      "command": "npx",
      "args": ["-y", "paperplain-mcp"],
      "env": {
        "S2_API_KEY": "your-key-here"
      }
    }
  }
}

Zero-config users are unaffected — the key is entirely optional.

Tools

`search_research`

Search PubMed, ArXiv, and Semantic Scholar for peer-reviewed papers. Auto-routes based on topic — health queries go to PubMed + S2, CS/AI queries go to ArXiv + S2, everything else hits all three.

query         Natural language question or topic
max_results   1–10 papers (default: 5)
domain        "auto" | "health" | "cs" | "general"

Returns papers with title, authors, abstract, published date, URL, DOI, citation count, and a source_status field so your agent knows if any database was unavailable.

`fetch_paper`

Fetch full metadata and abstract for a specific paper. Supports:

ArXiv IDs — "2301.07041", "arxiv:2301.07041v2", "https://arxiv.org/abs/2301.07041"
PubMed IDs — "pubmed:37183813" or just "37183813"
DOIs — "10.1145/3290605.3300857" or "doi:10.1145/3290605.3300857" (resolved via Semantic Scholar)

Falls back to Semantic Scholar's ARXIV: endpoint when the ArXiv API is rate-limited.

`find_paper_by_title`

Find a specific paper when you only know its title. Uses Semantic Scholar's title-match search and returns the closest result.

title   Full or partial paper title, e.g. "Attention Is All You Need"
year    Publication year to narrow the match (optional)

Useful for verifying a citation or retrieving an abstract when you have no ID or DOI.

How it works

Agent calls search_research("agentic AI for home energy management")
PaperPlain classifies the domain (CS/AI) and routes to ArXiv + Semantic Scholar
Returns structured JSON — full abstracts, authors, dates, DOIs, citation counts
Agent's LLM synthesizes findings from the returned context — no black-box summaries

No LLM calls on our side. No cost. No rate limits beyond what PubMed, ArXiv, and Semantic Scholar impose.

Example output

{
  "query": "transformer architecture energy forecasting",
  "domain": "cs",
  "source_status": { "arxiv": "ok", "semanticscholar": "ok" },
  "total": 5,
  "papers": [
    {
      "id": "arxiv:2306.05042",
      "source": "arxiv",
      "title": "Temporal Fusion Transformers for Interpretable Multi-horizon Time Series Forecasting",
      "authors": ["Bryan Lim", "Sercan Arik"],
      "published": "2023-06-08",
      "abstract": "...",
      "url": "https://arxiv.org/abs/2306.05042",
      "citations": 1423
    }
  ]
}

Self-host

git clone https://github.com/sulmatajb/paperplain
cd paperplain/mcp
npm install
node server.js

License

MIT — do whatever you want with it.

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme