trove-mcp
v0.1.9
Published
Unified MCP server for open academic research search, retrieval, and synthesis.
Maintainers
Readme
trove-mcp
Give Claude the ability to search, read, and synthesize academic research across 250M+ papers using free/open sources.
What it connects to
- OpenAlex (primary metadata, discovery, citations)
- Semantic Scholar (semantic recommendations and similarity)
- arXiv (preprint metadata and full text)
- Unpaywall (OA discovery by DOI)
- PubMed (biomedical indexing)
- Hugging Face Papers API (discovery context endpoints; strict trending uses citation snapshots only)
- CORE (full-text fallback)
Install (Claude Desktop, stdio)
No local install is required with npx.
Quick smoke test:
npx -y trove-mcp@latest sync --queries="graph neural network"Optional global install:
npm i -g trove-mcp
trove-mcp sync --queries="graph neural network"Claude Desktop on macOS
- Open Claude Desktop.
- Go to
Settings->Developer->Edit Config. - Edit
~/Library/Application Support/Claude/claude_desktop_config.json. - Add this server entry under
mcpServers:
{
"mcpServers": {
"trove": {
"command": "npx",
"args": ["-y", "trove-mcp"],
"env": {
"TROVE_CONTACT_EMAIL": "[email protected]",
"TROVE_DB_PATH": "/Users/you/.trove-mcp/trove.db",
"SEMANTIC_SCHOLAR_API_KEY": "",
"UNPAYWALL_EMAIL": "[email protected]",
"CORE_API_KEY": ""
}
}
}
}- Save the file and fully restart Claude Desktop.
- Start a new chat and use Trove tools.
No API keys are required to start. Missing optional keys produce graceful partial/degraded responses.
Provider hardening behavior
- Every tool returns a structured envelope with
status,degraded, andwarnings. - Every tool envelope also includes
meta.versionso clients can verify the running package version. - Provider outages/rate limits are reported as warnings (not process crashes), and other sources are still used.
- Semantic Scholar is publicly accessible without a key, but unauthenticated traffic is shared and may return
429; Trove reports this as degraded and falls back to OpenAlex where possible. - Semantic Scholar recommendations are effectively API-key-backed for reliable use.
find_similar_papersnow fails fast with a clear warning whenSEMANTIC_SCHOLAR_API_KEYis not configured. - Unpaywall is only used when
UNPAYWALL_EMAIL(orTROVE_CONTACT_EMAIL) is set. - CORE works without key for basic access but may rate-limit heavily; set
CORE_API_KEYfor higher/stabler throughput. get_trending_papersis strict fail-closed: onlymode = snapshotwhen non-zero citation velocity evidence exists; otherwisemode = unavailable.get_trending_papersself-bootstraps local citation snapshots from OpenAlex, Semantic Scholar, and Hugging Face candidate discovery, then reportssnapshot_coveragediagnostics per source.get_trending_papersrequires local snapshot history to become useful. Official provider APIs do not expose day-level citation history, so first-runmode = unavailableis expected until at least one older snapshot date exists.find_similar_papersis strict fail-closed: Semantic Scholar only; no lexical fallback output.compare_papersandbuild_literature_mapreturn explicit insufficiency errors when extraction evidence quality is too low.- PapersWithCode runtime enrichment is disabled in strict mode.
get_referencesis strict fail-closed. Some large institutional arXiv preprints are retrievable as papers but still lack usable reference coverage in both Semantic Scholar and OpenAlex; in those cases Trove returns an explicit warning instead of guessed references. Useget_full_textto inspect inline citations directly when this happens.get_authorreturnsmostCitedPaperIdsandrecentPaperIdsas best-effort enrichment. If those follow-up lists cannot be fetched reliably, Trove returns the author profile aspartialand does not treat empty arrays as authoritative cache data.- OpenAlex-heavy search can still be domain-ambiguous for niche queries;
search_papersandbuild_literature_mapprefer precision gates, but queries with overloaded terms may still need tighter prompts or filters. trace_ideauses heuristic origin candidate ranking; timeline quality is often good, but the earliest canonical paper can still be missed when provider search ranking is imperfect.
HTTP mode (streamable)
TROVE_HTTP_BEARER_TOKEN=change-me npx trove-mcp --transport=http --port=3000Browser client example:
TROVE_HTTP_BEARER_TOKEN=change-me TROVE_HTTP_CORS_ORIGIN=http://localhost:3000 npx trove-mcp --transport=http --port=3000POST /mcpfor authenticated MCP requestsGET /healthfor health checks- Auth header:
Authorization: Bearer <token> - CORS exposes
Mcp-Session-Id/MCP-Session-Idfor browser MCP clients
Tools
| Tool | What it does |
|---|---|
| search_papers | Multi-source search + dedupe + deterministic ranking (OpenAlex/S2/arXiv/PubMed/CORE) |
| get_trending_papers | Topic papers ranked by citation velocity with mode = snapshot | unavailable |
| get_paper | Resolve paper by DOI/arXiv/S2/OpenAlex/PubMed/title |
| get_full_text | arXiv -> Unpaywall -> CORE full-text fallback with chunked output |
| get_citations | Papers that cite a target paper |
| get_references | Papers referenced by a target paper |
| find_similar_papers | Semantic Scholar recommendations; reliable use requires SEMANTIC_SCHOLAR_API_KEY |
| get_author | Author profile and impact metrics; paper-list enrichment is best-effort |
| get_institution_output | Institution profile + publication output |
| get_coauthor_network | Collaboration graph around an author |
| build_literature_map | Structured evidence map (claims/methods/limitations/consensus) |
| compare_papers | Structured 2-5 paper comparison |
| trace_idea | Concept lineage across time and influence |
Resources
trove://resources/versiontrove://resources/source-capability-matrixtrove://resources/schema-referencetrove://resources/cache-health
Prompts
literature-review-workflowpaper-comparison-workflowidea-lineage-workflow
Sync job for citation snapshots
npx trove-mcp sync
npx trove-mcp sync --queries="agentic ai,graph neural network,causal inference"Run this on a schedule (e.g. cron) to improve get_trending_papers quality.
Without older local snapshots, first-run trending results will correctly return mode = unavailable.
Development
npm install
npm run typecheck
npm test
npm run test:quality
npm run verify:release
npm run build
npm run inspectLive-contract tests:
LIVE_CONTRACT=1 npm run test:liveTemporarily hide tools from MCP registration (for strict release gating):
TROVE_DISABLED_TOOLS=get_trending_papers,find_similar_papers npx -y trove-mcp@latestData and compliance
- No scraping
- No Sci-Hub or paywalled bypassing
- All sources are public/open APIs or legal OA links
License
MIT
