npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

arxiv-mcp-server

v0.1.5

Published

An MCP server for searching and fetching papers from arXiv

Downloads

926

Readme

arXiv MCP Server

License: MIT Python 3.11+ MCP

I built this MCP server to access 2.4M+ arXiv papers directly in Claude Desktop. It uses GROBID for academic PDF extraction and builds citation networks to track research connections.

What It Does

  • Search arXiv by keywords, authors, categories, and dates
  • Extract full text from PDFs using GROBID (handles equations and references)
  • Build citation networks using Semantic Scholar integration
  • Manage a local library with collections and tags
  • Generate summaries and compare papers side-by-side

PDF Extraction

I implemented three extraction tiers that adapt to document complexity:

  • FAST: pdfplumber for simple documents (~1s)
  • SMART: GROBID for academic papers (~5s) - preserves equations and references
  • PREMIUM: Mistral OCR for complex layouts (~2s) - requires API key

🚀 Quick Start

Installation

Option 1: Install via npm (Recommended)

# Install globally
npm install -g arxiv-mcp-server

# Or install locally in a project
npm install arxiv-mcp-server

Option 2: Install from source

# Clone the repository
git clone https://github.com/r-uben/arxiv-mcp-server.git
cd arxiv-mcp-server

# Install dependencies with Poetry
poetry install

# Test the server
poetry run arxiv-mcp-server

Claude Desktop Integration

For npm installation:

Add to your claude_desktop_config.json:

{
  "mcpServers": {
    "arxiv": {
      "command": "npx",
      "args": ["arxiv-mcp-server"],
      "cwd": "/path/to/your/project"
    }
  }
}

Or for global installation:

{
  "mcpServers": {
    "arxiv": {
      "command": "arxiv-mcp-server"
    }
  }
}

For Poetry installation:

{
  "mcpServers": {
    "arxiv": {
      "command": "poetry",
      "args": ["run", "arxiv-mcp-server"],
      "cwd": "/path/to/arxiv-mcp-server"
    }
  }
}

Restart Claude Desktop and you're ready to go!

Examples

"Search for recent papers on large language models in the last 6 months"
"Find all papers by Geoffrey Hinton on deep learning"
"Build a citation network around paper 2301.00001"
"Save paper 2301.00001 to my 'Transformers' collection"
"Summarize the key findings from paper 2301.00001"

⚙️ Configuration

API Keys (Optional)

For enhanced features, set these environment variables:

# For premium PDF extraction (Mistral OCR)
export MISTRAL_API_KEY="your-mistral-api-key"

# For faster citation lookups (Semantic Scholar)
export SEMANTIC_SCHOLAR_API_KEY="your-semantic-scholar-api-key"

External Services (Optional)

GROBID Server - For enhanced academic paper processing:

docker run --rm -it --init -p 8070:8070 lfoppiano/grobid:0.8.0

Configuration Options

| Variable | Purpose | Default | |----------|---------|---------| | MISTRAL_API_KEY | Premium OCR extraction | None | | SEMANTIC_SCHOLAR_API_KEY | Citation discovery API | None | | GROBID_SERVER | GROBID server URL | http://localhost:8070 | | FORCE_SMART | Always use SMART tier for academic papers | true |

Available Tools

I've implemented 25 tools across four categories:

  • Search & Discovery: search papers, find by author, get recent papers, find similar papers
  • Library Management: save papers, manage collections, track reading status, search library
  • Citation Analysis: extract references, find citing papers, build citation networks
  • Content Analysis: extract PDFs, summarize papers, compare papers, extract key findings

How It Works

The server automatically:

  1. Analyzes PDF complexity and selects the best extraction method
  2. Caches papers locally to reduce API calls
  3. Respects rate limits (arXiv: 3 req/s, Semantic Scholar: 1-4 req/s)
  4. Falls back gracefully when services are unavailable

Development

# Development setup
poetry install
poetry run pytest                    # Run tests
poetry run black .                   # Format code  
poetry run ruff check .              # Lint code

# Testing individual components
poetry run python -m pytest tests/  # Full test suite
poetry run arxiv-mcp-server          # Start server manually

arXiv Categories

| Field | Popular Categories | |-------|-------------------| | Computer Science | cs.AI, cs.LG, cs.CV, cs.CL, cs.RO | | Mathematics | math.CO, math.NT, math.AG, math.ST | | Physics | astro-ph, cond-mat, hep-ph, quant-ph | | Biology | q-bio.BM, q-bio.CB, q-bio.GN |

Complete arXiv taxonomy →

License

MIT License © 2025 Ruben Fernández-Fuertes