npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@kennyfrc/jina-reader

v1.0.1

Published

A Model Context Protocol (MCP) server for integrating with Jina Reader API, allowing Claude and other AI assistants to read and extract content from webpages, HTML, and PDF files.

Downloads

32

Readme

Jina Reader MCP

A Model Context Protocol (MCP) server for integrating with Jina Reader API, allowing Claude and other AI assistants to read and extract content from webpages, HTML, and PDF files.

Features

  • Read content from URLs
  • Process HTML content directly
  • Extract information from PDF files (base64 encoded)
  • Format responses with links and metadata

Setup

  1. Clone this repository
  2. Install dependencies:
    npm install
  3. Build the project:
    npm run build

Usage

Using with npx

The easiest way to use this package is with npx. You need to provide your Jina API key as an environment variable:

# Set the API key directly in the command
JINA_API_KEY=your_jina_api_key npx -y @kennyfrc/jina-reader

Starting the server locally

If you've cloned the repository:

# Provide the API key when running
JINA_API_KEY=your_jina_api_key npm start

Integrating with Claude Desktop

Add this to your claude-desktop-config.json:

{
  "mcpServers": {
    "jina-reader": {
      "command": "npx",
      "args": ["-y", "@kennyfrc/jina-reader"],
      "env": {
        "JINA_API_KEY": "your_jina_api_key_here"
      }
    }
  }
}

Or if you've cloned the repository:

{
  "mcpServers": {
    "jina-reader": {
      "command": "node",
      "args": ["/absolute/path/to/your/dist/index.js"],
      "env": {
        "JINA_API_KEY": "your_jina_api_key_here"
      }
    }
  }
}

Place this file in:

  • macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
  • Windows: %APPDATA%\Claude\claude_desktop_config.json
  • Linux: ~/.config/Claude/claude_desktop_config.json

Restart Claude Desktop to load your MCP server.

Available Tools

Each tool accepts the following optional parameters:

Common Parameters

  • engine: Control the rendering engine:

    • none: Default rendering engine (good balance of quality and speed)
    • direct: Speed-optimized rendering
    • browser: Quality-optimized rendering
    • cf-browser-rendering: Experimental rendering
  • max_length: Maximum number of characters to return (default is 20000)

    • Important: This parameter is essential for controlling token usage, especially with large documents like research papers
    • By default, content is truncated to 20,000 characters to prevent excessive token consumption
    • The response includes pagination information when content is clipped
    • Pagination details show the current character range and guidance for subsequent requests
  • start_index: Start content from the character index (default is 0)

    • Essential for paginating through large documents
    • Works with the pagination info provided in clipped responses
    • For example, to get the next page of content after seeing "Currently showing characters 0-20000", use start_index=20000

Tools

  • jina_read_url: Read and extract content from a webpage

    {
      "url": "https://example.com",
      "engine": "browser", // Optional, for best quality
      "max_length": 10000, // Optional, limit output length
      "start_index": 0     // Optional, pagination starting point
    }
  • jina_read_html: Read and extract content from HTML

    {
      "html": "<html>...</html>",
      "engine": "direct",  // Optional, for faster processing
      "max_length": 10000, // Optional, limit output length
      "start_index": 0     // Optional, pagination starting point
    }
  • jina_read_pdf: Read and extract content from a PDF file

    {
      "pdf": "base64_encoded_pdf_content",
      "engine": "none",    // Optional, uses default engine
      "max_length": 10000, // Optional, limit output length
      "start_index": 0     // Optional, pagination starting point
    }

Pagination

When dealing with large documents (like research papers or lengthy articles), the content will be automatically clipped to control token usage. The response includes clear pagination information:

[PAGINATION INFO]
- Content clipped: Currently showing characters 0-20000 of approximately 78000 total
- To view the next section, use start_index=20000 with the same max_length
- Complete content can be accessed by making multiple paginated requests

To retrieve subsequent pages, make additional requests with updated start_index values:

// First request (page 1)
{
  "url": "https://arxiv.org/html/2401.14196v1",
  "max_length": 20000
  // start_index defaults to 0
}

// Second request (page 2)
{
  "url": "https://arxiv.org/html/2401.14196v1",
  "max_length": 20000,
  "start_index": 20000
}

// Third request (page 3)
{
  "url": "https://arxiv.org/html/2401.14196v1",
  "max_length": 20000,
  "start_index": 40000
}

Caching

This MCP server implements two levels of caching:

  1. Local Memory Cache: Responses are cached in memory for 3 hours to improve performance and reduce API calls
    • Optimized for documentation content which changes infrequently
    • Significantly reduces API usage for repeated queries
  2. Jina API Server Cache: The Jina API also caches responses using the X-Cache header

The caching system helps to:

  • Improve response times
  • Reduce API usage
  • Decrease latency for repeated requests
  • Minimize costs when working with large documentation

Development

For development with automatic rebuilding:

npm run dev

For publishing instructions, see the docs/publishing.md file.