npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2025 – Pkg Stats / Ryan Hefner

@sylphlab/tools-pdf

v0.7.1

Published

Core logic for MCP PDF tools (text extraction, etc.)

Readme

@sylphlab/tools-pdf

NPM version

Core logic for extracting text and potentially other content from PDF documents.

This package provides the underlying logic and tool definitions for processing PDF files, primarily focusing on text extraction. It leverages the mupdf library for efficient parsing and is designed using @sylphlab/tools-core. This package serves as the foundation for @sylphlab/tools-pdf-mcp.

Purpose

Extracting information from PDF documents is a common requirement for data processing pipelines, RAG systems, and AI agents. This package offers standardized tools for accessing PDF content, defined using @sylphlab/tools-core for consistency and reusability across different platforms.

Tools Provided

  • getTextTool (or similar): Extracts text content from PDF files.
    • Can extract text from the entire document.
    • Can extract text from specific pages or page ranges.
    • Can optionally retrieve PDF metadata (author, title, etc.).
    • Can optionally retrieve the total page count.
  • (Potential Future Tools):
    • Conversion to Markdown or other formats.
    • Image extraction.

Key Features

  • Text Extraction: Provides flexible options for retrieving text content.
  • Metadata & Page Count: Allows fetching document properties.
  • Efficient Parsing: Uses the mupdf library, known for its performance and accuracy.
  • Standardized Definition: Uses the SylphTool structure from @sylphlab/tools-core.

Installation

This package is primarily intended for internal use within the mcp monorepo, serving as a dependency for @sylphlab/tools-pdf-mcp and potentially other packages needing direct PDF processing logic.

# From the root of the monorepo
pnpm add @sylphlab/tools-pdf --filter <your-package-name>

Usage (Conceptual)

The tool definitions are typically consumed by adapters or MCP server implementations.

import { getTextTool } from '@sylphlab/tools-pdf';
import { adaptToolToMcp } from '@sylphlab/tools-adaptor-mcp'; // Example adapter

// Example: Using the tool definition directly
async function runGetText() {
  const input = {
    items: [
      { filePath: './path/to/document.pdf', pages: [1, 3] } // Extract pages 1 and 3
    ],
    includeMetadata: true
  };
  // Validate input against getTextTool.inputSchema...
  const output = await getTextTool.handler(input);
  // Validate output against getTextTool.outputSchema...
  if (output.results && output.results[0]?.success) {
    console.log('Metadata:', output.results[0].data.metadata);
    console.log('Page 1 Text:', output.results[0].data.page_texts.find(p => p.page === 1)?.text);
    console.log('Page 3 Text:', output.results[0].data.page_texts.find(p => p.page === 3)?.text);
  }
}

// Example: Adapting for MCP
const mcpPdfTool = adaptToolToMcp(getTextTool);

// This adapted definition would then be used to create the MCP server.

Dependencies

  • @sylphlab/tools-core: Provides defineTool and core types.
  • zod: For input/output schema definition and validation.
  • mupdf: The core library used for parsing PDF documents.

Developed by Sylph Lab.