pdf-to-markdown-mcp
v1.0.9
Published
MCP server that converts PDF pages to markdown using Qwen VL model
Downloads
151
Readme
PDF to Markdown MCP Server
A Model Context Protocol (MCP) server that converts PDF pages to markdown format using the Qwen VL vision model.
Features
- Convert PDF to Markdown: Extract text, tables, and document structure from any PDF page.
- Vision-Powered Accuracy: Uses AI vision (Qwen VL) for high-fidelity extraction that regular text parsers often miss.
- Easy Integration: Works with any MCP client like Claude Desktop.
Requirements
- Node.js: Version 18 or higher.
- Qwen VL API Access: An API key and access to a Qwen VL endpoint.
Installation & Configuration
1. Install Dependencies
npm install
npm run build2. Environment Variables
The server needs the following environment variables:
QWEN_API_URL: The endpoint URL (e.g.,https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions)QWEN_API_KEY: Your authentication key.QWEN_MODEL: The specific model name (defaults toQwen3-VL-235B-A22B-Instruct).WORKSPACE(optional): Comma-separated list of absolute directory paths. When set, the server will only process PDF files located within these directories, preventing access to files outside the allowed workspace.
3. Setup with Claude Desktop
Add this to your Claude Desktop configuration file:
{
"mcpServers": {
"pdf-to-markdown": {
"command": "npx",
"args": [
"-y",
"pdf-to-markdown-mcp"
],
"env": {
"QWEN_API_URL": "https://your-qwen-api-endpoint.com/v1/chat/completions",
"QWEN_API_KEY": "your-api-key-here",
"QWEN_MODEL": "Qwen3-VL-235B-A22B-Instruct"
}
}
}
}Usage
Once connected, you can use the convert_pdf_page_to_markdown tool.
Tool: convert_pdf_page_to_markdown
Converts a specific page from a PDF file to markdown.
Arguments:
pdf_path(string): Absolute path to the PDF file on your computer.page_number(number): The page number you want to convert (starting from 1).
Example Prompt:
"Please convert page 5 of C:\Documents\Report.pdf to markdown for me."
System Dependencies
Depending on your OS, you may need additional libraries for PDF rendering:
- Windows: No additional steps required.
- macOS:
brew install pkg-config cairo pango libpng jpeg giflib librsvg - Linux (Ubuntu/Debian):
sudo apt-get install build-essential libcairo2-dev libpango1.0-dev libjpeg-dev libgif-dev librsvg2-dev
Troubleshooting
- "PDF file not found": Ensure the path is absolute and the file is accessible.
- "Invalid page number": Check that the page number exists in the document.
- API Errors: Verify your
QWEN_API_URLandQWEN_API_KEY. - Render Failures: If conversion fails on Linux/macOS, ensure the System Dependencies above are installed.
License
Apache 2.0
