xmo-ocr-mcp-server
v1.0.14
Published
MCP server for OCR document conversion with XMO API
Downloads
1,314
Maintainers
Readme
OCR MCP Server
MCP (Model Context Protocol) server for OCR document conversion using XMO API.
Features
- Convert documents (PDF, Word, PowerPoint) and images to Markdown
- Supports multipart/form-data file upload
- API key authentication via
XMO_APIKEYenvironment variable - Support for advanced OCR options (LLM enhancement, page range, etc.)
Installation
Using npx (Recommended)
npx -y xmo-ocr-mcp-serverInstall globally
npm install -g xmo-ocr-mcp-serverInstall locally
npm install xmo-ocr-mcp-serverUsage
Set up API Key
You must set the XMO_APIKEY environment variable:
export XMO_APIKEY="your-api-key-here"Run as MCP Server
The server runs on stdio and can be used with any MCP client:
XMO_APIKEY="your-api-key" npx -y xmo-ocr-mcp-serverConfigure in Claude Desktop
Add to your Claude Desktop configuration file:
macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
Windows: %APPDATA%\Claude\claude_desktop_config.json
{
"mcpServers": {
"ocr": {
"command": "npx",
"args": ["-y", "xmo-ocr-mcp-server"],
"env": {
"XMO_APIKEY": "your-api-key-here"
}
}
}
}Available Tools
convert_document
Convert documents or images to Markdown using OCR.
Parameters:
file_path(required): Absolute path to the local fileuse_llm(optional): Use LLM to enhance accuracy for tables, forms, math (default: false)output_format(optional): Output format - 'markdown', 'json', 'html', or 'chunks' (default: 'markdown')page_range(optional): Pages to process, e.g., '0,2-5' for pages 0,2,3,4,5max_pages(optional): Maximum number of pages to processforce_ocr(optional): Force OCR on all PDF pages (default: false)
Example:
{
"file_path": "/path/to/document.pdf",
"use_llm": true,
"page_range": "0-5"
}Supported File Formats
- Documents: PDF, DOC, DOCX, PPT, PPTX
- Images: PNG, JPG, JPEG, WEBP
Development
Install dependencies
npm installRun locally
export XMO_APIKEY="your-api-key"
node index.jsTest the package
export XMO_APIKEY="your-api-key"
npm testEnvironment Variables
XMO_APIKEY(required): Your XMO API key for authenticationOCR_API_URL(optional): Custom OCR API endpoint (default: http://ocr.xm-opt.com/api/ocr/convert/sync)
Publishing to npm
Before publishing:
- Ensure you're logged in to npm:
npm login - Update version in package.json:
npm version patch|minor|major - Publish to npm:
npm publish --access public
Note: Make sure the package name xmo-ocr-mcp-server is available on npm before publishing.
License
MIT
