xmo-ocr-mcp-server
v1.0.20
Published
MCP server for OCR document conversion with XMO API
Maintainers
Readme
OCR MCP Server
MCP (Model Context Protocol) server for OCR document conversion using XMO API.
Features
- Convert documents (PDF, Word, PowerPoint) and images to Markdown
- Supports multipart/form-data file upload
- API key authentication via
XMO_APIKEYenvironment variable - Support for advanced OCR options (LLM enhancement, page range, etc.)
Installation
Using npx (Recommended)
npx -y xmo-ocr-mcp-serverInstall globally
npm install -g xmo-ocr-mcp-serverInstall locally
npm install xmo-ocr-mcp-serverUsage
Set up API Key
You must set the XMO_APIKEY environment variable:
export XMO_APIKEY="your-api-key-here"Run as MCP Server
The server runs on stdio and can be used with any MCP client:
XMO_APIKEY="your-api-key" npx -y xmo-ocr-mcp-serverConfigure in Claude Desktop
Add to your Claude Desktop configuration file:
macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
Windows: %APPDATA%\Claude\claude_desktop_config.json
{
"mcpServers": {
"ocr": {
"command": "npx",
"args": ["-y", "xmo-ocr-mcp-server"],
"env": {
"XMO_APIKEY": "your-api-key-here"
}
}
}
}Available Tools
convert_document
Convert documents or images to Markdown using OCR.
Parameters:
file_path(required): Absolute path to the local fileuse_llm(optional): Use LLM to enhance accuracy for tables, forms, math (default: false)output_format(optional): Output format - 'markdown', 'json', 'html', or 'chunks' (default: 'markdown')page_range(optional): Pages to process, e.g., '0,2-5' for pages 0,2,3,4,5max_pages(optional): Maximum number of pages to processforce_ocr(optional): Force OCR on all PDF pages (default: false)
Example:
{
"file_path": "/path/to/document.pdf",
"use_llm": true,
"page_range": "0-5"
}Supported File Formats
- Documents: PDF, DOC, DOCX, PPT, PPTX
- Images: PNG, JPG, JPEG, WEBP
Development
Install dependencies
npm installRun locally
export XMO_APIKEY="your-api-key"
node index.jsTest the package
export XMO_APIKEY="your-api-key"
npm testEnvironment Variables
XMO_APIKEY(required): Your XMO API key for authenticationOCR_API_URL(optional): Custom OCR API endpoint (default: http://ocr.xm-opt.com/api/ocr/convert/sync)
Publishing to npm
Before publishing:
- Ensure you're logged in to npm:
npm login - Update version in package.json:
npm version patch|minor|major - Publish to npm:
npm publish --access public
Note: Make sure the package name xmo-ocr-mcp-server is available on npm before publishing.
License
MIT
