@stacktown/file-converter-mcp
v1.0.0
Published
MCP server for high-quality file conversion with PDF-to-Markdown conversion and Universal Project Documentation Standard support
Maintainers
Readme
File Converter MCP
A Model Context Protocol (MCP) server that aggregates various file conversion tools for quick formatting and file type transformations.
Features
Supported Conversions
- PDF to Markdown - Convert PDF documents to markdown format
- Image Format Conversion - Transform between common image formats (PNG, JPG, WebP, etc.)
- Document Conversion - Convert between document formats (DOCX, TXT, HTML, etc.)
- Spreadsheet Conversion - Transform spreadsheet formats (CSV, XLSX, JSON, etc.)
- Code Format Conversion - Convert between code formats and syntax highlighting
- Archive Operations - Extract and create archive files (ZIP, TAR, etc.)
Conversion Engines
- PDF Engine: marker (recommended) and pymupdf4llm support
- Image Engine: Sharp and ImageMagick integration
- Document Engine: Pandoc integration for broad format support
- Archive Engine: Built-in Node.js compression libraries
Installation
npm install -g file-converter-mcpDependencies
Install conversion engines based on your needs:
# PDF conversion engines
pip install marker-pdf pymupdf4llm
# Image processing (choose one)
npm install sharp
# OR
brew install imagemagick # macOS
apt-get install imagemagick # Ubuntu
# Document conversion
brew install pandoc # macOS
apt-get install pandoc # Ubuntu
# Archive tools (usually pre-installed)
# zip, unzip, tar, gzipUsage
MCP Configuration
Add to your MCP client configuration:
{
"mcpServers": {
"file-converter": {
"command": "file-converter-mcp",
"args": []
}
}
}Available Tools
PDF Conversion
convert_pdf_to_markdown- Convert PDF files to Markdownextract_pdf_text- Extract plain text from PDF filesextract_pdf_images- Extract images from PDF files
Image Conversion
convert_image_format- Convert between image formatsresize_image- Resize images with quality optionscompress_image- Reduce image file size
Document Conversion
convert_document- Convert between document formats using Pandocextract_document_text- Extract text from various document formatsconvert_markdown_to_html- Convert Markdown to HTML with styling
Spreadsheet Conversion
convert_csv_to_json- Convert CSV data to JSON formatconvert_json_to_csv- Convert JSON data to CSV formatconvert_xlsx_to_csv- Extract CSV data from Excel files
Archive Operations
create_archive- Create ZIP or TAR archives from files/foldersextract_archive- Extract contents from archive fileslist_archive_contents- List files in archive without extracting
Utility Tools
detect_file_type- Identify file format and encodingvalidate_conversion- Check if conversion is supportedbatch_convert- Convert multiple files in one operation
Examples
Basic PDF Conversion
// Convert PDF to Markdown
await client.callTool("convert_pdf_to_markdown", {
input_path: "/path/to/document.pdf",
output_path: "/path/to/output.md",
options: {
engine: "marker",
preserve_formatting: true
}
});Image Format Conversion
// Convert PNG to WebP with compression
await client.callTool("convert_image_format", {
input_path: "/path/to/image.png",
output_path: "/path/to/image.webp",
options: {
quality: 80,
format: "webp"
}
});Document Conversion
// Convert DOCX to Markdown using Pandoc
await client.callTool("convert_document", {
input_path: "/path/to/document.docx",
output_path: "/path/to/document.md",
options: {
format: "markdown",
preserve_styles: false
}
});Batch Operations
// Convert multiple files at once
await client.callTool("batch_convert", {
input_directory: "/path/to/input/",
output_directory: "/path/to/output/",
conversions: [
{ from: "pdf", to: "markdown" },
{ from: "png", to: "webp" },
{ from: "docx", to: "txt" }
]
});Configuration Options
Conversion Settings
interface ConversionOptions {
engine?: string; // Conversion engine to use
quality?: number; // Output quality (1-100)
preserve_formatting?: boolean; // Maintain original formatting
output_format?: string; // Specific output format
compression_level?: number; // Compression level (0-9)
custom_options?: Record<string, any>; // Engine-specific options
}Supported File Types
Input Formats
- Documents: PDF, DOCX, DOC, RTF, TXT, HTML, XML
- Images: PNG, JPG, JPEG, WebP, GIF, BMP, TIFF, SVG
- Spreadsheets: CSV, XLSX, XLS, JSON, TSV
- Archives: ZIP, TAR, GZ, 7Z, RAR (extract only)
- Code: Various programming language files
Output Formats
- Text: Markdown, HTML, TXT, RTF
- Images: PNG, JPG, WebP, GIF, BMP
- Data: JSON, CSV, XML, YAML
- Archives: ZIP, TAR, GZ
Performance Considerations
- Memory Usage: Large files are processed in chunks to prevent memory issues
- Processing Speed: Different engines have different speed/quality tradeoffs
- Batch Processing: More efficient for multiple file conversions
- Caching: Converted files can be cached to avoid re-processing
Error Handling
The server provides comprehensive error handling:
- Input file validation and format detection
- Graceful fallback between conversion engines
- Detailed error messages with suggested solutions
- Progress tracking for long-running conversions
Development
# Clone repository
git clone https://github.com/cordlesssteve/file-converter-mcp.git
cd file-converter-mcp
# Install dependencies
npm install
# Build project
npm run build
# Run development mode
npm run dev
# Run tests
npm testContributing
- Fork the repository
- Create a feature branch
- Add support for new file formats or conversion engines
- Add tests for new functionality
- Submit a pull request
License
MIT License - see LICENSE file for details.
