@stacktown/file-converter-mcp

v1.0.0

Published

7 months ago

MCP server for high-quality file conversion with PDF-to-Markdown conversion and Universal Project Documentation Standard support

0High
0Medium
0Low

cordlesssteve

mcp model-context-protocol document-organization pdf-conversion markdown project-documentation universal-standard

File Converter MCP

A Model Context Protocol (MCP) server that aggregates various file conversion tools for quick formatting and file type transformations.

Features

Supported Conversions

PDF to Markdown - Convert PDF documents to markdown format
Image Format Conversion - Transform between common image formats (PNG, JPG, WebP, etc.)
Document Conversion - Convert between document formats (DOCX, TXT, HTML, etc.)
Spreadsheet Conversion - Transform spreadsheet formats (CSV, XLSX, JSON, etc.)
Code Format Conversion - Convert between code formats and syntax highlighting
Archive Operations - Extract and create archive files (ZIP, TAR, etc.)

Conversion Engines

PDF Engine: marker (recommended) and pymupdf4llm support
Image Engine: Sharp and ImageMagick integration
Document Engine: Pandoc integration for broad format support
Archive Engine: Built-in Node.js compression libraries

Installation

npm install -g file-converter-mcp

Dependencies

Install conversion engines based on your needs:

# PDF conversion engines
pip install marker-pdf pymupdf4llm

# Image processing (choose one)
npm install sharp
# OR
brew install imagemagick  # macOS
apt-get install imagemagick  # Ubuntu

# Document conversion
brew install pandoc  # macOS
apt-get install pandoc  # Ubuntu

# Archive tools (usually pre-installed)
# zip, unzip, tar, gzip

Usage

MCP Configuration

Add to your MCP client configuration:

{
  "mcpServers": {
    "file-converter": {
      "command": "file-converter-mcp",
      "args": []
    }
  }
}

Available Tools

PDF Conversion

convert_pdf_to_markdown - Convert PDF files to Markdown
extract_pdf_text - Extract plain text from PDF files
extract_pdf_images - Extract images from PDF files

Image Conversion

convert_image_format - Convert between image formats
resize_image - Resize images with quality options
compress_image - Reduce image file size

Document Conversion

convert_document - Convert between document formats using Pandoc
extract_document_text - Extract text from various document formats
convert_markdown_to_html - Convert Markdown to HTML with styling

Spreadsheet Conversion

convert_csv_to_json - Convert CSV data to JSON format
convert_json_to_csv - Convert JSON data to CSV format
convert_xlsx_to_csv - Extract CSV data from Excel files

Archive Operations

create_archive - Create ZIP or TAR archives from files/folders
extract_archive - Extract contents from archive files
list_archive_contents - List files in archive without extracting

Utility Tools

detect_file_type - Identify file format and encoding
validate_conversion - Check if conversion is supported
batch_convert - Convert multiple files in one operation

Examples

Basic PDF Conversion

// Convert PDF to Markdown
await client.callTool("convert_pdf_to_markdown", {
  input_path: "/path/to/document.pdf",
  output_path: "/path/to/output.md",
  options: {
    engine: "marker",
    preserve_formatting: true
  }
});

Image Format Conversion

// Convert PNG to WebP with compression
await client.callTool("convert_image_format", {
  input_path: "/path/to/image.png",
  output_path: "/path/to/image.webp",
  options: {
    quality: 80,
    format: "webp"
  }
});

Document Conversion

// Convert DOCX to Markdown using Pandoc
await client.callTool("convert_document", {
  input_path: "/path/to/document.docx",
  output_path: "/path/to/document.md",
  options: {
    format: "markdown",
    preserve_styles: false
  }
});

Batch Operations

// Convert multiple files at once
await client.callTool("batch_convert", {
  input_directory: "/path/to/input/",
  output_directory: "/path/to/output/",
  conversions: [
    { from: "pdf", to: "markdown" },
    { from: "png", to: "webp" },
    { from: "docx", to: "txt" }
  ]
});

Configuration Options

Conversion Settings

interface ConversionOptions {
  engine?: string;                    // Conversion engine to use
  quality?: number;                   // Output quality (1-100)
  preserve_formatting?: boolean;      // Maintain original formatting
  output_format?: string;             // Specific output format
  compression_level?: number;         // Compression level (0-9)
  custom_options?: Record<string, any>; // Engine-specific options
}

Supported File Types

Input Formats

Documents: PDF, DOCX, DOC, RTF, TXT, HTML, XML
Images: PNG, JPG, JPEG, WebP, GIF, BMP, TIFF, SVG
Spreadsheets: CSV, XLSX, XLS, JSON, TSV
Archives: ZIP, TAR, GZ, 7Z, RAR (extract only)
Code: Various programming language files

Output Formats

Text: Markdown, HTML, TXT, RTF
Images: PNG, JPG, WebP, GIF, BMP
Data: JSON, CSV, XML, YAML
Archives: ZIP, TAR, GZ

Performance Considerations

Memory Usage: Large files are processed in chunks to prevent memory issues
Processing Speed: Different engines have different speed/quality tradeoffs
Batch Processing: More efficient for multiple file conversions
Caching: Converted files can be cached to avoid re-processing

Error Handling

The server provides comprehensive error handling:

Input file validation and format detection
Graceful fallback between conversion engines
Detailed error messages with suggested solutions
Progress tracking for long-running conversions

Development

# Clone repository
git clone https://github.com/cordlesssteve/file-converter-mcp.git
cd file-converter-mcp

# Install dependencies
npm install

# Build project
npm run build

# Run development mode
npm run dev

# Run tests
npm test

Contributing

Fork the repository
Create a feature branch
Add support for new file formats or conversion engines
Add tests for new functionality
Submit a pull request

License

MIT License - see LICENSE file for details.

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

File Converter MCP

Features

Supported Conversions

Conversion Engines

Installation

Dependencies

Usage

MCP Configuration

Available Tools

PDF Conversion

Image Conversion

Document Conversion

Spreadsheet Conversion

Archive Operations

Utility Tools

Examples

Basic PDF Conversion

Image Format Conversion

Document Conversion

Batch Operations

Configuration Options

Conversion Settings

Supported File Types

Input Formats

Output Formats

Performance Considerations

Error Handling

Development

Contributing

License

Support