unified-markdown
v1.1.0
Published
AI-powered CLI and web UI to convert images, PDFs, DOCX, and PPTX to Markdown using Google Gemini
Downloads
238
Maintainers
Readme
UnifiedMarkdown (umd)
AI-powered CLI and web UI to convert images, PDFs, DOCX, and PPTX documents to Markdown using Google Gemini.
Features
- Format Support: Convert images (PNG, JPG, JPEG, WEBP, GIF, BMP, TIFF, SVG) and documents (PDF, DOCX, PPTX) to Markdown.
- Batch Processing: Scan and convert entire directories in parallel with configurable concurrency.
- Web UI: Built-in web dashboard to browse directories, manage conversions, configure exclusions, and monitor jobs in real time via SSE.
- Directory Exclusions: Respects
.umdignorefiles (gitignore-style patterns) and custom exclusion rules managed through the UI. - Claude Code Skills: Bundled skills let you alternatively convert files using natural language in Claude Code sessions.
- Easy Setup: Interactive CLI configuration for your Gemini API key.
Prerequisites
- Node.js >= 18.0.0
- Google Gemini API key (free tier available at Google AI Studio)
- LibreOffice (required for PPTX/PPT conversion only):
- macOS:
brew install libreoffice - Ubuntu/Debian:
sudo apt install libreoffice - Windows: Install from official website
- macOS:
Installation
npm install -g unified-markdownSetup
Run the interactive setup to configure your Gemini API key:
umd setupAlternatively, set the environment variable directly:
export GEMINI_API_KEY="your-api-key-here"Configuration is stored at ~/.umd/config.json.
Usage
Convert a Single File
Converts a file to a .md file alongside the original:
umd convert photo.png
umd convert /path/to/document.pdfConvert a Directory (Sequential)
umd convert /path/to/directoryBatch Convert a Directory in Parallel
Scan and convert all supported files concurrently:
umd orchestrate convert /path/to/directory
umd orch convert /path/to/directory --concurrency 5
umd orch convert /path/to/directory --dry-runScan a Directory
Preview what would be converted without converting:
umd orchestrate scan /path/to/directory
umd orch scan /path/to/directory --pending-onlyStart the Web UI
Launch a local web dashboard to visually manage conversions:
umd orchestrate ui
umd orch ui --port 8080 --openThe UI provides:
- Dashboard with live stats and activity feed
- File Browser with directory scanning, file tree with selection, and native OS folder picker
- Jobs page with real-time progress tracking and log viewer
- Settings for managing exclusion rules and viewing configuration
Check Job Status
umd orchestrate status # Show recent jobs
umd orch status <jobId> # Show specific job details
umd orch status --all # Show all jobsSupported File Types
- Images:
.png,.jpg,.jpeg,.webp,.gif,.bmp,.tiff,.tif,.svg - Documents:
.pdf,.docx,.pptx,.ppt
.umdignore
Place a .umdignore file in any directory to exclude files from scanning and batch conversion. Uses gitignore-style patterns:
# Ignore all PDFs in this directory
*.pdf
# Ignore a specific subdirectory
drafts/
# Negation to re-include
!important.pdfAlternative: Claude Code Skills
If you use Claude Code, this package includes bundled skills that are automatically installed to ~/.claude/skills during npm install. Convert files using natural language:
Convert document.pdf to markdown
Convert all files in ./docs/ to markdownThe web UI also has a "Use Claude Code" toggle for batch conversions, which uses the bundled skill with Claude Code's --dangerously-skip-permissions flag.
License
MIT
