md-pdf-md
v1.0.2
Published
Bidirectional Markdown-PDF converter with AI-powered vision. MD→PDF with beautiful themes, PDF→MD with LLaVA vision model - completely open source and privacy-first
Maintainers
Readme
md-pdf-md
Bidirectional Markdown ↔ PDF converter with AI-powered vision
Convert Markdown to beautiful PDFs AND extract Markdown from PDFs using local AI vision. Zero configuration, completely private, and open source.
✨ Features
Markdown → PDF
- 🎨 4 Beautiful Themes - GitHub, GitHub Dark, Academic, Minimal
- 💎 VS Code Syntax Highlighting - Powered by Shiki
- 📄 Smart Page Breaks - No orphaned headings or broken code blocks
- 📊 Auto Table of Contents - With page numbers
- 🚀 2-3 Second Generation - Fast and efficient
- ⚙️ Zero Configuration - Works out of the box
PDF → Markdown (NEW!)
- 🤖 AI-Powered Vision - Uses LLaVA to understand document structure
- 🔒 100% Private - Runs locally via Ollama (no cloud APIs)
- 📝 Structure Preservation - Maintains headings, lists, code blocks, tables
- 💰 Free Forever - No API costs, completely open source
🚀 Quick Start
# Install
npm install -g md-pdf-md
# Convert Markdown to PDF
md-pdf-md README.md
# Convert PDF to Markdown (requires Ollama + LLaVA)
md-pdf-md document.pdfThat's it! The tool auto-detects file type and converts appropriately.
📦 Installation
Basic (MD→PDF only)
npm install -g md-pdf-mdFull Setup (MD↔PDF bidirectional)
# 1. Install the package
npm install -g md-pdf-md
# 2. Install Ollama (for PDF→MD)
# Visit: https://ollama.ai
# 3. Pull LLaVA model (~4.7GB)
ollama pull llava
# 4. Verify setup
md-pdf-md check💡 Usage
Smart Auto-Detection
# Just pass any file!
md-pdf-md README.md # → Converts to PDF
md-pdf-md document.pdf # → Converts to Markdown
md-pdf-md slides.md --theme github-darkWith Options
# Markdown to PDF
md-pdf-md docs.md -o output.pdf --theme academic --format Letter
# PDF to Markdown
md-pdf-md report.pdf -o report.md --model llava --quality 300Explicit Commands (for power users)
md-pdf-md md2pdf input.md # Explicit MD→PDF
md-pdf-md pdf2md input.pdf # Explicit PDF→MD
md-pdf-md themes # List available themes
md-pdf-md check # Verify Ollama setup🎨 Themes
| Theme | Description | Best For |
|-------|-------------|----------|
| github | Clean light theme | General docs |
| github-dark | Dark with syntax highlighting | Code-heavy docs |
| academic | Formal serif fonts | Papers & reports |
| minimal | Simple & clean | Minimalist design |
Preview: md-pdf-md themes
🔧 Options
Markdown → PDF
-o, --output <path> Output PDF path
-t, --theme <name> Theme (default: github)
--toc / --no-toc Table of contents (default: true)
--page-numbers Page numbers (default: true)
-f, --format <format> A4, Letter, or Legal (default: A4)
--css <path> Custom CSS file
--highlight-theme <theme> Syntax highlight themePDF → Markdown
-o, --output <path> Output markdown path
-m, --model <name> Ollama model (default: llava)
--host <url> Ollama server URL
-q, --quality <dpi> Image quality (default: 200)
--debug Debug mode📝 Programmatic API
import { convertMarkdownToPdf, convertPdfToMarkdown } from 'md-pdf-md';
// Markdown → PDF
const result = await convertMarkdownToPdf({
input: 'README.md',
output: 'README.pdf',
theme: 'github-dark',
toc: true,
pageNumbers: true
});
// PDF → Markdown (with progress)
const result = await convertPdfToMarkdown({
input: 'document.pdf',
output: 'document.md',
model: 'llava'
}, (progress) => {
console.log(`Page ${progress.currentPage}/${progress.totalPages}`);
});🤖 How PDF→MD Works
Traditional PDF extractors just dump text blindly. md-pdf-md uses LLaVA vision AI to:
- Understand structure - Identifies H1, H2, H3 correctly
- Preserve formatting - Maintains lists, code blocks, tables
- Detect code - Recognizes programming languages
- Keep hierarchy - Preserves document organization
All processing happens locally on your machine - no cloud APIs, no data leaving your computer.
🆚 Comparison
| Feature | md-pdf-md | pandoc | md-to-pdf | pdf2md | |---------|-----------|--------|-----------|---------| | MD→PDF Beautiful | ✅ | ⚠️ Complex | ⚠️ Basic | ❌ | | PDF→MD AI-powered | ✅ | ❌ | ❌ | ⚠️ Poor | | Zero config | ✅ | ❌ | ❌ | ✅ | | 100% Private | ✅ | ✅ | ✅ | ✅ | | Free | ✅ | ✅ | ✅ | ✅ |
💡 Use Cases
Developers: Beautiful README PDFs with syntax highlighting
md-pdf-md README.md --theme github-darkEnterprises: Professional reports and documentation
md-pdf-md quarterly-report.md --theme academic --format LetterWriters: Edit PDFs by converting to Markdown
md-pdf-md document.pdf # Edit the .md, then convert back!
md-pdf-md document.mdStudents: Format papers and extract notes from PDFs
md-pdf-md thesis.md --theme academic
md-pdf-md lecture-slides.pdf🐛 Troubleshooting
"Ollama is not running"
ollama serve # Start Ollama
ollama pull llava # Install model
md-pdf-md check # VerifyPoor PDF→MD results
md-pdf-md doc.pdf --quality 300 # Higher quality
md-pdf-md doc.pdf --model llama3.2-vision # Different model
md-pdf-md doc.pdf --debug # Debug modeMemory issues
NODE_OPTIONS="--max-old-space-size=4096" md-pdf-md large.pdf📊 Performance
MD→PDF: 2-3 seconds for typical documents PDF→MD: ~5-10 seconds per page (CPU), ~2-5 seconds (GPU) Accuracy: 90%+ structure preservation
🛠️ Requirements
- Node.js ≥ 16.0.0
- Ollama (PDF→MD only) - ollama.ai
- LLaVA model (PDF→MD only) -
ollama pull llava
🤝 Contributing
Contributions welcome! Please feel free to submit a Pull Request.
📄 License
MIT License - see LICENSE file for details.
🙏 Built With
- Puppeteer - PDF generation
- Ollama - Local AI runtime
- LLaVA - Vision language model
- Shiki - Syntax highlighting
- Marked - Markdown parsing
Made with ❤️ by josharsh
⭐ Star this repo if you find it useful!
