f2md
v1.1.1
Published
Convert PDF and DOCX files to Markdown using AI
Maintainers
Readme
f2md
Convert PDF, DOCX, and image files to Markdown using AI. This CLI tool extracts text, images, and preserves table structure while converting documents to clean, well-formatted Markdown. It also supports OCR text extraction from images.
Features
- PDF Support - Full text extraction, image extraction, and page screenshots for layout understanding
- DOCX Support - Text and image extraction with structure preservation
- Image OCR - Extract text from images (PNG, JPG, JPEG, GIF, WEBP) using AI-powered OCR
- AI-Powered Conversion - Uses Google's Gemini AI to intelligently convert content to Markdown
- Interactive CLI - Friendly prompts using clack.js
- Easy Setup - Built-in configuration wizard for API keys
Installation
Using npx (no installation required)
npx f2md document.pdfUsing bunx
bunx f2md document.pdfUsing pnpm dlx
pnpm dlx f2md document.pdfGlobal installation
npm install -g f2md
# or
bun install -g f2mdSetup
Before using the tool, you need to configure your Google AI API key.
Run the setup wizard
f2md setup
# or with npx
npx f2md setupThe setup wizard will:
- Show you where to get a Google AI API key (https://aistudio.google.com/apikey)
- Prompt you to enter your API key
- Ask where to save it (local project or global for all projects)
Manual setup
Alternatively, set the environment variable:
export GOOGLE_GENERATIVE_AI_API_KEY="your-api-key-here"Or create a .env file in your project:
GOOGLE_GENERATIVE_AI_API_KEY=your-api-key-hereUsage
Interactive Mode
f2mdThe tool will prompt you for:
- Input file path (PDF, DOCX, or image)
- Output file path
CLI Mode
# Convert with auto-generated output name
f2md document.pdf
# Convert with custom output path
f2md document.pdf output.md
# Extract text from an image (OCR)
f2md screenshot.png
# Extract text from image with custom output
f2md image.jpg output.mdSupported File Types
- PDF (
.pdf) - Word Documents (
.docx) - Images (
.png,.jpg,.jpeg,.gif,.webp) - OCR text extraction
Options
f2md --help # Show help
f2md --version # Show version
f2md setup # Configure API keyHow It Works
For PDF and DOCX files:
- Extraction - Reads the input file and extracts text, images, and layout information
- Processing - For PDFs, captures page screenshots to understand visual layout
- AI Conversion - Sends extracted content to Google's Gemini AI model
- Markdown Generation - Receives AI-generated Markdown with proper formatting
- Cleanup - Removes unused images and saves the final output
For Image files:
- Image Processing - Reads the image file and encodes it for AI processing
- OCR Analysis - Sends the image to Google's Gemini AI with specialized prompts for text extraction
- Text Extraction - AI extracts all visible text while preserving structure (headings, lists, tables)
- Markdown Generation - Converts extracted content to well-formatted Markdown
- Output - Saves the final Markdown file
Development
Prerequisites
- Bun installed
Setup
# Clone the repository
git clone <repo-url>
cd f2md
# Install dependencies
bun install
# Run in development mode
bun run devBuild
bun run buildProject Structure
src/
cli.ts - CLI entry point with clack prompts
convert.ts - Core conversion logic
index.ts - Public API exports
dist/ - Built output (generated)API Usage
You can also use this as a library in your Node.js/Bun projects:
import { convert } from "f2md";
const result = await convert("input.pdf", "output.md", {
onProgress: (message) => console.log(message),
respectPages: false,
});
console.log(`Saved to: ${result.outputPath}`);
console.log(`Images saved: ${result.imagesSaved}`);
console.log(`Images cleaned: ${result.imagesDeleted}`);License
MIT
