pdf-compare
v2026.2.3
Published
Compare two PDF files and generate a visual diff report with highlighted differences
Maintainers
Readme
PDF-Compare
A powerful tool for comparing PDF files. Generates vector-based side-by-side comparison reports with content-aware highlighting.
Features
- Vector-Based Rendering: Preserves text quality and keeps file sizes small (no image conversion)
- Searchable Output: Generated PDFs maintain searchable, selectable text
- Visual Comparison: Side-by-side view of two PDFs with intelligent page alignment
- Content-Aware Highlighting: Detects text changes based on content, ignoring layout shifts
- Smart Page Alignment: Automatically detects inserted/deleted pages
- Color-Coded Differences:
- Red: Deleted text (on the original document)
- Green: Added text (on the modified document)
- Multiple Interfaces: CLI, Programmatic API, TypeScript support
- Cross-Platform: Works on Windows, macOS, and Linux
Installation
npm install pdf-compareThe installation will automatically:
- Detect Python 3.12+ on your system
- Create an isolated virtual environment
- Install required Python dependencies
Prerequisites
Python 3.12+
Windows: Download from python.org and check "Add Python to PATH" during installation.
macOS:
brew install [email protected]Linux (Ubuntu/Debian):
sudo apt install python3.12 python3.12-venvNote: No additional dependencies are required. PyMuPDF handles all PDF operations natively.
Quick Start
CLI Usage
# Compare two PDFs
npx pdf-compare original.pdf modified.pdf -o diff.pdf
# Check dependencies
npx pdf-compare --check
# Run setup manually (if automatic setup failed)
npx pdf-compare --setup
# Show help
npx pdf-compare --helpProgrammatic API
const pdfCompare = require('pdf-compare');
// Check if dependencies are ready
const status = pdfCompare.checkDependencies();
console.log(status);
// { ready: true, python: true, venv: true, pythonPath: '...' }
// Compare PDFs
async function compare() {
const result = await pdfCompare.comparePDFs(
'original.pdf',
'modified.pdf',
'report.pdf'
);
if (result.pageCount === 0) {
console.log('No visual differences found');
} else {
console.log(`Report generated: ${result.reportPath}`);
console.log(`Pages with differences: ${result.pageCount}`);
}
}
compare();TypeScript
import { comparePDFs, checkDependencies, CompareResult } from 'pdf-compare';
const status = checkDependencies();
if (!status.ready) {
console.error('Dependencies not configured');
process.exit(1);
}
const result: CompareResult = await comparePDFs('a.pdf', 'b.pdf', 'diff.pdf');API Reference
comparePDFs(fileA, fileB, outputPath, options?)
Compare two PDF files and generate a visual diff report.
Parameters:
fileA(string): Path to the first PDF (Original)fileB(string): Path to the second PDF (Modified)outputPath(string): Path for the output reportoptions(object, optional):timeout(number): Timeout in ms (default: 120000)pythonPath(string): Custom Python pathcwd(string): Working directory for Python execution
Returns: Promise<CompareResult>
{
success: boolean;
pageCount: number | null; // 0 if no differences
reportPath: string | null; // null if no differences
output: string;
}comparePDFsFromBuffer(bufferA, bufferB, options?)
Compare PDFs from Uint8Array data (useful for streams/uploads).
Parameters:
bufferA(Uint8Array): First PDF as Uint8ArraybufferB(Uint8Array): Second PDF as Uint8Arrayoptions(object, optional): Same ascomparePDFs
Returns: Promise<CompareBufferResult>
{
success: boolean;
pageCount: number | null;
reportBuffer: Uint8Array | null;
output: string;
}checkDependencies()
Check if all dependencies are installed and configured.
Returns: DependencyStatus
{
ready: boolean;
python: boolean;
venv: boolean;
pythonPath: string | null;
}runSetup(options?)
Manually run the Python environment setup.
Parameters:
options(object, optional):force(boolean): Force reinstallquiet(boolean): Suppress output
Returns: Promise<SetupResult>
getVersion()
Get the package version.
Returns: string
Environment Variables
PDF_COMPARE_SKIP_SETUP=1: Skip automatic setup during npm install
How It Works
- Text Extraction: Extracts text and layout information from each page using PyMuPDF
- Similarity Scoring: Calculates similarity between pages using sequence matching
- Smart Alignment: Detects insertions, deletions, and shifts between documents
- Vector-Based Report: Creates a new PDF that preserves the original vector content
- Visual Highlighting: Adds vector-based highlights over text differences (no rasterization)
- Optimized Output: Maintains searchable text and small file sizes
Example: Inserted Page
If you insert a page in the middle of a document:
- The inserted page is shown with a blank page on the left, labeled "Added"
- Subsequent pages are correctly aligned and labeled as "Shifted"
Project Structure
pdf-compare/
├── lib/
│ ├── index.js # Main API exports
│ ├── cli.js # CLI entry point
│ ├── python-bridge.js # Python subprocess handling
│ └── setup.js # Environment setup
├── python/
│ └── requirements.txt # Python dependencies (py-pdf-compare)
├── scripts/
│ └── postinstall.js # Auto-setup on npm install
├── sample-files/ # Test PDFs for development
│ ├── original.pdf
│ ├── modified.pdf
│ ├── modified_extra_page.pdf
│ └── modified_removed_page.pdf
├── types/
│ └── index.d.ts # TypeScript definitions
└── package.json # npm configurationTroubleshooting
Setup failed: Python not found
Ensure Python 3.12+ is installed and in your PATH:
python --version # or python3 --versionAfter installing Python, run:
npx pdf-compare --setupDependencies not found
Verify Python dependencies are installed:
npx pdf-compare --checkIf needed, reinstall:
npx pdf-compare --setupPermission errors on Linux/macOS
The virtual environment is created in node_modules/pdf-compare/.venv. Ensure you have write permissions.
Skipping automatic setup in CI
Set the environment variable:
PDF_COMPARE_SKIP_SETUP=1 npm installDevelopment
From Source
git clone https://github.com/grananda/PDF-Compare.git
cd PDF-Compare
npm installnpm scripts:
npm run check # Verify dependencies are installed
npm run setup # Run Python environment setup manually
npm test # Compare sample files (quick test)
npm run compare -- a.pdf b.pdf -o output.pdf # Compare any PDFsSample files included for testing:
sample-files/original.pdf- Base documentsample-files/modified.pdf- Document with text changessample-files/modified_extra_page.pdf- Document with added pagesample-files/modified_removed_page.pdf- Document with removed page
License
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
Support
For issues, questions, or contributions, visit: https://github.com/grananda/PDF-Compare
