paddleocr-skills
v1.2.1
Published
PaddleOCR Skills - Install PP-OCRv5 and PaddleOCR-VL 1.5 for Claude Code
Maintainers
Readme
PaddleOCR Skills
One-command installer to add PaddleOCR Skills to your Claude Code project.
Quick Start
Install PaddleOCR Skills in your Claude Code project with a single command:
npx paddleocr-skillsThe installer will:
- Prompt you to select skills (PP-OCRv5 and/or PaddleOCR-VL)
- Copy skill files to your project
- Install Python dependencies
- Guide you through API configuration
- Verify the installation
What's Included
PP-OCRv5 (Text OCR)
- Fast text recognition for images and documents
- Adaptive quality modes (auto/fast/quality)
- Supports URLs and local files
- Confidence scoring and quality metrics
PaddleOCR-VL (Document Parsing)
- Advanced document structure analysis
- Table, formula, and chart recognition
- Layout detection (headers, footers, page numbers)
- Complete document parsing with reading order
Prerequisites
- Node.js: 14.0.0 or higher
- Python: 3.7 or higher
- Claude Code: Installed and configured
- API Access: Get your API credentials at Baidu AI Studio
Installation
Interactive Mode (Recommended)
npx paddleocr-skillsThe installer will guide you through:
- Skill selection
- Python dependency installation
- API configuration
Skip Configuration
If you want to configure later:
npx paddleocr-skills
# Choose "No" when asked about configurationThen configure manually:
# For PP-OCRv5
python scripts/ppocrv5/configure.py
# For PaddleOCR-VL
python scripts/paddleocr-vl-1.5/configure.pyUsage
After installation, use the skills in your Claude Code session:
PP-OCRv5 Example
# Extract text from an image
python scripts/ppocrv5/ocr_caller.py --file-url "https://example.com/image.jpg" --pretty
# Save result to file
python scripts/ppocrv5/ocr_caller.py --file-path "document.pdf" --output result.json --prettyPaddleOCR-VL Example
# Parse a complex document
python scripts/paddleocr-vl-1.5/vl_caller.py --file-url "https://example.com/paper.pdf" --pretty
# Save result to file
python scripts/paddleocr-vl-1.5/vl_caller.py --file-path "invoice.pdf" --output result.json --prettyProject Structure
After installation, your project will have:
your-project/
├── skills/
│ ├── ppocrv5/
│ │ └── SKILL.md
│ └── paddleocr-vl-1.5/
│ └── SKILL.md
├── scripts/
│ ├── ppocrv5/
│ │ ├── ocr_caller.py
│ │ ├── configure.py
│ │ ├── smoke_test.py
│ │ └── requirements.txt
│ └── paddleocr-vl-1.5/
│ ├── vl_caller.py
│ ├── configure.py
│ ├── smoke_test.py
│ └── requirements.txt
├── references/
│ ├── ppocrv5/
│ └── paddleocr-vl-1.5/
└── .env.exampleConfiguration
Manual Configuration
If you skipped auto-configuration, create a .env file:
# Copy the example file
cp .env.example .env
# Edit with your credentials
nano .envAdd your API credentials:
# PP-OCRv5
API_URL=https://your-api-url.aistudio-app.com/ocr
TOKEN=your-token-here
# PaddleOCR-VL
VL_API_URL=https://your-vl-api-url.com/v1
VL_TOKEN=your-vl-token-hereUsing Configuration Scripts
# Configure PP-OCRv5
python scripts/ppocrv5/configure.py --api-url "YOUR_URL" --token "YOUR_TOKEN"
# Configure PaddleOCR-VL
python scripts/paddleocr-vl-1.5/configure.py --api-url "YOUR_URL" --token "YOUR_TOKEN"Verification
Test your installation:
# Test PP-OCRv5
python scripts/ppocrv5/smoke_test.py
# Test PaddleOCR-VL
python scripts/paddleocr-vl-1.5/smoke_test.pyTroubleshooting
Python Not Found
Ensure Python is in your PATH:
python --versionIf not found, install Python 3.7+ from python.org.
API Configuration Error
Get your API credentials:
- Visit Baidu AI Studio
- Create a new task or use an existing one
- Copy the API URL and TOKEN
- Run the configuration script
Permission Denied
On Windows, run your terminal as Administrator if you encounter permission errors.
Documentation
Each skill includes comprehensive documentation:
- skills/ppocrv5/SKILL.md: PP-OCRv5 usage guide
- skills/paddleocr-vl-1.5/SKILL.md: PaddleOCR-VL usage guide
- references/: Technical reference documentation
License
MIT
Support
For issues and questions:
- GitHub Issues: Report a bug
- Documentation: See
skills/*/SKILL.mdfiles
Credits
Built with:
- PaddleOCR by PaddlePaddle
- Claude Code by Anthropic
