@mnapoli/ocr
v1.0.0
Published
CLI to transcribe text from images using Google Gemini Vision
Maintainers
Readme
@mnapoli/ocr
A minimal CLI to transcribe text from images using Google Gemini Vision.
Usage
npx @mnapoli/ocr image.jpg
npx @mnapoli/ocr page1.jpg page2.png > output.txt
npx @mnapoli/ocr *.jpg | lessMultiple images are transcribed in order, separated by a blank line on stdout.
Requirements
A Google Gemini API key, set as the GEMINI_API_KEY environment variable:
export GEMINI_API_KEY=your_key_hereGet a key at aistudio.google.com.
Supported formats
JPEG, PNG
Options
| Flag | Description |
|-|-|
| --version, -v | Print version |
| --help, -h | Show usage |
