@robin7331/papyrus-cli
v0.1.11
Published
Convert PDF to markdown or text with the OpenAI Agents SDK
Maintainers
Readme
Installation
Install globally:
npm i -g @robin7331/papyrus-cli
papyrus --helpUsage
# Show installed CLI version
papyrus --version
# List available models for the current API key
papyrus --models
# Single file (default behavior; if no API key is found, Papyrus prompts you to paste one)
papyrus ./path/to/input.pdf
# Single file with explicit output extension/output/model
papyrus ./path/to/input.pdf --format md --output ./out/result.md --model gpt-4o-mini
# Default conversion with extra instructions
papyrus ./path/to/input.pdf --instructions "Prioritize table accuracy." --format txt
# Prompt conversion (inline prompt)
papyrus ./path/to/input.pdf --prompt "Extract all invoice line items as bullet points." --format md
# Prompt conversion (prompt file)
papyrus ./path/to/input.pdf --prompt-file ./my-prompt.txt --format txt
# Folder mode (recursive scan, asks for confirmation)
papyrus ./path/to/folder
# Folder mode with explicit concurrency and output directory
papyrus ./path/to/folder --concurrency 4 --output ./out
# Folder mode without confirmation prompt
papyrus ./path/to/folder --yesAPI Key Setup
Papyrus requires OPENAI_API_KEY.
If no API key is found in your environment or local config, Papyrus will prompt you interactively to paste one, and can save it for future runs.
macOS/Linux (persistent):
echo 'export OPENAI_API_KEY="your_api_key_here"' >> ~/.zshrc
source ~/.zshrcPowerShell (persistent):
setx OPENAI_API_KEY "your_api_key_here"
# restart PowerShell after running setxOne-off execution:
OPENAI_API_KEY="your_api_key_here" papyrus ./path/to/input.pdfPapyrus config commands (optional, local persistent storage in ~/.config/papyrus/config.json):
papyrus config init
papyrus config show
papyrus config clearArguments Reference
[input]
Path to a single PDF file or a folder containing PDFs (processed recursively).
Required unless you use --models.
Example:
papyrus ./docs/invoice.pdf-v, --version
Print the installed Papyrus CLI version.
Example:
papyrus --version--format <format>
Output file extension override. Any extension is allowed (for example md, txt, csv, json).
This flag controls the output filename extension only.
When provided, Papyrus also passes the extension as a guidance hint to the model.
Example:
papyrus ./docs/invoice.pdf --format csv-o, --output <path>
Output destination.
- Single file input: output file path.
- Folder input: output directory path (folder structure is mirrored).
Example:
papyrus ./docs --output ./converted--instructions <text>
Additional conversion instructions for default conversion behavior. Cannot be combined with --prompt or --prompt-file.
Example:
papyrus ./docs/invoice.pdf --instructions "Keep table columns aligned."--prompt <text>
Inline prompt text for prompt-based conversion. Must be non-empty. Use exactly one of --prompt or --prompt-file.
Example:
papyrus ./docs/invoice.pdf --prompt "Summarize payment terms."--prompt-file <path>
Path to a text file containing the prompt for prompt-based conversion. File must contain non-empty text. Use exactly one of --prompt or --prompt-file.
Example:
papyrus ./docs/invoice.pdf --prompt-file ./my-prompt.txt-m, --model <model>
OpenAI model name used for conversion. Default is gpt-4o-mini.
If the selected model is not available, Papyrus prints the available model IDs before exiting.
Example:
papyrus ./docs/invoice.pdf --model gpt-4.1-mini--models
Lists the available OpenAI model IDs for the current API key and exits.
Example:
papyrus --models--concurrency <n>
Maximum parallel workers for folder input. Must be an integer between 1 and 100. Default is 10.
Example:
papyrus ./docs --concurrency 4-y, --yes
Skips the interactive folder confirmation prompt.
Example:
papyrus ./docs --yesNotes
- In default conversion (without
--prompt/--prompt-file), the model returns structured JSON withformat+content. - Without
--format, output extension follows model-selected content format (.mdor.txt). - With
--format, only the output extension changes. - Single-file input now also shows a live worker lane in TTY while conversion is running.
- Folder input is scanned recursively for
.pdffiles and processed in parallel. - In folder mode,
--outputmust be a directory path and mirrored subfolders are preserved. - OpenAI rate-limit (
429) responses are retried automatically usingRetry-After(when present) plus exponential backoff. - Rate-limit retry tuning is available via environment variables:
PAPYRUS_RATE_LIMIT_MAX_RETRIES(default8)PAPYRUS_RATE_LIMIT_BASE_DELAY_MS(default2000)PAPYRUS_RATE_LIMIT_MAX_DELAY_MS(default120000)
- For scanned PDFs, output quality depends on OCR quality from the model.
Development
npm install
npm run build
npm run dev -- ./path/to/input.pdf
npm testLicense
MIT
