rivet-plugin-pdf-extractor
v0.2.9
Published
A Rivet plugin that makes it easy to extract text and metadata from PDF files.
Readme
PDF Extractor Plugin for Rivet
A Rivet plugin that makes it easy to extract text and metadata from PDF files.
Features
- Complete text extraction from PDFs
- Metadata extraction (author, title, creation date, etc.)
- Support for different PDF sources:
- URLs (https://...)
- Base64 data
- Local file paths
- Ability to specify particular pages to extract
Installation
In Rivet
- Open the plugins panel at the top of the screen
- Search for "rivet-plugin-pdf-extractor"
- Click "Install" to add the plugin to your project
In Code
import * as Rivet from "@ironclad/rivet-core";
import pdfExtractorPlugin from "rivet-plugin-pdf-extractor";
// Register the plugin
Rivet.globalRivetNodeRegistry.registerPlugin(pdfExtractorPlugin(Rivet));Usage
After installation, you'll find the PDF Extractor node in the "Documents" group.
- Add the PDF Extractor node to your graph
- Configure the PDF source (file path, URL, or base64 data)
- Choose extraction options (text, metadata, specific pages)
- Connect the outputs to other nodes in your graph
Development
- Clone the repository
- Install dependencies with
npm install - Build the plugin with
npm run build - Start the development mode with
npm run dev
Troubleshooting
If you encounter issues with PDF extraction:
- Check that the file exists and is accessible
- For URLs, make sure they're publicly accessible
- For large PDFs, you may need to increase the memory limits in Rivet
