n8n-nodes-file-metadata
v1.0.10
Published
n8n node for extracting metadata from files (PDF, images, ebooks, archives, office docs, audio, video, markdown) with namespace support
Downloads
163
Maintainers
Readme
n8n-nodes-file-metadata
n8n node for extracting metadata from files with namespace support for Qdrant filtering.
Features
- Extract metadata from PDF files
- Extract EXIF data from images
- Process ebooks (EPUB)
- Extract archive information (ZIP)
- Parse Word documents
- Read Excel spreadsheets
- Get audio metadata
- Extract video information
- Parse markdown frontmatter
- NEW: Automatic namespace generation for Qdrant vector store filtering
Namespace Support
This version automatically adds a namespace field to the metadata based on the document title:
- Extracts
titleorinfo.Titlefrom document metadata - Sanitizes the title to create a valid namespace (alphanumeric and underscores only)
- Limits namespace length to 96 characters for Qdrant compatibility
- Enables filtering with queries like:
{
"must": [
{
"key": "metadata.namespace",
"match": {
"value": "Writing_521A_Creative_Writing"
}
}
]
}Installation
npm install [email protected]Usage
- Add the File Metadata Extractor node to your workflow
- Connect it to your document source
- Configure the binary property name (default: 'data')
- The node will output metadata including the new
namespacefield - Use the namespace for filtering in Qdrant vector stores
Example Output
{
"title": "Writing 521A Creative Writing",
"author": "John Doe",
"fileType": "PDF",
"numberOfPages": 25,
"namespace": "Writing_521A_Creative_Writing",
"creationDate": "2024-01-15T10:30:00.000Z"
}License
MIT
