hazo_files
v1.4.3
Published
File management including integration to cloud files
Maintainers
Readme
hazo_files
A powerful, modular file management package for Node.js and React applications with support for local filesystem and Google Drive storage. Built with TypeScript for type safety and developer experience.
Features
- Multiple Storage Providers: Local filesystem and Google Drive support out of the box
- Modular Architecture: Easily add custom storage providers
- Unified API: Single consistent interface across all storage providers
- React UI Components: Drop-in FileBrowser component with folder tree, file list, and preview
- Naming Rules System: Visual configurator and utilities for generating consistent file/folder names
- Naming Convention Management: Full CRUD with UI components for managing naming conventions in database
- Extraction Data Management: Track and manage LLM-extracted metadata with merge strategies
- LLM Integration: Built-in support for hazo_llm_api document/image extraction
- Upload + Extract Workflow: Combined service for uploading files with automatic LLM extraction and naming
- File Reference Tracking: Multi-entity file references with orphan detection, soft delete, and lifecycle management
- File Change Detection: xxHash-based content hashing for efficient change detection
- Content Tagging: Optional LLM-based content classification at upload time or on-demand via
content_tagfield - Schema Migrations: Built-in V2/V3 migration utilities for adding reference tracking and content tagging to existing databases
- TypeScript: Full type safety and IntelliSense support
- OAuth Integration: Built-in Google Drive OAuth authentication
- Progress Tracking: Upload/download progress callbacks
- File Validation: Extension filtering and file size limits
- Error Handling: Comprehensive error types and handling
Installation
npm install hazo_filesFor React UI components, ensure you have React 18+ installed:
npm install react react-domFor the NamingRuleConfigurator component (drag-and-drop interface), also install:
npm install @dnd-kit/core @dnd-kit/sortable @dnd-kit/utilitiesFor database tracking and LLM extraction features (optional):
npm install hazo_connect # Database tracking
npm install hazo_llm_api # LLM document extraction
npm install server-only # Server-side safety (recommended)
# Note: xxhash-wasm is included automatically as a dependencyTailwind CSS v4 Setup (Required for UI Components)
If you're using Tailwind CSS v4 with the UI components, you must add a @source directive to your CSS file to ensure Tailwind scans the package's files for utility classes.
Add this to your globals.css or main CSS file AFTER the tailwindcss import:
@import "tailwindcss";
/* Required: Enable Tailwind to scan hazo_files package for utility classes */
@source "../node_modules/hazo_files/dist/ui";Without this directive, Tailwind v4's JIT compiler will not generate CSS for the utility classes used in hazo_files components (like hover:bg-gray-100, text-sm, rounded-md, etc.), resulting in broken styling.
Note: This is only required for Tailwind v4. Earlier versions of Tailwind automatically scan node_modules and do not need this configuration.
Quick Start
Basic Usage (Server-side)
import { createInitializedFileManager } from 'hazo_files';
// Create and initialize file manager
const fileManager = await createInitializedFileManager({
config: {
provider: 'local',
local: {
basePath: './files',
maxFileSize: 10 * 1024 * 1024, // 10MB
allowedExtensions: ['jpg', 'png', 'pdf', 'txt']
}
}
});
// Create a directory
await fileManager.createDirectory('/documents');
// Upload a file
await fileManager.uploadFile(
'./local-file.pdf',
'/documents/file.pdf',
{
onProgress: (progress, bytes, total) => {
console.log(`Upload progress: ${progress}%`);
}
}
);
// List directory contents
const result = await fileManager.listDirectory('/documents');
if (result.success) {
console.log(result.data);
}
// Download a file
await fileManager.downloadFile('/documents/file.pdf', './downloaded.pdf');Using Configuration File
Create hazo_files_config.ini in your project root:
[general]
provider = local
[local]
base_path = ./files
max_file_size = 10485760
allowed_extensions = jpg,png,pdf,txtThen initialize without config object:
import { createInitializedFileManager } from 'hazo_files';
const fileManager = await createInitializedFileManager();React UI Component
import { FileBrowser } from 'hazo_files/ui';
import type { FileBrowserAPI } from 'hazo_files/ui';
// Create an API adapter that calls your server endpoints
const api: FileBrowserAPI = {
async listDirectory(path: string) {
const res = await fetch(`/api/files?action=list&path=${path}`);
return res.json();
},
async getFolderTree(path = '/', depth = 3) {
const res = await fetch(`/api/files?action=tree&path=${path}&depth=${depth}`);
return res.json();
},
async uploadFile(file: File, remotePath: string) {
const formData = new FormData();
formData.append('file', file);
formData.append('path', remotePath);
const res = await fetch('/api/files/upload', { method: 'POST', body: formData });
return res.json();
},
// ... implement other methods
};
function MyFileBrowser() {
return (
<FileBrowser
api={api}
initialPath="/"
showPreview={true}
showTree={true}
viewMode="grid"
/>
);
}Advanced Usage
Google Drive Integration
1. Set up Google Cloud Console
- Go to Google Cloud Console
- Create a new project or select an existing one
- Enable the Google Drive API
- Create OAuth 2.0 credentials
- Add authorized redirect URIs (e.g.,
http://localhost:3000/api/auth/callback/google)
2. Configure Environment Variables
Create .env.local:
HAZO_GOOGLE_DRIVE_CLIENT_ID=your-client-id.apps.googleusercontent.com
HAZO_GOOGLE_DRIVE_CLIENT_SECRET=your-client-secret
HAZO_GOOGLE_DRIVE_REDIRECT_URI=http://localhost:3000/api/auth/callback/google3. Configure hazo_files
[general]
provider = google_drive
[google_drive]
client_id =
client_secret =
redirect_uri = http://localhost:3000/api/auth/callback/google
refresh_token =Environment variables will automatically override empty values.
4. Implement OAuth Flow
import { createFileManager, GoogleDriveModule } from 'hazo_files';
// Initialize with Google Drive
const fileManager = createFileManager({
config: {
provider: 'google_drive',
google_drive: {
clientId: process.env.HAZO_GOOGLE_DRIVE_CLIENT_ID!,
clientSecret: process.env.HAZO_GOOGLE_DRIVE_CLIENT_SECRET!,
redirectUri: process.env.HAZO_GOOGLE_DRIVE_REDIRECT_URI!,
}
}
});
await fileManager.initialize();
// Get the Google Drive module to access auth methods
const module = fileManager.getModule() as GoogleDriveModule;
const auth = module.getAuth();
// Generate auth URL
const authUrl = auth.getAuthUrl();
console.log('Visit:', authUrl);
// After user authorizes, exchange code for tokens
const tokens = await auth.exchangeCodeForTokens(authCode);
// Authenticate the module
await module.authenticate(tokens);
// Now you can use the file manager
await fileManager.createDirectory('/MyFolder');Next.js API Route Example
// app/api/files/route.ts
import { NextRequest, NextResponse } from 'next/server';
import { createInitializedFileManager } from 'hazo_files';
async function getFileManager() {
return createInitializedFileManager({
config: {
provider: 'local',
local: {
basePath: process.env.LOCAL_STORAGE_BASE_PATH || './files',
}
}
});
}
export async function GET(request: NextRequest) {
const { searchParams } = new URL(request.url);
const action = searchParams.get('action');
const path = searchParams.get('path') || '/';
const fm = await getFileManager();
switch (action) {
case 'list':
return NextResponse.json(await fm.listDirectory(path));
case 'tree':
const depth = parseInt(searchParams.get('depth') || '3', 10);
return NextResponse.json(await fm.getFolderTree(path, depth));
default:
return NextResponse.json({ success: false, error: 'Invalid action' });
}
}
export async function POST(request: NextRequest) {
const body = await request.json();
const { action, ...params } = body;
const fm = await getFileManager();
switch (action) {
case 'createDirectory':
return NextResponse.json(await fm.createDirectory(params.path));
case 'deleteFile':
return NextResponse.json(await fm.deleteFile(params.path));
case 'renameFile':
return NextResponse.json(await fm.renameFile(params.path, params.newName));
default:
return NextResponse.json({ success: false, error: 'Invalid action' });
}
}File Upload API Route
// app/api/files/upload/route.ts
import { NextRequest, NextResponse } from 'next/server';
import { createInitializedFileManager } from 'hazo_files';
export async function POST(request: NextRequest) {
const formData = await request.formData();
const file = formData.get('file') as File;
const path = formData.get('path') as string;
const fm = await getFileManager();
// Convert File to Buffer
const arrayBuffer = await file.arrayBuffer();
const buffer = Buffer.from(arrayBuffer);
const result = await fm.uploadFile(buffer, path);
return NextResponse.json(result);
}Progress Tracking
// Upload with progress tracking
await fileManager.uploadFile(
'./large-file.zip',
'/uploads/large-file.zip',
{
onProgress: (progress, bytesTransferred, totalBytes) => {
console.log(`Progress: ${progress.toFixed(2)}%`);
console.log(`${bytesTransferred} / ${totalBytes} bytes`);
}
}
);
// Download with progress tracking
await fileManager.downloadFile(
'/uploads/large-file.zip',
'./downloaded-file.zip',
{
onProgress: (progress, bytesTransferred, totalBytes) => {
console.log(`Download: ${progress.toFixed(2)}%`);
}
}
);File Operations
// Create directory structure
await fileManager.createDirectory('/projects/2024/docs');
// Upload file
const uploadResult = await fileManager.uploadFile(
buffer,
'/projects/2024/docs/report.pdf'
);
// Move file
await fileManager.moveItem(
'/projects/2024/docs/report.pdf',
'/archive/2024/report.pdf'
);
// Rename file
await fileManager.renameFile(
'/archive/2024/report.pdf',
'annual-report.pdf'
);
// Copy file (convenience method)
await fileManager.copyFile(
'/archive/2024/annual-report.pdf',
'/backup/annual-report.pdf'
);
// Delete file
await fileManager.deleteFile('/backup/annual-report.pdf');
// Remove directory (recursive)
await fileManager.removeDirectory('/archive/2024', true);
// Check if file exists
const exists = await fileManager.exists('/projects/2024/docs');
// Get file/folder information
const itemResult = await fileManager.getItem('/projects/2024/docs/report.pdf');
if (itemResult.success && itemResult.data) {
console.log('File:', itemResult.data.name);
console.log('Size:', itemResult.data.size);
console.log('Modified:', itemResult.data.modifiedAt);
}
// List directory with options
const listResult = await fileManager.listDirectory('/projects', {
recursive: true,
includeHidden: false,
filter: (item) => !item.isDirectory && item.name.endsWith('.pdf')
});Working with Text Files
// Write text file
await fileManager.writeFile('/notes/readme.txt', 'Hello, World!');
// Read text file
const readResult = await fileManager.readFile('/notes/readme.txt');
if (readResult.success) {
console.log(readResult.data); // "Hello, World!"
}Folder Tree
// Get folder tree (3 levels deep by default)
const treeResult = await fileManager.getFolderTree('/projects', 3);
if (treeResult.success && treeResult.data) {
console.log(JSON.stringify(treeResult.data, null, 2));
}
// Output:
// [
// {
// "id": "abc123",
// "name": "2024",
// "path": "/projects/2024",
// "children": [
// {
// "id": "def456",
// "name": "docs",
// "path": "/projects/2024/docs",
// "children": []
// }
// ]
// }
// ]Configuration
Configuration File (hazo_files_config.ini)
[general]
provider = local
[local]
base_path = ./files
allowed_extensions = jpg,png,pdf,txt,doc,docx
max_file_size = 10485760
[google_drive]
client_id = your-client-id.apps.googleusercontent.com
client_secret = your-client-secret
redirect_uri = http://localhost:3000/api/auth/callback/google
refresh_token =
access_token =
root_folder_id =
[naming]
; Supported date format tokens for naming rules
date_formats = YYYY,YY,MM,M,DD,D,MMM,MMMM,YYYY-MM-DD,YYYY-MMM-DD,DD-MM-YYYY,MM-DD-YYYYEnvironment Variables
The following environment variables can override configuration file values:
HAZO_GOOGLE_DRIVE_CLIENT_IDHAZO_GOOGLE_DRIVE_CLIENT_SECRETHAZO_GOOGLE_DRIVE_REDIRECT_URIHAZO_GOOGLE_DRIVE_REFRESH_TOKENHAZO_GOOGLE_DRIVE_ACCESS_TOKENHAZO_GOOGLE_DRIVE_ROOT_FOLDER_ID
Configuration via Code
import { createInitializedFileManager } from 'hazo_files';
const fileManager = await createInitializedFileManager({
config: {
provider: 'local',
local: {
basePath: './storage',
allowedExtensions: ['jpg', 'png', 'gif', 'pdf'],
maxFileSize: 5 * 1024 * 1024 // 5MB
}
}
});UI Components
FileBrowser Component
The FileBrowser is a complete, drop-in file management UI with:
- Folder tree navigation
- File list (grid or list view)
- Breadcrumb navigation
- File preview (images, text, PDFs)
- Context menus and actions
- Upload, download, rename, delete operations
- Drag-and-drop file moving between folders
import { FileBrowser } from 'hazo_files/ui';
<FileBrowser
api={api}
initialPath="/"
showPreview={true}
showTree={true}
viewMode="grid"
treeWidth={250}
previewHeight={300}
onError={(error) => console.error(error)}
onNavigate={(path) => console.log('Navigated to:', path)}
onSelect={(item) => console.log('Selected:', item)}
/>Drag-and-Drop File Moving
The FileBrowser includes built-in drag-and-drop functionality for moving files and folders:
Features:
- Drag files/folders from the file list
- Drop onto folders in the sidebar tree or main file list
- Visual feedback with opacity and colored borders during drag
- Prevents invalid operations (dropping on self, into current parent, folder into descendant)
- Shows dragged item preview during drag operation
How to use:
- Click and hold on any file or folder in the file list
- Drag it over a folder in either the tree sidebar or file list
- Valid drop targets show a green ring/background
- Release to move the item to the new location
Technical requirements:
- Requires
@dnd-kit/corepeer dependency (already included for NamingRuleConfigurator) - API must implement
moveItem(sourcePath, destinationPath)method - Automatically validates drop targets to prevent invalid moves
Visual feedback:
- Dragging: Item becomes semi-transparent (opacity-50)
- Valid drop target: Green ring (
ring-2 ring-green-500) and background (bg-green-50) - Drag preview: Shows file/folder icon and name following cursor
ID patterns used:
- File items:
file-item-{path}(draggable) - Folder tree drops:
folder-drop-tree-{path}(droppable) - Folder list drops:
folder-drop-list-{path}(droppable)
Individual Components
You can also use individual components:
import {
PathBreadcrumb,
FolderTree,
FileList,
FilePreview,
FileActions,
FileInfoPanel
} from 'hazo_files/ui';
// Use individually with your own layoutFileInfoPanel Component
The FileInfoPanel displays file metadata in a structured format and can be used standalone in sidebars, custom dialogs, or inline panels:
import { FileInfoPanel } from 'hazo_files/ui';
// In a sidebar
function Sidebar({ selectedFile, metadata, isLoading }) {
return (
<div className="sidebar p-4">
<h3 className="font-bold mb-4">File Info</h3>
<FileInfoPanel
item={selectedFile}
metadata={metadata}
isLoading={isLoading}
/>
</div>
);
}
// Without custom metadata section
<FileInfoPanel
item={file}
showCustomMetadata={false}
className="bg-gray-50 rounded-lg p-4"
/>
// In a custom dialog
function MyCustomDialog({ file }) {
return (
<dialog>
<FileInfoPanel item={file} showCustomMetadata={false} />
</dialog>
);
}Props:
item: FileSystemItem | null- The file or folder to display info formetadata?: FileMetadata | null- Additional metadata from databaseisLoading?: boolean- Show loading state for custom metadatashowCustomMetadata?: boolean- Whether to show the JSON metadata section (default: true)className?: string- Additional CSS classes for custom styling
Hooks
import { useFileBrowser, useFileOperations } from 'hazo_files/ui';
function MyCustomFileBrowser() {
const {
currentPath,
files,
tree,
selectedItem,
isLoading,
navigate,
refresh,
selectItem
} = useFileBrowser(api, '/');
const {
createFolder,
uploadFiles,
deleteItem,
renameItem
} = useFileOperations(api, currentPath);
// Build your custom UI
}Naming Rule Configurator
Build consistent file/folder naming patterns with a visual drag-and-drop interface:
import { NamingRuleConfigurator } from 'hazo_files/ui';
import type { NamingVariable } from 'hazo_files/ui';
function NamingConfig() {
// Define user-specific variables
const userVariables: NamingVariable[] = [
{
variable_name: 'project_name',
description: 'Name of the project',
example_value: 'WebApp',
category: 'user'
},
{
variable_name: 'client_id',
description: 'Client identifier',
example_value: 'ACME',
category: 'user'
},
];
const handleSchemaChange = (schema) => {
console.log('New schema:', schema);
// Save to database or state
};
const handleExport = (schema) => {
// Export as JSON file
const blob = new Blob([JSON.stringify(schema, null, 2)], { type: 'application/json' });
const url = URL.createObjectURL(blob);
const a = document.createElement('a');
a.href = url;
a.download = 'naming-rule.json';
a.click();
};
return (
<NamingRuleConfigurator
variables={userVariables}
onChange={handleSchemaChange}
onExport={handleExport}
sampleFileName="proposal.pdf"
/>
);
}The configurator provides:
- Category Tabs: User, Date, File, Counter variables
- Drag & Drop: Build patterns by dragging variables into file/folder patterns
- Segment Reordering: Drag segments within patterns to reorder them
- Live Preview: See generated names in real-time with example values
- Undo/Redo: Full history with keyboard shortcuts (Ctrl+Z, Ctrl+Y)
- Import/Export: Save and load naming rules as JSON
- Scrollable Layout: Works in fixed-height containers with scrollable content area
System variables included:
- Date: YYYY, YY, MM, DD, YYYY-MM-DD, MMM, MMMM, etc.
- File: original_name, extension, ext
- Counter: counter (auto-incrementing with padding)
Naming Convention Management Components
Full UI for managing naming conventions stored in the database:
import {
NamingConventionManager,
NamingConventionList,
NamingConventionEditor,
} from 'hazo_files/ui';
// Full management UI (list + editor combined)
<NamingConventionManager
api={namingAPI}
onSelect={(convention) => applyConvention(convention)}
/>
// Or use components separately
<NamingConventionList
api={namingAPI}
selectedId={selectedId}
onSelect={setSelectedId}
onEdit={(id) => openEditor(id)}
onDelete={(id) => confirmDelete(id)}
/>
<NamingConventionEditor
api={namingAPI}
conventionId={editingId}
userVariables={customVariables}
onSave={(convention) => handleSave(convention)}
onCancel={() => closeEditor()}
/>Naming Rules API
Generate file and folder names programmatically from naming schemas:
import {
hazo_files_generate_file_name,
hazo_files_generate_folder_name,
createVariableSegment,
createLiteralSegment,
type NamingRuleSchema
} from 'hazo_files';
// Create a naming schema
const schema: NamingRuleSchema = {
version: 1,
filePattern: [
createVariableSegment('client_id'),
createLiteralSegment('_'),
createVariableSegment('project_name'),
createLiteralSegment('_'),
createVariableSegment('YYYY-MM-DD'),
createLiteralSegment('_'),
createVariableSegment('counter'),
],
folderPattern: [
createVariableSegment('YYYY'),
createLiteralSegment('/'),
createVariableSegment('client_id'),
createLiteralSegment('/'),
createVariableSegment('project_name'),
],
};
// Define variable values
const variables = {
client_id: 'ACME',
project_name: 'Website',
};
// Generate file name
const fileResult = hazo_files_generate_file_name(
schema,
variables,
'original-document.pdf',
{
counterValue: 42,
preserveExtension: true, // Keep original .pdf extension
date: new Date('2024-12-09'),
}
);
if (fileResult.success) {
console.log(fileResult.name);
// Output: "ACME_Website_2024-12-09_042.pdf"
}
// Generate folder path
const folderResult = hazo_files_generate_folder_name(schema, variables);
if (folderResult.success) {
console.log(folderResult.name);
// Output: "2024/ACME/Website"
}
// Use with FileManager
const uploadPath = `/${folderResult.name}/${fileResult.name}`;
await fileManager.uploadFile(buffer, uploadPath);Available System Variables
Date Variables (use current date unless overridden):
YYYY- Full year (2024)YY- Two-digit year (24)MM- Month with zero padding (01-12)M- Month without padding (1-12)DD- Day with zero padding (01-31)D- Day without padding (1-31)MMM- Short month name (Jan, Feb, etc.)MMMM- Full month name (January, February, etc.)YYYY-MM-DD- ISO date format (2024-01-15)YYYY-MMM-DD- Date with month name (2024-Jan-15)DD-MM-YYYY- European format (15-01-2024)MM-DD-YYYY- US format (01-15-2024)
File Metadata Variables (from original filename):
original_name- Filename without extensionextension- File extension with dot (.pdf)ext- Extension without dot (pdf)
Counter Variable:
counter- Auto-incrementing number with zero padding (001, 042, 123)
Parsing Pattern Strings
You can also parse pattern strings directly:
import { parsePatternString, patternToString } from 'hazo_files';
// Parse string to segments
const segments = parsePatternString('{client_id}_{YYYY-MM-DD}_{counter}');
console.log(segments);
// [
// { id: '...', type: 'variable', value: 'client_id' },
// { id: '...', type: 'literal', value: '_' },
// { id: '...', type: 'variable', value: 'YYYY-MM-DD' },
// { id: '...', type: 'literal', value: '_' },
// { id: '...', type: 'variable', value: 'counter' },
// ]
// Convert back to string
const patternStr = patternToString(segments);
// "{client_id}_{YYYY-MM-DD}_{counter}"Extraction Data Management
Manage LLM-extracted data stored within the file_data JSON field. The system maintains both raw extraction history and merged results.
Data Structure
interface FileDataStructure {
merged_data: Record<string, unknown>; // Combined data from all extractions
raw_data: ExtractionData[]; // Individual extraction entries
}
interface ExtractionData {
id: string; // Unique extraction ID
extracted_at: string; // ISO timestamp
source?: string; // Optional source identifier (e.g., model name)
data: Record<string, unknown>; // The extracted data
}Using with FileMetadataService
import { FileMetadataService, createFileMetadataService } from 'hazo_files';
// Create service with your CRUD provider
const metadataService = createFileMetadataService(crudService);
// Add an extraction
const extraction = await metadataService.addExtraction(
'/documents/report.pdf',
'local',
{ title: 'Annual Report', author: 'John Doe', pages: 42 },
{ source: 'gpt-4', mergeStrategy: 'shallow' }
);
console.log('Added extraction:', extraction?.id);
// Get merged data (combined from all extractions)
const merged = await metadataService.getMergedData('/documents/report.pdf', 'local');
console.log('Merged data:', merged);
// Get all extractions
const extractions = await metadataService.getExtractions('/documents/report.pdf', 'local');
console.log('All extractions:', extractions);
// Get a specific extraction
const specific = await metadataService.getExtractionById(
'/documents/report.pdf',
'local',
extraction?.id
);
// Remove an extraction (recalculates merged_data by default)
await metadataService.removeExtractionById(
'/documents/report.pdf',
'local',
extraction?.id,
{ recalculateMerged: true, mergeStrategy: 'deep' }
);
// Clear all extractions
await metadataService.clearExtractions('/documents/report.pdf', 'local');Using Utility Functions Directly
For working with parsed data structures without database operations:
import {
parseFileData,
addExtractionToFileData,
removeExtractionById,
getMergedData,
getExtractions,
deepMerge,
createEmptyFileDataStructure,
} from 'hazo_files';
// Parse existing JSON (auto-migrates old format)
const fileData = parseFileData(existingJsonString);
// Add an extraction (returns new structure, immutable)
const result = addExtractionToFileData(
fileData,
{ category: 'finance', summary: 'Q4 results' },
{ source: 'claude-3', mergeStrategy: 'deep' }
);
if (result.success) {
const newFileData = result.data;
console.log('New merged data:', newFileData.merged_data);
console.log('Extraction count:', newFileData.raw_data.length);
}
// Remove an extraction by ID
const removeResult = removeExtractionById(fileData, 'ext_12345', {
recalculateMerged: true,
mergeStrategy: 'shallow'
});
// Get copies of data
const mergedCopy = getMergedData(fileData);
const extractionsCopy = getExtractions(fileData);Merge Strategies
Shallow (default): Spreads top-level properties, later values overwrite earlier
// { a: 1, b: 2 } + { b: 3, c: 4 } = { a: 1, b: 3, c: 4 }Deep: Recursively merges nested objects, concatenates arrays
// { a: { x: 1 }, arr: [1] } + { a: { y: 2 }, arr: [2] } = { a: { x: 1, y: 2 }, arr: [1, 2] }
Migration from Old Format
The parseFileData function automatically migrates old plain-object format to the new structure:
// Old format: { title: 'Report', author: 'John' }
// Becomes: { merged_data: { title: 'Report', author: 'John' }, raw_data: [] }Naming Convention Management
Store and manage naming conventions in your database with full CRUD operations.
NamingConventionService
import { NamingConventionService, HAZO_FILES_NAMING_TABLE_SCHEMA } from 'hazo_files';
import { createCrudService } from 'hazo_connect/server';
// Create CRUD service for naming conventions table
const namingCrud = createCrudService(adapter, HAZO_FILES_NAMING_TABLE_SCHEMA.tableName);
const namingService = new NamingConventionService(namingCrud);
// Create a naming convention
const convention = await namingService.create({
naming_title: 'Tax Documents',
naming_type: 'both', // 'file', 'folder', or 'both'
naming_value: {
version: 1,
filePattern: [
{ id: '1', type: 'variable', value: 'client_id' },
{ id: '2', type: 'literal', value: '_' },
{ id: '3', type: 'variable', value: 'YYYY-MM-DD' },
],
folderPattern: [
{ id: '4', type: 'variable', value: 'YYYY' },
{ id: '5', type: 'literal', value: '/' },
{ id: '6', type: 'variable', value: 'client_id' },
],
},
variables: [
{ variable_name: 'client_id', description: 'Client ID', example_value: 'ACME', category: 'user' }
],
scope_id: 'optional-scope-uuid', // Link to hazo_scopes for organization
});
// Get all conventions
const allConventions = await namingService.list();
// Get parsed conventions (with schema and variables as objects)
const parsed = await namingService.listParsed();
// Get by scope (e.g., for a specific organization)
const scopedConventions = await namingService.getByScope('scope-uuid');
// Update
await namingService.update(convention.id, {
naming_title: 'Updated Tax Documents',
});
// Duplicate
const copy = await namingService.duplicate(convention.id, 'Tax Documents Copy');
// Delete
await namingService.delete(convention.id);NamingConventionManager UI Component
import { NamingConventionManager } from 'hazo_files/ui';
import type { NamingConventionAPI } from 'hazo_files/ui';
// Create API adapter for your backend
const namingAPI: NamingConventionAPI = {
list: () => fetch('/api/naming-conventions').then(r => r.json()),
create: (input) => fetch('/api/naming-conventions', {
method: 'POST',
body: JSON.stringify(input),
}).then(r => r.json()),
update: (id, input) => fetch(`/api/naming-conventions/${id}`, {
method: 'PATCH',
body: JSON.stringify(input),
}).then(r => r.json()),
delete: (id) => fetch(`/api/naming-conventions/${id}`, {
method: 'DELETE',
}).then(r => r.json()),
};
function NamingConventionsPage() {
return (
<NamingConventionManager
api={namingAPI}
onSelect={(convention) => console.log('Selected:', convention)}
/>
);
}Upload with LLM Extraction
Combine file uploads with automatic LLM extraction and naming convention application.
UploadExtractService
import {
TrackedFileManager,
NamingConventionService,
LLMExtractionService,
UploadExtractService,
} from 'hazo_files';
import { createLLM } from 'hazo_llm_api';
// Create LLM extraction service
const extractionService = new LLMExtractionService((provider, options) => {
return createLLM({ provider, ...options });
}, 'gemini');
// Create upload + extract service (with optional content tag config)
const uploadExtract = new UploadExtractService(
trackedFileManager,
namingService,
extractionService,
{
content_tag_set_by_llm: true,
content_tag_prompt_area: 'classification',
content_tag_prompt_key: 'classify_document',
content_tag_prompt_return_fieldname: 'document_type',
}
);
// Upload with extraction and naming convention
const result = await uploadExtract.uploadWithExtract(
pdfBuffer,
'quarterly-report.pdf',
{
// Enable LLM extraction
extract: true,
extractionOptions: {
promptArea: 'reports',
promptKey: 'extract_summary',
llmProvider: 'gemini',
},
// Apply naming convention
namingConventionId: 'convention-uuid',
namingVariables: { client_id: 'ACME', project: 'Q4' },
basePath: '/documents',
createFolders: true,
counterValue: 1,
}
);
if (result.success) {
console.log('Uploaded to:', result.generatedPath);
// e.g., '/documents/2024/ACME/ACME_Q4_2024-12-09_001.pdf'
console.log('Extracted data:', result.extraction?.data);
console.log('Content tag:', result.contentTag);
// e.g., 'invoice', 'report', 'contract'
}
// Generate path preview without uploading
const preview = await uploadExtract.generatePath(
'document.pdf',
'convention-uuid',
{ client_id: 'ACME' },
{ basePath: '/docs', counterValue: 5 }
);
console.log('Would upload to:', preview.fullPath);
// Create folder from naming convention
const folderResult = await uploadExtract.createFolderFromConvention(
'convention-uuid',
{ client_id: 'ACME', project: 'Website' },
{ basePath: '/projects' }
);LLMExtractionService Standalone
import { LLMExtractionService } from 'hazo_files';
const extractionService = new LLMExtractionService(llmFactory, 'gemini');
// Extract from document
const result = await extractionService.extractFromDocument(
pdfBuffer,
'application/pdf',
{
customPrompt: 'Extract all financial figures and dates',
llmProvider: 'qwen',
}
);
// Extract from image
const imageResult = await extractionService.extractFromImage(
imageBuffer,
'image/jpeg',
{
promptArea: 'receipts',
promptKey: 'extract_receipt',
}
);
// Auto-detect based on MIME type
const autoResult = await extractionService.extract(
buffer,
mimeType,
extractionOptions
);Content Tagging
Automatically classify uploaded files using LLM-based content analysis. The content_tag field stores a classification string (e.g., "invoice", "report", "contract") determined by an LLM prompt.
Configuration
import type { ContentTagConfig } from 'hazo_files';
const contentTagConfig: ContentTagConfig = {
content_tag_set_by_llm: true,
content_tag_prompt_area: 'classification',
content_tag_prompt_key: 'classify_document',
content_tag_prompt_return_fieldname: 'document_type',
content_tag_prompt_variables: { language: 'en' }, // optional
};Automatic Tagging at Upload
Pass contentTagConfig to UploadExtractService constructor (default for all uploads) or per-upload via options:
// Per-upload override
const result = await uploadExtract.uploadWithExtract(buffer, 'file.pdf', {
basePath: '/docs',
contentTagConfig: {
content_tag_set_by_llm: true,
content_tag_prompt_area: 'classification',
content_tag_prompt_key: 'classify_document',
content_tag_prompt_return_fieldname: 'document_type',
},
});
console.log(result.contentTag); // e.g., 'invoice'Manual Tagging
Tag existing files by their database record ID:
const tagResult = await uploadExtract.tagFileContent('file-record-id');
if (tagResult.success) {
console.log('Tagged as:', tagResult.data);
}V3 Database Migration
If you have an existing hazo_files table, run the V3 migration to add the content_tag column:
import { migrateToV3, HAZO_FILES_MIGRATION_V3 } from 'hazo_files';
// Using the migration helper
await migrateToV3(
{ run: (sql) => db.run(sql) },
'sqlite'
);
// Or run statements manually
for (const stmt of HAZO_FILES_MIGRATION_V3.sqlite.alterStatements) {
try { await db.run(stmt); } catch { /* column exists */ }
}New tables created with HAZO_FILES_TABLE_SCHEMA already include the content_tag column.
File Reference Tracking
Track which entities (form fields, chat messages, etc.) reference each file. Multiple entities can reference the same file, enabling shared files without duplication.
Adding and Removing References
import { TrackedFileManager } from 'hazo_files';
// Upload a file with an initial reference
const result = await trackedManager.uploadFileWithRef(buffer, '/docs/report.pdf', {
scope_id: 'workspace-123',
uploaded_by: 'user-456',
ref: {
entity_type: 'form_field',
entity_id: 'field-789',
created_by: 'user-456',
},
});
// result.data.file_id, result.data.ref_id
// Add another reference to the same file
await trackedManager.addRef(fileId, {
entity_type: 'chat_message',
entity_id: 'msg-abc',
});
// Remove a specific reference
const { remaining_refs } = await trackedManager.removeRef(fileId, refId);
// Get file with status info
const fileStatus = await trackedManager.getFileById(fileId);
// { record, refs: FileRef[], is_orphaned: boolean }Orphan Detection and Cleanup
// Find files with zero references
const orphans = await trackedManager.findOrphanedFiles({
olderThanMs: 7 * 24 * 60 * 60 * 1000, // 7 days old
scope_id: 'workspace-123',
});
// Clean up orphaned files (delete physical files + DB records)
const { cleaned, errors } = await trackedManager.cleanupOrphanedFiles({
olderThanMs: 30 * 24 * 60 * 60 * 1000,
softDeleteOnly: false, // true to only mark as soft_deleted
});
// Soft-delete a specific file
await trackedManager.softDeleteFile(fileId);
// Verify physical file existence
const exists = await trackedManager.verifyFileExistence(fileId);Database Migration (Existing Databases)
If you have an existing hazo_files table, run the V2 migration to add reference tracking columns:
import { migrateToV2, backfillV2Defaults, HAZO_FILES_MIGRATION_V2 } from 'hazo_files';
// Using the migration helper
await migrateToV2(
{ run: (sql) => db.exec(sql) }, // SQLite
'sqlite'
);
await backfillV2Defaults({ run: (sql) => db.exec(sql) }, 'sqlite');
// Or run statements manually
for (const stmt of HAZO_FILES_MIGRATION_V2.sqlite.alterStatements) {
try { await db.run(stmt); } catch { /* column exists */ }
}
for (const idx of HAZO_FILES_MIGRATION_V2.sqlite.indexes) {
await db.run(idx);
}New tables created with HAZO_FILES_TABLE_SCHEMA already include V2 columns. For V3 content tagging migration, see Content Tagging above.
Reference Tracking Types
import type {
FileRef, // Individual reference from entity to file
FileMetadataRecordV2, // Extended record with refs, status, scope
FileWithStatus, // Rich view: record + parsed refs + is_orphaned
FileStatus, // 'active' | 'orphaned' | 'soft_deleted' | 'missing'
AddRefOptions, // Options for adding a reference
RemoveRefsCriteria, // Criteria for bulk ref removal
} from 'hazo_files';File Change Detection
Detect file content changes using fast xxHash hashing.
import { TrackedFileManager, computeFileHash, hasFileContentChanged } from 'hazo_files';
// TrackedFileManager automatically tracks file hashes on upload
const result = await trackedManager.uploadFile(buffer, '/docs/report.pdf', {
skipHash: false, // Hash is computed by default
awaitRecording: true, // Wait for DB record before returning
});
// Check if a file has changed since it was tracked
const hasChanged = await trackedManager.hasFileChanged('/docs/report.pdf');
if (hasChanged) {
console.log('File has been modified since last upload');
}
// Get stored hash and size
const hash = await trackedManager.getStoredHash('/docs/report.pdf');
const size = await trackedManager.getStoredSize('/docs/report.pdf');
// Use hash utilities directly
const fileHash = await computeFileHash(buffer);
const changed = await hasFileContentChanged(oldHash, newBuffer);Server Entry Point
For server-side applications, use the /server entry point which includes a factory function:
import { createHazoFilesServer } from 'hazo_files/server';
const hazoFiles = await createHazoFilesServer({
crudService: fileCrud,
namingCrudService: namingCrud,
config: {
provider: 'local',
local: { basePath: './storage' },
},
enableTracking: true,
llmFactory: (provider) => createLLM({ provider }),
// Optional: enable automatic content tagging for all uploads
defaultContentTagConfig: {
content_tag_set_by_llm: true,
content_tag_prompt_area: 'classification',
content_tag_prompt_key: 'classify_document',
content_tag_prompt_return_fieldname: 'document_type',
},
});
// Access all services
const { fileManager, metadataService, namingService, extractionService, uploadExtractService } = hazoFiles;API Reference
FileManager
Main service class providing unified file operations.
Methods
initialize(config?: HazoFilesConfig): Promise<void>- Initialize the file managercreateDirectory(path: string): Promise<OperationResult<FolderItem>>- Create directoryremoveDirectory(path: string, recursive?: boolean): Promise<OperationResult>- Remove directoryuploadFile(source, remotePath, options?): Promise<OperationResult<FileItem>>- Upload filedownloadFile(remotePath, localPath?, options?): Promise<OperationResult<Buffer | string>>- Download filemoveItem(sourcePath, destinationPath, options?): Promise<OperationResult<FileSystemItem>>- Move file/folderdeleteFile(path: string): Promise<OperationResult>- Delete filerenameFile(path, newName, options?): Promise<OperationResult<FileItem>>- Rename filerenameFolder(path, newName, options?): Promise<OperationResult<FolderItem>>- Rename folderlistDirectory(path, options?): Promise<OperationResult<FileSystemItem[]>>- List directory contentsgetItem(path: string): Promise<OperationResult<FileSystemItem>>- Get file/folder infoexists(path: string): Promise<boolean>- Check if file/folder existsgetFolderTree(path?, depth?): Promise<OperationResult<TreeNode[]>>- Get folder treewriteFile(path, content, options?): Promise<OperationResult<FileItem>>- Write text filereadFile(path: string): Promise<OperationResult<string>>- Read text filecopyFile(sourcePath, destinationPath, options?): Promise<OperationResult<FileItem>>- Copy fileensureDirectory(path: string): Promise<OperationResult<FolderItem>>- Ensure directory exists
Types
type StorageProvider = 'local' | 'google_drive';
interface FileItem {
id: string;
name: string;
path: string;
size: number;
mimeType: string;
createdAt: Date;
modifiedAt: Date;
isDirectory: false;
parentId?: string;
metadata?: Record<string, unknown>;
}
interface FolderItem {
id: string;
name: string;
path: string;
createdAt: Date;
modifiedAt: Date;
isDirectory: true;
parentId?: string;
children?: (FileItem | FolderItem)[];
metadata?: Record<string, unknown>;
}
interface OperationResult<T = void> {
success: boolean;
data?: T;
error?: string;
}
interface UploadOptions {
overwrite?: boolean;
onProgress?: (progress: number, bytesTransferred: number, totalBytes: number) => void;
metadata?: Record<string, unknown>;
}See src/types/index.ts for complete type definitions.
Error Handling
hazo_files provides comprehensive error types:
import {
FileNotFoundError,
DirectoryNotFoundError,
FileExistsError,
DirectoryExistsError,
DirectoryNotEmptyError,
PermissionDeniedError,
InvalidPathError,
FileTooLargeError,
InvalidExtensionError,
AuthenticationError,
ConfigurationError,
OperationError
} from 'hazo_files';
// Use in try-catch
try {
await fileManager.uploadFile(buffer, '/files/test.exe');
} catch (error) {
if (error instanceof InvalidExtensionError) {
console.error('File type not allowed');
} else if (error instanceof FileTooLargeError) {
console.error('File is too large');
}
}Extending with Custom Storage Providers
See docs/ADDING_MODULES.md for a complete guide on creating custom storage modules.
Quick example:
import { BaseStorageModule } from 'hazo_files';
import type { StorageProvider, OperationResult, FileItem } from 'hazo_files';
class S3StorageModule extends BaseStorageModule {
readonly provider: StorageProvider = 's3' as StorageProvider;
async initialize(config: HazoFilesConfig): Promise<void> {
await super.initialize(config);
// Initialize S3 client
}
async uploadFile(source, remotePath, options?): Promise<OperationResult<FileItem>> {
// Implement S3 upload
}
// Implement other required methods...
}
// Register the module
import { registerModule } from 'hazo_files';
registerModule('s3', () => new S3StorageModule());Testing
The package includes a test application in test-app/ demonstrating:
- Next.js 14+ integration
- API routes for file operations
- FileBrowser UI component usage
- Local storage and Google Drive switching
- OAuth flow implementation
To run the test app:
cd test-app
npm install
npm run devVisit http://localhost:3000
Browser Compatibility
The UI components require:
- Modern browsers with ES2020+ support
- React 18+
- CSS Grid and Flexbox support
Server-side code requires Node.js 16+
License
MIT License - see LICENSE file for details
Contributing
Contributions are welcome! Please:
- Fork the repository
- Create a feature branch
- Commit your changes with clear messages
- Add tests for new functionality
- Submit a pull request
Support
- GitHub Issues: https://github.com/pub12/hazo_files/issues
- Documentation: https://github.com/pub12/hazo_files
Roadmap
- Amazon S3 storage module
- Dropbox storage module
- OneDrive storage module
- WebDAV support
- Advanced search and filtering
- Batch operations
- File versioning
- Sharing and permissions
- Real-time file sync
- Thumbnail generation
Credits
Created by Pubs Abayasiri
Built with:
- TypeScript
- React
- Google APIs (googleapis)
- xxhash-wasm for fast file hashing
- @dnd-kit for drag-and-drop
- tsup for building
