npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

hazo_files

v2.0.1

Published

File management including integration to cloud files

Readme

hazo_files

npm version License: MIT

A powerful, modular file management package for Node.js and React applications with support for local filesystem and Google Drive storage. Built with TypeScript for type safety and developer experience.

Features

  • FileStorageProvider Interface: Lightweight put/get/delete/exists/getSignedUrl/probe abstraction for simpler use cases
  • AppFileServerProvider: Local filesystem + HMAC-signed download URLs — no cloud account required
  • GoogleDriveProvider: Service-account Google Drive for Shared Drives with lazy path-cache
  • InMemoryProvider: Zero-dependency in-memory store for unit tests (via hazo_files/testing)
  • Multiple Storage Providers: Local filesystem, Google Drive, and Dropbox support out of the box
  • Modular Architecture: Easily add custom storage providers
  • Unified API: Single consistent interface across all storage providers
  • React UI Components: Drop-in FileBrowser component with folder tree, file list, and preview
  • Naming Rules System: Visual configurator and utilities for generating consistent file/folder names
  • Naming Convention Management: Full CRUD with UI components for managing naming conventions in database
  • Extraction Data Management: Track and manage LLM-extracted metadata with merge strategies
  • LLM Integration: Built-in support for hazo_llm_api document/image extraction
  • Upload + Extract Workflow: Combined service for uploading files with automatic LLM extraction and naming
  • File Reference Tracking: Multi-entity file references with orphan detection, soft delete, and lifecycle management
  • File Change Detection: xxHash-based content hashing for efficient change detection
  • Content Tagging: Optional LLM-based content classification at upload time or on-demand via content_tag field
  • Schema Migrations: Built-in V2/V3 migration utilities for adding reference tracking and content tagging to existing databases
  • Background Upload Pipelines: Framework-agnostic UploadManager + React HazoFileUploadProvider for multi-step upload pipelines that survive component unmount, with optional sonner toast bridge
  • Quota Tracking: Per-scope opt-in quota with threshold callbacks and fail-open semantics
  • URL Import: importFromUrl with SSRF protection, streaming size cap, and source_url provenance tracking
  • Actor Tracking: Optional actor_id on all mutations, written to uploaded_by and changed_by columns
  • Purge Scheduler: createPurgeJobHandlers factory for integrating with hazo_jobs cron workers
  • TypeScript: Full type safety and IntelliSense support
  • OAuth Integration: Built-in Google Drive and Dropbox OAuth authentication
  • Prompt Cache Invalidation: Passthrough for hazo_llm_api prompt cache management via server instance
  • Progress Tracking: Upload/download progress callbacks
  • File Validation: Extension filtering and file size limits
  • Error Handling: Comprehensive error types and handling

Installation

npm install hazo_files

For React UI components, ensure you have React 18+ installed:

npm install react react-dom

For the NamingRuleConfigurator component (drag-and-drop interface), also install:

npm install @dnd-kit/core @dnd-kit/sortable @dnd-kit/utilities

For cloud storage providers (install only what you need):

npm install googleapis        # Google Drive support
npm install dropbox           # Dropbox support

For database tracking and LLM extraction features (optional):

npm install hazo_connect      # Database tracking
npm install hazo_llm_api      # LLM document extraction
npm install server-only       # Server-side safety (recommended)
npm install xxhash-wasm       # File change detection (optional)

For the background-upload sonner toast bridge (optional):

npm install sonner            # Toast notifications for background upload pipelines

Subpath exports

| Import | Contents | |--------|---------| | hazo_files | Core types, utilities, naming helpers | | hazo_files/ui | React components (FileBrowser, NamingRuleConfigurator, etc.) | | hazo_files/server | Server-only: FileManager, TrackedFileManager, FileStorageProvider providers, schema exports | | hazo_files/background-upload | Framework-agnostic UploadManager pipeline engine | | hazo_files/background-upload/react | React bindings for background uploads | | hazo_files/testing | InMemoryProvider — import in test suites, safe to use without server-only |

Tailwind CSS v4 Setup (Required for UI Components)

If you're using Tailwind CSS v4 with the UI components, you must add a @source directive to your CSS file to ensure Tailwind scans the package's files for utility classes.

Add this to your globals.css or main CSS file AFTER the tailwindcss import:

@import "tailwindcss";

/* Required: Enable Tailwind to scan hazo_files package for utility classes */
@source "../node_modules/hazo_files/dist/ui";

Without this directive, Tailwind v4's JIT compiler will not generate CSS for the utility classes used in hazo_files components (like hover:bg-gray-100, text-sm, rounded-md, etc.), resulting in broken styling.

Note: This is only required for Tailwind v4. Earlier versions of Tailwind automatically scan node_modules and do not need this configuration.

Quick Start

Basic Usage (Server-side)

import { createInitializedFileManager } from 'hazo_files';

// Create and initialize file manager
const fileManager = await createInitializedFileManager({
  config: {
    provider: 'local',
    local: {
      basePath: './files',
      maxFileSize: 10 * 1024 * 1024, // 10MB
      allowedExtensions: ['jpg', 'png', 'pdf', 'txt']
    }
  }
});

// Create a directory
await fileManager.createDirectory('/documents');

// Upload a file
await fileManager.uploadFile(
  './local-file.pdf',
  '/documents/file.pdf',
  {
    onProgress: (progress, bytes, total) => {
      console.log(`Upload progress: ${progress}%`);
    }
  }
);

// List directory contents
const result = await fileManager.listDirectory('/documents');
if (result.success) {
  console.log(result.data);
}

// Download a file
await fileManager.downloadFile('/documents/file.pdf', './downloaded.pdf');

Using Configuration File

Create hazo_files_config.ini in your project root:

[general]
provider = local

[local]
base_path = ./files
max_file_size = 10485760
allowed_extensions = jpg,png,pdf,txt

Then initialize without config object:

import { createInitializedFileManager } from 'hazo_files';

const fileManager = await createInitializedFileManager();

React UI Component

import { FileBrowser } from 'hazo_files/ui';
import type { FileBrowserAPI } from 'hazo_files/ui';

// Create an API adapter that calls your server endpoints
const api: FileBrowserAPI = {
  async listDirectory(path: string) {
    const res = await fetch(`/api/files?action=list&path=${path}`);
    return res.json();
  },
  async getFolderTree(path = '/', depth = 3) {
    const res = await fetch(`/api/files?action=tree&path=${path}&depth=${depth}`);
    return res.json();
  },
  async uploadFile(file: File, remotePath: string) {
    const formData = new FormData();
    formData.append('file', file);
    formData.append('path', remotePath);
    const res = await fetch('/api/files/upload', { method: 'POST', body: formData });
    return res.json();
  },
  // ... implement other methods
};

function MyFileBrowser() {
  return (
    <FileBrowser
      api={api}
      initialPath="/"
      showPreview={true}
      showTree={true}
      viewMode="grid"
    />
  );
}

Advanced Usage

Google Drive Integration

1. Set up Google Cloud Console

  1. Go to Google Cloud Console
  2. Create a new project or select an existing one
  3. Enable the Google Drive API
  4. Create OAuth 2.0 credentials
  5. Add authorized redirect URIs (e.g., http://localhost:3000/api/auth/callback/google)

2. Configure Environment Variables

Create .env.local:

HAZO_GOOGLE_DRIVE_CLIENT_ID=your-client-id.apps.googleusercontent.com
HAZO_GOOGLE_DRIVE_CLIENT_SECRET=your-client-secret
HAZO_GOOGLE_DRIVE_REDIRECT_URI=http://localhost:3000/api/auth/callback/google

3. Configure hazo_files

[general]
provider = google_drive

[google_drive]
client_id =
client_secret =
redirect_uri = http://localhost:3000/api/auth/callback/google
refresh_token =

Environment variables will automatically override empty values.

4. Implement OAuth Flow

import { createFileManager, GoogleDriveModule } from 'hazo_files';

// Initialize with Google Drive
const fileManager = createFileManager({
  config: {
    provider: 'google_drive',
    google_drive: {
      clientId: process.env.HAZO_GOOGLE_DRIVE_CLIENT_ID!,
      clientSecret: process.env.HAZO_GOOGLE_DRIVE_CLIENT_SECRET!,
      redirectUri: process.env.HAZO_GOOGLE_DRIVE_REDIRECT_URI!,
    }
  }
});

await fileManager.initialize();

// Get the Google Drive module to access auth methods
const module = fileManager.getModule() as GoogleDriveModule;
const auth = module.getAuth();

// Generate auth URL
const authUrl = auth.getAuthUrl();
console.log('Visit:', authUrl);

// After user authorizes, exchange code for tokens
const tokens = await auth.exchangeCodeForTokens(authCode);

// Authenticate the module
await module.authenticate(tokens);

// Now you can use the file manager
await fileManager.createDirectory('/MyFolder');

Next.js API Route Example

// app/api/files/route.ts
import { NextRequest, NextResponse } from 'next/server';
import { createInitializedFileManager } from 'hazo_files';

async function getFileManager() {
  return createInitializedFileManager({
    config: {
      provider: 'local',
      local: {
        basePath: process.env.LOCAL_STORAGE_BASE_PATH || './files',
      }
    }
  });
}

export async function GET(request: NextRequest) {
  const { searchParams } = new URL(request.url);
  const action = searchParams.get('action');
  const path = searchParams.get('path') || '/';

  const fm = await getFileManager();

  switch (action) {
    case 'list':
      return NextResponse.json(await fm.listDirectory(path));
    case 'tree':
      const depth = parseInt(searchParams.get('depth') || '3', 10);
      return NextResponse.json(await fm.getFolderTree(path, depth));
    default:
      return NextResponse.json({ success: false, error: 'Invalid action' });
  }
}

export async function POST(request: NextRequest) {
  const body = await request.json();
  const { action, ...params } = body;

  const fm = await getFileManager();

  switch (action) {
    case 'createDirectory':
      return NextResponse.json(await fm.createDirectory(params.path));
    case 'deleteFile':
      return NextResponse.json(await fm.deleteFile(params.path));
    case 'renameFile':
      return NextResponse.json(await fm.renameFile(params.path, params.newName));
    default:
      return NextResponse.json({ success: false, error: 'Invalid action' });
  }
}

File Upload API Route

// app/api/files/upload/route.ts
import { NextRequest, NextResponse } from 'next/server';
import { createInitializedFileManager } from 'hazo_files';

export async function POST(request: NextRequest) {
  const formData = await request.formData();
  const file = formData.get('file') as File;
  const path = formData.get('path') as string;

  const fm = await getFileManager();

  // Convert File to Buffer
  const arrayBuffer = await file.arrayBuffer();
  const buffer = Buffer.from(arrayBuffer);

  const result = await fm.uploadFile(buffer, path);
  return NextResponse.json(result);
}

Progress Tracking

// Upload with progress tracking
await fileManager.uploadFile(
  './large-file.zip',
  '/uploads/large-file.zip',
  {
    onProgress: (progress, bytesTransferred, totalBytes) => {
      console.log(`Progress: ${progress.toFixed(2)}%`);
      console.log(`${bytesTransferred} / ${totalBytes} bytes`);
    }
  }
);

// Download with progress tracking
await fileManager.downloadFile(
  '/uploads/large-file.zip',
  './downloaded-file.zip',
  {
    onProgress: (progress, bytesTransferred, totalBytes) => {
      console.log(`Download: ${progress.toFixed(2)}%`);
    }
  }
);

File Operations

// Create directory structure
await fileManager.createDirectory('/projects/2024/docs');

// Upload file
const uploadResult = await fileManager.uploadFile(
  buffer,
  '/projects/2024/docs/report.pdf'
);

// Move file
await fileManager.moveItem(
  '/projects/2024/docs/report.pdf',
  '/archive/2024/report.pdf'
);

// Rename file
await fileManager.renameFile(
  '/archive/2024/report.pdf',
  'annual-report.pdf'
);

// Copy file (convenience method)
await fileManager.copyFile(
  '/archive/2024/annual-report.pdf',
  '/backup/annual-report.pdf'
);

// Delete file
await fileManager.deleteFile('/backup/annual-report.pdf');

// Remove directory (recursive)
await fileManager.removeDirectory('/archive/2024', true);

// Check if file exists
const exists = await fileManager.exists('/projects/2024/docs');

// Get file/folder information
const itemResult = await fileManager.getItem('/projects/2024/docs/report.pdf');
if (itemResult.success && itemResult.data) {
  console.log('File:', itemResult.data.name);
  console.log('Size:', itemResult.data.size);
  console.log('Modified:', itemResult.data.modifiedAt);
}

// List directory with options
const listResult = await fileManager.listDirectory('/projects', {
  recursive: true,
  includeHidden: false,
  filter: (item) => !item.isDirectory && item.name.endsWith('.pdf')
});

Working with Text Files

// Write text file
await fileManager.writeFile('/notes/readme.txt', 'Hello, World!');

// Read text file
const readResult = await fileManager.readFile('/notes/readme.txt');
if (readResult.success) {
  console.log(readResult.data); // "Hello, World!"
}

Folder Tree

// Get folder tree (3 levels deep by default)
const treeResult = await fileManager.getFolderTree('/projects', 3);
if (treeResult.success && treeResult.data) {
  console.log(JSON.stringify(treeResult.data, null, 2));
}

// Output:
// [
//   {
//     "id": "abc123",
//     "name": "2024",
//     "path": "/projects/2024",
//     "children": [
//       {
//         "id": "def456",
//         "name": "docs",
//         "path": "/projects/2024/docs",
//         "children": []
//       }
//     ]
//   }
// ]

Configuration

Configuration File (hazo_files_config.ini)

[general]
provider = local

[local]
base_path = ./files
allowed_extensions = jpg,png,pdf,txt,doc,docx
max_file_size = 10485760

[google_drive]
client_id = your-client-id.apps.googleusercontent.com
client_secret = your-client-secret
redirect_uri = http://localhost:3000/api/auth/callback/google
refresh_token =
access_token =
root_folder_id =

[naming]
; Supported date format tokens for naming rules
date_formats = YYYY,YY,MM,M,DD,D,MMM,MMMM,YYYY-MM-DD,YYYY-MMM-DD,DD-MM-YYYY,MM-DD-YYYY

Environment Variables

The following environment variables can override configuration file values:

  • HAZO_GOOGLE_DRIVE_CLIENT_ID
  • HAZO_GOOGLE_DRIVE_CLIENT_SECRET
  • HAZO_GOOGLE_DRIVE_REDIRECT_URI
  • HAZO_GOOGLE_DRIVE_REFRESH_TOKEN
  • HAZO_GOOGLE_DRIVE_ACCESS_TOKEN
  • HAZO_GOOGLE_DRIVE_ROOT_FOLDER_ID

Configuration via Code

import { createInitializedFileManager } from 'hazo_files';

const fileManager = await createInitializedFileManager({
  config: {
    provider: 'local',
    local: {
      basePath: './storage',
      allowedExtensions: ['jpg', 'png', 'gif', 'pdf'],
      maxFileSize: 5 * 1024 * 1024 // 5MB
    }
  }
});

Database Schema

Database tables are only required if you use TrackedFileManager, FileMetadataService, NamingConventionService, or UploadExtractService. Plain FileManager (filesystem only) needs no tables.

There are two tables:

  • hazo_files — file metadata, hashes, references, content tags
  • hazo_files_naming — saved naming conventions

The DDL below is also exposed programmatically via HAZO_FILES_TABLE_SCHEMA and HAZO_FILES_NAMING_TABLE_SCHEMA (see Programmatic Setup below). Run the raw SQL if you prefer to manage migrations with your existing tooling (psql, sqlite3, Flyway, Knex, etc.).

hazo_files Table

PostgreSQL

CREATE TABLE IF NOT EXISTS hazo_files (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  filename TEXT NOT NULL,
  file_type TEXT NOT NULL,
  file_data TEXT DEFAULT '{}',
  created_at TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT NOW(),
  changed_at TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT NOW(),
  file_path TEXT NOT NULL,
  storage_type TEXT NOT NULL,
  file_hash TEXT,
  file_size BIGINT,
  file_changed_at TIMESTAMP WITH TIME ZONE,
  file_refs TEXT DEFAULT '[]',
  ref_count INTEGER DEFAULT 0,
  status TEXT DEFAULT 'active',
  scope_id UUID,
  uploaded_by UUID,
  storage_verified_at TIMESTAMP WITH TIME ZONE,
  deleted_at TIMESTAMP WITH TIME ZONE,
  original_filename TEXT,
  content_tag TEXT
);

CREATE INDEX IF NOT EXISTS idx_hazo_files_path ON hazo_files (file_path);
CREATE INDEX IF NOT EXISTS idx_hazo_files_storage ON hazo_files (storage_type);
CREATE UNIQUE INDEX IF NOT EXISTS idx_hazo_files_path_storage ON hazo_files (file_path, storage_type);
CREATE INDEX IF NOT EXISTS idx_hazo_files_hash ON hazo_files (file_hash);
CREATE INDEX IF NOT EXISTS idx_hazo_files_status ON hazo_files (status);
CREATE INDEX IF NOT EXISTS idx_hazo_files_scope ON hazo_files (scope_id);
CREATE INDEX IF NOT EXISTS idx_hazo_files_ref_count ON hazo_files (ref_count);
CREATE INDEX IF NOT EXISTS idx_hazo_files_deleted ON hazo_files (deleted_at);
CREATE INDEX IF NOT EXISTS idx_hazo_files_content_tag ON hazo_files (content_tag);

SQLite

CREATE TABLE IF NOT EXISTS hazo_files (
  id TEXT PRIMARY KEY,
  filename TEXT NOT NULL,
  file_type TEXT NOT NULL,
  file_data TEXT DEFAULT '{}',
  created_at TEXT NOT NULL,
  changed_at TEXT NOT NULL,
  file_path TEXT NOT NULL,
  storage_type TEXT NOT NULL,
  file_hash TEXT,
  file_size INTEGER,
  file_changed_at TEXT,
  file_refs TEXT DEFAULT '[]',
  ref_count INTEGER DEFAULT 0,
  status TEXT DEFAULT 'active',
  scope_id TEXT,
  uploaded_by TEXT,
  storage_verified_at TEXT,
  deleted_at TEXT,
  original_filename TEXT,
  content_tag TEXT
);

CREATE INDEX IF NOT EXISTS idx_hazo_files_path ON hazo_files (file_path);
CREATE INDEX IF NOT EXISTS idx_hazo_files_storage ON hazo_files (storage_type);
CREATE UNIQUE INDEX IF NOT EXISTS idx_hazo_files_path_storage ON hazo_files (file_path, storage_type);
CREATE INDEX IF NOT EXISTS idx_hazo_files_hash ON hazo_files (file_hash);
CREATE INDEX IF NOT EXISTS idx_hazo_files_status ON hazo_files (status);
CREATE INDEX IF NOT EXISTS idx_hazo_files_scope ON hazo_files (scope_id);
CREATE INDEX IF NOT EXISTS idx_hazo_files_ref_count ON hazo_files (ref_count);
CREATE INDEX IF NOT EXISTS idx_hazo_files_deleted ON hazo_files (deleted_at);
CREATE INDEX IF NOT EXISTS idx_hazo_files_content_tag ON hazo_files (content_tag);

hazo_files_naming Table

Required only if you use NamingConventionService to persist saved naming rules.

PostgreSQL

CREATE TABLE IF NOT EXISTS hazo_files_naming (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  scope_id UUID,
  naming_title TEXT NOT NULL,
  naming_type TEXT NOT NULL CHECK(naming_type IN ('file', 'folder', 'both')),
  naming_value TEXT NOT NULL,
  created_at TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT NOW(),
  changed_at TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT NOW(),
  variables TEXT DEFAULT '[]'
);

CREATE INDEX IF NOT EXISTS idx_hazo_files_naming_scope ON hazo_files_naming (scope_id);
CREATE INDEX IF NOT EXISTS idx_hazo_files_naming_type ON hazo_files_naming (naming_type);

SQLite

CREATE TABLE IF NOT EXISTS hazo_files_naming (
  id TEXT PRIMARY KEY,
  scope_id TEXT,
  naming_title TEXT NOT NULL,
  naming_type TEXT NOT NULL CHECK(naming_type IN ('file', 'folder', 'both')),
  naming_value TEXT NOT NULL,
  created_at TEXT NOT NULL,
  changed_at TEXT NOT NULL,
  variables TEXT DEFAULT '[]'
);

CREATE INDEX IF NOT EXISTS idx_hazo_files_naming_scope ON hazo_files_naming (scope_id);
CREATE INDEX IF NOT EXISTS idx_hazo_files_naming_type ON hazo_files_naming (naming_type);

Programmatic Setup

The same DDL is available as exported constants so you can run it from your app's startup or migration script without hand-copying SQL:

import {
  HAZO_FILES_TABLE_SCHEMA,
  HAZO_FILES_NAMING_TABLE_SCHEMA,
} from 'hazo_files';

// Pick 'sqlite' or 'postgres'
const dbType: 'sqlite' | 'postgres' = 'sqlite';

// Create hazo_files table
await db.run(HAZO_FILES_TABLE_SCHEMA[dbType].ddl);
for (const idx of HAZO_FILES_TABLE_SCHEMA[dbType].indexes) {
  await db.run(idx);
}

// Create hazo_files_naming table (only if using NamingConventionService)
await db.run(HAZO_FILES_NAMING_TABLE_SCHEMA[dbType].ddl);
for (const idx of HAZO_FILES_NAMING_TABLE_SCHEMA[dbType].indexes) {
  await db.run(idx);
}

For PostgreSQL, swap db.run(...) for client.query(...).

To use a custom table name, see getSchemaForTable(name, dbType) and getNamingSchemaForTable(name, dbType).

Upgrading Existing Tables

If you already have a pre-V2 or pre-V3 hazo_files table, see Database Migration (Existing Databases) and V3 Database Migration for the ALTER TABLE scripts and migration helpers.

UI Components

FileBrowser Component

The FileBrowser is a complete, drop-in file management UI with:

  • Folder tree navigation
  • File list (grid or list view)
  • Breadcrumb navigation
  • File preview (images, text, PDFs)
  • Context menus and actions
  • Upload, download, rename, delete operations
  • Drag-and-drop file moving between folders
import { FileBrowser } from 'hazo_files/ui';

<FileBrowser
  api={api}
  initialPath="/"
  showPreview={true}
  showTree={true}
  viewMode="grid"
  treeWidth={250}
  previewHeight={300}
  onError={(error) => console.error(error)}
  onNavigate={(path) => console.log('Navigated to:', path)}
  onSelect={(item) => console.log('Selected:', item)}
/>

Drag-and-Drop File Moving

The FileBrowser includes built-in drag-and-drop functionality for moving files and folders:

Features:

  • Drag files/folders from the file list
  • Drop onto folders in the sidebar tree or main file list
  • Visual feedback with opacity and colored borders during drag
  • Prevents invalid operations (dropping on self, into current parent, folder into descendant)
  • Shows dragged item preview during drag operation

How to use:

  1. Click and hold on any file or folder in the file list
  2. Drag it over a folder in either the tree sidebar or file list
  3. Valid drop targets show a green ring/background
  4. Release to move the item to the new location

Technical requirements:

  • Requires @dnd-kit/core peer dependency (already included for NamingRuleConfigurator)
  • API must implement moveItem(sourcePath, destinationPath) method
  • Automatically validates drop targets to prevent invalid moves

Visual feedback:

  • Dragging: Item becomes semi-transparent (opacity-50)
  • Valid drop target: Green ring (ring-2 ring-green-500) and background (bg-green-50)
  • Drag preview: Shows file/folder icon and name following cursor

ID patterns used:

  • File items: file-item-{path} (draggable)
  • Folder tree drops: folder-drop-tree-{path} (droppable)
  • Folder list drops: folder-drop-list-{path} (droppable)

Individual Components

You can also use individual components:

import {
  PathBreadcrumb,
  FolderTree,
  FileList,
  FilePreview,
  FileActions,
  FileInfoPanel
} from 'hazo_files/ui';

// Use individually with your own layout

FileInfoPanel Component

The FileInfoPanel displays file metadata in a structured format and can be used standalone in sidebars, custom dialogs, or inline panels:

import { FileInfoPanel } from 'hazo_files/ui';

// In a sidebar
function Sidebar({ selectedFile, metadata, isLoading }) {
  return (
    <div className="sidebar p-4">
      <h3 className="font-bold mb-4">File Info</h3>
      <FileInfoPanel
        item={selectedFile}
        metadata={metadata}
        isLoading={isLoading}
      />
    </div>
  );
}

// Without custom metadata section
<FileInfoPanel
  item={file}
  showCustomMetadata={false}
  className="bg-gray-50 rounded-lg p-4"
/>

// In a custom dialog
function MyCustomDialog({ file }) {
  return (
    <dialog>
      <FileInfoPanel item={file} showCustomMetadata={false} />
    </dialog>
  );
}

Props:

  • item: FileSystemItem | null - The file or folder to display info for
  • metadata?: FileMetadata | null - Additional metadata from database
  • isLoading?: boolean - Show loading state for custom metadata
  • showCustomMetadata?: boolean - Whether to show the JSON metadata section (default: true)
  • className?: string - Additional CSS classes for custom styling

Hooks

import { useFileBrowser, useFileOperations } from 'hazo_files/ui';

function MyCustomFileBrowser() {
  const {
    currentPath,
    files,
    tree,
    selectedItem,
    isLoading,
    navigate,
    refresh,
    selectItem
  } = useFileBrowser(api, '/');

  const {
    createFolder,
    uploadFiles,
    deleteItem,
    renameItem
  } = useFileOperations(api, currentPath);

  // Build your custom UI
}

Naming Rule Configurator

Build consistent file/folder naming patterns with a visual drag-and-drop interface:

import { NamingRuleConfigurator } from 'hazo_files/ui';
import type { NamingVariable } from 'hazo_files/ui';

function NamingConfig() {
  // Define user-specific variables
  const userVariables: NamingVariable[] = [
    {
      variable_name: 'project_name',
      description: 'Name of the project',
      example_value: 'WebApp',
      category: 'user'
    },
    {
      variable_name: 'client_id',
      description: 'Client identifier',
      example_value: 'ACME',
      category: 'user'
    },
  ];

  const handleSchemaChange = (schema) => {
    console.log('New schema:', schema);
    // Save to database or state
  };

  const handleExport = (schema) => {
    // Export as JSON file
    const blob = new Blob([JSON.stringify(schema, null, 2)], { type: 'application/json' });
    const url = URL.createObjectURL(blob);
    const a = document.createElement('a');
    a.href = url;
    a.download = 'naming-rule.json';
    a.click();
  };

  return (
    <NamingRuleConfigurator
      variables={userVariables}
      onChange={handleSchemaChange}
      onExport={handleExport}
      sampleFileName="proposal.pdf"
    />
  );
}

The configurator provides:

  • Category Tabs: User, Date, File, Counter variables
  • Drag & Drop: Build patterns by dragging variables into file/folder patterns
  • Segment Reordering: Drag segments within patterns to reorder them
  • Live Preview: See generated names in real-time with example values
  • Undo/Redo: Full history with keyboard shortcuts (Ctrl+Z, Ctrl+Y)
  • Import/Export: Save and load naming rules as JSON
  • Scrollable Layout: Works in fixed-height containers with scrollable content area

System variables included:

  • Date: YYYY, YY, MM, DD, YYYY-MM-DD, MMM, MMMM, etc.
  • File: original_name, extension, ext
  • Counter: counter (auto-incrementing with padding)

Naming Convention Management Components

Full UI for managing naming conventions stored in the database:

import {
  NamingConventionManager,
  NamingConventionList,
  NamingConventionEditor,
} from 'hazo_files/ui';

// Full management UI (list + editor combined)
<NamingConventionManager
  api={namingAPI}
  onSelect={(convention) => applyConvention(convention)}
/>

// Or use components separately
<NamingConventionList
  api={namingAPI}
  selectedId={selectedId}
  onSelect={setSelectedId}
  onEdit={(id) => openEditor(id)}
  onDelete={(id) => confirmDelete(id)}
/>

<NamingConventionEditor
  api={namingAPI}
  conventionId={editingId}
  userVariables={customVariables}
  onSave={(convention) => handleSave(convention)}
  onCancel={() => closeEditor()}
/>

Naming Rules API

Generate file and folder names programmatically from naming schemas:

import {
  hazo_files_generate_file_name,
  hazo_files_generate_folder_name,
  createVariableSegment,
  createLiteralSegment,
  type NamingRuleSchema
} from 'hazo_files';

// Create a naming schema
const schema: NamingRuleSchema = {
  version: 1,
  filePattern: [
    createVariableSegment('client_id'),
    createLiteralSegment('_'),
    createVariableSegment('project_name'),
    createLiteralSegment('_'),
    createVariableSegment('YYYY-MM-DD'),
    createLiteralSegment('_'),
    createVariableSegment('counter'),
  ],
  folderPattern: [
    createVariableSegment('YYYY'),
    createLiteralSegment('/'),
    createVariableSegment('client_id'),
    createLiteralSegment('/'),
    createVariableSegment('project_name'),
  ],
};

// Define variable values
const variables = {
  client_id: 'ACME',
  project_name: 'Website',
};

// Generate file name
const fileResult = hazo_files_generate_file_name(
  schema,
  variables,
  'original-document.pdf',
  {
    counterValue: 42,
    preserveExtension: true,  // Keep original .pdf extension
    date: new Date('2024-12-09'),
  }
);

if (fileResult.success) {
  console.log(fileResult.name);
  // Output: "ACME_Website_2024-12-09_042.pdf"
}

// Generate folder path
const folderResult = hazo_files_generate_folder_name(schema, variables);

if (folderResult.success) {
  console.log(folderResult.name);
  // Output: "2024/ACME/Website"
}

// Use with FileManager
const uploadPath = `/${folderResult.name}/${fileResult.name}`;
await fileManager.uploadFile(buffer, uploadPath);

Available System Variables

Date Variables (use current date unless overridden):

  • YYYY - Full year (2024)
  • YY - Two-digit year (24)
  • MM - Month with zero padding (01-12)
  • M - Month without padding (1-12)
  • DD - Day with zero padding (01-31)
  • D - Day without padding (1-31)
  • MMM - Short month name (Jan, Feb, etc.)
  • MMMM - Full month name (January, February, etc.)
  • YYYY-MM-DD - ISO date format (2024-01-15)
  • YYYY-MMM-DD - Date with month name (2024-Jan-15)
  • DD-MM-YYYY - European format (15-01-2024)
  • MM-DD-YYYY - US format (01-15-2024)

File Metadata Variables (from original filename):

  • original_name - Filename without extension
  • extension - File extension with dot (.pdf)
  • ext - Extension without dot (pdf)

Counter Variable:

  • counter - Auto-incrementing number with zero padding (001, 042, 123)

Parsing Pattern Strings

You can also parse pattern strings directly:

import { parsePatternString, patternToString } from 'hazo_files';

// Parse string to segments
const segments = parsePatternString('{client_id}_{YYYY-MM-DD}_{counter}');
console.log(segments);
// [
//   { id: '...', type: 'variable', value: 'client_id' },
//   { id: '...', type: 'literal', value: '_' },
//   { id: '...', type: 'variable', value: 'YYYY-MM-DD' },
//   { id: '...', type: 'literal', value: '_' },
//   { id: '...', type: 'variable', value: 'counter' },
// ]

// Convert back to string
const patternStr = patternToString(segments);
// "{client_id}_{YYYY-MM-DD}_{counter}"

Extraction Data Management

Manage LLM-extracted data stored within the file_data JSON field. The system maintains both raw extraction history and merged results.

Data Structure

interface FileDataStructure {
  merged_data: Record<string, unknown>;  // Combined data from all extractions
  raw_data: ExtractionData[];            // Individual extraction entries
}

interface ExtractionData {
  id: string;           // Unique extraction ID
  extracted_at: string; // ISO timestamp
  source?: string;      // Optional source identifier (e.g., model name)
  data: Record<string, unknown>;  // The extracted data
}

Using with FileMetadataService

import { FileMetadataService, createFileMetadataService } from 'hazo_files';

// Create service with your CRUD provider
const metadataService = createFileMetadataService(crudService);

// Add an extraction
const extraction = await metadataService.addExtraction(
  '/documents/report.pdf',
  'local',
  { title: 'Annual Report', author: 'John Doe', pages: 42 },
  { source: 'gpt-4', mergeStrategy: 'shallow' }
);
console.log('Added extraction:', extraction?.id);

// Get merged data (combined from all extractions)
const merged = await metadataService.getMergedData('/documents/report.pdf', 'local');
console.log('Merged data:', merged);

// Get all extractions
const extractions = await metadataService.getExtractions('/documents/report.pdf', 'local');
console.log('All extractions:', extractions);

// Get a specific extraction
const specific = await metadataService.getExtractionById(
  '/documents/report.pdf',
  'local',
  extraction?.id
);

// Remove an extraction (recalculates merged_data by default)
await metadataService.removeExtractionById(
  '/documents/report.pdf',
  'local',
  extraction?.id,
  { recalculateMerged: true, mergeStrategy: 'deep' }
);

// Clear all extractions
await metadataService.clearExtractions('/documents/report.pdf', 'local');

Using Utility Functions Directly

For working with parsed data structures without database operations:

import {
  parseFileData,
  addExtractionToFileData,
  removeExtractionById,
  getMergedData,
  getExtractions,
  deepMerge,
  createEmptyFileDataStructure,
} from 'hazo_files';

// Parse existing JSON (auto-migrates old format)
const fileData = parseFileData(existingJsonString);

// Add an extraction (returns new structure, immutable)
const result = addExtractionToFileData(
  fileData,
  { category: 'finance', summary: 'Q4 results' },
  { source: 'claude-3', mergeStrategy: 'deep' }
);

if (result.success) {
  const newFileData = result.data;
  console.log('New merged data:', newFileData.merged_data);
  console.log('Extraction count:', newFileData.raw_data.length);
}

// Remove an extraction by ID
const removeResult = removeExtractionById(fileData, 'ext_12345', {
  recalculateMerged: true,
  mergeStrategy: 'shallow'
});

// Get copies of data
const mergedCopy = getMergedData(fileData);
const extractionsCopy = getExtractions(fileData);

Merge Strategies

  • Shallow (default): Spreads top-level properties, later values overwrite earlier

    // { a: 1, b: 2 } + { b: 3, c: 4 } = { a: 1, b: 3, c: 4 }
  • Deep: Recursively merges nested objects, concatenates arrays

    // { a: { x: 1 }, arr: [1] } + { a: { y: 2 }, arr: [2] } = { a: { x: 1, y: 2 }, arr: [1, 2] }

Migration from Old Format

The parseFileData function automatically migrates old plain-object format to the new structure:

// Old format: { title: 'Report', author: 'John' }
// Becomes: { merged_data: { title: 'Report', author: 'John' }, raw_data: [] }

Naming Convention Management

Store and manage naming conventions in your database with full CRUD operations.

NamingConventionService

import { NamingConventionService, HAZO_FILES_NAMING_TABLE_SCHEMA } from 'hazo_files';
import { createCrudService } from 'hazo_connect/server';

// Create CRUD service for naming conventions table
const namingCrud = createCrudService(adapter, HAZO_FILES_NAMING_TABLE_SCHEMA.tableName);
const namingService = new NamingConventionService(namingCrud);

// Create a naming convention
const convention = await namingService.create({
  naming_title: 'Tax Documents',
  naming_type: 'both', // 'file', 'folder', or 'both'
  naming_value: {
    version: 1,
    filePattern: [
      { id: '1', type: 'variable', value: 'client_id' },
      { id: '2', type: 'literal', value: '_' },
      { id: '3', type: 'variable', value: 'YYYY-MM-DD' },
    ],
    folderPattern: [
      { id: '4', type: 'variable', value: 'YYYY' },
      { id: '5', type: 'literal', value: '/' },
      { id: '6', type: 'variable', value: 'client_id' },
    ],
  },
  variables: [
    { variable_name: 'client_id', description: 'Client ID', example_value: 'ACME', category: 'user' }
  ],
  scope_id: 'optional-scope-uuid', // Link to hazo_scopes for organization
});

// Get all conventions
const allConventions = await namingService.list();

// Get parsed conventions (with schema and variables as objects)
const parsed = await namingService.listParsed();

// Get by scope (e.g., for a specific organization)
const scopedConventions = await namingService.getByScope('scope-uuid');

// Update
await namingService.update(convention.id, {
  naming_title: 'Updated Tax Documents',
});

// Duplicate
const copy = await namingService.duplicate(convention.id, 'Tax Documents Copy');

// Delete
await namingService.delete(convention.id);

NamingConventionManager UI Component

import { NamingConventionManager } from 'hazo_files/ui';
import type { NamingConventionAPI } from 'hazo_files/ui';

// Create API adapter for your backend
const namingAPI: NamingConventionAPI = {
  list: () => fetch('/api/naming-conventions').then(r => r.json()),
  create: (input) => fetch('/api/naming-conventions', {
    method: 'POST',
    body: JSON.stringify(input),
  }).then(r => r.json()),
  update: (id, input) => fetch(`/api/naming-conventions/${id}`, {
    method: 'PATCH',
    body: JSON.stringify(input),
  }).then(r => r.json()),
  delete: (id) => fetch(`/api/naming-conventions/${id}`, {
    method: 'DELETE',
  }).then(r => r.json()),
};

function NamingConventionsPage() {
  return (
    <NamingConventionManager
      api={namingAPI}
      onSelect={(convention) => console.log('Selected:', convention)}
    />
  );
}

Upload with LLM Extraction

Combine file uploads with automatic LLM extraction and naming convention application.

UploadExtractService

import {
  TrackedFileManager,
  NamingConventionService,
  LLMExtractionService,
  UploadExtractService,
} from 'hazo_files';
import { createLLM } from 'hazo_llm_api';

// Create LLM extraction service
const extractionService = new LLMExtractionService((provider, options) => {
  return createLLM({ provider, ...options });
}, 'gemini');

// Create upload + extract service (with optional content tag config)
const uploadExtract = new UploadExtractService(
  trackedFileManager,
  namingService,
  extractionService,
  {
    content_tag_set_by_llm: true,
    content_tag_prompt_area: 'classification',
    content_tag_prompt_key: 'classify_document',
    content_tag_prompt_return_fieldname: 'document_type',
  }
);

// Upload with extraction and naming convention
const result = await uploadExtract.uploadWithExtract(
  pdfBuffer,
  'quarterly-report.pdf',
  {
    // Enable LLM extraction
    extract: true,
    extractionOptions: {
      promptArea: 'reports',
      promptKey: 'extract_summary',
      llmProvider: 'gemini',
    },
    // Apply naming convention
    namingConventionId: 'convention-uuid',
    namingVariables: { client_id: 'ACME', project: 'Q4' },
    basePath: '/documents',
    createFolders: true,
    counterValue: 1,
  }
);

if (result.success) {
  console.log('Uploaded to:', result.generatedPath);
  // e.g., '/documents/2024/ACME/ACME_Q4_2024-12-09_001.pdf'
  console.log('Extracted data:', result.extraction?.data);
  console.log('Content tag:', result.contentTag);
  // e.g., 'invoice', 'report', 'contract'
}

// Generate path preview without uploading
const preview = await uploadExtract.generatePath(
  'document.pdf',
  'convention-uuid',
  { client_id: 'ACME' },
  { basePath: '/docs', counterValue: 5 }
);
console.log('Would upload to:', preview.fullPath);

// Create folder from naming convention
const folderResult = await uploadExtract.createFolderFromConvention(
  'convention-uuid',
  { client_id: 'ACME', project: 'Website' },
  { basePath: '/projects' }
);

LLMExtractionService Standalone

import { LLMExtractionService } from 'hazo_files';

const extractionService = new LLMExtractionService({
  create: llmFactory,
  invalidateCache: (area, key) => invalidate_prompt_cache(area, key), // optional
}, 'gemini');

// Extract from document
const result = await extractionService.extractFromDocument(
  pdfBuffer,
  'application/pdf',
  {
    customPrompt: 'Extract all financial figures and dates',
    llmProvider: 'qwen',
  }
);

// Extract from image
const imageResult = await extractionService.extractFromImage(
  imageBuffer,
  'image/jpeg',
  {
    promptArea: 'receipts',
    promptKey: 'extract_receipt',
  }
);

// Auto-detect based on MIME type
const autoResult = await extractionService.extract(
  buffer,
  mimeType,
  extractionOptions
);

Content Tagging

Automatically classify uploaded files using LLM-based content analysis. The content_tag field stores a classification string (e.g., "invoice", "report", "contract") determined by an LLM prompt.

Configuration

import type { ContentTagConfig } from 'hazo_files';

const contentTagConfig: ContentTagConfig = {
  content_tag_set_by_llm: true,
  content_tag_prompt_area: 'classification',
  content_tag_prompt_key: 'classify_document',
  content_tag_prompt_return_fieldname: 'document_type',
  content_tag_prompt_variables: { language: 'en' }, // optional
};

Automatic Tagging at Upload

Pass contentTagConfig to UploadExtractService constructor (default for all uploads) or per-upload via options:

// Per-upload override
const result = await uploadExtract.uploadWithExtract(buffer, 'file.pdf', {
  basePath: '/docs',
  contentTagConfig: {
    content_tag_set_by_llm: true,
    content_tag_prompt_area: 'classification',
    content_tag_prompt_key: 'classify_document',
    content_tag_prompt_return_fieldname: 'document_type',
  },
});
console.log(result.contentTag); // e.g., 'invoice'

Manual Tagging

Tag existing files by their database record ID:

const tagResult = await uploadExtract.tagFileContent('file-record-id');
if (tagResult.success) {
  console.log('Tagged as:', tagResult.data);
}

V3 Database Migration

If you have an existing hazo_files table, run the V3 migration to add the content_tag column:

import { migrateToV3, HAZO_FILES_MIGRATION_V3 } from 'hazo_files';

// Using the migration helper
await migrateToV3(
  { run: (sql) => db.run(sql) },
  'sqlite'
);

// Or run statements manually
for (const stmt of HAZO_FILES_MIGRATION_V3.sqlite.alterStatements) {
  try { await db.run(stmt); } catch { /* column exists */ }
}

New tables created with HAZO_FILES_TABLE_SCHEMA already include the content_tag column.

File Reference Tracking

Track which entities (form fields, chat messages, etc.) reference each file. Multiple entities can reference the same file, enabling shared files without duplication.

Adding and Removing References

import { TrackedFileManager } from 'hazo_files';

// Upload a file with an initial reference
const result = await trackedManager.uploadFileWithRef(buffer, '/docs/report.pdf', {
  scope_id: 'workspace-123',
  uploaded_by: 'user-456',
  ref: {
    entity_type: 'form_field',
    entity_id: 'field-789',
    created_by: 'user-456',
  },
});
// result.data.file_id, result.data.ref_id

// Add another reference to the same file
await trackedManager.addRef(fileId, {
  entity_type: 'chat_message',
  entity_id: 'msg-abc',
});

// Remove a specific reference
const { remaining_refs } = await trackedManager.removeRef(fileId, refId);

// Get file with status info
const fileStatus = await trackedManager.getFileById(fileId);
// { record, refs: FileRef[], is_orphaned: boolean }

Orphan Detection and Cleanup

// Find files with zero references
const orphans = await trackedManager.findOrphanedFiles({
  olderThanMs: 7 * 24 * 60 * 60 * 1000, // 7 days old
  scope_id: 'workspace-123',
});

// Clean up orphaned files (delete physical files + DB records)
const { cleaned, errors } = await trackedManager.cleanupOrphanedFiles({
  olderThanMs: 30 * 24 * 60 * 60 * 1000,
  softDeleteOnly: false, // true to only mark as soft_deleted
});

// Soft-delete a specific file
await trackedManager.softDeleteFile(fileId);

// Verify physical file existence
const exists = await trackedManager.verifyFileExistence(fileId);

Database Migration (Existing Databases)

If you have an existing hazo_files table, run the V2 migration to add reference tracking columns:

import { migrateToV2, backfillV2Defaults, HAZO_FILES_MIGRATION_V2 } from 'hazo_files';

// Using the migration helper
await migrateToV2(
  { run: (sql) => db.exec(sql) }, // SQLite
  'sqlite'
);
await backfillV2Defaults({ run: (sql) => db.exec(sql) }, 'sqlite');

// Or run statements manually
for (const stmt of HAZO_FILES_MIGRATION_V2.sqlite.alterStatements) {
  try { await db.run(stmt); } catch { /* column exists */ }
}
for (const idx of HAZO_FILES_MIGRATION_V2.sqlite.indexes) {
  await db.run(idx);
}

New tables created with HAZO_FILES_TABLE_SCHEMA already include V2 columns. For V3 content tagging migration, see Content Tagging above.

Reference Tracking Types

import type {
  FileRef,           // Individual reference from entity to file
  FileMetadataRecordV2,  // Extended record with refs, status, scope
  FileWithStatus,    // Rich view: record + parsed refs + is_orphaned
  FileStatus,        // 'active' | 'orphaned' | 'soft_deleted' | 'missing'
  AddRefOptions,     // Options for adding a reference
  RemoveRefsCriteria, // Criteria for bulk ref removal
} from 'hazo_files';

File Change Detection

Detect file content changes using fast xxHash hashing.

import { TrackedFileManager, computeFileHash, hasFileContentChanged } from 'hazo_files';

// TrackedFileManager automatically tracks file hashes on upload
const result = await trackedManager.uploadFile(buffer, '/docs/report.pdf', {
  skipHash: false, // Hash is computed by default
  awaitRecording: true, // Wait for DB record before returning
});

// Check if a file has changed since it was tracked
const hasChanged = await trackedManager.hasFileChanged('/docs/report.pdf');
if (hasChanged) {
  console.log('File has been modified since last upload');
}

// Get stored hash and size
const hash = await trackedManager.getStoredHash('/docs/report.pdf');
const size = await trackedManager.getStoredSize('/docs/report.pdf');

// Use hash utilities directly
const fileHash = await computeFileHash(buffer);
const changed = await hasFileContentChanged(oldHash, newBuffer);

Server Entry Point

For server-side applications, use the /server entry point which includes a factory function:

import { createHazoFilesServer } from 'hazo_files/server';

const hazoFiles = await createHazoFilesServer({
  crudService: fileCrud,
  namingCrudService: namingCrud,
  config: {
    provider: 'local',
    local: { basePath: './storage' },
  },
  enableTracking: true,
  llmFactory: {
    create: (provider) => createLLM({ provider }),
    invalidateCache: (area, key) => invalidate_prompt_cache(area, key), // optional
  },
  // Optional: enable automatic content tagging for all uploads
  defaultContentTagConfig: {
    content_tag_set_by_llm: true,
    content_tag_prompt_area: 'classification',
    content_tag_prompt_key: 'classify_document',
    content_tag_prompt_return_fieldname: 'document_type',
  },
});

// Access all services
const { fileManager, metadataService, namingService, extractionService, uploadExtractService, invalidatePromptCache } = hazoFiles;

// Invalidate prompt cache without importing hazo_llm_api directly
invalidatePromptCache?.('classification', 'classify_document');

Background Upload Pipelines

A framework-agnostic upload pipeline engine that survives React component unmount. Useful when uploads include multi-step server work (upload → LLM extract → user confirmation → DB commit) and the user may navigate away mid-flight.

Two subpath exports:

  • hazo_files/background-upload — core, no React dependency (UploadManager, Job, PipelineExecutor, TypedEventEmitter, all types)
  • hazo_files/background-upload/react — React bindings (HazoFileUploadProvider, useFileUpload, useJobStatus, useFileUploadToasts)

Core API (framework-agnostic)

import { UploadManager } from 'hazo_files/background-upload';
import type { PipelineStep, PipelineContext, JobHandle } from 'hazo_files/background-upload';

const uploadStep: PipelineStep = {
  name: 'upload',
  async execute(ctx: PipelineContext, handle: JobHandle) {
    handle.set_status('uploading');
    for (let i = 0; i < ctx.files.length; i++) {
      // ... POST file to your /api/files/upload route
      handle.set_progress(i + 1, ctx.files.length);
    }
  },
};

const extractStep: PipelineStep = {
  name: 'extract',
  async execute(ctx, handle) {
    handle.set_status('processing');
    const extracted = await fetch('/api/extract', { /* ... */ }).then(r => r.json());
    ctx.extracted_data = extracted;
  },
};

const manager = new UploadManager({
  max_concurrent: 2,
  default_pipeline_steps: [uploadStep, extractStep],
});

manager.on('job:completed', ({ job }) => console.log('done', job.job_id));

const batch_id = manager.submit_batch({
  files: [file1, file2],
  group_id: 'project-123',
  group_label: 'Q4 Tax Documents',
});

React Provider + Hooks

// app/layout.tsx (Next.js) or your app root
'use client';
import { HazoFileUploadProvider } from 'hazo_files/background-upload/react';
import { Toaster } from 'sonner';

export default function RootLayout({ children }: { children: React.ReactNode }) {
  return (
    <HazoFileUploadProvider config={{ max_concurrent: 2 }}>
      <Toaster richColors />
      {children}
    </HazoFileUploadProvider>
  );
}
'use client';
import { useFileUpload, useJobStatus } from 'hazo_files/background-upload/react';

export function UploadButton() {
  const { submit_batch, active_jobs } = useFileUpload();

  function onPick(files: FileList) {
    submit_batch({
      files: Array.from(files),
      group_id: 'project-123',
      group_label: 'Project 123',
      pipeline_steps: [/* your PipelineStep[] */],
    });
  }

  return (
    <>
      <input type="file" multiple onChange={(e) => onPick(e.target.files!)} />
      <ul>
        {active_jobs.map((j) => (
          <li key={j.job_id}>{j.group_label}: {j.status}</li>
        ))}
      </ul>
    </>
  );
}

// Track a single job
export function JobBadge({ job_id }: { job_id: string }) {
  const job = useJobStatus(job_id);
  if (!job) return null;
  return <span>{job.status} {job.progress && `${job.progress.current}/${job.progress.total}`}</span>;
}

Confirmation Steps (user-in-the-loop)

const confirmStep: PipelineStep = {
  name: 'confirm',
  async execute(ctx, handle) {
    handle.set_status('awaiting_confirmation');
    const result = await handle.request_confirmation({
      conflicts: ctx.extracted_data.conflicts,
    });
    if (!result.confirmed) throw new Error('User cancelled');
    // ctx.extracted_data now reflects user choices via result.data
  },
};

In the UI, subscribe to job:confirmation_needed (or read jobs in awaiting_confirmation status), render a dialog, then:

const { resolve_confirmation } = useFileUpload();
resolve_confirmation(job_id, { confirmed: true, data: userChoices });

Sonner Toast Bridge

The provider mounts a ToastBridge by default (enable_toasts={true}) that uses sonner to notify on job:completed, job:error, job:confirmation_needed, and batch:completed. Sonner is a soft optional peer dependency — if it isn't installed, the bridge is a no-op. Set enable_toasts={false} on the provider to opt out, or wire useFileUploadToasts(manager) yourself for custom toast behavior.

Events

| Event | Payload | Fired when | |-------|---------|------------| | job:created | { job } | Job enters the queue | | job:status_changed | { job, previous_status } | Status transitions (queued → uploading → processing → ...) | | job:progress | { job } | handle.set_progress called inside a pipeline step | | job:completed | { job } | All pipeline steps finished successfully | | job:error | { job, error } | A pipeline step threw | | job:confirmation_needed | { job, payload } | handle.request_confirmation called | | job:confirmation_resolved | { job, result } | resolve_confirmation called | | batch:progress | { batch } | Any job in the batch settles | | batch:completed | { batch } | All jobs in the batch are done or error |

Design Notes

  • Survives unmount: UploadManager lives on a useRef; pipelines run on the manager, not on React state. Navigating away does not abort uploads.
  • Single source of truth: useFileUpload / useJobStatus subscribe via useSyncExternalStore against the manager's event emitter, so multiple components stay consistent.
  • Concurrency: max_concurrent controls how many jobs the executor runs in parallel; the rest wait in the FIFO queue.
  • No DB writes: This module is purely an in-memory pipeline runner — your pipeline steps own all server I/O.

API Reference

FileManager

Main service class providing unified file operations.

Methods

  • initialize(config?: HazoFilesConfig): Promise<void> - Initialize the file manager
  • createDirectory(path: string): Promise<OperationResult<FolderItem>> - Create directory
  • removeDirectory(path: string, recursive?: boolean): Promise<OperationResult> - Remove directory
  • uploadFile(source, remotePath, options?): Promise<OperationResult<FileItem>> - Upload file
  • downloadFile(remotePath, localPath?, options?): Promise<OperationResult<Buffer | string>> - Download file
  • moveItem(sourcePath, destinationPath, options?): Promise<OperationResult<FileSystemItem>> - Move file/folder
  • deleteFile(path: string): Promise<OperationResult> - Delete file
  • renameFile(path, newName, options?): Promise<OperationResult<FileItem>> - Rename file
  • renameFolder(path, newName, options?): Promise<OperationResult<FolderItem>> - Rename folder
  • listDirectory(path, options?): Promise<OperationResult<FileSystemItem[]>> - List directory contents
  • getItem(path: string): Promise<OperationResult<FileSystemItem>> - Get file/folder info
  • exists(path: string): Promise<boolean> - Check if file/folder exists
  • getFolderTree(path?, depth?): Promise<OperationResult<TreeNode[]>> - Get folder tree
  • writeFile(path, content, options?): Promise<OperationResult<FileItem>> - Write text file
  • readFile(path: string): Promise<OperationResult<string>> - Read text file
  • copyFile(sourcePath, destinationPath, options?): Promise<OperationResult<FileItem>> - Copy file
  • ensureDirectory(path: string): Promise<OperationResult<FolderItem>> - Ensure directory exists

Types

type StorageProvider = 'local' | 'google_drive';

interface FileItem {
  id: string;
  name: string;
  path: string;
  size: number;
  mimeType: string;
  createdAt: Date;
  modifiedAt: Date;
  isDirectory: false;
  parentId?: string;
  metadata?: Record<string, unknown>;
}

interface FolderItem {
  id: string;
  name: string;
  path: string;
  createdAt: Date;
  modifiedAt: Date;
  isDirectory: true;
  parentId?: string;
  children?: (FileItem | FolderItem)[];
  metadata?: Record<string, unknown>;
}

interface OperationResult<T = void> {
  success: boolean;
  data?: T;
  error?: string;
}

interface UploadOptions {
  overwrite?: boolean;
  onProgress?: (progress: number, bytesTransferred: number, totalBytes: number) => void;
  metadata?: Record<string, unknown>;
}

See src/types/index.ts for complete type definitions.

Error Handling

hazo_files provides comprehensive error types:

import {
  FileNotFoundError,
  DirectoryNotFoundError,
  FileExistsError,
  DirectoryExistsError,
  DirectoryNotEmptyError,
  PermissionDeniedError,
  InvalidPathError,
  FileTooLargeError,
  InvalidExtensionError,
  AuthenticationError,
  ConfigurationError,
  OperationError
} from 'hazo_files';

// Use in try-catch
try {
  await fileManager.uploadFile(buffer, '/files/test.exe');
} catch (error) {
  if (error instanceof InvalidExtensionError) {
    console.error('File type not allowed');
  } else if (error instanceof FileTooLargeError) {
    console.error('File is too large');
  }
}

FileStorageProvider (v2 Provider API)

v2 introduces a slimmer storage abstraction — FileStorageProvider — that sits alongside (not replacing) the existing FileManager/StorageModule stack. Use it when you don't need folder trees, metadata tracking, or naming conventions, and just want put/get/signed-URL semantics.

Interface

import type { FileStorageProvider } from 'hazo_files/server';

interface FileStorageProvider {
  put(path: string, body: Buffer | Readable, opts?: PutOpts): Promise<PutResult>;
  get(path: string): Promise<Buffer | Readable>;
  delete(path: string): Promise<void>;
  exists(path: string): Promise<boolean>;
  getSignedUrl(path: string, opts?: SignedUrlOpts): Promise<string>;
  probe(): Promise<ProbeResult>;
}

AppFileServerProvider

Local filesystem with HMAC-signed download URLs. Ideal for self-hosted apps where files are served through an API route.

import { AppFileServerProvider } from 'hazo_files/server';

const provider = new AppFileServerProvider({
  root: './storage',          // filesystem root
  hmac_secret: process.env.FILE_HMAC_SECRET!,
  default_ttl_seconds: 300,   // default 5 min
});

// Store a file
const result = await provider.put('uploads/report.pdf', buffer, {
  contentType: 'application/pdf',
});
// { provider: 'app_file_server', native_id: 'uploads/report.pdf', size: 12345 }

// Generate a signed URL (expires in 5 min)
const url = await provider.getSignedUrl('uploads/report.pdf');
// '/api/files/serve/1748080800.ABC123.../uploads/report.pdf'

Serving signed URLs (Next.js API route)

// app/api/files/serve/[...token]/route.ts
import { NextRequest, NextResponse } from 'next/server';
import { AppFileServerProvider } from 'hazo_files/server';

const provider = new AppFileServerProvider({
  root: './storage',
  hmac_secret: process.env.FILE_HMAC_SECRET!,
});

export async function GET(
  _req: NextRequest,
  { params }: { params: { token: string[] } }
) {
  // token = ['<exp>.<sig>', 'uploads', 'report.pdf']
  const [token, ...pathParts] = params.token;
  const filePath = pathParts.join('/');

  if (!provider.verifySignedUrl(token, filePath)) {
    return new NextResponse('Forbidden', { status: 403 });
  }

  const buf = await provider.get(filePath) as Buffer;
  return new NextResponse(buf, {
    headers: { 'Content-Type': 'application/octet-stream' },
  });
}

GoogleDriveProvider

Service-account Google Drive for Shared Drives. No OAuth flow — configure a GCP service account with Shared Drive access.

import { GoogleDriveProvider } from 'hazo_files/server';
import type { DrivePathCache } from 'hazo_files/server';

// Implement the path cache (e.g., backed by your hazo_connect adapter)
const pathCache: DrivePathCache = {
  lookup: async (key) => redis.get(`gdrive:${key}`),
  write:  async (key, id) => redis.set(`gdrive:${key}`, id),
  invalidate: async (key) => redis.del(`gdrive:${key}`),
};

const provider = new GoogleDriveProvider({
  service_account_json: process.env.GOOGLE_SERVICE_ACCOUNT_JSON!,
  shared_drive_id: 'your-shared-drive-id',
  path_cache: pathCache,
});

// Upload a file (folders are created lazily)
const result = await provider.put('2026/ACME/invoice.pdf', buffer, {
  contentType: 'application/pdf',
});
// { provider: 'gdrive', native_id: '<drive-file-id>', size: 12345 }

// Check connectivity
const health = await provider.probe();
// { ok: true } or { ok: false, error: 'drive_not_shared', message: '...' }

Note: getSignedUrl for Google Drive returns a native drive.google.com/uc?id=... link, which requires the viewer to have Drive access. For public unauthenticated downloads, use AppFileServerProvider instead or implement your own proxy.

InMemoryProvider (testing)

Import from the dedicated hazo_files/testing subpath — keeps server-only modules out of your test bundle.

import { InMemoryProvider } from 'hazo_files/testing';

const store = new InMemoryProvider();

await store.put('docs/readme.txt', Buffer.from('hello'));
const buf = await store.get('docs/readme.txt');
console.log(buf.toString()); // 'hello'

// Test helper: inspect internal state
const snap = store.snapshot();
console.log([...snap.keys()]); // ['docs/readme.txt']

// getSignedUrl returns a data: URL — no HTTP server needed
const url = await store.getSignedUrl('docs/readme.txt');
// 'data:application/octet-stream;base64,aGVsbG8='

Provider Errors

import {
  StorageCollisionExhausted,
  StorageNotConfigured,
  StorageUnavailable,
} from 'hazo_files/server';

// StorageCollisionExhausted — thrown when put with ifNotExists keeps colliding
// StorageNotConfigured     — thrown when provider config is absent
// StorageUnavailable       — wraps a ProbeResult error for runtime checks

Extending with Custom Storage Providers

See docs/ADDING_MODULES.md for a complete guide on creating custom storage modules.

Quick example:

import { BaseStorageModule } from 'hazo_files';
import type { StorageProvider, OperationResult, FileItem } from 'hazo_files';

class S3StorageModule extends BaseStorageModule {
  readonly provider: StorageProvider = 's3' as StorageProvider;

  async initialize(config: HazoFilesConfig): Promise<void> {
    await super.initialize(config);
    // Initialize S3 client
  }

  async uploadFile(source, remotePath, options?): Promise<OperationResult<FileItem>> {
    // Implement S3 upload
  }

  // Implement other required methods...
}

// Register the module
import { registerModule } from 'hazo_files';
registerModule('s3', () => new S3StorageModule());

Debug Integration (hazo_debug)

hazo_files emits structured file_operation log entries with timing, storage provider, and operation type on every file operation. You can forward these to hazo_debug for a visual Files tab in the debug panel.

The integration uses the existing logger option — no direct dependency on hazo_debug is needed:

import { createHazoFilesServer } from 'hazo_files/server';
import { use_debug_files } from 'hazo_debug/client';

// In your server setup:
const { log_file_op } = use_debug_files();

const { fileManager } = await createHazoFilesServer({
  config: { provider: 'local', local: { basePath: './files' } },
  logger: {
    info: (msg, data) => {
      if (msg === 'file_operation') log_file_op(data);
    },
    error: (msg, data) => {
      if (msg === 'file_operation') log_file_op(data);
    },
  },
});

Every file_operation log entry includes:

| Field | Type | Description | |-------|------|-------------| | operation | string | 'upload' | 'download' | 'delete' | 'move' | 'list' | 'extract' | | file_name | string? | File name | | file_path | string? | Virtual file path | | mime_type | string? | MIME type | | size_bytes | number? | File size in bytes | | storage | string? | Storage provider ('local', 'google_drive', etc.) | | duration_ms | number | Operation duration in milliseconds | | success | boolean | Whether the operation succeeded | | error | string? | Error message (on failure) | | metadata | object? | Extra context (e.g., { type: 'rename', new_name }) |

Logging is emitted at two levels:

  • FileManager logs the storage-level operation (actual file I/O timing)
  • FileMetadataService logs the database tracking operation (if tracking is enabled)

Testing

The package includes a test application in test-app/ demonstrating:

  • Next.js 14+ integration
  • API routes for file operations
  • FileBrowser UI component usage
  • Local storage and Google Drive switching
  • OAuth flow implementation

To run the test app:

cd test-app
npm install
npm run dev

Visit http://localhost:3000

Browser Compatibility

The UI components require:

  • Modern browsers with ES2020+ support
  • React 18+
  • CSS Grid and Flexbox support

Server-side code requires Node.js 16+

License

MIT License - see LICENSE file for details

Contributing

Contributions are welcome! Please:

  1. Fork the repository
  2. Create a feature branch
  3. Commit your changes with clear messages
  4. Add tests for new functionality
  5. Submit a pull request

Quota Tracking (v1.6.0)

Per-scope opt-in quota tracking. A scope with no quota row has no limit — uploads succeed regardless (fail-open).

Setup

Run migration 006 to create the