@jucie-engine/database

v1.0.8

Published

2 months ago

Database extension for @jucie-engine/core - SQLite-based data storage and management

Downloads

0High
0Medium
0Low

codewerks

database sqlite

@brickworks/database-ext

A high-performance file-based database extension for the Brickworks engine, providing persistent data storage with collections, automatic checkpointing, and Write-Ahead Logging (WAL) for reliability.

Built on top of the reactive State system, this database provides a document-oriented interface with JSON file persistence.

Features

🗄️ JSON file persistence with automatic file management
📁 Collection-based data organization for structured storage
⚡ Write-Ahead Logging (WAL) for performance and reliability
🔄 Automatic checkpointing when change thresholds are reached
💾 Backup and restore functionality
🎯 Document storage with auto-generated unique IDs
🔍 Query capabilities with findWhere and findAllWhere
⚙️ Reactive state integration with the Brickworks State system
🧪 Debug methods for testing and development

Architecture

The database is built in three layers:

1. State (Core)

The reactive state management system (@jucie-state/core) provides:

Path-based reactive data access
Change tracking with ledger system
Query capabilities (findWhere, findAllWhere)
Conflict resolution and merging
Export/import functionality

2. Database (Persistence)

The Database class wraps State and adds file-based persistence:

Write-Ahead Log pattern: All changes append to .wal file immediately
Periodic checkpointing: After N changes (default 1000), full state exports to .db file
Crash recovery: On startup, loads .db then replays any .wal entries
Change listener: Automatically persists every State mutation

3. Collection (Document API)

Provides a document-oriented interface:

Auto-generates unique document IDs using @jucie/id
Nested collections with collection(name)
Query methods for finding documents
All operations delegate to underlying State reference

How It Works

Data Flow

User Operation → Collection → State → Database Change Listener → WAL File
                    ↓
              After 1000 changes
                    ↓
              Full State → .db File
                    ↓
              Clear WAL

File System

The database maintains two files:

.db file: Full snapshot of the state tree (JSON format)
.wal file: Append-only log of changes since last checkpoint (newline-delimited JSON)

Durability Guarantee

Every state change is immediately written to the WAL file before returning, ensuring:

Crash recovery: If app crashes, WAL entries are replayed on restart
No data loss: Changes are persisted synchronously
Performance: WAL appends are fast; expensive full writes happen periodically

Initialization & Recovery Process

When initializeDb() is called:

Check for .db file:
- If missing: Export current state to create initial .db file
- If exists: Load .db file into State
Check for .wal file:
- If exists: Replay all WAL entries on top of loaded state
- This recovers any changes made since last checkpoint
Checkpoint: Write the recovered state back to .db and clear WAL
Attach listener: Start listening for State changes to persist to WAL

Example recovery scenario:

Initial state:    users: { user1: { name: "Alice" } }
After 500 writes: users: { user1: { name: "Alice" }, user2: { name: "Bob" }, ... }
                  ↓ (checkpoint not reached, only in WAL)
Crash & restart:  
                  ↓ Load .db file
                  users: { user1: { name: "Alice" } }
                  ↓ Replay .wal entries
                  users: { user1: { name: "Alice" }, user2: { name: "Bob" }, ... }
                  ↓ Checkpoint and clear WAL
                  ✓ All data recovered

Installation

npm install @brickworks/database-ext @jucie-engine/core

Quick Start

import { Engine } from '@jucie-engine/core';
import { Database } from '@brickworks/database-ext';

// Create engine with database
const engine = await Engine.create()
  .install(Database.configure({
    dbPath: './data',
    dbName: 'myapp.db',
    walPath: './data', 
    walName: 'myapp.wal'
  }));

// Initialize database
await engine.database.initializeDb();

// Get a collection
const users = engine.database.collection('users');

// Add data
const userId = users.add({
  name: 'John Doe',
  email: '[email protected]',
  created: new Date()
});

// Query data
const user = users.get([userId]);
console.log(user); // { name: 'John Doe', email: '[email protected]', ... }

Configuration

Database.configure({
  // Database file settings
  dbPath: './data',        // Directory for database file
  dbName: 'app.db',        // Database filename
  
  // WAL file settings  
  walPath: './data',       // Directory for WAL file
  walName: 'app.wal',      // WAL filename
  
  // Performance settings
  checkpointThreshold: 100,  // Soft limit - defer checkpoint to idle (default: 100)
  forceThreshold: 1000,      // Hard limit - force immediate checkpoint (default: 1000)
  checkpointInterval: 5000,  // Max ms between checkpoints (default: 5000)
  
  // Debug settings
  debug: false             // Enable debug methods
})

Checkpoint Strategy

The database uses a dual-threshold, triple-trigger checkpoint strategy:

Triggers:

Soft threshold (checkpointThreshold): Defer checkpoint to idle time
Hard threshold (forceThreshold): Force immediate checkpoint
Time-based (checkpointInterval): Checkpoint after T milliseconds

Behavior:

At 100 changes (soft): Schedule checkpoint for next idle tick ⏱️
At 1000 changes (hard): Force immediate checkpoint ⚠️
After 5 seconds (time): Checkpoint regardless of change count 🕐

Features:

✅ Always deferred to idle (except force threshold)
✅ Transaction-aware - won't interrupt active batch operations
✅ Async writes - uses fsPromises to avoid blocking
✅ Backup rotation - keeps previous DB + WAL for corruption recovery

Collections API

Creating and Managing Collections

// Get a collection (creates if doesn't exist)
const posts = engine.database.collection('posts');
const comments = engine.database.collection('comments');

Adding Data

// Add with auto-generated ID
const postId = posts.add({
  title: 'My First Post',
  content: 'Hello world!',
  published: true
});

// Set with specific key
posts.set(['featured', 'hero-post'], {
  title: 'Featured Post',
  content: 'This is featured content'
});

Reading Data

// Get by ID
const post = posts.get([postId]);

// Get by path
const heroPost = posts.get(['featured', 'hero-post']);

// Check existence
const exists = posts.has([postId]);

// Get all keys
const allPostIds = posts.keys();
const featuredKeys = posts.keys(['featured']);

Updating Data

// Update with a function
posts.update([postId], post => ({
  ...post,
  views: (post.views || 0) + 1,
  lastViewed: new Date()
}));

// Direct set (overwrites)
posts.set([postId], {
  title: 'Updated Title',
  content: 'Updated content'
});

Removing Data

// Remove by ID
posts.remove([postId]);

// Remove by path
posts.remove(['featured', 'hero-post']);

Querying Data

// Find single document where field matches value
const publishedPost = posts.findWhere('published', '===', true);

// Find all documents matching criteria
const allPublished = posts.findAllWhere('published', '===', true);

// Query nested properties
const recentPosts = posts.findAllWhere(['meta', 'created'], '>', lastWeek);

Nested Collections

Collections can contain other collections for hierarchical organization:

const posts = engine.database.collection('posts');
const postId = posts.add({ title: 'My Post', content: '...' });

// Create a nested collection for comments
const comments = posts.collection('comments');
comments.set([postId, 'comment1'], {
  author: 'Alice',
  text: 'Great post!',
  timestamp: Date.now()
});

// Access nested data
const comment = comments.get([postId, 'comment1']);

// Even deeper nesting
const replies = comments.collection('replies');
replies.set([postId, 'comment1', 'reply1'], {
  author: 'Bob',
  text: 'I agree!',
  timestamp: Date.now()
});

This creates a structure like:

{
  "posts": {
    "doc_123": {
      "title": "My Post",
      "content": "..."
    }
  },
  "comments": {
    "doc_123": {
      "comment1": {
        "author": "Alice",
        "text": "Great post!",
        "timestamp": 1234567890
      }
    },
    "replies": {
      "doc_123": {
        "comment1": {
          "reply1": {
            "author": "Bob",
            "text": "I agree!",
            "timestamp": 1234567891
          }
        }
      }
    }
  }
}

Database Operations

Manual Persistence

// Force a checkpoint (respects active batches)
await engine.database.checkpoint();

// Check if database is initialized
const isReady = engine.database.isInitialized();

// Manual batch control
engine.database.beginBatch();
// ... your operations ...
engine.database.endBatch();

// Or use the batch wrapper (recommended)
engine.database.batch(state => {
  // Operations here won't be interrupted by checkpoints
  state.set(['key'], value);
});

Backup and Restore

// Create manual backup (full export)
const backupPath = await engine.database.backup('./backups/backup.json');
console.log(`Backup saved to: ${backupPath}`);

// Restore from manual backup
const restored = await engine.database.restore('./backups/backup.json');
if (restored) {
  console.log('Database restored successfully');
}

// Restore from automatic backup (corruption recovery)
const recovered = await engine.database.restoreFromBackup();
if (recovered) {
  console.log('Recovered from backup rotation files');
}

Automatic Backup Rotation

The database automatically maintains backup files for corruption recovery:

Files created:

data/
├── app.db            # Current database
├── app.db.backup     # Previous checkpoint
├── app.wal           # Current WAL
└── app.wal.backup    # WAL at time of previous checkpoint

How it works:

Before each checkpoint, current .db and .wal are copied to .backup files
New checkpoint is written to .db and .wal is cleared
If corruption detected, can restore from .backup files

Recovery from corruption:

// If main DB is corrupted
try {
  await engine.database.initializeDb();
} catch (error) {
  console.error('DB corrupted, restoring from backup');
  await engine.database.restoreFromBackup();
  await engine.database.initializeDb();
}

This gives you:

✅ One checkpoint back in time
✅ All changes from that checkpoint in the WAL backup
✅ Automatic rotation on every checkpoint
✅ No manual intervention needed

Cleanup

// Destroy database and clean up files
engine.database.destroy();

Advanced Usage

Nested Data Structures

const settings = engine.database.collection('settings');

// Store nested configuration
settings.set(['app', 'theme', 'colors'], {
  primary: '#007bff',
  secondary: '#6c757d'
});

// Retrieve nested data
const colors = settings.get(['app', 'theme', 'colors']);

// Update nested properties
settings.update(['app', 'theme'], theme => ({
  ...theme,
  darkMode: true
}));

Performance Considerations

// For high-throughput applications, tune the checkpoint threshold
const highPerformanceDb = Database.configure({
  maxChanges: 10000,  // Checkpoint less frequently
  dbPath: './data',
  dbName: 'high-perf.db'
});

// Use collections to organize data logically
const analytics = engine.database.collection('analytics');
const userEvents = engine.database.collection('user-events');
const systemLogs = engine.database.collection('system-logs');

Error Handling

try {
  await engine.database.initializeDb();
  const users = engine.database.collection('users');
  
  users.add({ name: 'John' });
} catch (error) {
  console.error('Database operation failed:', error);
}

State Integration

Since the database is built on the State system, you get all State features for free:

Reactivity

const users = engine.database.collection('users');

// Watch for changes to a specific user
const unwatch = state.watch(['users', userId], (newValue, oldValue) => {
  console.log('User updated:', newValue);
});

// Make a change - watcher fires automatically
users.update([userId], user => ({ ...user, lastSeen: Date.now() }));

Direct State Access

Collections delegate to State, so you can access State directly if needed:

const users = engine.database.collection('users');

// Collection methods
users.add({ name: 'Alice' });
users.get([userId]);

// Equivalent State access
const { state } = engine.useContext();
state.set(['users', 'doc_123'], { name: 'Alice' });
state.get(['users', 'doc_123']);

Batching Operations

Use collection batching for transaction-aware operations that prevent checkpoint interruptions:

const users = engine.database.collection('users');

// Recommended: Use collection.batch() - automatically defers checkpoints
users.batch(state => {
  // Checkpoint will be deferred until batch completes
  state.set(['user1'], { name: 'Alice', email: '[email protected]' });
  state.set(['user2'], { name: 'Bob', email: '[email protected]' });
  state.remove(['oldUser']);
});
// Checkpoint runs on next idle tick after batch completes

Why this matters: Without batch protection, a checkpoint could happen mid-transaction, causing:

Temporary performance degradation during critical operations
Inconsistent checkpoint timing
Potential event loop blocking

Database-level batching also available:

engine.database.batch(state => {
  state.set(['users', 'user1'], { name: 'Alice' });
  state.set(['posts', 'post1'], { title: 'Hello' });
});

Manual batch control for custom transaction logic:

engine.database.beginBatch();

// Your complex transaction
await performMultiStepOperation();

engine.database.endBatch();
// Checkpoint scheduled for next idle time if needed

State Export/Import

The database's backup/restore uses State's export/import:

// Manual state operations
const { state } = engine.useContext();
const snapshot = await state.export(['users']); // Export only users
await state.import(snapshot); // Import state

// Or use database methods
await engine.database.backup('./backup.json');
await engine.database.restore('./backup.json');

Debug Methods

When debug: true is enabled, additional methods are available for testing:

// Available only when debug: true
await engine.database._afterWrite();        // Wait for pending writes
await engine.database._checkpoint();        // Force checkpoint
const count = engine.database._getWALEntryCount(); // Get WAL entry count
const entries = engine.database._getWALEntries();  // Get WAL entries

File Structure

your-app/
├── data/
│   ├── app.db          # Main database file (JSON snapshot)
│   ├── app.db.backup   # Previous checkpoint backup
│   ├── app.wal         # Write-Ahead Log file (newline-delimited JSON)
│   └── app.wal.backup  # WAL at time of previous checkpoint
└── backups/
    └── backup.json     # Manual backup files

File Formats

.db file - Complete state snapshot:

{
  "state": {
    "users": {
      "doc_abc123": { "name": "Alice", "email": "[email protected]" },
      "doc_xyz789": { "name": "Bob", "email": "[email protected]" }
    }
  },
  "ledger": { /* ... metadata for conflict resolution ... */ }
}

.wal file - Change entries (one per line):

{"method":"set","path":["users","doc_abc123"],"to":{"name":"Alice","email":"[email protected]"}}
{"method":"set","path":["users","doc_xyz789"],"to":{"name":"Bob","email":"[email protected]"}}
{"method":"remove","path":["users","doc_old456"]}

Performance

The database extension is optimized for:

Fast writes: WAL append operations are extremely fast (synchronous file appends)
Low memory overhead: Only the active state tree is kept in memory
Configurable checkpointing: Tune maxChanges based on your workload
Reliability: WAL ensures data integrity during crashes

Performance Characteristics

Write operations: O(1) append to WAL + State tree mutation
Read operations: O(1) to O(log n) depending on State tree depth
Checkpoint operations: O(n) where n is the size of the state tree
Startup time: O(m) where m is the number of WAL entries to replay

Tuning

For write-heavy workloads with large state trees:

Database.configure({
  checkpointThreshold: 500,   // Defer to idle less often
  forceThreshold: 5000,       // Allow WAL to grow larger before forcing
  checkpointInterval: 10000   // Longer intervals between checkpoints
})

For applications with frequent restarts (smaller WAL replay):

Database.configure({
  checkpointThreshold: 50,    // Checkpoint more frequently
  forceThreshold: 500,        // Force sooner to keep WAL small
  checkpointInterval: 2000    // More frequent time-based checkpoints
})

For real-time applications that need consistent responsiveness:

Database.configure({
  checkpointThreshold: 100,   // Moderate soft threshold
  forceThreshold: 1000,       // Reasonable hard limit
  checkpointInterval: 5000    // Balance between WAL size and frequency
})
// Use collection.batch() around bulk operations

Understanding the thresholds:

checkpointThreshold: Triggers deferred checkpoint (no performance impact)
forceThreshold: Emergency brake to prevent unbounded WAL growth
checkpointInterval: Time-based safety net

Rule of thumb: Set forceThreshold to 5-10x checkpointThreshold

Important Concepts

Path-Based Architecture

Everything in the database uses path arrays to navigate the state tree:

// Path: ['users', 'doc_123', 'profile', 'avatar']
users.get(['doc_123', 'profile', 'avatar']);

// Behind the scenes, Collections prefix their name to paths:
const users = engine.database.collection('users');
users.get(['doc_123']); // Actually accesses state.get(['users', 'doc_123'])

Change Entries

The WAL stores structured change entries:

{
  method: 'set',           // or 'remove', 'update', 'apply'
  path: ['users', 'id'],   // Path array
  to: { name: 'Alice' },   // New value (for set/update)
  from: { name: 'Bob' }    // Previous value (for history)
}

Write Strategy

The database uses a hybrid write strategy:

WAL writes (synchronous):

✅ Changes written to WAL synchronously via fs.appendFileSync
✅ Guarantees durability - changes are on disk before function returns
✅ Fast append operations (~microseconds)
✅ No data loss on crashes

Checkpoint writes (async, deferred):

✅ Full state exports happen asynchronously via fsPromises
✅ Deferred to idle time using setImmediate
✅ Non-blocking - doesn't interrupt active requests
✅ Transaction-aware - respects batch operations
⚡ Best of both worlds: durability + responsiveness

Memory Model

The entire state tree is kept in memory:

Reads: Instant, no I/O
Writes: Fast WAL append
Memory: Scales with data size, not with number of operations
Checkpoint: Writes entire state, but happens infrequently

Crash Recovery Guarantees

What survives a crash:

✅ All changes written to WAL (synchronous writes)
✅ Last checkpoint state in .db file
❌ Changes in memory not yet written to WAL (shouldn't happen)

Recovery process:

Load last checkpoint from .db
Replay all WAL entries in order
State fully recovered

Idle Checkpointing Benefits

The dual-threshold idle checkpoint strategy prevents periodic performance degradation:

Problem: With synchronous checkpoints every N changes:

Request 1-99:   ⚡ Fast (WAL append only)
Request 100:    🐌 Slow (triggers full checkpoint, blocks event loop)
Request 101-199: ⚡ Fast
Request 200:    🐌 Slow (another blocking checkpoint)

Solution: With dual-threshold idle-deferred async checkpoints:

Request 1-99:    ⚡ Fast (WAL append only)
Request 100:     ⚡ Fast (soft threshold - schedules idle checkpoint)
  [Idle tick]:   ⏱️  Async checkpoint (non-blocking)
Request 101-199:  ⚡ Fast (consistent performance)
Request 200:     ⚡ Fast (schedules checkpoint)
  [Idle tick]:   ⏱️  Async checkpoint
...
Request 1000:    ⚠️  Force checkpoint (hard threshold, synchronous)
                 Prevents unbounded WAL growth

During batch operations:

users.batch(state => {
  // 500 operations
});
  ↓ Soft threshold reached multiple times
  ↓ All checkpoint triggers deferred
  ↓ Batch completes
  ↓ Single checkpoint scheduled for idle time

With backup rotation:

[Idle checkpoint triggered]
  ↓ Copy app.db → app.db.backup
  ↓ Copy app.wal → app.wal.backup
  ↓ Write new app.db
  ↓ Clear app.wal
  ✓ Corruption-resistant

Result: Predictable, consistent performance with no periodic lag spikes + automatic backup protection.

TypeScript Support

TypeScript definitions are included:

import { Engine } from '@jucie-engine/core';
import { Database } from '@brickworks/database-ext';

interface User {
  name: string;
  email: string;
  created: Date;
}

const engine = await Engine.create().install(Database.configure({
  dbPath: './data',
  dbName: 'app.db'
}));

const users = engine.database.collection('users');
const userId = users.add({ 
  name: 'John', 
  email: '[email protected]',
  created: new Date() 
} as User);

License

ISC

Contributing

Issues and pull requests are welcome. Please ensure tests pass before submitting.

npm test                    # Run all tests
npm run test:performance    # Run performance benchmarks