sharded

v0.0.50

Published

6 months ago

SQLite-based write buffer and caching system for Prisma

0High
0Medium
0Low

Sharded - Make any database fast

Sharded is a SQLite-based write buffer and caching system for Prisma that provides high-performance data operations with automatic synchronization to your main database.

🚀 Performance Benefits

Dramatic Speed Improvement: Reduce write & read times from >100ms to <10ms
10x faster reads from SQLite cache vs. network database calls
Reduced database load through write buffering
Improved user experience with instant data access
Scalable architecture supporting multiple worker nodes

🚀 Features

Write Buffering: Buffer write operations in fast SQLite databases before syncing to your main database
Intelligent Caching: Cache frequently accessed data for lightning-fast reads
Automatic Sync: Background synchronization using Redis queues (BullMQ)
Prisma Integration: Seamless integration with existing Prisma workflows
Multi-Node Support: Master/worker architecture for distributed systems
Schema Generation: CLI tools to generate optimized schemas for your blocks
WAL Mode: Automatic WAL optimizations prevent "phantom data" issues (details)

🎯 When to Use Sharded

✅ Perfect Use Cases

Real-time applications where sub-10ms response times are critical
High-frequency read/write operations on specific data subsets
- Chat/messaging systems with frequent message operations
- Gaming applications requiring ultra-fast state updates
- Live collaboration tools with real-time document editing
- Analytics dashboards with frequently accessed metric

❌ When NOT to Use Sharded

Infrequent database operations (< 10 operations per minute)
Simple CRUD applications without performance bottlenecks
Applications with simple, linear data access patterns
One-time data processing or batch operations
Systems where network latency isn't a concern
Applications with mostly write-once, read-rarely data
Your database is fast enough for your use case

📦 Installation

yarn add sharded
# or
npm install sharded

Prerequisites

Node.js 16+
Prisma 6.7.0+
Redis (for queue management)
SQLite support

Docker Deployment

When deploying with Docker, persist your blocks across redeployments by mounting the blocks data directory:

# Dockerfile
FROM node:18-alpine

WORKDIR /app
COPY package*.json ./
RUN yarn install

COPY . .
RUN yarn build

# Create blocks directory
RUN mkdir -p /app/prisma/blocks/data

EXPOSE 3000
CMD ["yarn", "start"]

# docker-compose.yml
version: '3.8'
services:
  app:
    build: .
    ports:
      - "3000:3000"
    volumes:
      # Persist blocks across container restarts/redeployments
      - blocks_data:/app/prisma/blocks/data
    environment:
      - DATABASE_URL=postgresql://user:pass@db:5432/mydb
      - REDIS_URL=redis://redis:6379
    depends_on:
      - db
      - redis

  db:
    image: postgres:15
    environment:
      POSTGRES_DB: mydb
      POSTGRES_USER: user
      POSTGRES_PASSWORD: pass
    volumes:
      - postgres_data:/var/lib/postgresql/data

  redis:
    image: redis:7-alpine
    volumes:
      - redis_data:/data

volumes:
  blocks_data:    # Persists your Sharded blocks
  postgres_data:  # Persists your main database
  redis_data:     # Persists Redis queue data

Important: Without persisting /app/prisma/blocks/data, your blocks will be recreated on every deployment, losing cached data and requiring full reloads.

🛠️ Quick Start

1. Generate Block Schema

First, generate a subset schema for the models you want to cache/buffer:

# Generate schema for specific models
npx sharded generate --schema=./prisma/schema.prisma --models=User,Order

# Or generate for all models
npx sharded generate --schema=./prisma/schema.prisma --all-models

This creates:

prisma/blocks/block.prisma - Optimized schema for SQLite
prisma/blocks/template.sqlite - Template database
prisma/blocks/generated/ - Generated Prisma client

2. Create a Block

import { Block } from "sharded";
import { PrismaClient } from "@prisma/client";

const mainClient = new PrismaClient();

// Define how to load initial data into the block
const loader = async (blockClient: PrismaClient, mainClient: PrismaClient) => {
  // Load users from main database
  const users = await mainClient.user.findMany();
  for (const user of users) {
    await blockClient.user.create({ data: user });
  }

  // Load orders from main database
  const orders = await mainClient.order.findMany();
  for (const order of orders) {
    await blockClient.order.create({ data: order });
  }
};

// Create block client
const blockClient = await Block.create({
  blockId: "user-orders-cache",
  client: mainClient,
  loader,
  ttl: 3600, // Cache TTL in seconds (optional)
  connection: {
    host: "localhost",
    port: 6379,
    // password: 'your-redis-password'
  },
});

3. Use the Block Client

The block client works exactly like a regular Prisma client:

// Create operations are buffered and synced asynchronously
const user = await blockClient.user.create({
  data: {
    email: "[email protected]",
    name: "John Doe",
  },
});

// Read operations use cached data when available
const users = await blockClient.user.findMany({
  include: {
    orders: true,
  },
});

// Updates are buffered and synced
await blockClient.user.update({
  where: { id: user.id },
  data: { name: "Jane Doe" },
});

// Deletes are buffered and synced
await blockClient.user.delete({
  where: { id: user.id },
});

🏗️ Architecture

Block clients queue operations locally and to Redis. The Block.watch() method handles all background synchronization by automatically creating sync workers for blocks with pending operations.

// Every block client works the same way
const blockClient = await Block.create({
  blockId: "my-cache",
  client: mainClient,
  loader,
  ttl: 3600, // Optional: 1 hour cache
});

// Background sync and cleanup handled by Block.watch()
await Block.watch({
  ttl: 3600,
  intervals: {
    invalidation: 10000,   // Check TTL every 10 seconds
    syncCheck: 2000,       // Check sync workers every 2 seconds
    cleanup: 3600000,      // Clean Redis every hour (prevents performance degradation)
  },
  mainClient: prismaClient, // Used for sync workers
  connection: {
    host: "localhost",
    port: 6379,
  },
});

Data Flow

Writes: Buffered in SQLite → Queued in Redis → Synced to main DB
Reads: Check SQLite cache → Fallback to main DB → Cache result

💡 Use Cases

High-traffic applications requiring fast read access
Microservices needing local data caching
Real-time applications with frequent database operations
Analytics workloads requiring fast aggregations
Multi-tenant applications with isolated data blocks

🎯 Block Scoping Strategy

Important: Sharded is designed for subsections of your application that need fast read/writes, not entire databases. Loading your entire database into a block would be inefficient and defeat the purpose.

✅ Good Block Scoping Examples

1. Per-User Blocks (User Dashboard)

// Block for a specific user's data
const userBlock = await Block.create({
  blockId: `user-${userId}`,
  client: mainClient,
  loader: async (blockClient, mainClient) => {
    // Load only this user's data
    const user = await mainClient.user.findUnique({
      where: { id: userId },
      include: {
        profile: true,
        settings: true,
        recentActivity: { take: 50 },
      },
    });

    if (user) {
      await blockClient.user.create({ data: user });
    }
  },
});

2. Per-Chat Blocks (Messaging App)

// Block for a specific chat room
const chatBlock = await Block.create({
  blockId: `chat-${chatId}`,
  client: mainClient,
  loader: async (blockClient, mainClient) => {
    // Load chat and recent messages
    const chat = await mainClient.chat.findUnique({
      where: { id: chatId },
      include: {
        messages: {
          take: 100, // Last 100 messages
          orderBy: { createdAt: "desc" },
        },
        participants: true,
      },
    });

    if (chat) {
      await blockClient.chat.create({ data: chat });
    }
  },
});

// Fast message operations
await chatBlock.message.create({
  data: {
    content: "Hello!",
    chatId: chatId,
    userId: senderId,
  },
});

3. Per-Session Blocks (E-commerce Cart)

// Block for user's shopping session
const sessionBlock = await Block.create({
  blockId: `session-${sessionId}`,
  client: mainClient,
  loader: async (blockClient, mainClient) => {
    // Load cart, wishlist, and recently viewed
    const session = await mainClient.session.findUnique({
      where: { id: sessionId },
      include: {
        cart: { include: { items: true } },
        wishlist: { include: { items: true } },
        recentlyViewed: { take: 20 },
      },
    });

    if (session) {
      await blockClient.session.create({ data: session });
    }
  },
});

4. Per-Game Blocks (Gaming Application)

// Block for active game state
const gameBlock = await Block.create({
  blockId: `game-${gameId}`,
  client: mainClient,
  loader: async (blockClient, mainClient) => {
    // Load game state and player data
    const game = await mainClient.game.findUnique({
      where: { id: gameId },
      include: {
        players: true,
        gameState: true,
        moves: { take: 50 }, // Recent moves
      },
    });

    if (game) {
      await blockClient.game.create({ data: game });
    }
  },
});

// Ultra-fast game moves
await gameBlock.move.create({
  data: {
    gameId,
    playerId,
    action: "attack",
    coordinates: { x: 10, y: 15 },
  },
});

5. Per-Workspace Blocks (Collaboration Tools)

// Block for team workspace
const workspaceBlock = await Block.create({
  blockId: `workspace-${workspaceId}`,
  client: mainClient,
  loader: async (blockClient, mainClient) => {
    // Load workspace with active documents and team members
    const workspace = await mainClient.workspace.findUnique({
      where: { id: workspaceId },
      include: {
        documents: {
          where: { status: "active" },
          take: 50,
        },
        members: true,
        recentActivity: { take: 100 },
      },
    });

    if (workspace) {
      await blockClient.workspace.create({ data: workspace });
    }
  },
});

❌ Avoid These Patterns

// ❌ DON'T: Load entire database
const badBlock = await Block.create({
  blockId: "entire-app",
  loader: async (blockClient, mainClient) => {
    // This will be slow and memory-intensive
    const allUsers = await mainClient.user.findMany(); // Could be millions
    const allOrders = await mainClient.order.findMany(); // Could be millions
    // ... loading everything
  },
});

// ❌ DON'T: Overly broad scoping
const tooBroadBlock = await Block.create({
  blockId: "all-users-data",
  loader: async (blockClient, mainClient) => {
    // Loading all users when you only need one
    const users = await mainClient.user.findMany({
      include: { orders: true, profile: true },
    });
  },
});

🎯 Scoping Best Practices

Scope by User/Session: Create blocks per user session or user context
Scope by Feature: Create blocks for specific features (chat, cart, game)
Limit Data Size: Only load what you need (recent messages, active items)
Use TTL Wisely: Set appropriate cache expiration based on data freshness needs
Monitor Block Size: Keep blocks under 100MB for optimal performance

🔄 Block Lifecycle Management

// Create block when user starts session
const userBlock = await Block.create({
  blockId: `user-${userId}-${sessionId}`,
  // ... config
});

// Use throughout session for fast operations
await userBlock.user.update({ ... });
await userBlock.activity.create({ ... });

// Clean up when session ends
await Block.delete_block(`user-${userId}-${sessionId}`);

📚 API Reference

Block.create(config)

Creates a new block instance.

interface BlockConfig<T> {
  blockId: string;                    // Unique identifier for the block
  client: T;                         // Main Prisma client
  loader: (blockClient: T, mainClient: T) => Promise<void>; // Data loader function
  debug?: boolean;                   // Enable debug logging
  prismaOptions?: Prisma.PrismaClientOptions; // Additional Prisma options
  connection?: {                     // Redis connection options
    host: string;
    port: number;
    password?: string;
  };
  ttl?: number;                      // Cache TTL in seconds (used by watch() for invalidation)
};

Block.invalidate(blockId)

Manually invalidate a block cache:

await Block.invalidate("user-orders-cache");

Block.delete_block(blockId)

Delete a block and its associated files:

await Block.delete_block("user-orders-cache");

Block.watch(options)

Start watching for cache invalidation, sync worker management, and automatic Redis cleanup.

Interface:

interface WatchOptions<T> {
  ttl?: number;                  // Default TTL in seconds for all blocks
  intervals?: {
    invalidation?: number;       // Check TTL invalidation (ms, default: 10000)
    syncCheck?: number;          // Check sync workers (ms, default: 2000)
    cleanup?: number;            // Redis cleanup (ms, default: 3600000, set 0 to disable)
  };
  mainClient: T;                 // Required: Main Prisma client for sync worker creation
  connection?: {                 // Redis connection options
    host: string;
    port: number;
    password?: string;
  };
}

Example:

await Block.watch({
  ttl: 3600,                    // Cache TTL in seconds
  intervals: {
    invalidation: 10000,        // Check TTL invalidation every 10 seconds
    syncCheck: 2000,            // Check sync workers every 2 seconds
    cleanup: 3600000,           // Clean Redis every 1 hour (default, optional)
  },
  mainClient: prismaClient,     // Required for sync worker creation
  connection: {
    host: "localhost",
    port: 6379,
  },
});

Automatic Redis Cleanup: Block.watch() includes automatic cleanup to prevent performance degradation from accumulated stale data (failed jobs, orphaned keys, etc.). It runs every hour by default and cleans up:

Old failed jobs (older than 1 hour)
Stale operation keys for deleted blocks
Orphaned Redis metadata (last_seen, block_ttl)
Dead letter queues for non-existent blocks

Customize cleanup interval based on your workload:

// High-throughput apps: every 30 minutes
intervals: { cleanup: 1800000 }

// Normal workload: every 1 hour (default)
intervals: { cleanup: 3600000 }

// Disable automatic cleanup
intervals: { cleanup: 0 }

Important: The mainClient parameter is crucial for sync worker creation. When Block.watch() detects blocks with pending operations, it uses this client to create sync workers that process queued operations and sync them to the main database.

Block.cleanup()

Manually trigger Redis cleanup (also runs automatically via Block.watch()):

// Run immediate cleanup
const result = await Block.cleanup();
console.log('Cleaned:', result);
// { staleOperationKeys: 5, oldFailedJobs: 23, orphanedKeys: 8 }

Useful for:

Serverless environments (scheduled via cron)
Immediate cleanup when Redis is slow
Custom cleanup schedules outside of Block.watch()

📖 For detailed Redis maintenance guide, see REDIS-MAINTENANCE.md

⚠️ Known Limitations

Cache Consistency: If records are modified directly in the main database, the block cache won't be aware until invalidation

📋 TODO

Multi-Machine Sync: Currently, blocks with the same ID across different machines don't sync with each other. Multi-process on the same machine works fine as they share the same SQLite file location, but cross-machine block synchronization needs to be implemented.

🤝 Contributing

Fork the repository
Create a feature branch: git checkout -b feature/amazing-feature
Commit your changes: git commit -m 'Add amazing feature'
Push to the branch: git push origin feature/amazing-feature
Open a Pull Request

🧪 Testing

# Run tests
yarn test

# Run in development mode
yarn dev

📁 Project Structure

sharded/
├── cli/                 # Command-line interface
│   ├── generate.ts     # Schema generation
│   └── index.ts        # CLI entry point
├── runtime/            # Core runtime
│   └── Block.ts        # Main Block class
├── tests/              # Test files
├── prisma/             # Example schema
└── dist/               # Compiled output

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🔗 Links

Made with ❤️ by the Sharded team