@earth-app/collegedb

v1.2.3

Published

3 days ago

Downloads

1,131

0High
0Medium
0Low

gmitch215

CollegeDB

Universal Database Horizontal Sharding Router

NPM Version

A TypeScript library for true horizontal scaling of SQLite-style databases primarily for Cloudflare using D1 and KV, with additional provider adapters for Redis/Valkey KV and PostgreSQL/MySQL/SQLite SQL backends, plus Drizzle ORM interop across those SQL providers. CollegeDB distributes your data across multiple database shards, with each table's records split by primary key across different database instances.

Why CollegeDB

CollegeDB implements data distribution where a single logical table is physically stored across multiple D1 databases:

env.db-east (Shard 1)
┌────────────────────────────────────────────┐
│ table users: [user-1, user-3, user-5, ...] │
│ table posts: [post-2, post-7, post-9, ...] │
└────────────────────────────────────────────┘

env.db-west (Shard 2)
┌────────────────────────────────────────────┐
│ table users: [user-2, user-4, user-6, ...] │
│ table posts: [post-1, post-3, post-8, ...] │
└────────────────────────────────────────────┘

env.db-central (Shard 3)
┌────────────────────────────────────────────┐
│ table users: [user-7, user-8, user-9, ...] │
│ table posts: [post-4, post-5, post-6, ...] │
└────────────────────────────────────────────┘

This allows you to:

Break through D1's single database limits by spreading data across many databases
Improve query performance by reducing data per database instance
Scale geographically by placing shards in different regions
Increase write throughput by parallelizing across multiple database instances

Features

Automatic query routing (primary key to shard mapping)
Provider adapters for Redis/Valkey/NuxtHub KV plus PostgreSQL/MySQL/SQLite SQL
Drizzle interop through existing SQL providers (createPostgreSQLProvider, createMySQLProvider, createSQLiteProvider)
Auto-allocated generated-id inserts via insert() and direct-shard inserts via insertShard() for AUTOINCREMENT / RETURNING workflows
Hyperdrive helpers for PostgreSQL and MySQL
Multiple allocation strategies: round-robin, random, hash, location-aware, and mixed read/write strategies
Durable Object shard coordination and shard statistics
Migration helpers for integrating existing datasets and rebalancing mappings

Getting Started

Installation

bun add @earth-app/collegedb
# or
npm install @earth-app/collegedb

NuxtHub + Drizzle with CollegeDB Routing

Keep NuxtHub + Drizzle for schema/migrations and add CollegeDB as your routing layer.

import { db as hubDb } from '@nuxthub/db';
import { kv } from '@nuxthub/kv';
import { sql } from 'drizzle-orm';
import { drizzle } from 'drizzle-orm/d1';
import { createNuxtHubKVProvider, createSQLiteProvider, first, initialize, run } from '@earth-app/collegedb';

let initialized = false;

function ensureCollegeDB(env: { DB_SECONDARY: D1Database }) {
	if (initialized) return;

	initialize({
		kv: createNuxtHubKVProvider(kv),
		shards: {
			'db-primary': createSQLiteProvider(hubDb, sql),
			'db-secondary': createSQLiteProvider(drizzle(env.DB_SECONDARY), sql)
		},
		strategy: 'hash'
	});

	initialized = true;
}

export default defineEventHandler(async (event) => {
	const env = event.context.cloudflare.env;
	ensureCollegeDB(env);

	await run('post:123', 'INSERT OR REPLACE INTO blog_posts (id, title) VALUES (?, ?)', ['post:123', 'Hello from CollegeDB']);

	const post = await first<{ id: string; title: string }>('post:123', 'SELECT id, title FROM blog_posts WHERE id = ?', ['post:123']);

	return { post };
});

Drop-in Pattern for Existing `hub:db` + `hub:kv` Code

// before
import { eq } from 'drizzle-orm';
import { db } from 'hub:db';
import { kv } from 'hub:kv';
import { blogPosts } from '~/server/db/schema';

const cached = await kv.get('nuxtpress:post:slug');
if (cached) return cached;

const rows = await db.select().from(blogPosts).where(eq(blogPosts.slug, slug)).limit(1);
await kv.set('nuxtpress:post:slug', rows[0], { ttl: 3600 });

// after (CollegeDB routing + same NuxtHub KV cache)
import { kv } from '@nuxthub/kv';
import { sql } from 'drizzle-orm';
import { db } from 'hub:db';
import { createNuxtHubKVProvider, createSQLiteProvider, first, initialize } from '@earth-app/collegedb';

let initialized = false;

function setup() {
	if (initialized) return;
	initialize({
		kv: createNuxtHubKVProvider(kv),
		shards: {
			'db-primary': createSQLiteProvider(db, sql)
		},
		strategy: 'hash'
	});
	initialized = true;
}

setup();

const cacheKey = `nuxtpress:post:${slug}`;
const cached = await kv.get(cacheKey);
if (cached) return cached;

const row = await first<{ id: string; slug: string; title: string }>(
	cacheKey,
	'SELECT id, slug, title FROM blog_posts WHERE slug = ? LIMIT 1',
	[slug]
);

await kv.set(cacheKey, row, { ttl: 3600 });

Benchmark Suite

CollegeDB includes a benchmark runner that executes each SQL+KV combination across adapter profiles, then generates a report with profile-specific matrices.

Adapter Profiles

| Profile | Purpose | | ---------- | ----------------------------------------------------------------------- | | native | Direct provider clients (Cloudflare bindings or driver-native adapters) | | drizzle | Drizzle interop through SQL provider adapters | | hyperdrive | Hyperdrive connection-string wrappers for PostgreSQL/MySQL | | nuxthub | NuxtHub-style KV adapter with SQL provider interop |

Scenario Catalog

| Scenario Key | Scenario | What Happens | Workload Per Run | | ----------------- | -------------------------------- | ------------------------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------- | | basic_crud | Basic CRUD round-trip | Insert, read, update, and delete a user via routed queries. | 20 iterations; 4 routed SQL ops per iteration | | advanced_usage | Advanced lookup workflow | Writes user+post, adds lookup aliases, then validates join and alias-based lookup. | 15 iterations; ~5 routed SQL ops + KV lookup-key updates per iteration | | migration_mapping | Migration-style mapping creation | Inserts legacy records on a fixed shard, then builds shard mappings in batch and validates routing. | 10 iterations; 20 legacy records mapped per iteration | | bulk_crud | Bulk CRUD pressure | Performs bulk inserts, half updates, and full delete sweep, then validates shard-wide totals. | 7 iterations; 160 inserts + 80 updates + 160 deletes per iteration | | auto_increment | Auto-generated primary keys | Inserts rows with generated ids on an automatically selected shard, captures the generated key, then validates routed readback. | 6 iterations; insert + generated-id readback per iteration | | indexing | Indexed query scan | Creates an index on posts(user_id) and repeatedly queries the indexed path. | 15 iterations after warmup dataset build | | metadata_fetch | Metadata inspection | Reads table metadata/introspection rows from one shard. | 14 iterations; 1 metadata query per iteration | | pragma_or_info | PRAGMA / server info | Runs provider-specific PRAGMA/info query to sample low-level metadata latency. | 14 iterations; 1 pragma/info query per iteration | | counting | Cross-shard counting | Counts users across all shards to measure fanout aggregation overhead. | 14 iterations; all-shard count aggregation per iteration | | shard_fanout | Shard fanout query | Runs query fanout to all shards and aggregates shard-level responses. | 14 iterations; 1 all-shards query per iteration | | reassignment | Shard reassignment flow | Creates a record, reassigns it to another shard, and verifies routed reads still succeed. | 10 iterations; insert + reassignment + verification per iteration |

Report Matrices

Each generated report includes:

Matrix: SQL x KV (Overall)
Matrix: Adapter Profiles (Overall Avg)
Matrix: Core Scenario Latency (avg/p95)
Matrix: Introspection and Routing Latency (avg/p95)
Cloudflare Worker (wrangler dev --local)
Matrix: Cloudflare Adapter Profiles (Overall Avg)

Common Commands

bun run test:sandbox
bun run test:sandbox:drizzle
bun run test:sandbox:nuxthub
bun run test:sandbox:hyperdrive

For Docker-based benchmark details and filtering options, see Sandbox Benchmarks (Docker Compose).

Provider Adapters

CollegeDB can run with either native Cloudflare bindings or custom providers as long as they match the exported KVStorage and SQLDatabase interfaces.

Drizzle interop is enabled by passing a Drizzle sql tag as the optional second argument to createPostgreSQLProvider, createMySQLProvider, or createSQLiteProvider.

Supported adapters:

createRedisKVProvider
createValkeyKVProvider
createNuxtHubKVProvider
createPostgreSQLProvider
createMySQLProvider
createSQLiteProvider
createDrizzleSQLProvider (compatibility helper)
createHyperdrivePostgresProvider
createHyperdriveMySQLProvider

import { createClient as createRedisClient } from 'redis';
import { Pool } from 'pg';
import { createPostgreSQLProvider, createRedisKVProvider, initialize, run, type CollegeDBConfig } from '@earth-app/collegedb';

const redisClient = createRedisClient({ url: process.env.REDIS_URL });
const pgPool = new Pool({ connectionString: process.env.POSTGRES_URL });

const config: CollegeDBConfig = {
	kv: createRedisKVProvider(redisClient),
	shards: {
		'pg-east': createPostgreSQLProvider(pgPool)
	},
	strategy: 'hash',
	disableAutoMigration: true
};

async function bootstrap() {
	await redisClient.connect();
	initialize(config);
	await run('user-1', 'INSERT INTO users (id, name) VALUES (?, ?)', ['user-1', 'Taylor']);
}

bootstrap().catch(console.error);

For Hyperdrive-backed SQL connections, use createHyperdrivePostgresProvider or createHyperdriveMySQLProvider with your database client factory.

NuxtHub Runtime Example (D1 + KV)

import { db } from '@nuxthub/db';
import { kv } from '@nuxthub/kv';
import { sql } from 'drizzle-orm';
import { createNuxtHubKVProvider, createSQLiteProvider, initialize, run, first } from '@earth-app/collegedb';

initialize({
	kv: createNuxtHubKVProvider(kv),
	shards: {
		'db-primary': createSQLiteProvider(db, sql)
	},
	strategy: 'hash'
});

await run('draft:home', 'INSERT OR REPLACE INTO drafts (id, content) VALUES (?, ?)', ['draft:home', '# Home']);

const draft = await first<{ id: string; content: string }>('draft:home', 'SELECT id, content FROM drafts WHERE id = ?', ['draft:home']);

Keep Drizzle Schema + Migrations, Route Queries with CollegeDB

CollegeDB does not replace your Drizzle schema or NuxtHub migration workflow.

npx nuxt db generate
npx nuxt db migrate

Use those migrations as-is, then route runtime reads/writes through CollegeDB adapters.

For a complete non-Cloudflare setup, see examples/provider-sandbox.ts.

NuxtHub + Drizzle Recipes

Multi-Vendor Shards (Cloudflare + Postgres/MySQL)

NuxtHub supports multiple deployment/database vendors. CollegeDB can shard across any SQL backends that Drizzle can connect to.

import { sql } from 'drizzle-orm';
import { drizzle as drizzlePg } from 'drizzle-orm/postgres-js';
import { drizzle as drizzleMySQL } from 'drizzle-orm/mysql2';
import { drizzle as drizzleD1 } from 'drizzle-orm/d1';
import { kv } from '@nuxthub/kv';
import postgres from 'postgres';
import mysql from 'mysql2/promise';
import {
	createMySQLProvider,
	createNuxtHubKVProvider,
	createPostgreSQLProvider,
	createSQLiteProvider,
	initialize,
	run
} from '@earth-app/collegedb';

const pgClient = postgres(process.env.POSTGRES_URL!);
const mysqlPool = mysql.createPool(process.env.MYSQL_URL!);

function setup(env: { DB_CF: D1Database }) {
	initialize({
		kv: createNuxtHubKVProvider(kv),
		shards: {
			'db-cf': createSQLiteProvider(drizzleD1(env.DB_CF), sql),
			'db-pg': createPostgreSQLProvider(drizzlePg(pgClient), sql),
			'db-mysql': createMySQLProvider(drizzleMySQL(mysqlPool), sql)
		},
		strategy: 'hash'
	});
}

export default defineEventHandler(async (event) => {
	setup(event.context.cloudflare.env);

	await run('tenant:acme:user:1', 'INSERT INTO users (id, name) VALUES (?, ?)', ['tenant:acme:user:1', 'Ada']);
});

Keep NuxtHub Cache/KV Semantics Intact

Use NuxtHub KV for app cache while CollegeDB uses its own key namespace for shard mappings:

import { kv } from '@nuxthub/kv';
import { first } from '@earth-app/collegedb';

const cacheKey = `nuxtpress:post:${slug}`;
const cached = await kv.get(cacheKey);
if (cached) return cached;

const post = await first(cacheKey, 'SELECT * FROM blog_posts WHERE slug = ? LIMIT 1', [slug]);
await kv.set(cacheKey, post, { ttl: 3600 });

Sandbox Benchmarks (Docker Compose)

CollegeDB ships with an integration sandbox runner that benchmarks real latency across provider combinations.

Requirements:

Docker + Docker Compose plugin
Bun
Wrangler (installed as a dev dependency and invoked by scripts)

The Cloudflare benchmark path runs against the dedicated sandbox worker:

Worker entry: sandbox/worker.ts
Wrangler config: sandbox/wrangler.jsonc

Main commands:

# Run full SQL x KV matrix plus Cloudflare local benchmark
bun run test:sandbox

# Run full SQL x KV matrix only
bun run test:sandbox:all

# Run Cloudflare local benchmark only (wrangler dev --local)
bun run test:sandbox:cloudflare

Provider filters:

# One SQL provider against all KV providers (native profile by default)
bun run test:sandbox:mysql
bun run test:sandbox:postgres
bun run test:sandbox:mariadb
bun run test:sandbox:sqlite

# One KV provider against all SQL providers (native profile by default)
bun run test:sandbox:redis
bun run test:sandbox:valkey

# Run all SQL x KV combinations for one adapter profile
bun run test:sandbox:drizzle
bun run test:sandbox:nuxthub
bun run test:sandbox:hyperdrive

# Explicit pairwise combinations
bun run test:sandbox:postgres+redis
bun run test:sandbox:postgres+valkey
bun run test:sandbox:mysql+redis
bun run test:sandbox:mysql+valkey
bun run test:sandbox:mariadb+redis
bun run test:sandbox:mariadb+valkey
bun run test:sandbox:sqlite+redis
bun run test:sandbox:sqlite+valkey

Output behavior:

Every run writes a timestamped Markdown report to sandbox/results/
sandbox/results/latest.md is always updated to the newest report
The runner prints the report in-terminal using Bun's Markdown renderer with ANSI formatting
test:sandbox includes native, drizzle, hyperdrive, and nuxthub adapter profiles across supported SQL/KV combinations plus Cloudflare profile runs

Benchmark coverage includes:

basic CRUD
advanced lookup/routing workflows
migration-style mapping creation
bulk CRUD
auto-generated primary key inserts and readback
indexing queries
metadata fetch
pragma/info queries (provider-specific)
counting across shards
shard fanout aggregation
shard reassignment workflow

How to read benchmark rows:

Latency cells are formatted as average / p95 in milliseconds.
FAILED means the scenario returned an error.
N/A means the scenario was intentionally skipped in that environment.
Use the detailed section for full avg, p50, p95, min, max, and sample count (n).

In-Memory Providers for Testing & Development

CollegeDB includes lightweight, zero-dependency in-memory mock implementations of the KVStorage and SQLDatabase interfaces. These are ideal for:

Unit testing without external dependencies
Integration testing with multiple shard combinations
Local development and rapid iteration
Sandboxed playtesting of routing logic

The in-memory providers work in Cloudflare Workers, Node.js, and Deno environments.

Quick Start with In-Memory Providers

import { createInMemoryKVProvider, createInMemorySQLProvider, initialize, run, first } from '@earth-app/collegedb';

// Create fresh in-memory providers for each test
const config = {
	kv: createInMemoryKVProvider(),
	shards: {
		'shard-1': createInMemorySQLProvider(),
		'shard-2': createInMemorySQLProvider(),
		'shard-3': createInMemorySQLProvider()
	},
	strategy: 'hash'
};

initialize(config);

// Use as normal - all operations happen in-memory
await run('user-1', 'INSERT INTO users (id, name, email) VALUES (?, ?, ?)', ['user-1', 'Alice', '[email protected]']);

const user = await first<{ id: string; name: string }>('user-1', 'SELECT id, name FROM users WHERE id = ?', ['user-1']);
console.log(user); // { id: 'user-1', name: 'Alice' }

Unit Testing Example

import { describe, it, expect, beforeEach, afterEach } from 'vitest';
import { createInMemoryKVProvider, createInMemorySQLProvider, initialize, resetConfig, run, first } from '@earth-app/collegedb';

describe('User Shard Routing', () => {
	beforeEach(() => {
		// Fresh providers for each test
		initialize({
			kv: createInMemoryKVProvider(),
			shards: {
				'shard-1': createInMemorySQLProvider(),
				'shard-2': createInMemorySQLProvider(),
				'shard-3': createInMemorySQLProvider()
			},
			strategy: 'hash'
		});
	});

	afterEach(() => {
		resetConfig();
	});

	it('should insert and retrieve a user', async () => {
		await run('user-1', 'INSERT INTO users (id, name, email) VALUES (?, ?, ?)', ['user-1', 'Alice', '[email protected]']);

		const user = await first<{ name: string }>('user-1', 'SELECT name FROM users WHERE id = ?', ['user-1']);

		expect(user?.name).toBe('Alice');
	});

	it('should distribute users across shards', async () => {
		// Insert multiple users
		for (let i = 0; i < 9; i++) {
			await run(`user-${i}`, 'INSERT INTO users (id, name) VALUES (?, ?)', [`user-${i}`, `User ${i}`]);
		}

		// Verify each can be retrieved
		for (let i = 0; i < 9; i++) {
			const user = await first(`user-${i}`, 'SELECT id FROM users WHERE id = ?', [`user-${i}`]);
			expect(user).toBeDefined();
		}
	});

	it('should handle updates correctly', async () => {
		await run('user-1', 'INSERT INTO users (id, name) VALUES (?, ?)', ['user-1', 'Alice']);

		await run('user-1', 'UPDATE users SET name = ? WHERE id = ?', ['Alice Updated', 'user-1']);

		const user = await first<{ name: string }>('user-1', 'SELECT name FROM users WHERE id = ?', ['user-1']);

		expect(user?.name).toBe('Alice Updated');
	});
});

Integration Testing with Multiple Providers

Test different combinations without Docker or external services:

import { describe, it, expect, beforeEach, afterEach } from 'vitest';
import {
	createInMemoryKVProvider,
	createInMemorySQLProvider,
	initialize,
	resetConfig,
	run,
	first,
	KVShardMapper
} from '@earth-app/collegedb';

describe('Multi-Provider Integration', () => {
	it('should work with different KV/SQL combinations', async () => {
		const combinations = [{ kvName: 'memory', sqlName: 'memory' }];

		for (const combo of combinations) {
			resetConfig();

			initialize({
				kv: createInMemoryKVProvider(),
				shards: {
					'shard-1': createInMemorySQLProvider(),
					'shard-2': createInMemorySQLProvider()
				},
				strategy: 'hash'
			});

			// Test basic operations
			await run('key-1', 'INSERT INTO data (id, value) VALUES (?, ?)', ['key-1', 'test-value']);

			const row = await first('key-1', 'SELECT value FROM data WHERE id = ?', ['key-1']);

			expect(row?.value).toBe('test-value');
		}
	});

	it('should support lookup key mapping', async () => {
		initialize({
			kv: createInMemoryKVProvider(),
			shards: { 'shard-1': createInMemorySQLProvider() },
			strategy: 'hash'
		});

		const mapper = new KVShardMapper(createInMemoryKVProvider());

		// Add lookup keys
		await mapper.addLookupKeys('user-123', ['email:[email protected]', 'username:alice']);

		// Retrieve via lookup key
		const mapping = await mapper.getShardMapping('email:[email protected]');
		expect(mapping?.shard).toBeDefined();
	});
});

Performance Testing with In-Memory Providers

Run quick performance tests locally without external dependencies:

import { createInMemoryKVProvider, createInMemorySQLProvider, initialize, run } from '@earth-app/collegedb';

async function benchmarkInserts(iterations: number): Promise<number> {
	initialize({
		kv: createInMemoryKVProvider(),
		shards: {
			'shard-1': createInMemorySQLProvider(),
			'shard-2': createInMemorySQLProvider(),
			'shard-3': createInMemorySQLProvider()
		},
		strategy: 'hash'
	});

	const startTime = performance.now();

	for (let i = 0; i < iterations; i++) {
		const id = `perf-user-${i}`;
		await run(id, 'INSERT INTO users (id, name) VALUES (?, ?)', [id, `User ${i}`]);
	}

	return performance.now() - startTime;
}

const duration = await benchmarkInserts(1000);
console.log(`1000 inserts: ${duration.toFixed(2)}ms (${((1000 / duration) * 1000).toFixed(0)} ops/sec)`);

Running the Memory Provider Sandbox

CollegeDB includes a ready-made sandbox example demonstrating multiple scenarios:

bun run test:memory

This runs comprehensive benchmarks including:

Basic CRUD operations
Multi-shard data distribution
KV storage operations
Round-robin strategy testing
JOIN query performance

Supported Operations

Both in-memory providers support the complete CollegeDB API:

SQLDatabase features:

CREATE TABLE / DROP TABLE
INSERT / UPDATE / DELETE
SELECT with WHERE clauses
COUNT(*) queries
JOIN queries
PRAGMA queries (basic support)

KVStorage features:

get() / put() / delete()
list() with prefix filtering and cursor-based pagination
TTL/expiration support

Limitations

The in-memory providers are intentionally simple to avoid dependencies:

SQL Parser: Basic pattern matching instead of full SQL parsing; works well for standard CollegeDB patterns but may not handle complex SQL edge cases
Joins: Supported at application level; cross-shard joins work via multiple routed queries
Transactions: Not supported; operations are atomic per-statement
Indexes: Created but not actually used for query optimization
Schema: Inferred from CREATE TABLE statements; dynamic column detection based on binding order

For production use, migrate to appropriate providers (D1, Redis, PostgreSQL, etc.). For testing/development, these limitations are intentional to keep the implementation lightweight and zero-dependency.

Advanced In-Memory Example: `run`, `all`, `insert`, and Aggregates

The in-memory SQL emulator supports a useful subset of SQLite syntax — enough to drive routing tests for real-world ORM-style code. The example below exercises the full routing stack (run, all, first, insert, insertShard, runShard, countAllShards, allAllShardsGlobal) entirely in-process without spinning up a database container.

import {
	allAllShardsGlobal,
	countAllShards,
	createInMemoryKVProvider,
	createInMemorySQLProvider,
	first,
	initialize,
	insert,
	insertShard,
	resetConfig,
	run,
	runShard
} from '@earth-app/collegedb';

resetConfig();

initialize({
	kv: createInMemoryKVProvider(),
	shards: {
		'db-east': createInMemorySQLProvider(),
		'db-west': createInMemorySQLProvider(),
		'db-central': createInMemorySQLProvider()
	},
	strategy: 'hash',
	hashShardMappings: false,
	disableAutoMigration: true
});

// 1. Schema setup: replicate the same DDL on every shard.
for (const shard of ['db-east', 'db-west', 'db-central']) {
	await runShard(
		shard,
		`CREATE TABLE IF NOT EXISTS users (
			id TEXT PRIMARY KEY,
			name TEXT NOT NULL,
			email TEXT UNIQUE,
			created_at INTEGER
		)`
	);
	await runShard(
		shard,
		`CREATE TABLE IF NOT EXISTS tickets (
			id INTEGER PRIMARY KEY AUTOINCREMENT,
			user_id TEXT NOT NULL,
			subject TEXT NOT NULL,
			created_at INTEGER
		)`
	);
}

// 2. Routed inserts: CollegeDB picks the shard from a stable hash of the
//    primary key, then records the mapping in KV.
await run('user-1', 'INSERT INTO users (id, name, email, created_at) VALUES (?, ?, ?, ?)', [
	'user-1',
	'Alice',
	'[email protected]',
	Date.now()
]);

// 3. AUTOINCREMENT inserts: `insert()` allocates the shard and captures the
//    generated id, storing it as a mapping so routed reads find the row.
const ticketA = await insert('INSERT INTO tickets (user_id, subject, created_at) VALUES (?, ?, ?)', [
	'user-1',
	'Cannot log in',
	Date.now()
]);

// 4. Pinned inserts: `insertShard()` writes to a specific shard, still
//    capturing the generated id mapping.
const ticketB = await insertShard('db-west', 'INSERT INTO tickets (user_id, subject, created_at) VALUES (?, ?, ?) RETURNING id', [
	'user-1',
	'Mobile sync is slow',
	Date.now()
]);

// 5. Routed reads: `first` and `all` resolve the shard from the mapping.
const user = await first<{ id: string; name: string; email: string }>('user-1', 'SELECT id, name, email FROM users WHERE id = ?', [
	'user-1'
]);

const ticket = await first<{ id: number; subject: string }>(String(ticketA.generatedId), 'SELECT id, subject FROM tickets WHERE id = ?', [
	ticketA.generatedId
]);

// 6. Cross-shard aggregates: the emulator evaluates COUNT/MAX/MIN/SUM/AVG and
//    COALESCE during SELECT, so utility queries like `SELECT COALESCE(MAX(id),
//    0) + 1 AS next FROM tickets` work the same way they do in production.
const totals = await countAllShards('tickets');

const recentTickets = await allAllShardsGlobal<{ id: number; subject: string; created_at: number }>(
	'SELECT id, subject, created_at FROM tickets WHERE user_id = ?',
	['user-1'],
	{ sortBy: 'created_at', sortDirection: 'desc', limit: 10 }
);

console.log({ user, ticket, ticketB: ticketB.generatedId, totals, recent: recentTickets.results });

What this example demonstrates:

Routing through run, first, all, and insert works end-to-end without a real database. The hash strategy assigns each new primary key to a shard the same way it would in production.
Auto-increment + RETURNING are honored. insert() captures the generated id from either provider metadata or RETURNING id rows.
Aggregate SQL is evaluated (COUNT(*), MIN, MAX, SUM, AVG, COALESCE, simple arithmetic), and compound WHERE with AND / OR / parens / LIKE / IS NULL is honored on UPDATE, DELETE, and SELECT.
Cross-shard fanout helpers (countAllShards, allAllShardsGlobal) operate over the in-memory store exactly as they would over D1/Postgres/MySQL.
No data leaks between tests: call resetConfig() (and instantiate fresh providers) to start each test with a clean state.

Migrating from In-Memory to Production Providers

When ready to migrate from testing to production:

// Before (testing)
import { createInMemoryKVProvider, createInMemorySQLProvider } from '@earth-app/collegedb';

const config = {
	kv: createInMemoryKVProvider(),
	shards: { 'shard-1': createInMemorySQLProvider() }
};

// After (production)
import { createRedisKVProvider, createPostgreSQLProvider } from '@earth-app/collegedb';

const config = {
	kv: createRedisKVProvider(redisClient),
	shards: { 'shard-1': createPostgreSQLProvider(pgPool) }
};

// Rest of configuration stays the same!

The API remains identical - only the provider initialization changes.

Basic Usage

import { collegedb, createSchema, run, first } from '@earth-app/collegedb';

// Initialize with your Cloudflare bindings (existing databases work automatically!)
collegedb(
	{
		kv: env.KV,
		coordinator: env.ShardCoordinator,
		shards: {
			'db-east': env['db-east'], // Can be existing DB with data
			'db-west': env['db-west'] // Can be existing DB with data
		},
		strategy: 'hash'
	},
	async () => {
		// Create schema on new shards only (existing shards auto-detected)
		await createSchema(env['db-new-shard'], 'CREATE TABLE IF NOT EXISTS users (id TEXT PRIMARY KEY, name TEXT, email TEXT)');

		// Insert data (automatically routed to appropriate shard)
		await run('user-123', 'INSERT INTO users (id, name, email) VALUES (?, ?, ?)', ['user-123', 'Johnson', '[email protected]']);

		// Query data (automatically routed to correct shard, works with existing data!)
		const result = await first<User>('existing-user-456', 'SELECT * FROM users WHERE id = ?', ['existing-user-456']);

		console.log(result); // User data from existing database
	}
);

Geographic Distribution Example

import { collegedb, first, run } from '@earth-app/collegedb';

// Optimize for North American users with geographic sharding
collegedb(
	{
		kv: env.KV,
		strategy: 'location',
		targetRegion: 'wnam', // Western North America
		shardLocations: {
			'db-west': { region: 'wnam', priority: 2 }, // SF - Preferred for target region
			'db-east': { region: 'enam', priority: 1 }, // NYC - Secondary
			'db-europe': { region: 'weur', priority: 0.5 } // London - Fallback
		},
		shards: {
			'db-west': env.DB_WEST,
			'db-east': env.DB_EAST,
			'db-europe': env.DB_EUROPE
		}
	},
	async () => {
		// New users will be allocated to db-west (closest to target region)
		await run('user-west-123', 'INSERT INTO users (id, name, location) VALUES (?, ?, ?)', [
			'user-west-123',
			'West Coast User',
			'California'
		]);

		// Queries are routed to the correct geographic shard
		const user = await first<User>('user-west-123', 'SELECT * FROM users WHERE id = ?', ['user-west-123']);
		console.log(`User found in optimal shard: ${user?.name}`);
	}
);

Mixed Strategy Example

import { collegedb, first, run, type MixedShardingStrategy } from '@earth-app/collegedb';

// Use location strategy for writes (optimal data placement) and hash for reads (optimal performance)
const mixedStrategy: MixedShardingStrategy = {
	write: 'location', // New data goes to geographically optimal shards
	read: 'hash' // Reads use consistent hashing for best performance
};

collegedb(
	{
		kv: env.KV,
		strategy: mixedStrategy,
		targetRegion: 'wnam', // Western North America for writes
		shardLocations: {
			'db-west': { region: 'wnam', priority: 2 },
			'db-east': { region: 'enam', priority: 1 },
			'db-central': { region: 'enam', priority: 1 }
		},
		shards: {
			'db-west': env.DB_WEST,
			'db-east': env.DB_EAST,
			'db-central': env.DB_CENTRAL
		}
	},
	async () => {
		// Write operations use location strategy - new users placed optimally
		await run('user-california-456', 'INSERT INTO users (id, name, location) VALUES (?, ?, ?)', [
			'user-california-456',
			'California User',
			'Los Angeles'
		]);

		// Read operations use hash strategy - consistent and fast routing
		const user = await first<User>('user-california-456', 'SELECT * FROM users WHERE id = ?', ['user-california-456']);

		// Different operations can route to different shards based on strategy
		// This optimizes both data placement (writes) and query performance (reads)
		console.log(`User: ${user?.name}, Location: ${user?.location}`);
	}
);

This approach provides:

Optimal data placement: New records are written to geographically optimal shards using location strategy
Optimal read performance: Queries use hash strategy for consistent, high-performance routing
Flexibility: Each operation type can use the most appropriate routing strategy

Auto-Generated Primary Keys

When your table assigns the primary key during insert, use insert() for the automatic shard-allocation path or insertShard() when you already know the target shard. Both helpers capture the generated id from provider metadata or RETURNING rows, then store the generated-id mapping so the normal routed first() / all() helpers can read the row back.

import { first, insert, insertShard } from '@earth-app/collegedb';

// SQLite / D1
await createSchema(
	env['db-east'],
	`
	CREATE TABLE IF NOT EXISTS auto_users (
		id INTEGER PRIMARY KEY AUTOINCREMENT,
		name TEXT NOT NULL,
		email TEXT UNIQUE,
		created_at INTEGER
	)
`
);

const created = await insert('INSERT INTO auto_users (name, email, created_at) VALUES (?, ?, ?)', ['Ada', '[email protected]', Date.now()]);

const row = await first(String(created.generatedId), 'SELECT * FROM auto_users WHERE id = ?', [created.generatedId]);

// Direct shard insert when you want to pin the write to a specific shard
const directCreated = await insertShard('db-east', 'INSERT INTO auto_users (name, email, created_at) VALUES (?, ?, ?)', [
	'Ada',
	'[email protected]',
	Date.now()
]);

console.log(directCreated.generatedId);

// PostgreSQL / MySQL 8.0.19+ RETURNING path
await createSchema(
	env['db-east'],
	`
	CREATE TABLE IF NOT EXISTS auto_users (
		id BIGINT GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY,
		name VARCHAR(255) NOT NULL,
		email VARCHAR(255) UNIQUE,
		created_at BIGINT
	)
`
);

const created = await insert('INSERT INTO auto_users (name, email, created_at) VALUES (?, ?, ?) RETURNING id', [
	'Ada',
	'[email protected]',
	Date.now()
]);

const row = await first(String(created.generatedId), 'SELECT * FROM auto_users WHERE id = ?', [created.generatedId]);

If your SQL dialect uses RETURNING, include it in the insert statement. The helper will use the returned row instead of provider metadata when present.

Multi-Key Shard Mappings

CollegeDB supports multiple lookup keys for the same record, allowing you to query by username, email, ID, or any unique identifier. Keys are automatically hashed with SHA-256 for security and privacy.

import { collegedb, first, run, KVShardMapper } from '@earth-app/collegedb';

collegedb(
	{
		kv: env.KV,
		shards: { 'db-east': env.DB_EAST, 'db-west': env.DB_WEST },
		hashShardMappings: true, // Default: enabled for security
		strategy: 'hash'
	},
	async () => {
		// Create a user with multiple lookup keys
		const mapper = new KVShardMapper(env.KV, { hashShardMappings: true });

		await mapper.setShardMapping('user-123', 'db-east', ['username:john_doe', 'email:[email protected]', 'id:123']);

		// Now you can query by ANY of these keys
		const byId = await first('user-123', 'SELECT * FROM users WHERE id = ?', ['user-123']);
		const byUsername = await first('username:john_doe', 'SELECT * FROM users WHERE username = ?', ['john_doe']);
		const byEmail = await first('email:[email protected]', 'SELECT * FROM users WHERE email = ?', ['[email protected]']);

		// All queries route to the same shard (db-east)
		console.log('All queries find the same user:', byId?.name);
	}
);

Adding Lookup Keys to Existing Mappings

const mapper = new KVShardMapper(env.KV);

// User initially created with just ID
await mapper.setShardMapping('user-456', 'db-west');

// Later, add additional lookup methods
await mapper.addLookupKeys('user-456', ['email:[email protected]', 'username:jane']);

// Now works with any key
const user = await first('email:[email protected]', 'SELECT * FROM users WHERE email = ?', ['[email protected]']);

Lookup-Key Read Helpers With Fanout Fallback

When you query by a secondary key and want safe behavior even when a lookup mapping is missing or stale, use the router-level helpers:

import { allByLookupKey, firstByLookupKey } from '@earth-app/collegedb';

// Uses lookup-key mapping when present, then falls back to all-shard fanout if needed
const user = await firstByLookupKey('email:[email protected]', 'SELECT * FROM users WHERE email = ? LIMIT 1', ['[email protected]']);

// Same resolution flow, but returns merged row sets
const matches = await allByLookupKey('username:john_doe', 'SELECT id, username FROM users WHERE username = ?', ['john_doe']);

This avoids accidentally creating a new primary-key mapping for secondary identifiers while still returning results when mappings are unavailable.

Security and Privacy

SHA-256 Hashing (Enabled by Default): Sensitive data like emails are hashed before being stored as KV keys, protecting user privacy:

// With hashShardMappings: true (default)
// KV stores: "shard:a1b2c3d4..." instead of "shard:email:[email protected]"

const config = {
	kv: env.KV,
	shards: {
		/* ... */
	},
	hashShardMappings: true, // Hashes keys with SHA-256
	strategy: 'hash'
};

⚠️ Performance Trade-off: When hashing is enabled, operations like getKeysForShard() cannot return original key names, only hashed versions. For full key recovery, disable hashing:

const config = {
	hashShardMappings: false // Disables hashing - keys stored in plain text
};

Multi-Key Management

const mapper = new KVShardMapper(env.KV);

// Get all lookup keys for a mapping
const allKeys = await mapper.getAllLookupKeys('email:[email protected]');
console.log(allKeys); // ['user-123', 'username:john', 'email:[email protected]']

// Update shard assignment (updates all keys)
await mapper.updateShardMapping('username:john', 'db-central');

// Delete mapping (removes all associated keys)
await mapper.deleteShardMapping('user-123');

Drop-in Replacement for Existing Databases

CollegeDB supports seamless, automatic integration with existing D1 databases that already contain data. Simply add your existing databases as shards in the configuration. CollegeDB will automatically detect existing data and create the necessary shard mappings without requiring any manual migration steps.

Requirements for Drop-in Replacement

Primary Keys: All tables must have a primary key column (typically named id)
Schema Compatibility: Tables should use standard SQLite data types
Access Permissions: CollegeDB needs read/write access to existing databases
KV Namespace: A Cloudflare KV namespace for storing shard mappings

import { collegedb, first, run } from '@earth-app/collegedb';

// Add your existing databases as shards - that's it!
collegedb(
	{
		kv: env.KV,
		shards: {
			'db-users': env.ExistingUserDB, // Your existing database with users
			'db-orders': env.ExistingOrderDB, // Your existing database with orders
			'db-new': env.NewDB // Optional new shard for growth
		},
		strategy: 'hash'
	},
	async () => {
		// Existing data works immediately!
		const existingUser = await first('user-from-old-db', 'SELECT * FROM users WHERE id = ?', ['user-from-old-db']);

		// New data gets distributed automatically
		await run('new-user-123', 'INSERT INTO users (id, name, email) VALUES (?, ?, ?)', ['new-user-123', 'New User', '[email protected]']);
	}
);

That's it! No migration scripts, no manual mapping creation, no downtime. Your existing data is immediately accessible through CollegeDB's sharding system.

Manual Validation (Optional)

You can manually validate databases before integration if needed:

import { validateTableForSharding, listTables } from '@earth-app/collegedb';

// Check database structure
const tables = await listTables(env.ExistingDB);
console.log('Found tables:', tables);

// Validate each table
for (const table of tables) {
	const validation = await validateTableForSharding(env.ExistingDB, table);
	if (validation.isValid) {
		console.log(`✅ ${table}: ${validation.recordCount} records ready`);
	} else {
		console.log(`❌ ${table}: ${validation.issues.join(', ')}`);
	}
}

Manual Data Discovery (Optional)

If you want to inspect existing data before automatic migration:

import { discoverExistingPrimaryKeys } from '@earth-app/collegedb';

// Discover all user IDs in existing users table
const userIds = await discoverExistingPrimaryKeys(env.ExistingDB, 'users');
console.log(`Found ${userIds.length} existing users`);

// Custom primary key column
const orderIds = await discoverExistingPrimaryKeys(env.ExistingDB, 'orders', 'order_id');

Manual Integration (Optional)

For complete control over the integration process:

import { integrateExistingDatabase, KVShardMapper } from '@earth-app/collegedb';

const mapper = new KVShardMapper(env.KV);

// Integrate your existing database
const result = await integrateExistingDatabase(
	env.ExistingDB, // Your existing D1 database
	'db-primary', // Shard name for this database
	mapper, // KV mapper instance
	{
		tables: ['users', 'posts', 'orders'], // Tables to integrate
		primaryKeyColumn: 'id', // Primary key column name
		strategy: 'hash', // Allocation strategy for future records
		addShardMappingsTable: true, // Add CollegeDB metadata table
		dryRun: false // Set true for testing
	}
);

if (result.success) {
	console.log(`✅ Integrated ${result.totalRecords} records from ${result.tablesProcessed} tables`);
} else {
	console.error('Integration issues:', result.issues);
}

After integration, initialize CollegeDB with your existing databases as shards:

import { initialize, first } from '@earth-app/collegedb';

// Include existing databases as shards
initialize({
	kv: env.KV,
	coordinator: env.ShardCoordinator,
	shards: {
		'db-primary': env.ExistingDB, // Your integrated existing database
		'db-secondary': env.AnotherExistingDB, // Another existing database
		'db-new': env.NewDB // Optional new shard for growth
	},
	strategy: 'hash'
});

// Existing data is now automatically routed!
const user = await first('existing-user-123', 'SELECT * FROM users WHERE id = ?', ['existing-user-123']);

Complete Drop-in Example

The simplest possible integration - just add your existing databases:

import { initialize, first, run } from '@earth-app/collegedb';

export default {
	async fetch(request: Request, env: Env): Promise<Response> {
		// Step 1: Initialize with existing databases (automatic migration happens here!)
		initialize({
			kv: env.KV,
			shards: {
				'db-users': env.ExistingUserDB, // Your existing database with users
				'db-orders': env.ExistingOrderDB, // Your existing database with orders
				'db-new': env.NewDB // New shard for future growth
			},
			strategy: 'hash'
		});

		// Step 2: Use existing data immediately - no migration needed!
		// Supports typed queries, inserts, updates, deletes, etc.
		const existingUser = await first<User>('user-from-old-db', 'SELECT * FROM users WHERE id = ?', ['user-from-old-db']);

		// Step 3: New data gets distributed automatically
		await run('new-user-123', 'INSERT INTO users (id, name, email) VALUES (?, ?, ?)', ['new-user-123', 'New User', '[email protected]']);

		return new Response(
			JSON.stringify({
				existingUser: existingUser.results[0],
				message: 'Automatic drop-in replacement successful!'
			})
		);
	}
};

Manual Integration Example

If your tables use different primary key column names:

// For tables with custom primary key columns
const productIds = await discoverExistingPrimaryKeys(env.ProductDB, 'products', 'product_id');
const sessionIds = await discoverExistingPrimaryKeys(env.SessionDB, 'sessions', 'session_key');

Integrate only specific tables from existing databases:

const result = await integrateExistingDatabase(env.ExistingDB, 'db-legacy', mapper, {
	tables: ['users', 'orders'] // Only integrate these tables
	// Skip 'temp_logs', 'cache_data', etc.
});

Test integration without making changes:

const testResult = await integrateExistingDatabase(env.ExistingDB, 'db-test', mapper, {
	dryRun: true // No actual mappings created
});

console.log(`Would process ${testResult.totalRecords} records from ${testResult.tablesProcessed} tables`);

Performance Impact

One-time Setup: Migration detection runs once per shard
Minimal Overhead: Only scans table metadata and sample records
Cached Results: Subsequent operations have no migration overhead
Async Processing: Doesn't block application startup or queries

// Simple rollback - clear all mappings
import { KVShardMapper } from '@earth-app/collegedb';
const mapper = new KVShardMapper(env.KV);
await mapper.clearAllMappings(); // Returns to pre-migration state

// Or clear cache to force re-detection
import { clearMigrationCache } from '@earth-app/collegedb';
clearMigrationCache(); // Forces fresh migration check

Troubleshooting

Tables without Primary Keys

// Error: Primary key column 'id' not found
// Solution: Add primary key to existing table
await db.prepare(`ALTER TABLE legacy_table ADD COLUMN id TEXT PRIMARY KEY`).run();

Large Database Integration

// For very large databases, integrate in batches
const allTables = await listTables(env.LargeDB);
const batchSize = 2;

for (let i = 0; i < allTables.length; i += batchSize) {
	const batch = allTables.slice(i, i + batchSize);
	await integrateExistingDatabase(env.LargeDB, 'db-large', mapper, {
		tables: batch
	});
}

Mixed Primary Key Types

// Handle different primary key column names per table
const customIntegration = {
	users: 'user_id',
	orders: 'order_number',
	products: 'sku'
};

for (const [table, pkColumn] of Object.entries(customIntegration)) {
	const keys = await discoverExistingPrimaryKeys(env.DB, table, pkColumn);
	await createMappingsForExistingKeys(keys, ['db-shard1'], 'hash', mapper);
}

Cross-Shard Pagination Behavior

allAllShards and firstAllShards execute the exact SQL on each shard independently. That means SQL LIMIT/OFFSET applies per shard, not globally.

// With two shards, this can return up to 20 total rows (10 per shard)
const perShard = await allAllShards('SELECT * FROM posts ORDER BY created_at DESC LIMIT 10');

If you need true global merge/sort/pagination across all shard results, use allAllShardsGlobal / firstAllShardsGlobal and pass sort/pagination options to the library:

import { allAllShardsGlobal, firstAllShardsGlobal } from '@earth-app/collegedb';

const page = await allAllShardsGlobal<{ id: string; created_at: number }>('SELECT id, created_at FROM posts', [], {
	sortBy: 'created_at',
	sortDirection: 'desc',
	offset: 20,
	limit: 10
});

const newest = await firstAllShardsGlobal<{ id: string; created_at: number }>('SELECT id, created_at FROM posts', [], {
	sortBy: 'created_at',
	sortDirection: 'desc'
});

Database Query Best Practices

Use Library Utility Operations for DDL and Inspection

CollegeDB now exposes utility helpers for operational tasks that need shard awareness:

import { countAllShards, explainAllShards, getDatabaseSizesAllShards, indexAllShards } from '@earth-app/collegedb';

// Create index across all shards
await indexAllShards('posts', [{ name: 'user_id' }, { name: 'created_at', order: 'DESC' }], {
	ifNotExists: true
});

// Inspect query plan across all shards
const plans = await explainAllShards('SELECT * FROM posts WHERE user_id = ? ORDER BY created_at DESC LIMIT 20', ['user-123']);

// Count rows globally
const counts = await countAllShards('posts');

// Get per-shard size measurements
const sizes = await getDatabaseSizesAllShards();

Recommended pattern:

Use indexAllShards for schema/index consistency.
Use explain/explainAllShards before adding indexes or changing query shapes.
Use countAllShards and getDatabaseSizesAllShards for operational dashboards and rebalancing thresholds.

Create Targeted Indexes

CREATE INDEX IF NOT EXISTS idx_posts_user_id_created_at ON posts(user_id, created_at DESC);
CREATE INDEX IF NOT EXISTS idx_users_email ON users(email);

Inspect Query Plans and Statistics

EXPLAIN QUERY PLAN SELECT * FROM posts WHERE user_id = ? ORDER BY created_at DESC LIMIT 20;
PRAGMA optimize;
ANALYZE;

Use Bound Parameters (SQL Injection Protection)

// Safe: parameterized query
await first('user-123', 'SELECT * FROM users WHERE email = ?', [email]);

// Avoid string interpolation with user input
// BAD: `... WHERE email = '${email}'`

Design Search for Scale

Exact-match or prefix search fields should be indexed.
Prefer bounded result sets (LIMIT) and stable sorting.
For global search pages, pair lightweight per-shard SQL with allAllShardsGlobal for final merge/sort/pagination.

Pagination Guidance

For routed single-key reads (first, all), SQL pagination is naturally shard-local and predictable.
For fanout (allAllShards, firstAllShards), SQL pagination is per-shard.
For user-facing global pages, use allAllShardsGlobal so offset/limit apply once after merge.

API Reference

| Function | Description | Parameters | | ------------------------------------------------- | ---------------------------------------------------------------------- | ------------------------------------------------------------------ | | collegedb(config, callback) | Initialize CollegeDB, then run a callback | CollegeDBConfig, () => T | | initialize(config) | Initialize CollegeDB with configuration | CollegeDBConfig | | createSchema(db, schema) | Create schema on a shard database | SQLDatabase, string | | prepare(key, sql) | Prepare a SQL statement for execution | string, string | | run(key, sql, bindings) | Execute a SQL query with primary key routing | string, string, any[] | | insert(sql, bindings) | Insert on an automatically selected shard and capture the generated id | string, any[] | | insertShard(shard, sql, bindings) | Insert directly on a specific shard and capture the generated id | string, string, any[] | | first(key, sql, bindings) | Execute a SQL query and return first result | string, string, any[] | | all(key, sql, bindings) | Execute a SQL query and return all results | string, string, any[] | | index(key, table, columns, options) | Create an index on routed shard | string, string, string or index-column array, CreateIndexOptions | | indexShard(shard, table, columns, options) | Create an index on one shard | string, string, string or index-column array, CreateIndexOptions | | indexAllShards(table, columns, options) | Create an index on all shards | string, string or index-column array, CreateIndexOptions | | firstByLookupKey(key, sql, bindings, batchSize) | Resolve secondary-key mapping, fallback to fanout | string, string, any[], number | | allByLookupKey(key, sql, bindings, batchSize) | Resolve secondary-key mapping, fallback to fanout | string, string, any[], number | | runShard(shard, sql, bindings) | Execute a query directly on a specific shard | string, string, any[] | | allShard(shard, sql, bindings) | Execute a query on specific shard, return all results | string, string, any[] | | firstShard(shard, sql, bindings) | Execute a query on specific shard, return first result | string, string, any[] | | explain(key, sql, bindings, options) | Inspect query plan on routed shard | string, string, any[], ExplainOptions | | explainShard(shard, sql, bindings, options) | Inspect query plan on one shard | string, string, any[], ExplainOptions | | explainAllShards(sql, bindings, options) | Inspect query plan on all shards | string, any[], ExplainOptions | | count(key, table) | Count rows on routed shard | string, string | | countShard(shard, table) | Count rows on a specific shard | string, string | | countAllShards(table, batchSize) | Count rows per shard and global total | string, number | | runAllShards(sql, bindings, batchSize) | Execute query on all shards | string, any[], number | | allAllShards(sql, bindings, batchSize) | Execute query on all shards (SQL pagination applies per shard) | string, any[], number | | firstAllShards(sql, bindings, batchSize) | Execute query on all shards, return first row per shard | string, any[], number | | allAllShardsGlobal(sql, bindings, options) | Execute query on all shards, then globally merge/sort/paginate | string, any[], GlobalAllShardsOptions | | firstAllShardsGlobal(sql, bindings, options) | Return first row after global merge/sort/paginate | string, any[], GlobalAllShardsOptions | | reassignShard(key, newShard) | Move primary key to different shard | string, string | | listKnownShards() | Get list of available shards | void | | getShardStats() | Get statistics for all shards | void | | getDatabaseSizeForKey(key) | Get size of key-routed shard in bytes | string | | getDatabaseSizeForShard(shard) | Get size of a specific shard in bytes | string | | getDatabaseSizesAllShards(batchSize) | Get per-shard size data | number | | getTotalDatabaseSize(batchSize) | Get total size across all shards | number | | flush() | Clear all shard mappings (development only) | void |

Provider Adapter Functions

| Function | Description | Parameters | | ---------------------------------------------------------- | -------------------------------------------------------------------- | -------------------------------------------------------- | | createRedisKVProvider(client, options?) | Adapt a Redis client to CollegeDB's KVStorage contract | RedisLikeClient, { scanCount?: number } | | createValkeyKVProvider(client, options?) | Adapt a Valkey client to CollegeDB's KVStorage contract | RedisLikeClient, { scanCount?: number } | | createNuxtHubKVProvider(client) | Adapt NuxtHub/Unstorage-style KV clients to KVStorage | NuxtHubKVLike | | createPostgreSQLProvider(client, sqlTag?) | Adapt PostgreSQL or Drizzle PostgreSQL clients | PostgresClientLike, sqlTag? | | createMySQLProvider(client, sqlTag?) | Adapt MySQL/MariaDB or Drizzle MySQL/MariaDB clients | MySQLClientLike, sqlTag? | | createSQLiteProvider(client, sqlTag?) | Adapt SQLite/D1 or Drizzle SQLite/D1 clients | SQLiteClientLike, sqlTag? | | createDrizzleSQLProvider(client, sqlTag) | Generic Drizzle adapter (optional helper) | DrizzleClientLike, DrizzleSqlTagLike | | createHyperdrivePostgresProvider(binding, clientFactory) | Create a PostgreSQL SQLDatabase adapter using a Hyperdrive binding | HyperdriveBindingLike, HyperdrivePostgresClientFactory | | createHyperdriveMySQLProvider(binding, clientFactory) | Create a MySQL SQLDatabase adapter using a Hyperdrive binding | HyperdriveBindingLike, HyperdriveMySQLClientFactory | | isKVStorage(value) | Runtime guard for KVStorage | unknown | | isSQLDatabase(value) | Runtime guard for SQLDatabase | unknown |

Drop-in Replacement Functions

| Function | Description | Parameters | | ----------------------------------------- | ---------------------------------------------- | ------------------------------ | | autoDetectAndMigrate(d1, shard, config) | Automatically detect and migrate existing data | SQLDatabase, string, config | | checkMigrationNeeded(d1, shard, config) | Check if database needs migration | SQLDatabase, string, config | | validateTableForSharding(d1, table) | Check if table is suitable for sharding | SQLDatabase, string | | discoverExistingPrimaryKeys(d1, table) | Find all primary keys in existing table | SQLDatabase, string | | integrateExistingDatabase(d1, shard) | Complete drop-in integration of existing DB | SQLDatabase, string, mapper | | createMappingsForExistingKeys(keys) | Create shard mappings for existing keys | string[], string[], strategy | | listTables(d1) | Get list of tables in database | SQLDatabase | | clearMigrationCache() | Clear automatic migration cache | void |

Error Handling

| Class | Description | Usage | | ---------------- | ------------------------------------------- | ------------------------------------- | | CollegeDBError | Custom error class for CollegeDB operations | throw new CollegeDBError(msg, code) |

The CollegeDBError class extends the native Error class and includes an optional error code for better error categorization:

try {
	await run('invalid-key', 'SELECT * FROM users WHERE id = ?', ['invalid-key']);
} catch (error) {
	if (error instanceof CollegeDBError) {
		console.error(`CollegeDB Error (${error.code}): ${error.message}`);
	}
}

ShardCoordinator (Durable Object) API

The ShardCoordinator is an optional Durable Object that provides centralized shard allocation and statistics management. All endpoints return JSON responses.

HTTP API Endpoints

| Endpoint | Method | Description | Request Body | Response | | ----------- | ------ | ---------------------------------- | ------------------------------------------------ | -------------------------------------- | | /shards | GET | List all registered shards | None | ["db-east", "db-west"] | | /shards | POST | Register a new shard | {"shard": "db-new"} | {"success": true} | | /shards | DELETE | Unregister a shard | {"shard": "db-old"} | {"success": true} | | /stats | GET | Get shard statistics | None | [{"binding":"db-east","count":1542}] | | /stats | POST | Update shard statistics | {"shard": "db-east", "count": 1600} | {"success": true} | | /allocate | POST | Allocate shard for primary key | {"primaryKey": "user-123"} | {"shard": "db-west"} | | /allocate | POST | Allocate with specific strategy | {"primaryKey": "user-123", "strategy": "hash"} | {"shard": "db-west"} | | /flush | POST | Clear all state (development only) | None | {"success": true} | | /health | GET | Health check | None | "OK" |

Programmatic Methods

| Method | Description | Parameters | Returns | | ----------------------------- | ----------------------------- | -------------------- | ---------------