@frontierjs/litestone
v1.0.2
Published
Declarative, functional pipeline for transforming SQLite databases
Downloads
34
Maintainers
Readme
litestone
A declarative, functional pipeline for transforming SQLite databases. Scope rows, strip columns, mask fields, split by entity, and ship clean databases — without writing imperative migration scripts.
Built with Bun and its native SQLite driver. The source database is never mutated. All output goes to new files.
Install
npm install -g @frontierjs/litestone
# or
npm install -g litestoneQuickstart
Create a litestone.config.js in your project:
import { $ } from '@frontierjs/litestone'
export let pipeline = [
$.accounts.scope(`id IN (1, 2, 3)`),
$.users.filter(`deleted_at IS NULL`),
$.all.drop('email', 'password'),
$.leads.mask('phone', 'last4'),
]
export let config = {
db: './production.db',
pipeline,
}litestone # runs litestone.config.js
litestone my-config.js # or pass a path explicitlyHow it works
The framework copies the source database, runs your pipeline against the copy, VACUUMs it, and writes the result. The pipeline is a plain array of declarative steps — no loops, no callbacks, no SQL migrations.
The $ DSL
All operations are accessed through the $ proxy:
$.tableName.operation(...)$.all targets every table at once for column operations.
Operations
Row operations
These are standalone pipeline steps that reduce rows.
$.table.scope(sql | fn) — filter with FK cascade
The primary scoping primitive. Filters the anchor table, then automatically cascades to all FK-linked child tables. Supports both SQL strings (fast path) and JS functions.
$.accounts.scope(`id IN (1, 2, 3)`) // SQL — fastest
$.accounts.scope(row => row.plan === 'pro') // JS predicateCascade follows the full FK graph depth-first. Works with simple PKs (id), non-id PKs (slug, uuid), and composite FKs.
$.table.filter(sql | fn) — filter without cascade
Filters a single table. Does not cascade to child tables.
$.users.filter(`deleted_at IS NULL`)
$.leads.filter(row => row.status !== 'test')$.table.limit(n | '10%' | '50mb') — keep first N rows
$.audit_logs.limit(1000) // first 1000 rows
$.audit_logs.limit('10%') // first 10% of rows
$.audit_logs.limit('500mb') // rows that fit in ~500mb
$.all.limit('2gb') // whole-db target, distributed proportionally$.table.sample(n | '10%' | '50mb') — keep random rows
Same syntax as limit, but random selection.
$.leads.sample(500)
$.leads.sample('5%')
$.all.sample('1gb')$.table.drop() — drop an entire table
$.audit_logs.drop()
$.debug_events.drop()Column operations
Chainable. Multiple ops on the same table are applied in order.
$.table.drop('col1', 'col2') — drop columns
$.all.drop('email', 'password', 'internal_notes')
$.users.drop('legacy_field')$.table.keep('col1', 'col2') — whitelist columns
Drops everything not in the list.
$.leads.keep('id', 'account_id', 'status', 'phone')$.table.rename('from', 'to') — rename a column
$.users.rename('fname', 'first_name')$.table.set('name', fn) — transform or add a column
$.users.set('email', (val) => val.toLowerCase())
$.users.set('full_name', (_, row) => `${row.first_name} ${row.last_name}`.trim())$.table.mask('col', strategy) — built-in masking
| Strategy | Example input | Example output |
| -------- | ------------------ | ----------------------------------- |
| redact | +1 555-867-5309 | *** |
| last4 | +1 555-867-5309 | ************5309 |
| first2 | [email protected] | jo****************** |
| hash | secret | 2bb80d537b1da3e3 (SHA-256 prefix) |
Chaining
Column operations on the same table are chainable:
$.users
.set('full_name', (_, row) => `${row.first_name} ${row.last_name}`.trim())
.drop('first_name', 'last_name')
.mask('phone', 'last4')
.rename('usr_role', 'role')Splitting
Use $.shard() as a marker in the pipeline to fan out into one output file per entity. Everything before the marker runs once. Everything after runs per file.
import { $, params } from '@frontierjs/litestone'
const { accountIds } = params({ accountIds: [1, 2, 3] })
export let pipeline = [
$.accounts.scope(`id IN (${accountIds.join(',')})`),
$.all.drop('email', 'password'),
$.shard(), // ← split here, one db per account row
$.leads.sample(500), // per-file: max 500 random leads
$.audit_logs.limit(1000), // per-file: first 1000 audit logs
]
export let config = {
db: './production.db',
pipeline,
filename: (row) => `tenant-${row.slug}.db`, // optional
}Splits run in parallel (default: 8 workers). A manifest.json is written to the output directory after completion.
Recommended pipeline order
Order matters for both correctness and efficiency:
export let pipeline = [
// 1. Drop tables you don't need at all (leaf tables only)
$.audit_logs.drop(),
// 2. Scope — big row reduction, cascade while FK graph is intact
$.accounts.scope(`id IN (...)`),
// 3. Per-table row filters on the now-scoped subset
$.users.filter(`deleted_at IS NULL`),
$.leads.filter(`status != 'test'`),
// 4. limit / sample — further reduction if needed
$.leads.sample(500),
// 5. Column whitelist / drops — after row reduction
$.all.drop('password', 'internal_notes'),
$.leads.keep('id', 'account_id', 'status', 'phone'),
// 6. Mutations last — mask/set/rename on the final dataset
$.leads.mask('phone', 'last4'),
$.users
.set('full_name', (_, r) => `${r.first_name} ${r.last_name}`.trim())
.drop('first_name', 'last_name'),
]The one correctness trap: don't drop a table that is a FK parent for your scope anchor before scope runs. Only drop leaf tables or tables completely outside the scope chain.
Runtime parameters
Configs declare typed defaults, overridable at the CLI:
import { $, params } from '@frontierjs/litestone'
const { accountIds, includeSoftDeleted } = params({
accountIds: [1, 2, 3],
includeSoftDeleted: false,
})
export let pipeline = [
$.accounts.scope(`id IN (${accountIds.join(',')})`),
...(!includeSoftDeleted ? [$.users.filter(`deleted_at IS NULL`)] : []),
]
export let config = {
db: './production.db',
pipeline,
}litestone --params='{"accountIds":[5,10,69],"includeSoftDeleted":true}'CLI reference
litestone [config] [options]| Flag | Description |
| ------------------------ | -------------------------------------------------- |
| (no config) | Defaults to ./litestone.config.js |
| --out=path | Output file or directory |
| --dry-run | Validate pipeline without executing |
| --preview | Show row counts per step without executing |
| --params='{"key":val}' | Override params() defaults at runtime |
| --only=1,5,88 | Split mode: process only these IDs |
| --skip-existing | Split mode: skip files that already exist (resume) |
| --force | Split mode: overwrite stale intermediate file |
| --concurrency=N | Split mode: parallel worker count (default: 8) |
Examples
# Basic run
litestone
# Custom config and output directory
litestone ./configs/prod-export.js --out=./exports/
# Dry run — catch errors before touching anything
litestone --dry-run
# Preview row counts per step, no execution
litestone --preview
# Override params at runtime
litestone --params='{"accountIds":[42,99]}'
# Resume a failed split run
litestone --skip-existing --out=./splits/
# Re-run only specific accounts
litestone --only=5,12,88 --out=./splits/
# Limit parallelism on a constrained machine
litestone --concurrency=2Static analysis
Before touching the database, the framework walks the pipeline and catches:
- Operations on unknown or already-dropped tables and columns
- Invalid
limit/samplevalues - Index impact — warns when a
drop()orkeep()would destroy an index
⚠️ Step 4: dropping "email" on "users" will remove index "idx_users_email" (cols: email)Run --dry-run to see all warnings without executing anything.
Manifest
After a split run, manifest.json is written to the output directory:
{
"source": "/data/production.db",
"config": "/configs/export.js",
"splitBy": "accounts",
"completedAt": "2026-03-01T18:37:51.277Z",
"totalMs": 4432,
"written": 3,
"skipped": 0,
"failed": 0,
"failures": [],
"files": [
{
"file": "/exports/accounts-1.db",
"sizeBytes": 839680,
"rows": { "accounts": 1, "users": 17143, "leads": 100, "audit_logs": 50 }
}
]
}Performance notes
- SQL fast paths: String predicates use
INSERT INTO … SELECT WHEREentirely in SQL — 2–8× faster than JS row iteration. - Parallel splits: Each split runs in an independent Bun worker. On an 8-core machine, 100 splits finish in roughly the time of 1 split plus the shared pipeline pass.
- WAL detection: Stale
-wal/-shmfiles next to the source database trigger a warning with the exact fix command. - VACUUM: Every output file is VACUUMed before finalising, eliminating freelist pages left by intermediate operations.
Project structure
packages/
frontierjs-litestone/ — published as @frontierjs/litestone
bin.js — platform binary detection + bun fallback
package.json
src/
framework.js — core: introspect, plan, execute, $ DSL, params
runner.js — pipeline executor, rebuildTable, resolveScope
split-worker.js — Bun worker for parallel split execution
run.js — CLI entry point
frontierjs-litestone-linux-x64/
frontierjs-litestone-linux-arm64/
frontierjs-litestone-darwin-x64/
frontierjs-litestone-darwin-arm64/
frontierjs-litestone-win32-x64/
litestone/ — published as litestone (unscoped shim)
examples/
1-scope-and-anonymize.js
2-shard.js
3-redact-and-trim.js
4-multi-tenant-export.js
scripts/
bump-version.js — keeps all package versions in sync
seed.js — generates sample.db for local development
assets/
litestone.png
litestone-black.pngDevelopment
# Install deps
bun install
# Seed the sample database
bun scripts/seed.js
# Run against an example config
bun run dev examples/1-scope-and-anonymize.js --preview
bun run dev examples/1-scope-and-anonymize.js
# Build worker bundle + all platform binaries
bun run build
# Test a binary directly
./packages/frontierjs-litestone-darwin-arm64/litestone examples/1-scope-and-anonymize.js --preview
# Link for local development
cd packages/frontierjs-litestone && npm link
litestone examples/1-scope-and-anonymize.js --preview
# Bump version across all packages
bun scripts/bump-version.js 1.1.0
# Publish everything
bun run publish:all