sqlgoose

v0.1.1

Published

2 months ago

The Most Feature-Complete ORM - Mongoose-like API for SQL with 120+ Enterprise Features

🦢 SQLGoose

SQLGoose is a production-grade, database-agnostic ORM for Node.js. It brings the elegant, schema-based API of Mongoose to the world of SQL and heterogeneous database clusters, shipping with over 100+ built-in enterprise features.

📖 Table of Contents

🚀 Introduction

SQLGoose was built to solve the frustration of disparate database APIs and the complexity of managing resilient data layers. Whether you are running a single PostgreSQL instance or a globally distributed cluster of PostgreSQL, MySQL, and MongoDB nodes, SQLGoose provides a unified interface and enterprise-grade stability.

Why SQLGoose?

Unified API: Write queries once using familiar Mongoose syntax, run them natively on Postgres, MySQL, or MongoDB.
Heterogeneous Bliss: Use Postgres for your primary writes and MySQL/MongoDB for read replicas—the ORM handles the dialect translation internally.
Bulletproof Resilience: Built-in per-adapter circuit breakers and automated failover route your traffic around unhealthy nodes automatically without crashing your application.

✨ Features

Highlights:

🦄 Mongoose-Like Schema Engine: Define data definitions, default values, virtuals, plugins, and deeply nested validation.
🚦 Per-Adapter Circuit Breakers: Instance-based health tracking prevents a single slow replica from lagging your entire app.
🔄 Graceful Failover: Intelligently redirects traffic to healthy replicas or falls back to the primary database.
💓 Pulse Health Checks: Proactive background monitoring automatically reintroduces recovered database nodes to the pool.
⚖️ Advanced Load Balancing: Choose between round-robin, random, least-connections, or latency-based strategies.

🧰 Feature Deep Dive & Usage Guide

We categorized our 60+ native features into major operational blocks so you understand why they exist, where to apply them in production, and how they look in code.

1. High Availability & Scalability

(Multi-Database, Read Replicas, Circuit Breakers, Connection Retry, Database Sharding, Replication Monitoring, Connection Pooling, Feature Detection)

Why: Production architectures often distribute load across primaries and secondaries. You need traffic to automatically reroute if a replication node dies without throwing 500 errors. Where: Used globally at the connection layer of heavily trafficked, horizontally scaled APIs.

// Automatic distribution across Postgres, MySQL, and MongoDB dynamically handling sharding and read-replicas.
await connect({
  replicaConfig: {
    write: 'postgresql://primary/db', 
    read: ['mysql://secondary/db', 'mongodb://mongo-replica/db'],
    sharding: { enabled: true, strategy: 'hash' }
  }
});

2. Schema Engineering & Data Integrity

(Mongoose-like API, Auto Schema Sync, Schema Inheritance, Computed Fields, Strict Validation, Hooks & Middleware, Soft Delete, Timestamps, Default Scopes, AI Schema Suggestions, Schema Evolution)

Why: Unstructured SQL leads to messy, irrecoverable data. Schema mappings enforce type checking, compute derived data on the fly, and fire operational triggers. Where: Use hooks to hash passwords before save, inherited schemas for polymorphic users, and Soft Delete for GDPR compliance.

// Inherit from Base Schema, automatically syncs tables, and uses computed fields.
const BaseSchema = new Schema({
    createdAt: { type: Date, default: () => new Date() }
}, { softDelete: true }); // Tracks deletedAt natively

const userSchema = cloneSchema(BaseSchema).add({
  firstName: String,
  lastName: String,
}).virtual('fullName').get((doc) => `${doc.firstName} ${doc.lastName}`);

3. Advanced Querying & Relationships

(Query Builder, Aggregation Pipelines, Populate/Joins, Virtual Populate, Full-Text Search, Geospatial Queries, Time-Series Support, Polymorphic Relations, Cascading Deletes)

Why: Mongoose's .populate() is legendary. We brought it to SQL alongside proper Aggregations and Full-Text searching so you avoid writing painful, raw string SQL joins. Where: Deeply nested GraphQL resolvers, geospatial proximity searches (Uber/Tinder like apps), and relational bridging.

// Deep nested joins with Full-Text Search and filtering
const nearbyUsers = await User.find({ status: 'active' })
    .search('john') // Native Full-text search
    .near({ lat: 40.71, lng: -74.00, maxDistance: 5000 }) // Geospatial intersections
    .populate({
        path: 'orders',
        match: { total: { $gt: 100 } }, // Mongoose Query Operators
        populate: 'products'
    });

4. Enterprise Performance & Acceleration

(Redis Cache, Cursor Pagination, Streaming, Bulk Ops, Upserts, Query Timeout, Query Cost Analysis, Auto Query Optimization)

Why: Handling millions of rows crashes V8 memory limits. You need cursors, native caching layers, and true bulk ingest abilities. Where: Real-time analytical dashboards, massive CSV data imports, and high-throughput endpoint caches.

// Cursor Pagination for infinite scroll & Redis Cache caching
const page = await User.find().paginate({ cursor: lastCursor, limit: 100 }).cache({ ttl: 60 });

// Bulk Upsert avoiding multiple row-by-row trips
await User.bulkUpsert(
  [ { email: '[email protected]', name: 'A' }, { email: '[email protected]', name: 'B' } ],
  ['email'] // Conflict keys
);

5. Multi-Tenancy & Zero-Trust Security

(Multi-Tenancy RLS, Data Sanitization, Data Anonymization, Audit Logging, Row-Level Security, Database Proxy)

Why: SaaS apps require strict logical isolation between tenants. PII requires anonymization for safe telemetry. Where: Enterprise B2B platforms, healthcare apps (HIPAA), or anywhere requiring raw database isolation mapped safely at the ORM layer.

// Inject Row Level Security dynamically for entire request lifecycle
db.setTenant('tenant-acme-123');

// This query automatically adds `WHERE tenantId = 'tenant-acme-123'`
// AND obscures PII fields via native Anonymization
const users = await User.find().anonymize(['email', 'phone']);

6. DevOps Tooling & Database Observability

(Migrations, Database Backup/Restore, Change Data Capture (CDC), Database Versioning, Database Views, GraphQL Generation, Performance Profiler, DB Compare, Query Templates, 7+ CLI Tools)

Why: Developers shouldn't write repetitive boilerplates. Observability stops slow queries before they cascade into production. Where: Generating standard APIs automatically, running CI/CD database dry-runs, tracking query memory overhead directly in logs.

# Example of the built-in CLI generating automatic implementations
npx sqlgoose generate:graphql --out ./schema.graphql
npx sqlgoose compare postgres://dev mysql://prod --sync

// Monitor complex CDC explicitly without external Kafka dependencies
eventBus.on('User.updated', (changeSet) => notifyWebhooks(changeSet));

7. Distributed Background Jobs & Event-Sourcing

(JobQueue, JobScheduler, PriorityJobQueue, JobTopics, RetryableJobQueue, Event System, Webhook Manager)

Why: Production apps need to defer heavy processing (emails, reports) and react to data changes asynchronously. Where: Use for scheduling weekly reports, processing image uploads in the background, or triggering external webhooks on record changes.

// Define a priority job with automatic retries and topic subscription
await queue.addJob('process-image', { id: 123 }, { 
  priority: JobPriority.HIGH, 
  retry: { attempts: 3, backoff: 'exponential' } 
});

// Schedule regular tasks (Cron-like)
schedule('0 0 * * *', () => syncInventory());

// React to global model events via the native event bus
eventBus.on('User.created', async (user) => {
  await webhookManager.trigger('user.welcome', user);
});

8. Advanced SQL Expressions & Data Science

(Window Functions, Recursive CTEs, Aggregation SQL Compiler, Window Functions Compiler, JSON Enhanced Ops)

Why: Modern data apps require complex ranking, tree traversal, and JSON manipulation that standard ORMs struggle with. Where: Building leaderboards (Rank/RowNumber), traversing organization charts (Recursive CTEs), or querying deeply nested JSON blobs in Postgres/MySQL.

// Rank users by score using Window Functions natively
const ranked = await User.find()
  .window('rank', (w) => w.over().orderBy('score').desc())
  .select(['name', 'score', 'rank']);

// Traverse tree structures using Recursive CTEs
const path = await Categories.recursiveCTE('category_tree', (cte) => {
  cte.anchor({ parentId: null }).recursive({ parentId: sql`category_tree.id` });
});

9. Enterprise Relationships & Data Lifecycle

(Pivot Tables, Polymorphic Relations, Eager Loading, Virtual Populate, Dirty Field Tracking, Cascading Deletes, Parent-Child Sync)

Why: Complex business logic often involves Many-to-Many relationships with extra data, or tracking EXACTLY which fields changed before saving. Where: Use Pivot Tables for user-role relationships with metadata, and trackChanges for manual audit logs.

// Setup a Many-to-Many pivot with extra metadata
await setupPivotRelationship(User, Role, 'UserRoles', { 
  extraFields: { assignedBy: String } 
});

// Track dirty fields for manual auditing or conditional logic
const doc = await User.findById(123);
doc.name = 'New Name';
const changes = trackChanges(doc); // { name: { old: 'Old', new: 'New' } }

10. High-Performance Caching & Registry

(Tag-based Caching, Global Cache Mapping, Model Registry, InMemoryCache, RedisCache, Query Templates)

Why: Invalidation is hard. Tag-based caching allows you to purge specific groups of data (e.g., "all-users-cache") without clearing the entire Redis database. Where: Heavily read-only endpoints where data needs to be cleared only when certain entities update.

// Cache queries with specific tags for precise invalidation later
await User.find({ status: 'admin' })
  .cache({ tags: ['admins', 'global-stats'], ttl: 3600 });

// Purge all cache entries tagged with 'admins'
await getGlobalCache().invalidateTags(['admins']);

11. Transaction Management & Persistence

(Nested Transactions, Transaction Manager, Savepoints, Bulk Delete, Parent-Child Cascades, Atomic Operations)

Why: Reliable apps need multi-step operations to fail gracefully. Nested transactions allow sub-tasks to fail without rolling back the entire parent operation. Where: Complex checkout flows, financial ledger entries, or multi-step account setup processes.

// Perform complex flows with Nested Transactions (Savepoints)
await db.tx(async (parent) => {
  await parent.model('Order').create({ ... });
  
  await nestedTransaction(parent, async (child) => {
    // If this fails, only 'Inventory' rolls back, 'Order' stays.
    await child.model('Inventory').decrement({ ... });
  });
});

📦 Installation

npm install sqlgoose
yarn add sqlgoose
pnpm add sqlgoose

Note: Depending on your database of choice, you may also need to install the respective drivers (pg, mysql2, mongodb).

⚡ Quick Start

import { connect, Schema, model } from 'sqlgoose';

// 1. Connect (Single DB or Primary/Replica Cluster)
await connect('postgresql://root:password@localhost:5432/myapp');

// 2. Define a Schema
const userSchema = new Schema({
  name: { type: String, required: true },
  email: { type: String, unique: true, required: true },
  role: { type: String, default: 'user' },
  age: { type: Number, min: 18 }
}, { timestamps: true });

// 3. Create a Model
const User = model('User', userSchema);

// 4. Use it!
async function run() {
  // Create a record
  const newUser = await User.create({ 
    name: 'Alice', 
    email: '[email protected]',
    age: 25
  });
  
  // Find a record
  const found = await User.findOne({ email: '[email protected]' });
  console.log(found.name); // Alice
}

run();

🧠 How It Works

SQLGoose operates on a Hybrid Adapter Architecture.

Query Translation: Internally, it translates a single Query Tree into the specific dialect of the active database backend (PostgreSQL, MySQL, or MongoDB).
Replica Manager: When you execute a read operation (find, findOne), the Replica Manager evaluates the state of each node via its active Circuit Breaker.
Resilience & Pulse Checks: If a node is "tripped" (timeouts, errors), the load balancer drops it from the rotation. A background Pulse Sync polls the node and smoothly re-integrates it once healthy. Write operations (create, update, delete) bypass replicas and go straight to the primary.

🔧 API Reference

`Connection & Setup`

`connect(urlOrOptions)`

Initializes the database connection or cluster.

Type: Function
Parameters: url: string | ConnectionOptions (See Configuration)
Return: Promise<void>

Example:

await connect('mysql://user:pass@localhost:3306/db');

`db.tx(callback)`

Runs operations inside an atomic transaction.

Type: Method
Parameters: callback: (ctx: TransactionContext) => Promise<T>
Return: Promise<T>

Example:

import { db } from 'sqlgoose';
  
await db.tx(async (ctx) => {
  const TxUser = ctx.model('User');
  await TxUser.create({ name: 'Bob' });
});

`Schema`

Defines the structure of your database tables/collections.

`new Schema(definition, options)`

Type: Class
Parameters:
- definition: Object representing columns and types (String, Number, Boolean, Date, Object, Array).
- options: Settings like { timestamps: true, softDelete: true, multiTenant: true }.

`Schema.prototype.pre(hookType, callback)` & `Schema.prototype.post(hookType, callback)`

Lifecycle hooks for validation and execution.

Example:

userSchema.pre('save', async function(doc) {
  doc.password = await hashPassword(doc.password);
});

`Schema.prototype.virtual(name)`

Computed properties.

Example:

userSchema.virtual('fullName').get(function(doc) {
  return `${doc.firstName} ${doc.lastName}`;
});

`Model`

The active record wrapper around the database using your Schema.

`Model.find(filter)` / `Model.findOne(filter)`

Finds documents matching the properties.

Parameters: filter: Object
Return: Query<T[]> or Query<T>
Example: await User.find({ role: 'admin' });

`Model.create(doc)`

Creates and saves one or multiple documents.

Parameters: doc: Object | Object[]
Return: Promise<T | T[]>

`Model.aggregate(pipeline)`

Performs complex data transformations using a Mongoose-compatible pipeline native to SQL or NoSQL.

Parameters: pipeline: Object[]
Return: Promise<any[]>

💡 Usage Examples

1. Basic CRUD Operations

SQLGoose provides a fully-featured Active Record pattern for all your data manipulation needs natively across SQL and NoSQL databases.

Create

// Insert a single document
const user = await User.create({ name: 'Alice', email: '[email protected]' });

// Alternative: New instance & save
const bob = new User({ name: 'Bob', email: '[email protected]' });
await bob.save();

// Bulk Insert
await User.insertMany([
  { name: 'Charlie', email: '[email protected]' },
  { name: 'Dave', email: '[email protected]' }
]);

Read

// Find all matching documents with MongoDB-style filter operators
const users = await User.find({ status: 'active', age: { $gt: 18 } });

// Find single document
const user = await User.findOne({ email: '[email protected]' });

// Find by Primary Key (_eid)
const bob = await User.findById('uuid-1234');

// Query Builder with chaining (SQL-like)
const topUsers = await User.find()
  .where('score').gt(100)
  .sort('-createdAt')
  .limit(10)
  .populate('profile'); // Joins 'Profile' model dynamically

Update

// Update first match
await User.updateOne({ email: '[email protected]' }, { status: 'inactive' });

// Update multiple
await User.updateMany({ age: { $lt: 18 } }, { status: 'restricted' });

// Find, update, and return the modified document
const updatedUser = await User.findByIdAndUpdate(
  'uuid-1234', 
  { age: 30 }, 
  { new: true } // Return modified version
);

// Modifying an instance
const myUser = await User.findById('uuid-5678');
myUser.name = 'New Name';
await myUser.save(); // Generates `UPDATE ... WHERE _eid = 'uuid-5678'`

Delete

// Delete first match
await User.deleteOne({ email: '[email protected]' });

// Delete multiple matching records
await User.deleteMany({ status: 'banned' });

// Delete and return the document
const deletedUser = await User.findByIdAndDelete('uuid-1234');

// Soft Delete (Requires `{ softDelete: true }` in Schema options)
await User.delete({ _eid: 'uuid-1234' }); // Sets `deletedAt` timestamp instead of destroying row

2. Advanced Cluster Configuration (Heterogeneous)

import { connect } from 'sqlgoose';

await connect({
  replicaConfig: {
    write: 'postgresql://primary:5432/db',
    read: [
      'mysql://replica-1:3306/db',
      'mongodb://replica-2:27017/db'
    ],
    loadBalancing: 'latency-based',
    failureThreshold: 5,
    cooldownMs: 10000,
    failoverToPrimary: true,
    onCircuitOpen: (url, error) => console.error(`Circuit Opened: ${url}`, error)
  }
});

3. Multi-Tenancy & Row-Level Security

const tenantSchema = new Schema({
  name: String,
  data: Object
}, { multiTenant: true }); // Automatically injects and enforces tenantId field

const TenantModel = model('TenantData', tenantSchema);

// Contextual routing safely isolates requests
db.setTenant('tenant-uuid-1234');
const data = await TenantModel.find({}); // Automatically scopes to tenant-uuid-1234

4. Graceful Error Handling & Resilience

try {
  const user = await User.findById('123');
} catch (err) {
  if (err.name === 'CircuitBreakerError') {
    // Handled case where absolutely no healthy read nodes are available
    return renderFallbackOfflineUI();
  }
  throw err;
}

⚙️ Configuration

When calling connect(options), you can pass detailed configurations:

Connection Options

Replica Config Settings

🌍 Real Use Cases

Heterogeneous Migrations: Migrate safely from MongoDB to PostgreSQL incrementally. Use MongoDB as your read-replica temporarily while writing identically to Postgres using a single Schema model.
Global Scaling: Deploy a singular central Postgres primary Write database, accompanied by localized read-replicas distributed across multiple regions (e.g., fast local MySQL instances near the user).
Legacy Modernization: Apply modern ORM features (like Virtuals, GraphQL generation, and Hooks) to bare legacy MySQL or MongoDB databases without having to rewrite huge swathes of logic.

🚨 Edge Cases & Limitations

Dialect Parity differences: While SQLGoose perfectly abstracts most types, advanced engine-specific features (such as strictly relying on PostgreSQL JSONB array searches or MongoDB complex Geospatial intersections) might require adapter-specific query structuring or raw() execution bypasses.
Transactions limitations: Active transactions (ACID guarantees via .tx()) are currently pinned purely to the primary (Write) adapter to prevent multi-master desync race conditions.

🧪 Testing

SQLGoose uses an automated Docker-based testing matrix confirming functionality across all database dialects.

# Start required databases internally (Postgres, MySQL, Mongo, Redis)
docker-compose up -d

# Run all schema, resilience, and feature tests natively
npm test

📊 Performance Insights

SQLGoose adds negligible overhead (< 1ms) for internal query translation logic and macro parsings.

When operating at scale in clusered environments, utilizing the latency-based load balancer can drastically increase application responsiveness. By dynamically scoring database nodes based on ping response time latency, the ORM consistently favors the hardware responding the fastest in real-time.

🗂️ Folder Structure

For developers looking closely at the repo:

/src/adapters: Physical driver bindings (postgres, mysql, mongodb).
/src/helpers: Dozens of independent sub-modules that handle cache tags, aggregation compilers, raw SQL mapping, cursor pagination, and diff checking.
/src/security: SQL Injection protection engine and sanitization tools.
/src/plugins: Built-in auditing integrations.
/src/read-replicas.ts: The core Replica manager tracking latency telemetry across adapters.

🤝 Contributing

We welcome contributions! Please see our Contributing Guide for details on how to submit pull requests, report issues, and debug locally.

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

🦢 SQLGoose

📖 Table of Contents

🚀 Introduction

✨ Features

🧰 Feature Deep Dive & Usage Guide

1. High Availability & Scalability

2. Schema Engineering & Data Integrity

3. Advanced Querying & Relationships

4. Enterprise Performance & Acceleration

5. Multi-Tenancy & Zero-Trust Security

6. DevOps Tooling & Database Observability

7. Distributed Background Jobs & Event-Sourcing

8. Advanced SQL Expressions & Data Science

9. Enterprise Relationships & Data Lifecycle

10. High-Performance Caching & Registry

11. Transaction Management & Persistence

📦 Installation

⚡ Quick Start

🧠 How It Works

🔧 API Reference

Connection & Setup

connect(urlOrOptions)

db.tx(callback)

Schema

new Schema(definition, options)

Schema.prototype.pre(hookType, callback) & Schema.prototype.post(hookType, callback)

Schema.prototype.virtual(name)

Model

Model.find(filter) / Model.findOne(filter)

Model.create(doc)

Model.aggregate(pipeline)

💡 Usage Examples

1. Basic CRUD Operations

2. Advanced Cluster Configuration (Heterogeneous)

3. Multi-Tenancy & Row-Level Security

4. Graceful Error Handling & Resilience

⚙️ Configuration

Connection Options

Replica Config Settings

🌍 Real Use Cases

🚨 Edge Cases & Limitations

🧪 Testing

📊 Performance Insights

🗂️ Folder Structure

🤝 Contributing

📄 License

`Connection & Setup`

`connect(urlOrOptions)`

`db.tx(callback)`

`Schema`

`new Schema(definition, options)`

`Schema.prototype.pre(hookType, callback)` & `Schema.prototype.post(hookType, callback)`

`Schema.prototype.virtual(name)`

`Model`

`Model.find(filter)` / `Model.findOne(filter)`

`Model.create(doc)`

`Model.aggregate(pipeline)`