tableloader

v0.0.3

Published

a year ago

DataLoader + SQL Tables, with adapters Knex.js & Kysely

Downloads

0High
0Medium
0Low

tgriesser

dataloader sqlite mysql postgresql kysely knex orm

TableLoader

DataLoader + Table Associations = TableLoader (From the original creator of Knex.js)

A JavaScript (Node + ESM) library for relational database querying, focused on optimal type-safety, batching (avoiding N+1 queries), consistency, and correctness.

Ships with adapters for both Knex.js & Kysely, and instructions for building or contributing a new adapter if you're using something else to construct your queries.

Aims for queries that are lightweight, simple to modify via filters or scopes, easy to inspect, and with as good or better performance than the equivalent hand written sql queries, due to proper batching and optional column selection API.

It's especially well suited to avoid the N+1 query problem in GraphQL, but is useful in any situations where you'd use a query builder or ORM to handle your select statements.

Table of Contents

Overview / Guiding Principles

Database as the Source of Truth: The database has a well defined schema, and stable guarantees around uniqueness and identity. Introspecting and pulling this information into our runtime layer gives us better information about what we should expect when querying.

No Magic: It should be simple to trace the logic used to generate a query, and the query should represent what you would have written by hand. No obfuscation of what's happening behind the scenes, logging out the queries, or adding a debug breakpoint and stepping through the code should feel familiar.

Intentional APIs: The API design should be consistent, and large enough to support the most common use cases, but no larger. There should be a single, obvious approach to accomplish a given task, and it shouldn't try to handle every edge case - you can handle those using a query builder or raw SQL.

Making use of Batching: The batch loading pattern is incredibly helpful in reducing the number of queries to the database, but requires a lot of effort to do correctly in all situations. Tableloader aims to address these cases as much as possible.

Favor Readability over Micro-Optimizations: The difference between a for loop and a for of loop is meaningless when compared with the latency of a network request. This library is designed to optimize for reducing latter.

Keep Dependencies to a Minimum:

DataLoader (Batching & Caching)
json-stable-stringify (DataLoader instance cache keys)
Minimist (For the CLI Helper)

Don't try to do everything: Goes hand-in-hand with "intentional APIs". This library doesn't aim to replace a full-featured query builder layer, such as Knex or Kysely. Instead it should compliment it, providing well thought out abstractions for common data-fetching use cases.

How It Works

A TableLoader instance is created with knowledge of both type information and runtime metadata for the columns & keys in the target database:

// Type Info:
const adapter = new Adapter<DBTypes>(); // KnexAdapter or KyselyAdapter

const tableLoader = new TableLoader(adapter, {
  // Runtime Info:
  // Introspection typically generated via the Tableloader CLI (see section below)
  introspection: {
    primaryKeys: {
      user: "id",
      project: "id",
      membership: ["project_id", "user_id"],
    },
    columns: {
      user: ["id", "name", "created_at"],
      userInfo: ["user_id", "info"],
      project: ["id", "slug", "project_name", "created_at", "updated_at"],
      membership: ["project_id", "user_id", "role", "created_at"],
    },
    uniqueKeys: {
      project: ["slug"],
    },
  },
  // Written by hand, to represent the relations you wish to support in the API,
  // regardless of whether these are true ForeignKey relationships
  relations: {
    user: (t) => ({
      project: t.manyToMany("project", "membership"),
      userInfo: t.hasOne("userInfo").orThrow,
    }),
    project: (t) => ({
      user: t.manyToMany("user", "membership"),
    }),
  },
});

The concrete introspection metadata allows us to validate all of the defined relations, and keep a proper identity cache of all rows returned, via their primary and secondary (unique) keys.

To use TableLoader, we create a new context:

const ctx = tableLoader.makeContext();

The context acts as an identity cache of row results and internal DataLoaders. You'll generally create a new context per request/response, or similar unit of work, such as a queue job.

Each time a table or relation is queried, a DataLoader is either used or created & cached if it doesn't exist yet. Loaders are cached based on the constraints to the query:

const [user1, user2] = await Promise.all([
  // Both access the same dataloader, due to having the same filter
  ctx.table("user").filter({ status: "active" }).loadPk(1),
  ctx.table("user").filter({ status: "active" }).loadPk(2),
  //
]);

await Promise.all([
  // Again, both access the same dataloader, since they have the same relation
  ctx.relation("user:userInfo").loadFor(user1),
  ctx.relation("user:userInfo").loadFor(user2),
]);

SQL:

select `user`.* from `user` where `status` = 'active' and id in (1, 2)
select `user_info`.* from `user_info` where `user_id` in (1, 2)

Goals / Non-Goals

Goals:
- Full type safety while dealing with relations
- Observability: logs all queries executed within a given context, for simple inspection or snapshotting in tests
- Works from a stable introspection of the database, catching obvious association errors at startup rather than runtime
- Hooks to selectively load columns, if the selection set of columns are known ahead of time, such as when executing a GraphQL Document traversal
- Handle composite keys as well as we can, don't make them an afterthought
For consideration:
- Polymorphic associations. Not strictly against these, just didn't take the time to add them. Feel free to open an issue if you'd like to see them
Non-Goals:
- Models / Model Files
- Managing any sort of inserts / updates / deletes

Non-Goal explanation:

Models / Model Files

Tools like GraphQL already provide good separation between reads & writes. There's a lot of opinions around what a Model can / should do, and TableLoader aims to steer clear of that. We are concerned with mapping as close to the database as possible, and to focus on wins that address optimal query loading.

Feel free to build a more robust ORM on top of TableLoader and open a PR to link to it here!

Managing any sort of inserts / updates / deletes

Something for this might be added eventually, but it's not in the short term roadmap. Keeping an identity map of tables can be tricky with mutations, and it's often simpler to just do you your logic, and then issue a new tableLoader.makeContext()

Knex / Kysely provide good helpers for dealing with these sorts of things in the query builder directly, and we recommend you just use those.

Getting Started

In order to maintain type-safety both in TypeScript and at runtime, TableLoader requires a bit of initial configuration to get our types synced up with the database definition.

Generate Introspection Metadata

TableLoader requires a introspection, incuding the primary keys, unique keys, and columns for each table in a target database.

export interface IntrospectionShape<DBShape> {
  // A list of all columns associated with all tables
  tables: {
    [K in keyof DBShape]: Arr<keyof DBShape[K]>;
  };
  // A list of all primary keys associated with the tables
  primaryKeys: Partial<{
    [K in keyof DBShape]: keyof DBShape[K] | Arr<keyof DBShape[K]>;
  }>;
  // A list of all non-primary unique keys for a given table.
  uniqueKeys: Record<string, Arr<string | string[]>>;
}

Using the CLI

TableLoader comes with a utility for generating this introspection from the CLI:

# Shows info about available options
tableloader --help

tableloader --adapter [knex,kysely] --connnection $DATABASE_URL

If you'd prefer not to use the CLI, and use the programmatic API, the introspectionBuilder can be imported from tableloader/dist/introspection, and can be invoked to generate the introspection types at runtime.

See the source for the introspection CLI for an example of how you might create a runtime tool to generate the introspection data.

Generate Types & Create Adapter

With Kysely

Types should be generated using kysely-codegen

import { Kysely /* PostgresDialect, SqliteDialect, etc. */ } from "kysely";
import { AdapterKysely } from "tableloader/adapters/kysely";
// change to wherever the kysely-codegen types are generated, if the output path is specified
import type { DB } from "kysely-codegen";
import { introspection } from "./path/to/introspection";

const kysely = new Kysely<DB>({
  // ... config for specific kysely adapter
});
const adapter = new AdapterKysely(kysely, {
  // camelCase: true
});
const tableLoader = new TableLoader(adapter, {
  introspection,
  relations: {
    // ... relations defined here
  },
});

With Knex

Types should be generated using @tgriesser/schemats v9+

npm run schemats generate --prettier --skipPrefix knex_ -o $OUTPUT_PATH -c $DATABASE_URL

import knex from "knex";
import { AdapterKnex } from "tableloader/adapters/knex";
import type { DBTables } from "./path/to/schemats-output";
import { introspection } from "./path/to/introspection";

const db = knex({
  // ... config for your Knex instance
});
const adapter = new AdapterKnex<DBTables>(db, {
  // camelCase: true
});
const tableLoader = new TableLoader(adapter, {
  introspection,
  relations: {
    // ...relation definitions
  },
});

API

TableLoader

tableLoader.makeContext()

Creates a new "context", where the context is the entry point for all APIs to query for tables & relations. Typically you'll do this once per-request, in a middleware or similar location where you setup state for your

const ctx = tableLoader.makeContext();

const userOne = await ctx.table("users").loadPk(1);

TableLoaderContext

ctx.table()

Given a table name, exposes the TableApi for querying tables.

Promise.all([
  ctx.table("users").loadPk(1),
  ctx.table("users").filter({ status: "inactive" }).getRows(),
]);

ctx.relation()

Returns the "relation" api, where the relation name is defined as the concatenation of the table name and object property

ctx.setSelections()

The setSelections API takes object, containing a mapping of table names, to columns that should be selected for that table. If not specified, the default is to select * for the table when loaded.

Primary & unique keys are always loaded, regardless of the setSelections configuration. This is to ensure that we're able to keep proper row identity, and reduce the number of subsequent row lookups.

ctx.setSelections({
  users: ["name"],
  projects: ["status"],
});

// select id, name from users where id in (?)
const user = await ctx.table("users").loadPk(1);

// select id, slug, status from project where id in (?)
const project = await ctx.table("project").loadPk(1);

ctx.queries

A getter, returning all of the queries issued

TableApi

table.loadPk()

Loads a row for a table by its Primary Key. The introspection metadata is utilized to infer the correct types for the argument.

ctx.table("user").loadPk(1);

If the primary key is composite (multiple columns), these should be provided as an object:

ctx.table("membership").loadPk({ user_id: 1, organization_id: 2 });

The loadPk utilizes the batching via DataLoader, so multiple parallel calls to loadPk will be sent in a single query:

// select "user".* from "user" where "id" in (1, 2, 3)
Promise.all([
  ctx.table("user").loadPk(1),
  ctx.table("user").loadPk(2),
  ctx.table("user").loadPk(3),
]);

Returns the LoadedRow

table.loadPkOrThrow()

Like the loadPk method, except it throw a RowNotFoundError if the record is not found

table.loadPkBatch()

table.loadPkBatchOrThrow()

table.loadPkBatchOrError()

table.loadUniq()

table.loadBy()

table.filter()

table.orderBy()

table.limit()

table.findAll()

table.asConnection()

table.findOne()

table.interpolate()

Takes a plain object, and coerces it into a LoadedRow for the table, caching it in the

LoadedRow

A LoadedRow object is a class which wraps the raw database object, providing getters for the individual fields. It should feel functionally similar to the POJO (plain-old JavaScript Object) returned from the query library, however it has a few unique characteristics specific to TableLoader.

Relation Api (one)

relation.loadFor()

relation.loadForBatch()

Relation Api (many)

relation.orderBy()

relation.filter()

relation.scope()

relation.loadFor()

relation.loadOneFor()

relation.loadForBatch()

relation.asConnectionFor()

Returns an object properly implementing the GraphQL Cursor Connections specification, exposing a "edges", "pageInfo", and "nodes" properties.

const connection = ctx.relation("user:posts").asConnectionFor(user, args);

const [edges, pageInfo, nodes] = await Promise.all([
  connection.edges,
  connection.pageInfo,
  connection.nodes,
]);

relation.asPageFor()

Constraining Queries

Filters

Filters add a where clause to the query.

Scopes

Table Scopes

TableLoader comes with a concept of "scopes", a way to predefine a set of constraints for a query, to be invoked at runtime. Scopes are added to tables, using the adapter's scopes() on a relation definition, adding the scope options as metadata to the relation:

new TableLoader(adapter, {
  introspection,
  relations: {
    user: (t) => ({
      posts: t.hasMany("post"),
    }),
  },
  scopes: adapter.scopes({
    post: {
      public: (qb) => qb.where("published", true).where("hidden", false),
    },
  }),
});

Later when using the relation:

ctx.relation("user:posts").scope("public").loadFor(user);

Global Scopes

TableLoader also comes with a concept of "global scopes". Global scopes are defined using the same adapter.scopes method, and they are automatically used whenever a table is accessed:

new TableLoader(adapter, {
  // ...
  globalScopes: adapter.scopes({
    project: {
      notDeleted: (qb) => qb.whereNull("deleted_at"),
    },
  }),
});

It's possible to remove global scopes, by calling withougGlobalScopes:

ctx.table("project").withoutGlobalScopes().getRows();

Scopes with arguments

Scopes can define arguments:

scopes: {
  user: adapter.scopes({
    active: (qb, excludeUsers: number[]) => {
      return qb.where("activated", true).andWhereNotIn("id", excludeUsers);
    },
  });
}

TableLoader will track the definition of these types and include them in the types for the scope invocation:

// OK
ctx.relation("organization:users").scope("active", [1, 2, 3]).loadFor(org);

// TS error, excludeUsers expected:
ctx.relation("organization:users").scope("active").loadFor(org);

Dynamic scope creation

If you need to create the scopes dynamically, based on the

Aggregates

When aggregating, we assume we're grouping based on the relation column

Tracking Column Usage

Relation Naming Conventions

Naming conventions are similar to other libraries / frameworks, such as Rails, Laravel, or Django. The foreign key of a table is assumed to be the concatenation of the source table's name and the primary key of the referenced table, unless otherwise specified by the .setKeys method.

user
  - id
  - name
  - created_at

post
  - id
  - title
  - user_id

user -> hasMany(post)
post -> belongsTo(user)

One-To-One Association

hasOne

A hasOne relationship defines a one-to-one relationship between the source table and the target table, where the source table is assumed to have exactly one reference to the target table.

({
  user: (t) => ({
    userInfo: t.hasOne("user_info").orThrow,
    optionalMeta: t.hasOne("user_meta"),
  }),
});

select user_info.* from user_info where user_id in (...) -- (user.id)

belongsTo

The belongsTo defines a scenario where the source table refers to another model, by the primary key reference of the target table, via the primary key reference in its own table.

({
  userInfo: (t) => ({
    user: t.belongsTo("user"),
  }),
});

hasOneThrough

The "hasOneThrough" relationship isn't too common, it defines a one-to-one relationship with between a source table and a target table, joined through a third joining table.

({
  session: (t) => ({
    userInfo: t.hasOneThrough("userInfo", "user"),
  }),
});

select user_info.*, user.id as user$$id
inner join user on user.id = user_info.user_id
where user_info.user_id in (...) -- session.user_id

One-To-Many Association

hasMany

A hasMany is a fairly common one-to-many relationship, where the source table has zero or more of the target table. This is defined by default convention where the target table having the source table, suffixed by the source table's primary key.

({
  user: (t) => ({
    posts: t.hasMany("post"),
  }),
});

select post.* from post where user_id in (...) -- (user.id)

hasManyThrough

The hasManyThrough relationship provides a way to access a table through an intermediate table, where the intermediate table is its own entity.

({
  user: (t) => ({
    payments: t.hasManyThrough("payment", "invoice"),
  }),
});

ctx.relation("user:payments").loadFor(user);

select payment.* from payment
inner join invoice on invoice.id = payment.invoice_id
where invoice.user_id in (...) -- (user.id)

Many-To-Many Association

manyToMany

Similar to the belongsToMany in Laravel, or the has_and_belongs_to_many in Rails, the manyToMany defines a relationship between two tables, through a third table, where the intermediate table contains the primary key references to the two tables it's joining.

({
  user: (t) => ({
    projects: t.manyToMany("project", "projectUser"),
  }),
});

Modifying Relation Keys

For any relation, it is possible to modify the keys to suit the relation, such as in a situation where the relation is self referential, or the keys don't map exactly to the standard naming assumptions

({
  project: {
    // a project can have a template, which refers to the same table
    template: t.belongsTo("project").setKeys({ sourceKey: "template_id" }),
  },
});

select project.* from project where id in (...) -- source.template_id

Errors

TableLoaderError

RelationNotFoundError

RowNotFoundError

ColumnNotLoadedError

TODO Before Launch

[x] CamelCase handling
[x] Pluralization handling
[ ] Aggregation API
[ ] Pagination API Configuration
[ ] Example of setSelections use
[ ] Api for detecting column usage
[x] Decide on relation types
[ ] Opt-out for union all
[ ] Scopes (global, local)
[ ] Cleanup cache key structure to read a bit better
[ ] Finish off Docs
[ ] Benchmarks

Roadmap:

[ ] Consider pulling more introspection info, nullability & foreign keys
[ ] Windowing
[ ] Configuring when to use union vs. not

License

MIT

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

TableLoader

Overview / Guiding Principles

How It Works

Goals / Non-Goals

Getting Started

Generate Introspection Metadata

Using the CLI

Generate Types & Create Adapter

With Kysely

With Knex

API

TableLoader

tableLoader.makeContext()

TableLoaderContext

ctx.table()

ctx.relation()

ctx.setSelections()

ctx.queries

TableApi

table.loadPk()

table.loadPkOrThrow()

table.loadPkBatch()

table.loadPkBatchOrThrow()

table.loadPkBatchOrError()

table.loadUniq()

table.loadBy()

table.filter()

table.orderBy()

table.limit()

table.findAll()

table.asConnection()

table.findOne()

table.interpolate()

LoadedRow

Relation Api (one)

relation.loadFor()

relation.loadForBatch()

Relation Api (many)

relation.orderBy()

relation.filter()

relation.scope()

relation.loadFor()

relation.loadOneFor()

relation.loadForBatch()

relation.asConnectionFor()

relation.asPageFor()

Constraining Queries

Filters

Scopes

Table Scopes

Global Scopes

Scopes with arguments

Dynamic scope creation

Aggregates

Tracking Column Usage

Relation Naming Conventions

One-To-One Association

hasOne

belongsTo

hasOneThrough

One-To-Many Association

hasMany

hasManyThrough

Many-To-Many Association

manyToMany

Modifying Relation Keys

Errors

TableLoaderError

RelationNotFoundError

RowNotFoundError

ColumnNotLoadedError

TODO Before Launch

Roadmap:

License