coll-fns

v1.6.2

Published

2 months ago

A functional interface to join MongoDB collections and add hooks before and after insertions, updates and removals.

0High
0Medium
0Low

mongodb meteor joins hooks protocol find fetch query publish publication composite reactive denormalization functional composable

A functional interface to join MongoDB collections and add hooks before and after insertions, updates and removals.

Overview

Skip the repetitive glue code for joining collections and wiring related data. Define joins between collections once, then fetch documents in the exact shape you need in a nested tree. The data fetching is optimized to minimize database queries.

Stop repeating business logic all over your code base. Define hooks on collections that will be triggered conditionally before or after insertions, updates or removals: data validation, change propagation, logging, etc.

Rationale

Click on any element to unfold it and better understand the rationale behind this library!

While document databases with eventual consistency such as MongoDB are quite powerful, they often encourage denormalization patterns where information from one document must be copied on related documents. If everything goes well, one should be able to limit data fetching to self-contained documents only... (yeah, right).

This pattern adds much unwelcome complexity:

the fields to denormalize must be determined in advance
adding functionalities that would require new related fields means more denormalization and migration scripts
the denormalized data must be kept in sync (somehow)
denormalization can fail halfway through, leading to an incoherent state

I designed this library as a Meteor developer. I was hence forced to use MongoDB documents database instead of a normalized database. Yet, I wanted to join collections so I could

keep data in only one place (where it made most sense)
retrieve documents easily in an intuitive shape.

Joins allow to do so. They are declared in advance between collections, defining how different types of documents relate to each other in a single place, instead of repeating this logic each time when querying the data. Data fetching knows about them and allows arbitrary depth relationships where documents are returned as simple nested tree-shaped documents without the need for helpers, methods or other complex structures... It's just data!

And joins can be defined in code shared by both server and client (I hate redundant code 😒).

Although joins between collections reduce the need for denormalization, it is often essential to update related documents based on what happens to another one (cascading removals, for example).

Reacting to data changes to propagate the information on the wire through API calls is also quite frequent. As is the need for final validation before commiting a change.

So how should we go about it? Should this code be incorporated in each mutation path? Should we create custom functions to update a specific type of document so all these side-effects are executed? If so, how do we not forget to use these functions instead of the normal collection methods? And how do we reduce the amount of boilerplate code that will simply handle the mechanics of data manipulation?

Hooks were introduced to solve these issues. The are defined in advance on each collection to intercept or react to insertions, updates and removals.

Even better, they can be defined so they fire only if certain conditions are met! And they can be defined in many places, making it much easier to group hooks by area of concern, instead of spreading their logic all over the place.

Focussing on the business logic, describing it only once, wasting no time on boilerplate code, that's a time (and sanity) saver for sure! 🤪

I also faced a challenge when migrating from Meteor 2.0 to 3.0, which rendered isomorphic code cumbersome (same code running on the server and the client).

On the server, the new Meteor architecture now required the use of promises and async/await constructs.

On the client, data fetching with Minimongo must be synchronous in most frameworks to avoid complicated front-end code to handle promises.

I wanted a tool that would help me keep the isomorphic query capabilities while eliminating redundant glue code.

By designing the library with a protocol architecture, the same code can be run with a different database implementation.

On the client, I use a synchronous Minimongo implementation.

On the server, while still on Meteor 2.0, I also used a synchronous implementation that worked fine with Fibers. When I got ready to move to async/await and Meteor 3.0, I simply switched to an async protocol implementation without so much refactoring!

(Of course, the code had to be refactored to actually use coll-fns, but it comes with so much powerful features that it would have been a go-to anyway!)

And since it uses a protocol, it can be used with the native MongoDB driver too (built-in) and could even be adapted to almost any type of database... 🤯

A lot of libraries that add functionalities to the database layer mutate the collection instances themselves or, when more respectful, offer ways to extend the collection constructor somehow.

In either case, it can lead to potential issues where different enhancement libraries conflict with each other. Method names might change, data might be saved in added fields on the collection instances (is _joins really safe to use?)...

coll-fns, as the name implies, offers a functional API. Instead of doing Collection.insert(doc), you would insert(Collection, doc). I know... Moving the left parenthesis and replacing the dot with a comma is a lot to ask 😉, but it comes with benefits!

Instead of mutating the collections themselves, joins and hooks definitions are saved in a global registry. No collection instance is harmed (mutated) in the process. You could have a fancy custom collection wrapped and distorted by many different libraries; coll-fns won't add a thing to it.

Doing so makes it easy to offer a protocol interface: the type of collection involved doesn't matter at all. Heck, the collections could even be table names as strings and it would still work (if you implement a custom protocol)!

For Meteor developers, it also means being able to enhance the Meteor.users collection itself... event without access to instantiation code! 🤓

In Meteor in particular, publication of data over DDP that gets synced on the client in Minimongo is very convenient. It makes for really reactive applications where data doesn't get stale. Optimistic UI optimizations is built-in and doesn't require additional complex logic.

However, it can get difficult to publish exactly the right documents, especially when using a very useful library that allows joining collections together and keeping data much more normalized!

Although returning Meteor cursors from a publish function is still the most optimized path, it sometimes makes sense to publish children documents based on their relationship with their parent. But doing so is usually very difficult to implement.

Some existing community libraries that promise to do so either:

don't actually work (might understand added and removed callbacks, but won't react to updates)
are too simplistic (cause a lot of duplicated observers)
come with a much more complex data layer
add somewhat unpredictable reactivity on the server.

Since coll-fns is already very good at understanding relationships, a publish helper was introduced to allow such fine-tuned reactivity out of the box. 💪

Installation and configuration

IMPORTANT: For concision, the examples will use the synchronous Meteor protocol to avoid async/await boilerplate. Of course, your code will have to be adapted when used with an asynchronous protocol.

npm install coll-fns

`setProtocol(protocol)`

You will have to define which protocol to use before using any of the library's functionality.

import { setProtocol, protocols } from "coll-fns";

/* Built-in protocols include:
 * - meteorAsync
 * - meteorSync
 * - node
 *
 * Can also define a custom protocol! */
setProtocol(protocols.meteorAsync);

In a Meteor project, you should probably define a different protocol on client (synchronous) and server (asynchronous).

import { setProtocol, protocols } from "coll-fns";

const protocol = Meteor.isServer ? protocols.meteorAsync : protocols.meteorSync;
setProtocol(protocol);

There's also a native NodeJS MongoDB driver protocol built-in (protocols.node).

Execution context (`bindEnvironment`)

Protocols can optionally expose a bindEnvironment(fn) method.
It is used by hooks internals to run callbacks in the expected runtime context.

In Meteor Fiber-based servers, this prevents errors like:
- Meteor code must always run within a Fiber...
In environments that don't need special wrapping, it can be omitted.

protocols.meteorSync already handles this automatically:

On Meteor server with Fibers, it uses Meteor.bindEnvironment when available.
On Meteor client (or any environment where Meteor.bindEnvironment is unavailable), it falls back to a normal direct call.

So if you use protocols.meteorSync on both client and server in a Fiber-based Meteor app, no override is required specifically for this behavior.

You could even define a custom protocol for the library to work with another interface to MongoDB or even to a completely different storage system! Joins and hooks should then work the same way (let me know if you do 🤓!).

import { setProtocol } from "coll-fns";

const customProtocol = {
  /* Return a documents count */
  count(/* Coll, selector = {}, options = {} */) {},

  /* Return a list of documents. */
  findList(/* Coll, selector = {}, options = {} */) {},

  /* Return the name of the collection. */
  getName(/* Coll */) {},

  /* Optional. Return a function that will transform each document
   * after being fetched with descendants. */
  getTransform(/* Coll */) {},

  /* Optional. Wrap callbacks to preserve runtime context
   * (ex: Meteor.bindEnvironment with Fibers). */
  bindEnvironment(/* fn */) {},

  /* Insert a document in a collection
   * and return the inserted _id. */
  insert(/* Coll, doc, options */) {},

  /* Observe document changes.
   * Must return a handle with a `stop()` method. */
  observe(/* Coll, selector = {}, callbacks = {}, options = {} */) {},

  /* Remove documents in a collection
   * and return the number of removed documents. */
  remove(/* Coll, selector, options */) {},

  /* Stable stringify function used for internal query keys.
   * EJSON canonical stringify is used as a good default,
   * but can be overridden. */
  stringify(/* value */) {},

  /* Update documents in a collection
   * and return the number of modified documents. */
  update(/* Coll, selector, modifier, options */) {},
};

setProtocol(customProtocol);

observe expected shape:

observe(Coll, selector = {}, callbacks = {}, options = {}) {
  const { added, changed, removed } = callbacks;

  // Your implementation subscribes reactively to selector/options changes.
  // Callbacks should be invoked with:
  // - added(id, fields)
  // - changed(id, fields)
  // - removed(id)
  //
  // Then return an object exposing a stop() method.
  return {
    stop() {
      // Tear down underlying observer/resources.
    },
  };
}

Notes:

observe can be sync or async (returning a Promise of the stop-handle).
fields should contain changed/added fields payload expected by your transport.
publish() relies on this contract to keep nested observers in sync.

If your runtime has special callback context requirements, implement bindEnvironment.

const customProtocol = {
  // ...
  bindEnvironment(fn) {
    if (typeof Meteor?.bindEnvironment === "function") {
      return Meteor.bindEnvironment(fn);
    }
    return fn;
  },
};

If your runtime has no such requirement, simply omit bindEnvironment.

If you want to use the publish() composite publication helper:

observe must be defined on the protocol.
stringify can be defined if the default EJSON implementation is not sufficient.

Bypassing `coll-fns`

coll-fns intentionally keeps collection instances untouched. It doesn't add any methods nor change exsting ones' behaviour. To bypass any coll-fns functionality, simply use the normal collection methods (ex: Coll.insert, Coll.find().fetchAsync(), Coll.removeOne()). Joins and hooks will only get fired when using the library's functions... and if joins and hooks have been pre-defined, of course! 😉

Joins and fetch

Quick start examples

import { fetchList, join } from "coll-fns";
import { Comments, Posts, Users } from "/collections";

/* Define joins on Posts collection */
join(Posts, {
  /* One-to-one join */
  author: {
    Coll: Users,
    on: ["authorId", "_id"],
    single: true,
  },

  /* One-to-many join */
  comments: {
    Coll: Comments,
    on: ["_id", "postId"],
    /* `single` defaults to false,
     * so joined docs are returned as an array */
  },
});

/* Fetch data with nested joined documents in the requested shape. */
fetchList(
  Posts,
  {},
  {
    fields: {
      title: 1, // <= Own
      author: { birthdate: 0 }, // <= Falsy = anything but these fields
      comments: { text: 1 },
    },
  }
);

[
  {
    "title": "Blabla",
    "authorId": "foo", // <= Included by join definition
    "author": {
      "name": "Foo Bar",
      "genre": "non-fiction",
    },
    /* Comments is a one-to-many join, so is returned as a list */
    "comments": [{ "text": "Nice!" }, { "text": "Great!" }],
  },
]

`join(Coll, joinDefinitions)`

Collections can be joined together with globally pre-registered joins to greatly simplify optimized data fetching.

Joins are not symmetrical by default. Each collection should define its own relationships.

import { join } from "coll-fns";

join(
  /* Parent collection */
  Coll,

  /* Map of joins on children collections.
   * Each key is the name of the field
   * where joined docs will be placed. */
  {
    joinProp1: {
      /* joinDefinition */
    },

    joinProp2: {
      /* joinDefinition */
    },
  }
);

Collections can define as many joins as needed without impacting performance. They will be used only when explicitly fetched. They can be declared in different places (as long as join names don't collide).

In the context of Meteor, joins could (should?) be defined in shared client and server code, but some might only ever be used in one environement or the other. They could also define a set of common joins, but add others in environment specific code.

By default, joins link one document from the parent collection to multiple ones from the child collection. In the case of a one-to-one relationship, the single property should be set to true.

There are three main types of join definitions based on the argument to the on property: array, object and function joins.

Simple array join

on can be defined as an array of [parentProp, childProp] equality.

import { join } from "coll-fns";
import { Comments, Posts, Users } from "/collections";

join(Posts, {
  author: {
    Coll: Users,
    /* `post.authorId === user._id` */
    on: ["authorId", "_id"],
    /* Single doc instead of a list */
    single: true,
  },

  comments: {
    Coll: Comments,
    /* `post._id === comment.postId` */
    on: ["_id", "postId"],
  },
});

/* Reversed join from user to posts */
join(Users, {
  posts: {
    Coll: Posts,
    on: ["_id", "authorId"],
  },
});

Sub-array joins

Sometimes, the property referencing linked documents is an array (of ids, usually). In that case, the name of the array property should be nested in an array.

import { join } from "coll-fns";
import { Actions, Resources } from "/collections";

/* Each action can be associated with many resources and vice-versa.
 * Resource's `actionIds` array is the link between them. */
join(Actions, {
  resources: {
    Coll: Resources,
    on: ["_id", ["actionIds"]],
  },
});

/* The reverse join will flip the property names. */
join(Resources, {
  actions: {
    Coll: Actions,
    on: [["actionIds"], "_id"],
  },
});

Filtered array-joins

Some joins should target only specific documents in the foreign collection. A complementary selector can be passed to the third on array argument.

import { join } from "coll-fns";
import { Actions, Resources } from "/collections";

join(Resources, {
  /* Only active tasks (third array element is a selector) */
  activeTasks: {
    Coll: Tasks,
    on: ["_id", "resourceId", { active: true }],
  },

  /* All tasks associated with a resource */
  tasks: {
    Coll: Tasks,
    on: ["_id", "resourceId"],
  },
});

Object joins

The on join definition property can be an object representing a selector. It will always retrieve the same linked documents.

import { join } from "coll-fns";
import { Factory, Workers } from "../collections";

join(Workers, {
  /* All workers will have the same `factory` props. */
  factory: {
    Coll: Factory,
    on: { name: "FACTORY ABC" },
    single: true,
  },
});

Function joins

When joins are too complex to be defined with an array or object (although rare), a function can be used as the on property. Each parent document will be passed to this function, which should return a selector to use on the child collection.

When using function-based joins, a deps property should be added to the join definition to declare which parent fields are required for the join to work:

import { join } from "coll-fns";
import { Comments, Posts } from "/collections";
import { twoMonthsPrior } from "/lib/dates";

join(Posts, {
  recentComments: {
    Coll: Comments,
    on: (post) => {
      const { _id: postId, postedAt } = post;

      /* This argument must be defined at runtime. */
      const minDate = twoMonthsPrior(postedAt);

      /* Return a selector for the Comments collection */
      return {
        createdAt: { $gte: minDate },
        postId,
      };
    },
    /* Parent fields needed in the join function */
    deps: {
      _id: 1, // Optional. _id is implicit in any fetch.
      postedAt: 1,
    },
  },
});

fields remains accepted as a backward-compatible alias for deps.

Recursive joins

A collection can define joins on itself.

import { join } from "coll-fns";
import { Users } from "/collections";

join(Users, {
  friends: {
    /* Use the same collection in the join definition */
    Coll: Users,
    on: [["friendIds"], "_id"],
  },
});

Join additional options

Any additional properties defined on the join (other than Coll, on, single, postFetch, deps and legacy fields) will be treated as options to pass to the nested documents fetchList. It usually includes:

limit: Maximum joined documents count
skip: Documents to skip in the fetch
sort: Sort order of joined documents

`postFetch`

Children documents might need to be modified (transformed, ordered, filtered...) after being fetched. The postFetch: (childrenDocs, parentDoc) => childrenDocs join definition property can be used to do so.

The second argument of the function is the parent document. If some of its properties are needed, they should be declared in the join deps property so they are guaranteed to be fetched on the parent.

import { join } from "coll-fns";
import { Actions, Resources } from "/collections";
import { sortTasks } from "/lib/tasks";

join(Resources, {
  tasks: {
    Coll: Tasks,
    on: ["_id", "resourceId"],

    /* Ensure `tasksOrder` will be fetched on parent docs */
    deps: { tasksOrder: 1 },

    /* Transform the joined tasks documents based on parent resource. */
    postFetch(tasks, resource) {
      const { tasksOrder = [] } = resource;
      return sortTasks(tasks, tasksOrder);
    },
  },
});

`getJoins`

Use getJoins(Coll) to retrieve the complete dictionary of the collection's joins.

`fetchList(Coll, selector, options)`

Fetch documents with the ability to use collection joins.

Options:

fields: Field projection object
limit: Maximum number of documents
skip: Number of documents to skip
sort: Sort specification

In its simplest form, fetchList can be used in much the same way as Meteor's Coll.find(...args).fetch().

const users = await fetchList(
  Users,
  { status: "active" },
  {
    fields: { name: 1, email: 1 },
    sort: { createdAt: -1 },
    limit: 10,
    skip: 0,
  }
);

`fields` option and joins

Contrary to regular projection objects, they can use nested properties { car: { make: 1 } } instead of dot-string ones { car: 1, "car.make": 1 }.

The joins defined on the collections must be explicitly specified in the fields object for the children documents to be fetched. The combined presence of join or own fields determines the shape of the fetched documents.

Examples

import { fetchList, join } from "coll-fns";
import { Comments, Posts, Users } from "/collections";

/* Define joins on Posts collection */
join(Posts, {
  /* One-to-one join */
  author: {
    Coll: Users,
    on: ["authorId", "_id"],
    single: true,
  },

  /* One-to-many join */
  comments: {
    Coll: Comments,
    on: ["_id", "postId"],
    /* `single` defaults to false,
     * so joined docs are returned as an array */
  },
});

fetchList(Posts, {});

[{ "title": "Blabla", "authorId": "foo", "likes": 7 }]

fetchList(
  Posts,
  {},
  {
    fields: {
      title: true, // <= Own. Any truthy value works
    },
  }
);

[{ "title": "Blabla" }]

fetchList(
  Posts,
  {},
  {
    fields: {
      author: 1, // <= Join
    },
  }
);

[
  {
    "title": "Blabla",
    "authorId": "foo",
    "likes": 7,
    "author": {
      "name": "Foo Bar",
      "birthdate": "Some Date",
      "genre": "non-fiction",
    },
  },
]

fetchList(
  Posts,
  {},
  {
    fields: {
      title: 1, // <= Own
      author: 1, // <= Join
    },
  }
);

[
  {
    "title": "Blabla",
    "authorId": "foo", // <= Included by join definition
    "author": {
      "name": "Foo Bar",
      "birthdate": "Some Date",
      "genre": "non-fiction",
    },
  },
]

fetchList(
  Posts,
  {},
  {
    fields: {
      title: 1, // <= Own
      author: { birthdate: 0 }, // <= Falsy = anything but these fields
      comments: { text: 1 },
    },
  }
);

[
  {
    "title": "Blabla",
    "authorId": "foo", // <= Included by join definition
    "author": {
      "name": "Foo Bar",
      "genre": "non-fiction",
    },
    /* Comments is a one-to-many join, so is returned as a list */
    "comments": [{ "text": "Nice!" }, { "text": "Great!" }],
  },
]

`setJoinPrefix(prefix)`

If this combination approach seems confusing, it is possible to define a prefix that must be explicitly used when joined documents should be used. The prefix will be removed from the returned documents.

Setting the prefix to null or undefined allows using join fields at the document root like any normal field.

import { setJoinPrefix } from "coll-fns";

/* All join fields will have to be prefixed with "+" */
setJoinPrefix("+");

/* Some own fields, some join fields */
fetchList(
  Posts,
  {},
  {
    fields: {
      title: 1, // <= Own

      /* Join fields must be nested under the prefix key */
      "+": {
        author: { name: 1, birthdate: 1 }, // <= Join sub fields
      },
    },
  }
);

This option could also be useful if a document can have some denormalized data with the same property name as the join. The denormalized values or the joined document would then be returned based on the use of the prefix.

If, for some reason, you need to retrieve the prefix, you can do so with getJoinPrefix(Coll).

Nested Joins

Joins can be nested to fetch deeply related data. See Hooks best practices for how hooks can be used with nested joins.

import { fetchList } from "coll-fns";

const posts = fetchList(
  Posts,
  {},
  {
    fields: {
      title: 1,

      /* Level 1 : One-to-many join */
      comments: {
        text: 1,

        /* Level 2 : One-to-one join */
        user: {
          username: 1,
        },
      },
    },
  }
);

{
    "title": "Blabla",
    "comments": [
      { "text": "Nice!", "user": { "username": "foo"} },
      { "text": "Great!", "user": { "username": "bar" } }
    ]
  },

Recursion levels

When a field is declared using a positive number, its value is treated as a recursion limit. This could help preventing infinite loops. The value Infinity can even be used to go as deep as possible (to exhaustion), although it involves a greater risk of infinite loops.

import { join } from "coll-fns";
import { Users } from "/collections";

/* Pre-register recursive join */
join(Users, {
  friends: {
    Coll: Users,
    on: [["friendIds"], "_id"],
  },
});

fetchList(
  Users,
  {},
  {
    fields: {
      name: 1,
      /* Join field. Limit to 2 levels deep, reusing parent fields */
      friends: 2,
    },
  }
);

Documents transformation

Documents can be transformed after fetching. Collection-level transforms are automatically applied if the protocol allows it:

Meteor:

import { Mongo } from "meteor/mongo";

const Users = new Mongo.Collection("users", {
  transform: (doc) => ({
    ...doc,
    fullName: `${doc.firstName} ${doc.lastName}`,
  }),
});

For a specific fetch, pass a transform option:

const users = await fetchList(
  Users,
  { status: "active" },
  {
    transform: (doc) => ({
      ...doc,
      fullName: `${doc.firstName} ${doc.lastName}`,
    }),
  }
);

To skip a collection's transform, pass transform: null. Transforms are applied after joins resolve, so they have access to joined data. See Nested Joins for examples of using transforms with complex data structures.

`fetchOne(Coll, selector, options)`

Fetch a single document from a collection. Same behaviour as fetchList.

import { fetchOne } from "coll-fns";
import { Users } from "/collections";

const user = fetchOne(
  Users,
  { _id: userId },
  {
    fields: {
      name: 1,
      friends: 1, // <= Join
    },
  }
);

`fetchIds(Coll, selector, options)`

Fetch only the _id field of matching documents. fields option will be ignored.

import { fetchOne } from "coll-fns";
import { Users } from "/collections";

const userIds = fetchIds(Users, { status: "active" });

`exists(Coll, selector)`

Check if any document matches the selector.

import { fetchOne } from "coll-fns";
import { Users } from "/collections";

const hasActiveUsers = exists(Users, { status: "active" });
// Returns: true or false

`count(Coll, selector)`

Count documents matching the selector.

import { fetchOne } from "coll-fns";
import { Users } from "/collections";

const activeUsersCount = count(UsersCollection, { status: "active" });
// Returns an integer

`flattenFields(fields)`

Flatten a general field specifiers object (which could include nested objects) into a MongoDB-compatible one that uses dot-notation.

import { flattenFields } from "coll-fns";

const flattened = flattenFields({
  name: 1,
  address: {
    street: 1,
    city: 1,
  },
});
// Result: { name: 1, 'address.street': 1, 'address.city': 1 }

Hooks and write operations

Hooks allow you to intercept and react to data mutations (insertions, updates, removals) on collections. They are triggered conditionally before or after write operations, making them ideal for validation, cascading updates, logging, and other side effects.

`hook(Coll, hooksObj)`

The same object argument can define multiple hook types. Each hook type is defined using an array of hook definitions, making it possible to define multiple hooks at once.

Hooks can be defined in multiple places in your codebase. This allows grouping functionally related hooks together.

import { hook } from "coll-fns";
import { Users, Posts } from "/collections";

hook(Users, {
  beforeInsert: [
    {
      fn(doc) {
        if (!doc.email) throw new Error("Email is required");

        doc.createdAt = new Date();
      },
    },
  ],

  onInserted: [
    {
      fn(doc) {
        console.log(`New user created: ${doc._id}`);
      },
    },
  ],
});

Before hooks

These hooks run before the write operation and can prevent the operation by throwing an error.

beforeInsert: Runs before inserting a document. Receives (doc).
beforeUpdate: Runs before updating documents. Receives ([...docsToUpdate], modifier).
beforeRemove: Runs before removing documents. Receives ([...docsToRemove]).

Although arguments can be mutated, it is not the main purpose of these hooks. Mutations are brittle and hard to debug.

beforeUpdate and beforeRemove receive an array of targeted documents, whereas beforeInsert receives a single document.

After hooks

These hooks run after the write operation completes and are fire-and-forget (not awaited by the caller of the collection function). They are usually used to trigger side-effects. They should not throw errors that should get back to the caller.

onInserted: Runs after a document is inserted. Receives (doc).
onUpdated: Runs after a document is updated. Receives (afterDoc, beforeDoc).
onRemoved: Runs after a document is removed. Receives (doc).

IMPORTANT!

These hooks might create incoherent state when used as a denormalization technique (a common and helpful use case) if a downstream update fails. It is NOT inherent to coll-fns, but rather to eventual consistent database designs. Even if the after hooks were awaited, errors would not rollback prior successful updates.
Although hooks can define onError callbacks, if fn executes async code, it MUST await it or return it as a promise. Otherwise, onError callback will never get fired because the function will be running in a separate promise context. If fn starts async work and doesn’t return/await it, any error will become an unhandled rejection (and may crash the process).

❌ Wrong

hook(Coll, {
  fn(doc) {
    update(/* Some other collection */); // not awaited / not returned
  },
});

Might crash the process!

UnhandledPromiseRejection: Error: Validation failed
    at beforeUpdate (.../src/hooks.js:42:11)
    at update (.../src/update.js:128:7)
    ...

✅ Right

hook(Coll, {
  async fn(doc) {
    await update(/* Some other collection */); // Awaited
  },
});

hook(Coll, {
  fn(doc) {
    return update(/* Some other collection */); // Returned promise
  },
});

Hook definition properties

Each hook definition is an object with the following properties:

{
  /* Required. The function to execute.
   * Arguments depend on the hook type (see above).
   * Can be either synchronous or asynchronous. */
  fn(...args) { /* ... */ },

  /* Optional. Fields to fetch for the documents passed to the hook.
   * Fields for multiple hooks of the same type are automatically combined.
   * If any hook of a type requests all fields with `undefined` or `true`,
   * all other similar hooks will also get the entire documents.
   * Has no effect on `beforeInsert`: the doc to be inserted is the argument. */
  fields: { name: 1, email: 1 },

  /* Optional (`onUpdated` only). If true, fetch the document state
   * before the update with the same `fields` value.
   * Otherwise, only _id is fetched initially (they would have been
   * needed anyway to fetch their "after" versions). */
  before: true,

  /* Optional. Predicate that prevents the hook from running if it
   * returns a truthy value. Can be sync or async.
   * Receives the same arguments as fn. */
  unless(doc) { return doc.isBot; },

  /* Optional. Predicate that allows the hook to run only if it
   * returns a truthy value. Can be sync or async.
   * Receives the same arguments as fn. */
  when(doc) { return doc.status === "pending"; },

  /* Optional handler called if the hook function throws an error.
   * A default handler that logs to console.error is defined
   * for after-hooks (onInserted, onUpdated, onRemoved)
   * to prevent an error from crashing the server. */
  onError(err, hookDef) { /* ... */ },
}

Examples

hook(Users, {
  beforeInsert: [
    {
      fn(doc) {
        if (!doc.email || !doc.email.includes("@")) {
          throw new Error("Invalid email");
        }
      },
    },
  ],
});

/* If user's name changed, update their posts' denormalized data */
hook(Users, {
  onUpdated: [
    {
      fields: { name: 1 },
      /* Use `when` predicate to run the hook only on this condition.
       * Could also have used `unless` or checked condition inside `fn`. */
      when: (after, before) => after.name !== before.name,

      /* Effect to run - uses update() which also supports hooks */
      fn(after) {
        const { _id, name } = after;
        update(Posts, { authorId: _id }, { $set: { authorName: name } });
      },
    },
  ],
});

hook(Posts, {
  beforeRemove: [
    {
      /* Limit fetched fields of docs to be removed */
      fields: { _id: 1 },
      /* Only run for non-admin users */
      unless() {
        return Meteor.user()?.isAdmin;
      },
      fn() {
        throw new Error("Only admins can delete posts");
      },
    },
  ],

  onRemoved: [
    {
      /* Only log removal of published posts */
      when(doc) {
        return doc.status === "published";
      },
      fn(doc) {
        logEvent("post_deleted", { postId: doc._id });
      },
    },
  ],
});

hook(Users, {
  beforeRemove: [
    {
      fn(usersToRemove) {
        const userIds = usersToRemove.map((u) => u._id);

        /* Prevent removal if user has published posts */
        const hasPublished = exists(Posts, {
          authorId: { $in: userIds },
          status: "published",
        });

        if (hasPublished) {
          throw new Error("Cannot delete users with published posts");
        }
      },
    },
  ],

  onRemoved: [
    {
      fn(user) {
        /* Clean up related data after user is removed.
         * See remove() for more details on how this integrates with hooks. */
        remove(Comments, { authorId: user._id });
      },
    },
  ],
});

The data mutation methods below use basically the same arguments as Meteor collection methods.

`insert(Coll, doc)`

Insert a document into a collection. Returns the document _id. Runs beforeInsert and onInserted hooks if defined.

const newUser = insert(Users, {
  name: "Bob",
  email: "[email protected]",
});

Execution flow:

Run beforeInsert hooks (can throw to prevent insertion)
Insert the document
Fire onInserted hooks asynchronously (without awaiting)

`update(Coll, selector, modifier, options)`

Update documents matching the selector. Returns the number of documents modified. Runs beforeUpdate and onUpdated hooks if defined. Updates multiple documents by default (unlike Meteor's behavior).

update(Users, { status: "pending" }, { $set: { status: "active" } });

Execution flow:

Fetch target documents with beforeUpdate and onUpdated.before fields
Run beforeUpdate hooks with (docsToUpdate, modifier) (can throw to prevent update)
Execute the update
Fetch updated documents again with onUpdated fields
Fire onUpdated hooks asynchronously with (afterDoc, beforeDoc) pairs

Options:

multi (default: true): Update multiple documents or just the first match;
arrayFilters: Optional. Used in combination with MongoDB filtered positional operator to specify which elements to modify in an array field.

`remove(Coll, selector)`

Remove documents matching the selector. Runs beforeRemove and onRemoved hooks if defined.

remove(Users, { inactive: true });

Execution flow:

Fetch documents matching the selector with beforeRemove and onRemoved fields
Run beforeRemove hooks with matched documents (can throw to prevent removal)
Remove the documents
Fire onRemoved hooks asynchronously with each removed document

`registerSoftRemove(Coll, options)`

Register soft-remove behavior for a collection. This is a startup-time registration step similar to joins/hooks registration.

registerSoftRemove defines which targeted documents should be kept when softRemove is called, and optionally how those kept documents should be updated.

import { registerSoftRemove } from "coll-fns";

registerSoftRemove(Projects, {
  /* Optional. Fields fetched before evaluating keep predicates. */
  fields: { _id: 1, ownerId: 1, archived: 1 },

  /* Optional predicate. If true, the doc is kept. */
  when(project) {
    return project.archived;
  },

  /* Optional predicate by references.
   * Return [Coll, selector] pairs to test with exists(). */
  docToCollSelectorPairs(project) {
    return [
      [Tasks, { projectId: project._id, status: { $ne: "done" } }],
      [Invoices, { projectId: project._id }],
    ];
  },

  /* Optional default modifier applied to kept docs by softRemove(). */
  keepModifier: { $set: { archived: true, removedAt: new Date() } },
});

Options

fields: fields to fetch before keep checks (merged with _id internally).
when(doc): optional predicate; truthy means keep.
docToCollSelectorPairs(doc): optional function returning [[Coll, selector], ...]; if any selector matches at least one doc, keep.
keepModifier: optional default modifier used by softRemove for kept docs (can also be provided per call).

At least one of when or docToCollSelectorPairs must be provided.

`softRemove(Coll, selector, keepModifier, options)`

Run a remove operation that can keep some matched documents (and optionally update those kept documents).

import { softRemove } from "coll-fns";

/* Remove removable projects; archive the ones that must be kept. */
const result = await softRemove(
  Projects,
  { workspaceId },
  { $set: { archived: true, removedAt: new Date() } },
  { detailed: true }
);

// { removed: number, updated: number|null }

keepModifier can also be a function, including an async function. This is useful when the modifier must be built from runtime context.

await softRemove(
  Posts,
  { _id: postId },
  () => ({
    $set: {
      removedAt: new Date(),
      removedBy: Meteor.userId(),
      status: "archived",
    },
  }),
  { detailed: true }
);

If no keepModifier is passed, the default one from registerSoftRemove is used.
If neither is defined, kept docs are simply excluded from removal.

/* Uses the registered default keepModifier */
await softRemove(Projects, { workspaceId });

Execution flow

Fetch docs targeted by selector.
Evaluate keep predicates per doc (when and/or docToCollSelectorPairs).
Remove docs not marked to keep.
If a keep modifier exists, update kept docs with it.

softRemove uses coll-fns remove(...) and update(...) internally.
That means the usual hooks (beforeRemove/onRemoved, beforeUpdate/onUpdated) are still applied in the corresponding branch.

Options

detailed (default false):
- false: returns total affected count (removed + updated).
- true: returns { removed, updated }.

`configurePool(options)`

After hooks can generate significant background work, especially when they trigger cascading writes and more after hooks.

coll-fns uses an internal execution pool for fire-and-forget after hooks.
You can configure this pool at startup.

configurePool must be called before any after hook is processed.

Default behavior

By default, the pool uses:

maxConcurrent: 10
maxPending: 250
onOverflow: drop new call and warn in console

This prevents unbounded growth while allowing parallel processing.

Options

configurePool({
  maxConcurrent?: number; // >= 1
  maxPending?: number | Infinity; // >= 0 or Infinity
  onOverflow?: "drop" | "shift" | (pendings, call) => reorderedPendings | void;
  onError?: (error, call) => void;
});

maxConcurrent: maximum number of after hooks executed in parallel.
maxPending: maximum number of queued hooks waiting for execution.
onOverflow: policy to apply when pendings overflow
- "drop": ignore the new call.
- "shift": remove oldest pending call, enqueue the new one.
- function: return a reordered/filtered pending list.
onError: called when a pooled call fails. If omitted, errors are logged.

Example

import { configurePool } from "coll-fns";

/* Must be called BEFORE any after hook is processed. */
configurePool({
  maxConcurrent: 20,
  maxPending: 1000,
  onOverflow: "shift",
  onError(error, call) {
    console.error("After-hook pool error:", error, call);
  },
});

Notes

This configuration is startup-only. Calling configurePool after processing starts throws.
If your workload is sensitive to ordering, keep maxConcurrent low or implement ordering constraints in your hook logic.
Tune maxConcurrent/maxPending based on your app’s throughput and memory profile.

Hook best practices

Before hooks should throw errors to prevent operations:

hook(Users, {
  beforeInsert: [
    {
      fn(doc) {
        if (!isValidEmail(doc.email)) {
          throw new Error("Invalid email");
        }
      },
    },
  ],
});

After hooks have a default error handler that logs to console.error. Define a custom onError handler if you need different behavior. Receives (err, hookDef) where hookDef is the hook definition enhanced with metadata, including Coll, collName and hookType.

hook(Users, {
  onInserted: [
    {
      fn(doc) {
        /* ... */
      },
      onError(err, hookDef) {
        logToService(err, hookDef.collName);
      },
    },
  ],
});

Always declare which fields your hook needs with the fields property. This reduces database queries and improves performance:

hook(Posts, {
  onUpdated: [
    {
      /* Only fetch these fields */
      fields: { authorId: 1, title: 1 },
      fn(afterPost, beforePost) {
        if (afterPost.title !== beforePost.title) {
          notifySubscribers(afterPost);
        }
      },
    },
  ],
});

Use when and unless to avoid unnecessary side effects while keeping code clean and predictable:

hook(Users, {
  onUpdated: [
    {
      fields: { status: 1 },
      /* Only run if status actually changed */
      unless(after, before) {
        return after.status === before.status;
      },
      fn(after, before) {
        sendStatusChangeEmail(after);
      },
    },
  ],
});

Hooks support both synchronous and asynchronous code. Returning a promise from a before-hook will delay the write operation:

hook(Users, {
  beforeInsert: [
    {
      async fn(doc) {
        /* Wait for external service */
        doc.externalId = await createExternalUser(doc);
      },
    },
  ],
});

Nested reactive publications

If coll-fns is used in a Meteor project, using publications is a way to create fully reactive applications. publish function helps to publish complex hierarchical data.

When to use `publish`

Use publish() when the publication tree is dynamic and cannot be represented as a simple static list of cursors.

Important caveat (Meteor): if a publication can return plain cursor(s) directly from Meteor.publish, prefer that approach. Native cursor-return publications are simpler and usually more optimized by Meteor internals than any userland helper.

publish() is intended for cases where you need one or more of:

nested reactive children depending on parent documents
selector recomputation based on changed parent fields
observer reuse and invalidation logic you do not want to hand-roll repeatedly

`publish(publication, Coll, selector, options)`

Create a reactive publication tree (Meteor-style) with support for:

explicit child observers ({ Coll, on, ... })
join shorthand children ({ join: "joinKey", ... })
implicit join children derived from parent requested join fields

publish internally uses protocol methods:

observe to track cursor changes
getName to emit DDP collection names
stringify to build stable query reuse keys

As with normal joins, child on values can be:

static selector objects
functions (parent, ...ancestors) => selector
join-array selectors: [from, to, toSelector?]

Refer to the join() section for more details.

import { Meteor } from "meteor/meteor";
import { join, publish, setJoinPrefix } from "coll-fns";
import {
  Posts,
  Users,
  Comments,
  Tags,
  FeatureFlags,
  PostStats,
} from "/imports/api/collections";

setJoinPrefix("+");

join(Posts, {
  author: { Coll: Users, on: ["authorId", "_id"], single: true },
  comments: { Coll: Comments, on: ["_id", "postId"], sort: { createdAt: -1 } },
  stats: { Coll: PostStats, on: ["_id", "postId"], single: true },
});

join(Comments, {
  author: { Coll: Users, on: ["authorId", "_id"], single: true },
});

Meteor.publish("posts.tree", function postsTree() {
  return publish(
    this,
    Posts,
    { status: "published" },
    {
      fields: {
        title: 1,
        authorId: 1,
        tagIds: 1,
        editorId: 1,
        "+": {
          stats: 1,
          comments: {
            body: 1,
            createdAt: 1,
            "+": {
              author: { displayName: 1 },
            },
          },
        },
      },
      children: [
        /* Predefined join on Posts collection */
        {
          join: "author",
          fields: { displayName: 1, avatarUrl: 1 },
        },

        /* Array selector */
        {
          Coll: Tags,
          on: [["tagIds"], "_id"],
          fields: { label: 1, color: 1 },
          deps: undefined, // Array selector children implicitely derive deps
        },

        /* Function selector */
        {
          Coll: Users,
          on: (post) => ({ _id: post.editorId }),
          fields: { displayName: 1 },
          deps: ["editorId"],
        },

        /* Object selector. Always returns the same data irrespective of parent. */
        {
          Coll: FeatureFlags,
          on: { scope: "posts_publication" },
          fields: { key: 1, enabled: 1 },
          deps: false,
        },
      ],
    }
  );
});

Publication readiness and concurrency

Readiness is controlled per child with awaited (default: true):

awaited: true: wait for this child subtree before calling ready()
awaited: false: do not wait for this child subtree; it loads/reacts in background

Use awaited: true for data the screen must have immediately, and awaited: false for optional or heavy branches. If not specified, a child inherits awaited from its parent.

options.maxConcurrent controls how many child observer creations can run at the same time (10 by default). This can only be defined at the publication root, not on children.

Why it matters:

each parent added/changed can trigger many child observer creations
unbounded concurrency can create CPU spikes and DB pressure
too little concurrency can slow initial publication warm-up

What it controls exactly:

only child observer creation tasks are throttled
observer reuse still applies (duplicate query keys are collapsed)
invalidation logic still runs; the option just smooths creation bursts

Practical guidance:

decrease it if your publication causes heavy DB load or event-loop stalls
increase it if your DB/runtime handles parallelism well and warm-up is slow
keep in mind this is per publish() call, not a single global cap

Example:

Meteor.publish("posts.tree", function postsTree() {
  return publish(
    this,
    Posts,
    { status: "published" },
    {
      maxConcurrent: 5,
      children: [
        { Coll: Users, on: ["authorId", "_id"], awaited: true },
        {
          Coll: FeatureFlags,
          on: { scope: "posts_publication" },
          awaited: false,
        },
      ],
    }
  );
});

Publication context (`this` in Meteor)

publish() is designed first for Meteor publications. In Meteor usage, the first argument should be the publication session/context (this) received in Meteor.publish(name, function () { ... }).

Minimal expected shape of publication:

ready() (required)
added(collectionName, id, fields) (optional but normally provided by Meteor)
changed(collectionName, id, fields) (optional but normally provided by Meteor)
removed(collectionName, id) (optional but normally provided by Meteor)
onStop(fn) (optional, used for cleanup registration)
error(err) (optional, used as error sink)

Example:

Meteor.publish("posts.tree", function () {
  // `this` is the publication context/session.
  return publish(this, Posts, { status: "published" });
});

If ready is missing, publish() throws.

How child declarations work

children entries can be objects or falsy values (false, null, undefined). Falsy entries are ignored, which allows short-circuit declarations like isEnabled && { ...childArgs }.

Object entries can be defined with either:

explicit child args:
- { Coll, on, fields?, deps?, awaited?, children?, ...cursorOptions }
join shorthand:
- { join: "joinKey", ...overrides }

Additionally, join children can be derived implicitly from requested parent join fields.

Conflict rule:

If the same join key is declared both as explicit child ({ join: "..." }) and in parent join fields (fields["+"][joinKey] or root join key without prefix), publish() throws and asks you to choose one style.

Ancestors chain

For function selectors and function deps, the helper passes:

first argument: direct parent document
remaining arguments: full ancestors chain (grandparent, great-grandparent, ...)

This is useful when deep children must depend on context from higher levels.

Meteor.publish("resources.tasks.actions", function () {
  return publish(
    this,
    Resources,
    { archived: false },
    {
      children: [
        {
          Coll: Tasks,
          on: (resource) => ({ resourceId: resource._id }),
          deps: ["_id"],
          children: [
            {
              Coll: Actions,
              on: (task, resource) => ({
                taskId: task._id,
                tenantId: resource.tenantId,
              }),
              deps(changedFields, task, resource) {
                // Re-run when task link changes or resource tenancy changes.
                if ("tenantId" in changedFields) return true;
                return ["_id", "tenantId"];
              },
            },
          ],
        },
      ],
    }
  );
});

How `deps` works

deps controls when child observers are invalidated and recomputed after parent changed events.

Supported values:

true: always invalidate
false: never invalidate
"field" or ["fieldA", "fieldB"]: invalidate only when those keys are present in changed fields
{ fieldA: 1, fieldB: true }: object shorthand converted to watched keys (truthy top-level keys only)
function (changedFields, parent, ...ancestors) => depsLike: dynamic rule
undefined: special behavior

Special behavior when deps is undefined:

static selector object child: treated as no invalidation ([])
array/function selector child: treated as potentially dependent and will always invalidate conservatively

For array selectors, implicit deps are auto-derived from the from key.

deps matching is flat:

matching is done against exact keys present in changedFields
no deep path traversal is performed by publish()
nested object deps only contribute top-level keys

Debugging

publish() supports lightweight lifecycle debugging with:

debug: true to log all internal debug events
debug: ["EVENT_NAME", ...] to log selected events only
debug: { EVENT_NAME: true, ... } to log selected events only

Observer events:

CREATED: a new observer was successfully created and registered.
BYPASSED: observer creation was skipped because selector resolved to a void selector.
REUSED: an existing observer for the same query key was reused.
INVALIDATED: child observer graph for a parent document was recomputed after relevant parent changes.
UNFOLLOWED: one follower link to a sub-observer was removed.
CANCELLED: an observer was cancelled and its local cleanup started.

Observer documents events:

DOC_ADDED: a document was published through added.
DOC_CHANGED: a published document emitted a changed update.
DOC_REMOVED: a document was removed from publication (observer count for that document dropped to zero).

Publication events (available on root only):

READY: emitted once when publish() calls publication.ready().
STOPPED: emitted once when the publication stop handler runs.

Debug scope:

Root debug applies to the root observer and root publication locations (READY, STOPPED).
Each child defines its own debug argument.
Implicit join-derived children (from parent fields) inherit parent debug.

Built-in optimizations

publish() is designed to stay controlled even with nested reactive trees.

In practical terms, it aims to protect you from:

creating the same observer repeatedly for equivalent child queries
runaway bursts of child observer creation
stale async creations being attached after data already changed
leaked child observers when parent links disappear
duplicate add/remove churn for documents shared by multiple branches

Why this matters:

lower risk of memory growth from forgotten/stale observers
fewer unnecessary observers and DB watches
more predictable behavior during frequent parent changes
safer use of nested publications in real apps, not only toy examples

maxConcurrent is part of this safety model: it prevents uncontrolled parallel creation bursts and lets you tune throughput vs load.

Using `publish` outside Meteor

The helper can be used outside Meteor only if both layers are provided:

protocol layer (setProtocol) supporting at least:
- observe
- getName (optional, default implementation provided)
- stringify (optional, default implementation provided)
publication transport/context object implementing the callbacks listed above (added/changed/removed/ready/onStop/error)

Protocol methods handle database reactivity. publication handles how data changes are emitted to clients.

Current limitations

publish() is a helper, not a replacement for simple cursor-return publications.
Deep/dynamic trees can still be expensive if selectors are broad and highly volatile.
If deps are too broad (or omitted for dynamic selectors), invalidations may be frequent.
For best results, keep parent selectors selective and declare precise deps.

License

MIT

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

Overview

Table of contents

Rationale

Installation and configuration

setProtocol(protocol)

Execution context (bindEnvironment)

Bypassing coll-fns

Joins and fetch

Quick start examples

join(Coll, joinDefinitions)

Simple array join

Sub-array joins

Filtered array-joins

Object joins

Function joins

Recursive joins

Join additional options

postFetch

getJoins

fetchList(Coll, selector, options)

fields option and joins

Examples

setJoinPrefix(prefix)

Nested Joins

Recursion levels

Documents transformation

fetchOne(Coll, selector, options)

fetchIds(Coll, selector, options)

exists(Coll, selector)

count(Coll, selector)

flattenFields(fields)

Hooks and write operations

hook(Coll, hooksObj)

Before hooks

After hooks

Hook definition properties

Examples

insert(Coll, doc)

update(Coll, selector, modifier, options)

remove(Coll, selector)

registerSoftRemove(Coll, options)

softRemove(Coll, selector, keepModifier, options)

configurePool(options)

Default behavior

Options

Hook best practices

Nested reactive publications

When to use publish

publish(publication, Coll, selector, options)

Publication readiness and concurrency

Publication context (this in Meteor)

How child declarations work

Ancestors chain

How deps works

Debugging

Built-in optimizations

Using publish outside Meteor

Current limitations

License

`setProtocol(protocol)`

Execution context (`bindEnvironment`)

Bypassing `coll-fns`

`join(Coll, joinDefinitions)`

`postFetch`

`getJoins`

`fetchList(Coll, selector, options)`

`fields` option and joins

`setJoinPrefix(prefix)`

`fetchOne(Coll, selector, options)`

`fetchIds(Coll, selector, options)`

`exists(Coll, selector)`

`count(Coll, selector)`

`flattenFields(fields)`

`hook(Coll, hooksObj)`

`insert(Coll, doc)`

`update(Coll, selector, modifier, options)`

`remove(Coll, selector)`

`registerSoftRemove(Coll, options)`

`softRemove(Coll, selector, keepModifier, options)`

`configurePool(options)`

When to use `publish`

`publish(publication, Coll, selector, options)`

Publication context (`this` in Meteor)

How `deps` works

Using `publish` outside Meteor