npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@hashtree/collection

v0.2.4

Published

Immutable content-addressed collections and federated queries for hashtree

Readme

@hashtree/collection

Immutable content-addressed collections for hashtree.

For app-builder guidance and common pitfalls, see ../../GETTING_STARTED.md.

This package adds a small layer on top of @hashtree/index:

  • canonical byId roots
  • auto-updated key indexes
  • auto-updated search indexes
  • optional schema defaults, normalization, and migration hooks
  • published source manifests
  • federated search across many source manifests

It is meant for decentralized app data such as personal catalogs, followed-user datasets, local merged views, and broader platform-style apps where many publishers own their own records.

Design

This package is intended for decentralized data, so it does not assume one rigid global schema.

The intended model is:

  • each publisher owns their own source
  • canonical data is source-owned and content-addressed
  • indexes are derived projections
  • federated search is multi-query over many sources
  • local schema rules are allowed, but global schema lockstep is not required

In practice, that means a collection source should be thought of as:

  • raw item blobs
  • a canonical byId root
  • derived key/search indexes
  • a manifest that advertises those roots

The current package focuses on the index and manifest layer. It does not try to be a full database.

Platform Apps

This package is intended to be the generic data/index layer for apps that used to default to centralized "platform" backends.

Examples:

  • marketplace listings
  • room or apartment inventories
  • ride availability and dispatch inputs
  • booking slots and service catalogs
  • jobs, offers, menus, and local reputation projections

The decentralized pattern is:

  • each participant publishes their own source
  • canonical state stays source-owned
  • browse/search/trust are local derived views
  • federated query replaces the one global SQL table

Raw Data vs Projections

For decentralized systems, the safest long-term split is:

  • raw item format: publisher-defined and potentially app-specific
  • projection/index format: small normalized fields used for search, browse, ranking, and lightweight display

That split matters because clients may not understand every publisher's raw format, but they can still query published projections and indexes.

This package currently gives you the projection/index side:

  • canonical byId
  • named key indexes
  • named search indexes
  • source manifests
  • federated search helpers

It is compatible with a future codec/projection layer, where a source can declare an item format and clients can optionally decode richer item payloads when they know that format.

Published Metadata

When a collection root is published as a hashtree directory, reserve .collection-manifest.json for collection-level metadata that peers can inspect without any local runtime hooks.

Today that metadata is intentionally small:

  • schemaVersion
  • publishedSchema.itemFormat
  • publishedSchema.projectionFormat
  • optional publishedSchema.schemaRef

The JSON shape is the same in TypeScript and Rust. Index names are expected to be meaningful enough on their own, so there is no extra per-index description layer by default.

Schema

CollectionDefinition.schema is intentionally a local convenience, not a universal contract.

Use it for:

  • filling defaults
  • normalization before indexing
  • validation for your own writes
  • migrating known legacy item shapes

Do not assume every remote source on the network shares the same schema or predictable migration chain.

For that reason, schema support in this package is intentionally small:

  • defaults
  • normalize
  • validate
  • migrate

If a decentralized source uses an unknown raw item format, the source can still participate in federated search as long as it publishes compatible derived indexes.

Federated Query Model

The intended default is:

  • query many source manifests in parallel
  • merge results locally
  • dedupe by logical id
  • optionally boost by trust or social distance

This is usually better than physically merging everyone into one canonical shared mutable index.

Physical merge can still be useful as a local cache or overlay, but correctness should come from source snapshots, not from endlessly accumulating merged roots.

Install

npm install @hashtree/collection

Usage

import { MemoryStore } from '@hashtree/core';
import { CollectionWriter, CollectionSource, federatedSearch } from '@hashtree/collection';

const store = new MemoryStore();

const songs = new CollectionWriter(store, {
  sourceId: 'npub1.../audio',
  schema: {
    version: 2,
    defaults: { tags: [] },
    normalize: (song) => ({
      ...song,
      title: song.title.trim(),
    }),
  },
  getId: (song) => song.id,
  keyIndexes: [
    { name: 'artist', keys: (song) => [`artist:${song.artist.toLowerCase()}`] },
  ],
  searchIndexes: [
    { name: 'songs', prefix: 's:', text: (song) => [song.title, song.artist] },
  ],
});

await songs.put({ id: 'song-1', title: 'Starlight Echo', artist: 'Ada' }, someCid);

const source = new CollectionSource(store, songs.manifest());
const results = await source.search('songs', 'starlight');

Notes

  • put(item, cid) is safe for inserts and by-id-only collections.
  • put(...) requires options.previous when replacing an existing item in a collection with key/search indexes, so the library can remove stale derived entries deterministically.
  • replace(item, cid, previous) is the explicit helper for indexed updates.
  • delete(item) requires the indexed fields of the item being removed.
  • count() uses the manifest's published itemCount when available; use exactCount() if you explicitly need to walk the byId tree.
  • reindex(entries) is the explicit way to rebuild all derived roots after adding indexes or changing derivation rules. It accepts sync or async entry streams, but each entry still needs the canonical item snapshot plus its CID; roots alone are not enough.
  • If query-time normalization differs from the default keyword parser, define searchIndexes[].terms(text, { parseKeywords }).
  • When the reader still has the collection definition, pass it to new CollectionSource(store, manifest, definition) so source.search(...) reuses the same term expansion.
  • When the reader only has the manifest, pair searchIndexes[].terms(...) with CollectionSource.searchTerms(...) and app-side query parsing so indexing and querying stay in sync.
  • Schemas are intentionally small: use defaults, normalize, validate, and migrate instead of a large schema framework.
  • Federated search is multi-query first. You do not need to physically merge roots just to search across many sources.

Direction

The likely next layer on top of this package is a codec/projection model:

  • source declares an itemFormat
  • clients optionally register adapters/codecs for known formats
  • search and browse can still work from published projections even when raw items are unknown

That keeps the network open to many app-specific formats without giving up discoverability.