@skytruth/shared-datasets

v0.7.0

Published

Framework-neutral TypeScript helpers for SkyTruth shared datasets.

Readme

SkyTruth Shared Datasets TypeScript Helpers

Framework-neutral TypeScript helpers for browser and server code that consumes SkyTruth shared-datasets PMTiles through the SkyTruth CDN.

Use this package for catalog-driven PMTiles URLs, browser CDN session handshakes, PMTiles fetch credential selection, lightweight access-tier lookups, and server-only Cloud CDN signed-cookie helpers. It does not own application authentication, authorization, routing, secret storage, logging, UI behavior, retries, or HTTP/error translation.

Package Status And Installation

The package name is:

@skytruth/shared-datasets

The package is published on npm. Consumers install it with:

npm install @skytruth/shared-datasets

Use a local path only for package development and integration testing against unreleased local changes:

npm install ../shared-datasets-1/api/typescript

Do not commit local-path installs to production consumers. Verify the registry version before changing production consumers:

npm view @skytruth/shared-datasets version

Entrypoints

Use the browser-safe main entrypoint from client code and shared code that may be bundled into a browser:

import {
  clearPmtilesCdnSession,
  ensurePmtilesCdnSession,
  getPmtilesFetchCredentials,
  isPrivatePmtilesUrl,
  resolveSharedDatasetPmtilesRef
} from "@skytruth/shared-datasets";

Use the server-only entrypoint from API routes, server actions, or backend code that can import Node built-ins:

import {
  decodePmtilesCdnSigningKey,
  getExpiredPmtilesCookies,
  getPrivatePmtilesSessionCookies
} from "@skytruth/shared-datasets/server";

Do not import @skytruth/shared-datasets/server from browser bundles. It uses Node crypto and should stay behind the consumer application's backend boundary.

Recommended Setup By Runtime

| Runtime | Use | Setup | |---|---|---| | Browser displaying public PMTiles | Catalog helpers and getPmtilesFetchCredentials | Resolve pmtiles_url from catalog JSON or use a known public URL; no session endpoint is required. | | Browser displaying private or internal PMTiles | Main entrypoint session and fetch helpers | Call a consumer-owned backend session endpoint before mounting restricted layers and use credentialed PMTiles range requests. | | Browser click needs public feature attributes | resolveSharedDatasetLayer plus fetchSharedDatasetMetadataRecords | Resolve the layer and its release metadata sidecar together, then join clicked features by feature_id. | | Browser click needs private feature attributes | App backend route to a signed metadata sidecar URL | PMTiles expose feature_id; URL workflows should use the same canonical feature ID. The backend authenticates, authorizes, and returns only app-approved metadata access. | | Backend PMTiles session route | createPmtilesSessionHandler / createNextPmtilesSessionHandler from the server entrypoint | Provide getViewer and getSigningKey; the handler implements the full tiered session contract. | | Backend private metadata URL route | Server entrypoint artifact signing helper | Validate slug, release, locale, asset tier, and entitlement before signing an exact sidecar path. | | Backend layer/config API | Catalog helpers or access-tier cache helpers | Resolve catalog JSON once, preserve accessTier, url, citation, source, release, and release metadata sidecar references in consumer-owned config. |

Use the Python SDK instead when backend code needs to download canonical data files or resolve durable gs:// object identities with Application Default Credentials.

Release-oriented vector PMTiles are intentionally lightweight and should not be treated as the source of full feature attributes. They expose exactly one feature property, feature_id. Use it for click-to-metadata joins through a release metadata sidecar resolved from the release index. The IAP-protected metadata lookup API (POST /v1/assets/{slug}/releases/{release}:lookup) is dormant while Firestore metadata serving is inactive — otherwise valid lookup requests return 409 index_not_ready — so active consumer workflows must use the sidecar.

feature_id values are URL-safe strings matching ^[A-Za-z0-9]{1,64}$, either copied from a verified-unique source field or assigned as monotonic decimal sequence strings preserved across releases. For user-visible URLs, pass that public feature_id handle through the app backend and resolve metadata with the same release-scoped sidecar contract.

Catalog Helpers

The default catalog JSON URL is:

https://tiles.skytruth.org/_catalog/web/catalog.json

Resolve one PMTiles reference:

import { resolveSharedDatasetPmtilesRef } from "@skytruth/shared-datasets";

const ref = await resolveSharedDatasetPmtilesRef("example-public-layer");

Resolve several or all PMTiles references:

import {
  resolveAllSharedDatasetPmtilesRefs,
  resolveSharedDatasetPmtilesRefs
} from "@skytruth/shared-datasets";

const selectedRefs = await resolveSharedDatasetPmtilesRefs([
  "example-public-layer",
  "example-private-layer"
]);

const allRefs = await resolveAllSharedDatasetPmtilesRefs();

If your app already fetched catalog JSON, avoid a second network call:

import { resolveSharedDatasetPmtilesRefsFromCatalogJson } from "@skytruth/shared-datasets";

const refs = resolveSharedDatasetPmtilesRefsFromCatalogJson(catalogJson, [
  "example-public-layer",
  "example-private-layer"
]);

When a PMTiles layer needs localized labels or feature-inspector display names, resolve the selected release metadata sidecar instead of reading display columns from PMTiles. Prefer the requested locale-specific {asset-slug}.metadata.{locale}.ndjson.gz sidecar when present, then the canonical {asset-slug}.metadata.ndjson.gz fallback. Do not hardcode source-native fields; localized source data is materialized from {asset-slug}.metadata-translations.csv rows keyed by feature_id, field, locale, and source_value_hash, while PMTiles expose only feature_id. Sidecar/API records include geometry_hash; use it as the stable geometry-equivalence key for grouping or de-duplicating footprints after metadata is loaded, not as a URL lookup handle.

Use review_state values from metadata records to show or filter confidence for source-provided, machine-translated, human-reviewed, and mixed labels.

Each resolved ref includes:

type SharedDatasetCatalogRef = {
  accessTier: "public" | "private" | "internal";
  url: string;
  title: string | null;
  description: string | null;
  status: string | null;
  consumerGuidance: string | null;
  citation: string | null;
  license: string | null;
  source: string | null;
  sourceUrl: string | null;
  docsUrl: string | null;
  releaseIndexUrl: string | null;
  latestRelease: Record<string, unknown> | null;
  lastUpdated: string | null;
  localizedNames: {
    storage?: string | null;
    join_key?: string | null;
    localization_file?: string | null;
    property_template?: string | null;
    locale_code_format?: string | null;
    fallback_locale?: string | null;
    fallback_field?: string | null;
    available_locales?: string[];
    translations?: Array<{
      locale_code: string;
      field: string;
      review_state_field?: string | null;
      label?: string | null;
      review_state: "source_provided" | "machine_translated" | "human_reviewed" | "mixed";
    }>;
  } | null;
};

localizedNames is retained for older catalog JSON and consumers. New release-oriented metadata sidecar integrations should prefer the release index metadata artifact helpers described below.

Catalog resolution throws SharedDatasetCatalogResolutionError when catalog data is missing, malformed, or cannot resolve a requested PMTiles asset. These helpers return PMTiles-capable assets only. For full catalog screens or default production layer lists, use status === "active" and preserve the license, citation, source, docs, release, and metadata sidecar references returned with each ref; if your app needs non-PMTiles assets or fields outside this type, fetch and parse the catalog JSON directly.

Layer And Metadata Resolution

Because PMTiles expose only feature_id, mounting a layer and resolving its release metadata belong together. resolveSharedDatasetLayer is the recommended path: one call resolves the catalog ref, fetches the release index, and resolves the metadata sidecar from the same release:

import {
  fetchSharedDatasetMetadataRecords,
  resolveSharedDatasetLayer
} from "@skytruth/shared-datasets";

const layer = await resolveSharedDatasetLayer("example-public-layer", {
  locale: userLocale
});

renderPmtilesLayer(layer.ref.url);

if (layer.sidecar?.url) {
  const records = await fetchSharedDatasetMetadataRecords(layer.sidecar.url);
  const record = records.get(clickedFeatureId);
}

The returned layer includes ref (the PMTiles catalog ref), releaseIndex, resolvedRelease (the concrete YYYY-MM-DD release the sidecar came from; persist it when lineage matters), and sidecar with the resolved locale, fallback flag, and artifact URL. sidecar.url is null for private assets — route those through an app-owned signed-URL backend instead. Pass version: "YYYY-MM-DD" to pin the sidecar to an exact release; the default is the release index latest, which matches what the latest/ PMTiles CDN URL serves.

fetchSharedDatasetMetadataRecords downloads the sidecar, transparently handles both CDN-decompressed NDJSON and raw gzip bytes, parses each line, and returns a Map keyed by feature_id. Loading eagerly is fine for small assets; for assets with very large sidecars, load lazily on first interaction and keep the parsed map cached. parseSharedDatasetMetadataRecords is exported separately for callers that fetch sidecar text themselves.

Metadata Artifact Helpers

Release indexes list metadata sidecars for feature-inspector fields. Public assets can fetch those sidecars directly from the CDN artifact route:

import {
  resolvePublicSharedDatasetMetadataSidecarUrl
} from "@skytruth/shared-datasets";

const sidecar = resolvePublicSharedDatasetMetadataSidecarUrl({
  accessTier: ref.accessTier,
  releaseIndex,
  version: "latest",
  locale: userLocale
});

if (sidecar) {
  const response = await fetch(sidecar.url, { cache: "no-store" });
  const metadataBytes = await response.arrayBuffer();
}

The resolver tries the requested locale first and falls back to the canonical .metadata.ndjson.gz sidecar when a localized sidecar is absent. It returns null when the release index has no metadata sidecar.

Each sidecar is gzip NDJSON with one JSON record per feature:

{
  "schema_version": 2,
  "asset_slug": "example-boundary-layer",
  "release": "2026-06-09",
  "feature_id": "12345",
  "geometry_hash": "sha256:...",
  "properties_hash": "sha256:...",
  "properties": { "SOURCE_ID": 12345, "NAME": "Example feature" },
  "provenance": { "source": "Example source release" }
}

Join PMTiles features to records by feature_id. Localized sidecars keep the same record shape with translated display values already materialized into properties.

Private assets should not expose direct sidecar URLs from browser code. Consumer backends should expose an app-owned route such as:

GET /api/shared-datasets/metadata-url?slug=&version=&locale=

That route should authenticate the user, apply app-specific authorization, resolve the exact sidecar from catalog and release-index data, verify the asset is private, sign the artifact URL, return Cache-Control: no-store, and never sign arbitrary caller-provided object paths.

Server code can sign an exact resolved artifact path:

import { getSignedSharedDatasetArtifactUrl } from "@skytruth/shared-datasets/server";

const signedUrl = getSignedSharedDatasetArtifactUrl(gsUri, signingKey);

By default this server helper signs https://tiles.skytruth.org/private/... URLs for private metadata sidecars. Pass artifactBaseUrl only for tests or a deliberate deployment-specific CDN route.

Browser PMTiles Fetching

Before fetching private or internal PMTiles, call the consumer backend session endpoint. Public layers can skip the session call because they do not need a cookie.

const result = await ensurePmtilesCdnSession({
  accessTier: ref.accessTier,
  endpoint: "/api/pmtiles/session"
});

if (!result.ok) {
  if (result.denied) {
    hidePmtilesLayer(ref);
  } else {
    reportPmtilesSessionFailure(result);
  }
  return;
}

renderPmtilesLayer(ref.url);

A denied: true result (HTTP 403) means the signed-in viewer is not authorized for this tier — hide the layer instead of retrying. To decide which layers to offer before mounting anything, probe the viewer's qualifying tiers:

const grants = await getPmtilesCdnGrants({ endpoint: "/api/pmtiles/session" });
if (grants.ok && grants.tiers.includes("internal")) {
  showInternalDatasetGroup();
}

Use getPmtilesFetchCredentials anywhere PMTiles bytes are fetched:

const response = await fetch(ref.url, {
  credentials: getPmtilesFetchCredentials(ref.url),
  headers: {
    Range: `bytes=${start}-${end}`
  }
});

The helper returns:

  • include for restricted PMTiles URLs under /pmtiles/private/ or /pmtiles/internal/
  • same-origin for public PMTiles URLs

Relative PMTiles URLs are resolved against https://tiles.skytruth.org by default. Pass baseUrl and restrictedPathPrefixes only for tests or a deliberate consumer-owned PMTiles route.

On sign-out, clear CDN cookies through the same consumer endpoint:

await clearPmtilesCdnSession({ endpoint: "/api/pmtiles/session" });
await signOutUser();

ensurePmtilesCdnSession and clearPmtilesCdnSession return result objects instead of throwing for HTTP or network failures. Consumers decide whether to warn, retry, hide a layer, redirect to sign-in, or ignore cleanup failures.

Backend CDN Session Route

Consumers should expose their own session endpoint:

GET /api/pmtiles/session?tier=public
GET /api/pmtiles/session?tier=private
GET /api/pmtiles/session?tier=internal
GET /api/pmtiles/session?tier=grants
DELETE /api/pmtiles/session

Do not hand-write the contract. createPmtilesSessionHandler (and the pages-router adapter createNextPmtilesSessionHandler) implement all of it: Cache-Control: no-store on every response, 204 for public, 401 for anonymous restricted-tier requests, 403 for authenticated-but-unauthorized viewers, one signed cookie per tier the viewer qualifies for, a ?tier=grants probe that reports qualifying tiers without setting cookies, DELETE expiry of every restricted-tier cookie, and 500 on signing failures without leaking the key.

// pages/api/pmtiles/session.ts
import { createNextPmtilesSessionHandler } from "@skytruth/shared-datasets/server";

export default createNextPmtilesSessionHandler({
  getViewer: async req => {
    const session = await getCurrentUserSession(req);
    if (!session) return null;
    return {
      email: session.user.email,
      emailVerified: session.user.emailVerified,
      tierGrants: await getAppTierGrants(session.user)
    };
  },
  getSigningKey: async () =>
    decodePmtilesCdnSigningKey(await readPmtilesSigningKey())
});

Tier authorization is decided by isViewerAuthorizedForTier (exported from the main entrypoint): public allows anyone, private allows any authenticated viewer, and internal requires a verified email in the allowed domain list (default skytruth.org) or an unexpired app-level tierGrants entry. Use the same function for API payload filtering and metadata URL entitlement checks so cookie issuance and payload gates cannot drift apart. Guest-granted cookies are automatically clamped to the grant expiry.

For non-Next runtimes, call the framework-neutral handler directly and apply the returned { status, headers, cookies, body } to your response. The low-level helpers (getPmtilesSessionCookiesForTiers, getExpiredPmtilesCookies, decodePmtilesCdnSigningKey) remain available for custom routes. Cookie helpers return arrays; send each returned string as a separate Set-Cookie header — do not comma-join the array into one header value.

The default cookie settings target SkyTruth's PMTiles CDN:

  • cookie name: Cloud-CDN-Cookie
  • cookie domain: .skytruth.org
  • tier cookie paths: /pmtiles/private, /pmtiles/internal
  • tier URL prefixes: https://tiles.skytruth.org/pmtiles/private/, https://tiles.skytruth.org/pmtiles/internal/
  • signing key name: shared-datasets-pmtiles-v1
  • TTL: 30 days (clamped to grant expiry for guest-granted internal access)

Override these values only for tests or an explicitly different CDN route by passing a partial config (tierPaths, ttlSeconds, ...) to the handler or cookie helpers.

Access-Tier Cache Helpers

Use createSharedDatasetAccessTierLookup when a server needs a lightweight cached lookup from asset slug to public, private, or internal:

import {
  createSharedDatasetAccessTierLookup,
  getAccessTiersFromSharedDatasetPmtilesRefs,
  resolveAllSharedDatasetPmtilesRefs
} from "@skytruth/shared-datasets";

const getAccessTier = createSharedDatasetAccessTierLookup({
  loadAccessTiers: async () =>
    getAccessTiersFromSharedDatasetPmtilesRefs(
      await resolveAllSharedDatasetPmtilesRefs()
    )
});

const tier = await getAccessTier("example-public-layer");

The default cache TTL is 5 minutes. Pass ttlMs and now to customize or test cache behavior.

createCatalogSharedDatasetAccessTierLookup wires the same lookup to the shared datasets catalog directly, so the common server case is one call:

import { createCatalogSharedDatasetAccessTierLookup } from "@skytruth/shared-datasets";

const getAccessTier = createCatalogSharedDatasetAccessTierLookup();
const tier = await getAccessTier("example-public-layer");

It accepts the standard catalog fetch options (catalogUrl, fetchJson) plus ttlMs and now.

Filtering Private Rows Out Of Untrusted Payloads

When a server endpoint returns rows derived from shared datasets to unauthenticated or otherwise untrusted requesters, use filterPrivateSharedDatasetRows instead of hand-rolling tier checks. It applies the standard access policy: rows from non-public datasets are dropped, rows whose tier cannot be resolved are dropped (fail closed), and rows without an asset slug pass through unchanged.

import {
  createCatalogSharedDatasetAccessTierLookup,
  filterPrivateSharedDatasetRows
} from "@skytruth/shared-datasets";

const getAccessTier = createCatalogSharedDatasetAccessTierLookup();

const { rows, tierLookupFailed } = await filterPrivateSharedDatasetRows(
  candidateRows, // each row carries an `assetSlug` field by default
  { getAccessTier }
);

Pass getAssetSlug when rows store the dataset slug under a different field. tierLookupFailed reports that at least one row was dropped because its tier could not be resolved: the result is safe to serve but over-filtered, so skip long-lived caching of it (otherwise a transient catalog outage pins a degraded payload until the cache expires).

Troubleshooting

| Symptom | Likely cause | Fix | |---|---|---| | npm install @skytruth/shared-datasets returns 404 | The registry, scope, or package name is wrong, or npm has a transient registry issue. | Verify npm view @skytruth/shared-datasets version and use the public npm registry. | | Browser bundle includes node:crypto | The server entrypoint was imported into client code. | Move signing helpers behind a backend route and import browser helpers from the main entrypoint only. | | Private PMTiles session succeeds but tiles fail | The PMTiles library's internal range requests are missing credentials. | Configure or wrap its fetch implementation so all PMTiles requests use credentials: "include". | | Private PMTiles return 401 or 403 | User is unauthenticated, unauthorized, or the signed cookie is missing/expired. | Re-call the session endpoint and verify the backend authorization path. | | Public PMTiles fail with a cookie/session error | Public layers are unnecessarily using the private session path. | Skip ensurePmtilesCdnSession for known public layers or pass the catalog accessTier accurately. |

Development

Install package dependencies and run tests:

npm ci
npm test

Check the publish artifact before a release:

npm pack --dry-run