@x12i/countex

v1.8.3

Published

6 days ago

Generic sparse counter matrices for JSON events. Dynamic counting, aggregation, and retrieval from mapped JSON inputs.

0High
0Medium
0Low

x12i

counters aggregation analytics metrics sparse-matrix mongo redis events timeseries

@x12i/countex

Generic sparse counter matrices for JSON events. Dynamic counting, aggregation, and retrieval from mapped JSON inputs.

@x12i/countex is a generic mapped-event counter: a source-agnostic counting engine. You give it any JSON event, a declarative mapping config, and storage adapters; it projects events into sparse counter cells, batches deltas locally, and persists them atomically to a durable store (MongoDB) and a hot cache (Redis).

It does not know about users, organizations, documents, labels, or events. It only knows generic primitives: scope, subject, metric, dimensions, timeframe, delta. Whatever your domain calls those things — Countex doesn't care.

Terminology: metric is the generic Countex name for the thing being counted or categorized. In your domain that may be a label, status, event name, operation, outcome, class, model result, or business category — the metrics array in a map simply lists JSON paths that resolve to those categories. Storage and APIs refer to dictionary sourceKey (the string extracted from the event); avoid confusing that with application-level UUIDs unless your map uses UUIDs as keys.

Process memory vs retained state: Each worker keeps unflushed deltas in an in-memory buffer. After flush, authoritative aggregates live in durable storage (MongoDB); Redis holds a hot cache. Restarting a worker loses only buffered deltas that were not flushed yet — tune flush intervals accordingly.

event JSON  ──►  map  ──►  raw increments  ──►  buffer  ──►  flush ──► Mongo (durable)
                                                                 └───► Redis (hot cache)

               ◄── query ◄── Redis (full coverage)  or  ◄── Mongo (fallback) ──◄

Install

npm install @x12i/countex
# Bring your own clients — peer deps, both optional:
npm install mongodb ioredis

Node 18+. Zero runtime dependencies in the core package.

Documentation shipped with the package

These guides are included in the published tarball under docs/ and exposed as stable import subpaths (resolve to .md files for editors and tooling):

| Topic | Package subpath | File | |-------|-----------------|------| | Release & production deployment | @x12i/countex/docs/deployment | docs/DEPLOYMENT.md | | Tutorial — install → map → ingest → query | @x12i/countex/docs/end-to-end-example | docs/END_TO_END_EXAMPLE.md | | Feature guides index | @x12i/countex/docs/guides | docs/guides/README.md | | Counter maps | @x12i/countex/docs/guides/counter-maps | docs/guides/counter-maps.md | | Preview a map (dry-run) | @x12i/countex/docs/guides/preview | docs/guides/preview.md | | Ingestion & flush | @x12i/countex/docs/guides/ingestion-and-flush | docs/guides/ingestion-and-flush.md | | Timeframes | @x12i/countex/docs/guides/timeframes-and-bucketing | docs/guides/timeframes-and-bucketing.md | | Measures | @x12i/countex/docs/guides/measures | docs/guides/measures.md | | Querying | @x12i/countex/docs/guides/querying | docs/guides/querying.md | | Memory adapters | @x12i/countex/docs/guides/adapter-memory | docs/guides/adapter-memory.md | | Mongo durable | @x12i/countex/docs/guides/adapter-mongo | docs/guides/adapter-mongo.md | | Redis cache | @x12i/countex/docs/guides/adapter-redis | docs/guides/adapter-redis.md | | Cardinality | @x12i/countex/docs/guides/cardinality | docs/guides/cardinality.md | | Idempotency | @x12i/countex/docs/guides/idempotency | docs/guides/idempotency.md | | Backup export | @x12i/countex/docs/guides/backup-export | docs/guides/backup-export.md | | Client events | @x12i/countex/docs/guides/client-events | docs/guides/client-events.md | | Insights overview | @x12i/countex/docs/guides/insights-overview | docs/guides/insights-overview.md | | Insights ranking | @x12i/countex/docs/guides/insights-ranking | docs/guides/insights-ranking.md | | Insights comparison & trends | @x12i/countex/docs/guides/insights-comparison-and-trends | docs/guides/insights-comparison-and-trends.md | | Insights anomalies & rules | @x12i/countex/docs/guides/insights-anomalies-and-rules | docs/guides/insights-anomalies-and-rules.md | | Insights forecasting & budget | @x12i/countex/docs/guides/insights-forecasting-and-budget | docs/guides/insights-forecasting-and-budget.md | | Insights advisors | @x12i/countex/docs/guides/insights-advisors-and-optimization | docs/guides/insights-advisors-and-optimization.md | | CLI | @x12i/countex/docs/guides/cli | docs/guides/cli.md | | Simulation runtime (JS API) | @x12i/countex/simulation | dist/simulation/ | | Simulation overview | @x12i/countex/docs/guides/simulation | docs/guides/simulation.md | | Simulation profiles | @x12i/countex/docs/guides/simulation-profiles | docs/guides/simulation-profiles.md | | Simulation resource estimation | @x12i/countex/docs/guides/simulation-resource-estimation | docs/guides/simulation-resource-estimation.md |

Use cases (technical playbooks: contracts, fan-out math, exact QuerySpecs, production wiring, verification — see docs/use-cases/README.md):

| Topic | Package subpath | File | |-------|-----------------|------| | Use cases index | @x12i/countex/docs/use-cases | docs/use-cases/README.md | | Audit & security logs (volume, situations, distinct entities, logical event totals) | @x12i/countex/docs/use-cases/audit-security-log-counts | docs/use-cases/audit-security-log-counts.md | | Product & feature activity | @x12i/countex/docs/use-cases/product-feature-usage | docs/use-cases/product-feature-usage.md | | LLM cost & token metering | @x12i/countex/docs/use-cases/llm-cost-and-token-metering | docs/use-cases/llm-cost-and-token-metering.md | | Multi-tenant isolation | @x12i/countex/docs/use-cases/multi-tenant-isolation | docs/use-cases/multi-tenant-isolation.md | | Idempotent webhooks & queues | @x12i/countex/docs/use-cases/webhook-idempotent-ingestion | docs/use-cases/webhook-idempotent-ingestion.md | | Ops dashboards & rollups | @x12i/countex/docs/use-cases/ops-dashboard-daily-rollups | docs/use-cases/ops-dashboard-daily-rollups.md | | Processing & workflow activity | @x12i/countex/docs/use-cases/processing-workflow-activity | docs/use-cases/processing-workflow-activity.md |

Internals (how the engine works — diagrams and traces) live under docs/internals/ in the tarball; browse docs/internals/README.md.

After install: node_modules/@x12i/countex/docs/.

The mental model

Countex turns mapped JSON events into sparse aggregate cells.

A counter cell is uniquely identified by:

(scope, counterId, timeframeType, timeframeKey, subjectType, subjectId, metricId, dimensionsHash)

Each event can produce one or many cell increments. A counter increment is defined by:

scope — your tenant / org / account boundary
counterId — which counter family this is (e.g. access_by_subject_metric)
timeframe — minute / hour / day / week / month / quarter / year / all
subject — the thing being counted against (e.g. an actor or a target)
metric — the count category / label
dimensions — arbitrary additional breakdowns
delta — +1, -1, or any integer

Countex buffers increments locally, flushes atomic deltas to durable + cache, and serves queries from cache with durable fallback.

Measures and rates

Maps may declare optional measures: named numeric fields per event (cost, tokens, latency, …) with aggregations sum, count, min, and max. avg is computed at query time when both sum and count exist on the measure. Use scale on a measure for fixed-point storage (e.g. cents). Extracted values attach to each fan-out increment the same way as delta (see RFC-style docs for fan-out semantics).

Queries accept measures (explicit list or "*"), orderBy on a measure aggregation, and optional rate (count or a measure’s sum divided by elapsed time in the queried window; for format: "timeseries", per bucket). rate is invalid for timeframe all.

Redis cache stores sum and count per measure only; min / max are durable-backed. If a query needs min/max (including orderBy on them, or requesting measures that maintain min/max when using "*"), the planner reads from durable first so cache cannot drift.

bucketRange caps at 10,000 buckets; at minute resolution that is about 6.94 days — keep queries within that bound or narrow the range.

Percentiles & histograms

Core cells store additive measure aggregates (sum / count / min / max) per sparse cell. They do not store distributions, so percentile-style answers (p50/p90/p95/p99) can only come from min/max approximation or from histogram/sketch storage you add yourself. Insights advisors that mention percentile-shaped recommendations are heuristic; see docs/guides/insights-overview.md §"Distributions" and docs/DEPLOYMENT.md "Architecture decisions".

What kinds of events Countex is for

Countex is not tied to one event source or domain. It is a generic mapped-event counter: any JSON-like input stream where useful aggregates can be produced from paths in the event.

Typical sources include:

| Source family | Examples | Common Countex projection | |---|---|---| | Audit and security logs | Microsoft 365 audit logs, Google Workspace logs, Okta logs, cloud audit logs | actor/resource/action/situation/severity | | Product and user activity | Woopra, Segment, PostHog-style events, application activity logs | user/account/feature/event/page/funnel | | Processing and workflow activity | ETL runs, background jobs, agent graphs, workflow engines, queue processors | job/task/node/status/error/runtime | | LLM and AI service usage | Model gateway logs, OpenAI/Anthropic calls, internal LLM services | app/user/model/provider/tokens/cost/latency/result | | Business/domain events | CRM, support tickets, payments, commerce, lifecycle events | customer/object/state/category/outcome |

Countex only needs a map. The map tells Countex where to find:

the scope boundary
the event time
the subjects
the metrics
the dimensions
the delta/count
optional numeric measures such as cost, tokens, latency, or duration

The original event can come from a log, webhook, queue message, analytics event, system activity, processing trace, or generated test fixture. Countex does not care about the event source as long as the map can project it into generic counter primitives.

Audit-log style event

{
  "tenant": { "id": "org_123" },
  "event": {
    "id": "audit_001",
    "createdAt": "2026-05-08T14:33:00Z",
    "operation": "FileAccessed",
    "situation": "external"
  },
  "actor": { "id": "user_42", "type": "user" },
  "resource": { "id": "file_99", "type": "document" },
  "labels": ["PII", "Financial"]
}

Counts:

actor/resource × label × operation/situation/resourceType × timeframe

Product-activity style event

{
  "account": { "id": "acct_123" },
  "user": { "id": "user_42" },
  "event": {
    "name": "dashboard_viewed",
    "createdAt": "2026-05-08T14:33:00Z"
  },
  "product": {
    "feature": "risk_dashboard",
    "plan": "enterprise"
  }
}

Counts:

user/account × activity metric × feature/plan × timeframe

LLM-processing style event

{
  "scope": { "id": "org_123" },
  "request": {
    "id": "llm_req_001",
    "createdAt": "2026-05-08T14:33:00Z",
    "status": "success"
  },
  "app": { "id": "graphs-studio" },
  "model": {
    "provider": "openai",
    "name": "gpt-4.1"
  },
  "usage": {
    "inputTokens": 1200,
    "outputTokens": 350,
    "costCents": 4,
    "latencyMs": 1800
  }
}

Counts/measures:

app/model/provider/status × timeframe
sum inputTokens
sum outputTokens
sum costCents
avg latencyMs
count requests

Quick start

import { createCountex } from "@x12i/countex";
import { MemoryDurable, MemoryCache } from "@x12i/countex/adapters/memory";

const accessMap = {
  counterId: "access_by_subject_metric",

  scope: { path: "$.tenant.id" },
  eventTime: { path: "$.event.createdAt" },

  subjects: [
    { type: "actor",  path: "$.actor.id",  required: true  },
    { type: "target", path: "$.target.id", required: false },
  ],

  metrics: [
    { path: "$.labels[*]", required: true },
  ],

  dimensions: [
    { name: "eventType",  path: "$.event.type"      },
    { name: "situation",  path: "$.event.situation" },
    { name: "targetType", path: "$.target.type"     },
  ],

  delta: { default: 1 },
  timeframes: ["day", "week", "month", "quarter", "year"],
} as const;

const countex = createCountex({
  maps: [accessMap],
  durable: new MemoryDurable(),
  cache: new MemoryCache(),
});

await countex.ingest({
  tenant: { id: "org_123" },
  event:  { type: "share", situation: "external", createdAt: "2026-05-08T14:33:00Z" },
  actor:  { id: "user_42" },
  target: { id: "doc_99", type: "document" },
  labels: ["PCI", "PII"],
});

await countex.flush();

const result = await countex.query({
  scope: "org_123",
  counterId: "access_by_subject_metric",
  timeframe: { type: "day", from: "2026-05-01", to: "2026-05-08" },
  groupBy: ["subject", "metric", "dim:eventType"],
  format: "table",
  orderBy: "-count",
  limit: 100,
});

console.log(result.rows);

await countex.shutdown();

That single ingest produces:

1 actor + 1 target = 2 subjects
2 labels = 2 metrics
5 timeframes
→ 2 × 2 × 5 = 20 cell increments

Each cell uses an atomic $inc on durable and HINCRBY on the cache, so 20 ingest workers can safely run in parallel.

Preview a map (dry-run)

Use countex.preview(event, { counterId? }) to inspect what one event would produce without any I/O — no dictionary allocation, no buffering, no writes. Returns one MapPreviewResult per matching map, each with cells: PreviewIncrementCell[] (subjects, metrics, dimensions, dimensionsHash, timeframe keys, deltas, measures), extractionWarnings, and droppedAfterCardinality. See docs/guides/preview.md.

Production setup (Mongo + Redis)

Do not derive Mongo database names directly from raw tenant strings (length, $, /, and Unicode can break DB naming rules or leak structure). Hash or normalize first:

import { createHash } from "node:crypto";
import { MongoClient } from "mongodb";
import IORedis from "ioredis";
import { createCountex } from "@x12i/countex";
import { MongoDurable } from "@x12i/countex/adapters/mongo";
import { RedisCache }   from "@x12i/countex/adapters/redis";

function hashScopeId(scope: string): string {
  return createHash("sha256").update(scope, "utf8").digest("hex").slice(0, 16);
}

const mongo = new MongoClient(process.env.MONGO_URI!);
await mongo.connect();
const redis = new IORedis(process.env.REDIS_URL!);

const countex = createCountex({
  maps: [accessMap],
  durable: new MongoDurable({
    client: mongo,
    dbResolver: ({ scope }) => `countex_${hashScopeId(scope)}`,
  }),
  cache: new RedisCache({
    client: redis,
    keyPrefix: "countex",
    // Optional adapter default TTLs; when `retention.redis` is set below, the
    // engine resolves those strings to seconds and forwards them via
    // `setRetentionPolicy`, overriding adapter defaults for matching timeframes.
  }),
  retention: {
    redis: {
      minute: "48h",
      hour: "14d",
      day: "92d",
      week: "52w",
      month: "24mo",
      quarter: "24mo",
      year: "forever",
      all: "forever",
    },
    // Durable TTL / prune is not enforced by the engine — operators apply Mongo
    // TTL indexes or scheduled jobs. `retention.mongo` is optional per-timeframe
    // documentation (same duration grammar as redis); see types / DEPLOYMENT.
  },
  flush: {
    everyMs: 60_000,
    maxBufferedCells: 100_000,
    maxBufferedEvents: 50_000,
    flushOnShutdown: true,
  },
  cardinality: {
    maxDimensionValuesPerName: 10_000,
    onLimit: "overflow",
    overflowValue: "__OTHER__",
  },
});

await countex.init();

process.on("SIGTERM", async () => { await countex.shutdown(); });

Mapping reference

A CounterMap is the heart of the package. Everything Countex knows about your data is in here.

| Field | Type | Notes | |---------------|----------------------------|-------| | counterId | string | Stable identifier for this counter family. | | scope | { path } | Path into the event for the tenant/scope id. Required. | | eventTime | { path } | Path to a Date / ISO string / epoch ms. Drives bucket selection. | | subjects | SubjectMapping[] | One or many. Each has type and path; multi-value paths fan out. | | metrics | MetricMapping[] | Same fanout semantics. $.labels[*] is common. | | measures | MeasureMapping[] | Optional numeric fields per event (sum / count / min / max); see Measures. | | dimensions | DimensionMapping[] | Named extra breakdowns. Optional by default. | | delta | { path?, default? } | Numeric value at runtime. Default 1. Negative deltas are fine. | | timeframes | TimeframeType[] | Subset of minute / hour / day / week / month / quarter / year / all. | | idempotency | IdempotencyPolicy | Optional; needs a cache adapter that implements markEventSeen. | | cardinality | CardinalityPolicy | Optional; per-map override of engine-level defaults. | | options | MappingOptions | skipMissingOptionalDimensions (default true), normalizeValues (default true). |

Path syntax

A small JSONPath subset:

| Syntax | Meaning | |-------------------|--------------------------------------------------| | $ | Root. | | .name | Property. | | ['name'] | Property (bracket form, allows arbitrary chars). | | [N] | Numeric array index (negatives allowed). | | [*] | Wildcard — fan out across array elems / object values. |

Examples:

$.tenant.id
$.actor.id
$.labels[*]
$.items[*].id
$['weird.key.with.dots']

Fanout

A single event with 2 subjects and 3 labels (metrics) and 5 selected timeframes produces 2 × 3 × 5 = 30 atomic increments. This is intentional — read your mapping carefully. Countex emits a dropped event when cardinality protection bites, so you'll know.

Storage adapters

Countex defines two contracts and ships three adapters.

interface DurableStorage {
  applyDeltas(deltas: PendingDelta[]): Promise<void>;
  readCells(spec: QuerySpec): Promise<CounterCell[]>;
  resolveIds(lookups: DictionaryLookup[]): Promise<Map<string, number>>;
  streamCells(filter): AsyncIterable<CounterCell>;
  init?(): Promise<void>;
  close?(): Promise<void>;
}

interface CacheStorage {
  applyDeltas(deltas: PendingDelta[]): Promise<void>;
  readCells(spec: QuerySpec): Promise<{ cells: CounterCell[]; coverage: "full" | "partial" | "miss" }>;
  warm(cells: CounterCell[]): Promise<void>;
  markEventSeen?(scopeId, eventId, ttlSeconds): Promise<boolean>;
  init?(): Promise<void>;
  close?(): Promise<void>;
}

Built-in adapters

| Module | Purpose | |-----------------------------------|--------------------------------------| | @x12i/countex/adapters/memory | In-process durable + cache; tests / local dev. MemoryCache + Mongo is single-process only (init warns). | | @x12i/countex/adapters/mongo | Durable. Requires mongodb peer. | | @x12i/countex/adapters/redis | Hot cache. Requires ioredis peer. | | @x12i/countex/adapters/durable-cache | No Redis: each query reads durable once. Low volume / one machine. |

Bring your own clients — Countex doesn't open or close them. That keeps lifecycle in the host app's hands.

Mongo schema

countex_cells:

{
  "scopeId": "org_123",
  "counterId": "access_by_subject_metric",
  "timeframeType": "day",
  "timeframeKey": "2026-05-08",
  "subjectType": "actor",
  "subjectId": 10001,
  "metricId": 41,
  "dimensionsHash": "a8f21c0b1d2e",
  "dimensions": { "eventType": "share", "situation": "external" },
  "count": 42,
  "createdAt": "...",
  "updatedAt": "..."
}

Indexes (created automatically on first write):

uniq_cell_identity (unique, on the full identity tuple)
by_timeframe — for range scans
by_subject — for subject filters
by_metric — for metric filters

countex_dict — the per-scope subject/metric dictionary, with monotonic ids allocated via countex_seq. Allocation is concurrency-safe by construction: read existing → atomic sequence $inc → insertMany({ ordered: false }) → on duplicate-key, re-read. See docs/guides/adapter-mongo.md §"Dictionary allocation (resolveIds) and concurrency" and docs/internals/internal-storage-mongo.md.

dimensionsHash is canonicalized so any two workers produce the same hash for the same logical dimension tuple: keys sorted lexicographically, values type-tagged (n: / b: / i: / s:), serialized, hashed (SHA-1, truncated to 12 hex). See docs/internals/internal-dimension-hash.md.

Redis layout

countex:{scope}:{counterId}:{timeframeType}:{timeframeKey}     →  HASH
  field = "{subjectType}|{subjectId}|{metricId}|{dimensionsHash}"
  value = count

countex:{scope}:{counterId}:{timeframeType}:{timeframeKey}:cov →  HASH (coverage meta)
  status        = complete | partial | rebuilding   (see BucketCoverageMeta; reads may infer missing)
  lastWarmAt    = epoch ms (string)
  mapVersion    = optional opaque integer (string)
  schemaVersion = cache layout version (string)
  cellCount     = cells last warmed into this bucket (string, when known)

countex:{scope}:{counterId}:...:{timeframeKey}:meta:{field}    →  JSON of dimensions

countex:{scope}:dedupe:{eventId}                               →  "1" with TTL

One hash per (scope, counterId, bucket) keeps Redis key counts small and per-bucket reads to a single HGETALL. Do not treat “cell hash is non-empty” as full cache coverage — use the :cov hash (status === complete after warm, partial after incremental applyDeltas, rebuilding during cache rebuild). See docs/guides/adapter-redis.md and docs/internals/internal-storage-redis.md.

Query API

const result = await countex.query({
  scope: "org_123",
  counterId: "access_by_subject_metric",

  timeframe: {
    type: "day",
    from: "2026-05-01",
    to:   "2026-05-08",
  },

  // Filters — all optional (subject.ids / metrics: internal numbers and/or dictionary source strings)
  subject:    { type: "actor", ids: [10001, 10002, "user_uuid_a"] },
  metrics:    [41, 42, "PCI"],
  dimensions: { eventType: ["share", "download"], situation: ["external"] },

  // Shape
  groupBy: ["subject", "metric", "dim:eventType"],
  format:  "table", // "cells" | "table" | "matrix" | "timeseries"
  orderBy: "-count",
  limit:   100,

  // Optional: use dictionary source keys in grouped keys / cells instead of numeric ids only
  resolveLabels: true,

  // Diagnostics
  bypassCache: false,
  signal: AbortSignal.timeout(5_000),
});

Filters: subject.ids and metrics may mix internal numeric ids (from ingest) and string source keys (UUIDs, labels, and so on). String subject keys require subject.type. Set resolveLabels: true to return source keys in grouped keys.subject / keys.metric, and subjectSourceKey / metricSourceKey on format: "cells" rows. See docs/guides/querying.md.

explainQuery runs the planner without materializing cells. It returns a QueryPlan: expanded bucketsRequested, cacheEligible / cacheReason, predicted durableFallbackPredicted, cacheBucketsExpectedHit (when the cache adapter supports probeBucketCoverage), heuristic plannedCellsEstimate, and cost hints (expensive, groupByCost). Use it to preflight dashboards and expensive groupBy shapes.

const plan = await countex.explainQuery({
  scope: "org_123",
  counterId: "access_by_subject_metric",
  timeframe: { type: "day", from: "2026-05-01", to: "2026-05-08" },
  groupBy: ["subject", "metric"],
  format: "table",
});

Result:

{
  format: "table",
  rows: [
    { keys: { subject: 10001, metric: 41, "dim:eventType": "share" }, count: 142 },
    // With resolveLabels: true (and known dictionary rows), subject/metric keys are strings:
    // { keys: { subject: "user_42", metric: "PCI", "dim:eventType": "share" }, count: 142 },
    ...
  ],
  total: 1234,
  source: "cache" | "durable" | "merged",
}

String filters are resolved to internal ids before reads; unknown source keys are omitted from the filter. With resolveLabels: true, format: "cells" rows include subjectSourceKey and metricSourceKey in addition to numeric ids.

groupBy accepts:

"subject", "subjectType", "metric", "timeframeKey"
"dim:NAME" — group by a dimension value

Insights and advisors

Countex can optionally analyze stored counters and measures to produce higher-level insights. The core engine answers “what happened?” by maintaining sparse aggregate cells. The insights layer answers “what matters?” by ranking, comparing periods, detecting anomalies, forecasting usage, and emitting evidence-backed recommendations.

Typical uses:

Top / bottom N subjects, metrics, or dimensions; top/bottom percent of groups; Pareto and contribution views
Period-over-period comparison and simple trends on timeseries
Explainable anomaly detection (robust z-score, thresholds, missing activity)
Rule evaluation with normalized recommendations
Cost / budget / max-token style advisors (generic measures only — no domain lock-in)
Map and cardinality health when you pass the counter map into createCountexInsights
Resource projections from observed cells via estimateResourcesFromActuals (requires the durable adapter reference)

Safety: Insights honor maxBuckets (aligned with the core 10,000 bucket cap) and maxGroups. Grouped queries load all matching cells for the timeframe before aggregation, so narrow filters when possible.

Sparse counters: Bottom-N style rankings omit zero rows by default; Countex only stores cells that received increments.

import { createCountexInsights } from "@x12i/countex/insights";

const insights = createCountexInsights({
  countex,
  durable, // optional; required for estimateResourcesFromActuals → streamCells
  mapsByCounterId: new Map(maps.map((m) => [m.counterId, m])),
});

const topCostDrivers = await insights.topPercent({
  scope: "org_123",
  counterId: "llm_usage",
  timeframe: { type: "month", from: "2026-05", to: "2026-05" },
  groupBy: ["subject"],
  measure: "costMicros.sum",
  percent: 20,
});

Further reading: docs/guides/insights-overview.md, docs/guides/insights-ranking.md, docs/guides/insights-comparison-and-trends.md, docs/guides/insights-anomalies-and-rules.md, docs/guides/insights-forecasting-and-budget.md, docs/guides/insights-advisors-and-optimization.md. Deeper topic notes also live under docs/INSIGHTS*.md for maintainers.

CLI (after npm run build / install):

npx countex insights:top ./path/to/insight-context.mjs --scope org_123 --counter my_counter
# TypeScript configs: node --import=tsx node_modules/@x12i/countex/dist/cli.js insights:pareto ./ctx.ts

Events

The Countex client is an EventEmitter. Subscribe to observe what's happening:

countex.on("flush",                ({ cells, durationMs }) => {});
countex.on("flush_error",          ({ error })             => {});
countex.on("cardinality_warning",  ({ kind, count, limit })=> {});
countex.on("overflow",             ({ kind, sourceKey })   => {});
countex.on("dropped",              ({ reason })            => {});
countex.on("duplicate",            ({ scopeId, eventId })  => {});

All event payloads are typed via CountexEvents.

Operational notes

Buffering and flushing

Each Countex instance keeps only a delta buffer in memory:

Map<cellKey, delta>

Flush triggers (any of):

flush.everyMs timer (default 60s)
flush.maxBufferedCells reached (default 100k)
flush.maxBufferedEvents reached (default 50k)
shutdown() if flushOnShutdown is true (default)
explicit flush()

Flush is two-phase: durable first, cache second. If durable fails, the snapshot is merged back into the buffer — counts are not lost. If cache fails after durable succeeded, it's logged and continues; the next read warms cache from durable.

At most one flush runs at a time. Concurrent callers receive the same in-flight promise.

Cardinality protection

A bad mapping (e.g. a free-form URL as a dimension) can balloon the dictionary. Countex tracks per-(scope, kind, qualifier) how many distinct keys it has minted, and applies a policy when limits hit:

overflow (default) — rewrite the source key to __OTHER__. Emits overflow.
drop — silently drop the increment. Emits dropped.
error — throw CardinalityLimitError.

Defaults are conservative; tune them per workload. A warning fires at 80% of the limit (cardinality_warning).

Idempotency

Optional and per-map. Requires a cache adapter that implements markEventSeen (the included RedisCache and MemoryCache both do).

idempotency: {
  enabled: true,
  eventIdPath: "$.event.id",
  ttlSeconds: 86_400,
}

Dedupe keys are namespaced by counterId, so different maps with their own idempotency don't collide on shared event ids.

Late events

Bucket selection always uses the event time, never the ingestion time. Late-arriving events go into the correct historical bucket — $inc doesn't care about ordering.

Concurrency safety

All writes are atomic per cell:

Mongo: bulkWrite of updateOne with $inc and upsert: true
Redis: HINCRBY

20+ Countex workers can safely run in parallel against the same scope. Reads are eventually consistent across the two stores; if Redis is behind durable, the planner will detect partial coverage and read durable.

Backup / verify / restore / rebuild

Top-level free functions and a countex.backup namespace bound to your engine:

import {
  exportCells,
  computeCellsFingerprint,
  createBackupManifest,
  verifyBackupAgainstManifest,
  restoreSnapshotCells,
  rebuildCache,
  COUNTEX_CELL_SCHEMA_VERSION,
} from "@x12i/countex";

const filter = {
  scope: "org_123",
  counterId: "access_by_subject_metric",
  timeframe: { type: "day", from: "2026-02-01", to: "2026-05-08" },
};

// Stream all cells in a window — bounded async iteration.
for await (const cell of exportCells(durable, filter)) {
  // ndjson, S3, whatever you want
}

// Fingerprint + manifest for cold-storage verification.
const fp = await countex.backup.fingerprint(filter);          // { cellCount, contentSha256, countSum }
const manifest = countex.backup.createManifest(filter, fp);   // { manifestVersion, schemaVersion, filter, … }

// Verify a manifest against the live durable adapter.
const ok = await countex.backup.verify(manifest);
if (!ok.ok) throw new Error(ok.detail);

// Restore. Modes: "validate-only" | "replace-window" (default) | "merge-deltas" (reserved).
const { restored, mode } = await countex.backup.restore(
  countex.backup.export(filter),
  { mode: "replace-window" },
);

// Re-warm cache from durable. Each warmed bucket's :cov flips to status=complete.
await countex.backup.rebuildCache(filter);

See docs/guides/backup-export.md for the full recipe and manifest layout.

Docker integration testing

Countex ships a Docker-based stack (MongoDB, Redis, MinIO) and integration tests under tests/integration/docker/ that exercise real adapters: dictionary allocation under concurrency, Mongo lookupIds / reverseLookupIds, an end-to-end query() with source-key filters and resolveLabels, multi-worker ingestion, Redis cache coverage and durable fallback, rebuildCache, simulation fingerprints against a memory reference, and backup / manifest / verify / restore to S3-compatible storage (via the AWS SDK in devDependencies).

One-shot local run (starts services, runs tests with default localhost URLs, tears volumes down):

npm run test:e2e

Host-runner (services in Docker, tests on the host):

npm run docker:up
export COUNTEX_MONGO_URI='mongodb://countex:[email protected]:27017/admin'
export COUNTEX_REDIS_URL='redis://127.0.0.1:6379'
export COUNTEX_S3_ENDPOINT='http://127.0.0.1:9000'
export COUNTEX_S3_REGION='us-east-1'
export COUNTEX_S3_ACCESS_KEY_ID='countex'
export COUNTEX_S3_SECRET_ACCESS_KEY='countex-password'
export COUNTEX_S3_BUCKET='countex-backups'
export COUNTEX_INTEGRATION=1
npm run test:integration
npm run docker:down

Optional tuning: COUNTEX_TEST_PROFILE, COUNTEX_TEST_EVENTS, COUNTEX_TEST_WORKERS. Stress-style full pipeline (larger simulation + S3 round-trip): set COUNTEX_TEST_FULL=1 (requires S3 env as above).

Container-runner (CI parity): npm run test:integration:docker builds docker/test-runner.Dockerfile and runs npm run test:integration:exec with service hostnames injected by Compose.

Reports are written to artifacts/integration/report.json and artifacts/integration/report.md after npm run test:integration.

Errors

All errors thrown by Countex extend CountexError:

| Class | When | |--------------------------|------| | MappingError | Invalid mapping config (caught at engine startup). | | ExtractionError | Malformed path or value during extraction. | | CardinalityLimitError | Cardinality limit hit when policy is error. | | StorageError | Wrapping a storage-adapter failure. |

Versioning & scope

Additive counts are not distinct counts. If user_42 appears in 100 events, count = 100, not 1 unique user. Counter cells store sums of deltas, so a query result of count = 100 from a subject-grouped view means "100 increments hit this subject", not "100 distinct subjects". For exact distinct subject counts, group by ["subjectType", "subject"] and use rows.length after retrieving every matching cell — see docs/guides/querying.md §"Distinct subjects / metrics as row cardinality". Approximate distinct counts (HLL/sets) are a planned opt-in extension, not core.

v0.x is for stabilizing the public API. Once it's hardened:

exact unique counts (HLL / sets) are an opt-in extension, not core
complex expression deltas are not in core
distributed locks / exactly-once: not Countex's job; durable storage is the source of truth and $inc makes worker-count irrelevant

Repository layout

| Path | Role | |------|------| | src/index.ts | Public exports (createCountex, types, errors). | | src/core/ | Engine: ingest pipeline, buffer, extraction, mapping validation, time buckets, cardinality. | | src/query/ | Query planner (planQuery + executeQuery) and result formatting. | | src/storage/ | Adapters: memory (in-process), mongo (durable), redis (cache); shared coverage helpers. | | src/utils/ | JSONPath subset, hashing, logging. | | src/backup/ | Export, fingerprint, manifest, verify, restore, cache rebuild helpers. | | src/simulation/ | Simulation engine, profiles, expected aggregates, resource reports. | | src/insights/ | Ranking, comparison, anomaly detection, advisors, recommendations. | | src/cli.ts | CLI commands for simulation, insights, export, and operational helpers. | | examples/ | Runnable demos (npx tsx examples/basic.ts). | | docs/ | Deployment guide, tutorial, guides, internals, and use cases (published in the npm package). | | tests/unit/ | Fast Node test runner tests (npm test). | | tests/integration/ | Opt-in live tests against real services (npm run test:integration). |

Publishing (npm)

This package is scoped as @x12i/countex and is configured for public registry access (publishConfig.access in package.json).

Ensure you are logged in and allowed to publish under the @x12i scope on npm.
Put auth tokens only in ~/.npmrc — do not commit project .npmrc with credentials (it is listed in .gitignore).
From the repo root:
```
npm test
npm publish
```
prepublishOnly runs a clean build so the tarball always contains a fresh dist/.

The docs/ directory (DEPLOYMENT.md, END_TO_END_EXAMPLE.md) is part of the published package (files in package.json).

License

MIT — see LICENSE.

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

@x12i/countex

Install

Documentation shipped with the package

The mental model

Measures and rates

Percentiles & histograms

What kinds of events Countex is for

Audit-log style event

Product-activity style event

LLM-processing style event

Quick start

Preview a map (dry-run)

Production setup (Mongo + Redis)

Mapping reference

Path syntax

Fanout

Storage adapters

Built-in adapters

Mongo schema

Redis layout

Query API

Insights and advisors

Events

Operational notes

Buffering and flushing

Cardinality protection

Idempotency

Late events

Concurrency safety

Backup / verify / restore / rebuild

Docker integration testing

Errors

Versioning & scope

Repository layout

Publishing (npm)

License