@x12i/countex
v1.8.3
Published
Generic sparse counter matrices for JSON events. Dynamic counting, aggregation, and retrieval from mapped JSON inputs.
Maintainers
Readme
@x12i/countex
Generic sparse counter matrices for JSON events. Dynamic counting, aggregation, and retrieval from mapped JSON inputs.
@x12i/countex is a generic mapped-event counter: a source-agnostic counting engine. You give it any JSON event, a declarative mapping config, and storage adapters; it projects events into sparse counter cells, batches deltas locally, and persists them atomically to a durable store (MongoDB) and a hot cache (Redis).
It does not know about users, organizations, documents, labels, or events. It only knows generic primitives: scope, subject, metric, dimensions, timeframe, delta. Whatever your domain calls those things — Countex doesn't care.
Terminology: metric is the generic Countex name for the thing being counted or categorized. In your domain that may be a label, status, event name, operation, outcome, class, model result, or business category — the metrics array in a map simply lists JSON paths that resolve to those categories. Storage and APIs refer to dictionary sourceKey (the string extracted from the event); avoid confusing that with application-level UUIDs unless your map uses UUIDs as keys.
Process memory vs retained state: Each worker keeps unflushed deltas in an in-memory buffer. After flush, authoritative aggregates live in durable storage (MongoDB); Redis holds a hot cache. Restarting a worker loses only buffered deltas that were not flushed yet — tune flush intervals accordingly.
event JSON ──► map ──► raw increments ──► buffer ──► flush ──► Mongo (durable)
└───► Redis (hot cache)
◄── query ◄── Redis (full coverage) or ◄── Mongo (fallback) ──◄Install
npm install @x12i/countex
# Bring your own clients — peer deps, both optional:
npm install mongodb ioredisNode 18+. Zero runtime dependencies in the core package.
Documentation shipped with the package
These guides are included in the published tarball under docs/ and exposed as stable import subpaths (resolve to .md files for editors and tooling):
| Topic | Package subpath | File |
|-------|-----------------|------|
| Release & production deployment | @x12i/countex/docs/deployment | docs/DEPLOYMENT.md |
| Tutorial — install → map → ingest → query | @x12i/countex/docs/end-to-end-example | docs/END_TO_END_EXAMPLE.md |
| Feature guides index | @x12i/countex/docs/guides | docs/guides/README.md |
| Counter maps | @x12i/countex/docs/guides/counter-maps | docs/guides/counter-maps.md |
| Preview a map (dry-run) | @x12i/countex/docs/guides/preview | docs/guides/preview.md |
| Ingestion & flush | @x12i/countex/docs/guides/ingestion-and-flush | docs/guides/ingestion-and-flush.md |
| Timeframes | @x12i/countex/docs/guides/timeframes-and-bucketing | docs/guides/timeframes-and-bucketing.md |
| Measures | @x12i/countex/docs/guides/measures | docs/guides/measures.md |
| Querying | @x12i/countex/docs/guides/querying | docs/guides/querying.md |
| Memory adapters | @x12i/countex/docs/guides/adapter-memory | docs/guides/adapter-memory.md |
| Mongo durable | @x12i/countex/docs/guides/adapter-mongo | docs/guides/adapter-mongo.md |
| Redis cache | @x12i/countex/docs/guides/adapter-redis | docs/guides/adapter-redis.md |
| Cardinality | @x12i/countex/docs/guides/cardinality | docs/guides/cardinality.md |
| Idempotency | @x12i/countex/docs/guides/idempotency | docs/guides/idempotency.md |
| Backup export | @x12i/countex/docs/guides/backup-export | docs/guides/backup-export.md |
| Client events | @x12i/countex/docs/guides/client-events | docs/guides/client-events.md |
| Insights overview | @x12i/countex/docs/guides/insights-overview | docs/guides/insights-overview.md |
| Insights ranking | @x12i/countex/docs/guides/insights-ranking | docs/guides/insights-ranking.md |
| Insights comparison & trends | @x12i/countex/docs/guides/insights-comparison-and-trends | docs/guides/insights-comparison-and-trends.md |
| Insights anomalies & rules | @x12i/countex/docs/guides/insights-anomalies-and-rules | docs/guides/insights-anomalies-and-rules.md |
| Insights forecasting & budget | @x12i/countex/docs/guides/insights-forecasting-and-budget | docs/guides/insights-forecasting-and-budget.md |
| Insights advisors | @x12i/countex/docs/guides/insights-advisors-and-optimization | docs/guides/insights-advisors-and-optimization.md |
| CLI | @x12i/countex/docs/guides/cli | docs/guides/cli.md |
| Simulation runtime (JS API) | @x12i/countex/simulation | dist/simulation/ |
| Simulation overview | @x12i/countex/docs/guides/simulation | docs/guides/simulation.md |
| Simulation profiles | @x12i/countex/docs/guides/simulation-profiles | docs/guides/simulation-profiles.md |
| Simulation resource estimation | @x12i/countex/docs/guides/simulation-resource-estimation | docs/guides/simulation-resource-estimation.md |
Use cases (technical playbooks: contracts, fan-out math, exact QuerySpecs, production wiring, verification — see docs/use-cases/README.md):
| Topic | Package subpath | File |
|-------|-----------------|------|
| Use cases index | @x12i/countex/docs/use-cases | docs/use-cases/README.md |
| Audit & security logs (volume, situations, distinct entities, logical event totals) | @x12i/countex/docs/use-cases/audit-security-log-counts | docs/use-cases/audit-security-log-counts.md |
| Product & feature activity | @x12i/countex/docs/use-cases/product-feature-usage | docs/use-cases/product-feature-usage.md |
| LLM cost & token metering | @x12i/countex/docs/use-cases/llm-cost-and-token-metering | docs/use-cases/llm-cost-and-token-metering.md |
| Multi-tenant isolation | @x12i/countex/docs/use-cases/multi-tenant-isolation | docs/use-cases/multi-tenant-isolation.md |
| Idempotent webhooks & queues | @x12i/countex/docs/use-cases/webhook-idempotent-ingestion | docs/use-cases/webhook-idempotent-ingestion.md |
| Ops dashboards & rollups | @x12i/countex/docs/use-cases/ops-dashboard-daily-rollups | docs/use-cases/ops-dashboard-daily-rollups.md |
| Processing & workflow activity | @x12i/countex/docs/use-cases/processing-workflow-activity | docs/use-cases/processing-workflow-activity.md |
Internals (how the engine works — diagrams and traces) live under docs/internals/ in the tarball; browse docs/internals/README.md.
After install: node_modules/@x12i/countex/docs/.
The mental model
Countex turns mapped JSON events into sparse aggregate cells.
A counter cell is uniquely identified by:
(scope, counterId, timeframeType, timeframeKey, subjectType, subjectId, metricId, dimensionsHash)Each event can produce one or many cell increments. A counter increment is defined by:
- scope — your tenant / org / account boundary
- counterId — which counter family this is (e.g.
access_by_subject_metric) - timeframe —
minute/hour/day/week/month/quarter/year/all - subject — the thing being counted against (e.g. an actor or a target)
- metric — the count category / label
- dimensions — arbitrary additional breakdowns
- delta —
+1,-1, or any integer
Countex buffers increments locally, flushes atomic deltas to durable + cache, and serves queries from cache with durable fallback.
Measures and rates
Maps may declare optional measures: named numeric fields per event (cost, tokens, latency, …) with aggregations sum, count, min, and max. avg is computed at query time when both sum and count exist on the measure. Use scale on a measure for fixed-point storage (e.g. cents). Extracted values attach to each fan-out increment the same way as delta (see RFC-style docs for fan-out semantics).
Queries accept measures (explicit list or "*"), orderBy on a measure aggregation, and optional rate (count or a measure’s sum divided by elapsed time in the queried window; for format: "timeseries", per bucket). rate is invalid for timeframe all.
Redis cache stores sum and count per measure only; min / max are durable-backed. If a query needs min/max (including orderBy on them, or requesting measures that maintain min/max when using "*"), the planner reads from durable first so cache cannot drift.
bucketRange caps at 10,000 buckets; at minute resolution that is about 6.94 days — keep queries within that bound or narrow the range.
Percentiles & histograms
Core cells store additive measure aggregates (sum / count / min / max) per sparse cell. They do not store distributions, so percentile-style answers (p50/p90/p95/p99) can only come from min/max approximation or from histogram/sketch storage you add yourself. Insights advisors that mention percentile-shaped recommendations are heuristic; see docs/guides/insights-overview.md §"Distributions" and docs/DEPLOYMENT.md "Architecture decisions".
What kinds of events Countex is for
Countex is not tied to one event source or domain. It is a generic mapped-event counter: any JSON-like input stream where useful aggregates can be produced from paths in the event.
Typical sources include:
| Source family | Examples | Common Countex projection | |---|---|---| | Audit and security logs | Microsoft 365 audit logs, Google Workspace logs, Okta logs, cloud audit logs | actor/resource/action/situation/severity | | Product and user activity | Woopra, Segment, PostHog-style events, application activity logs | user/account/feature/event/page/funnel | | Processing and workflow activity | ETL runs, background jobs, agent graphs, workflow engines, queue processors | job/task/node/status/error/runtime | | LLM and AI service usage | Model gateway logs, OpenAI/Anthropic calls, internal LLM services | app/user/model/provider/tokens/cost/latency/result | | Business/domain events | CRM, support tickets, payments, commerce, lifecycle events | customer/object/state/category/outcome |
Countex only needs a map. The map tells Countex where to find:
- the scope boundary
- the event time
- the subjects
- the metrics
- the dimensions
- the delta/count
- optional numeric measures such as cost, tokens, latency, or duration
The original event can come from a log, webhook, queue message, analytics event, system activity, processing trace, or generated test fixture. Countex does not care about the event source as long as the map can project it into generic counter primitives.
Audit-log style event
{
"tenant": { "id": "org_123" },
"event": {
"id": "audit_001",
"createdAt": "2026-05-08T14:33:00Z",
"operation": "FileAccessed",
"situation": "external"
},
"actor": { "id": "user_42", "type": "user" },
"resource": { "id": "file_99", "type": "document" },
"labels": ["PII", "Financial"]
}Counts:
actor/resource × label × operation/situation/resourceType × timeframeProduct-activity style event
{
"account": { "id": "acct_123" },
"user": { "id": "user_42" },
"event": {
"name": "dashboard_viewed",
"createdAt": "2026-05-08T14:33:00Z"
},
"product": {
"feature": "risk_dashboard",
"plan": "enterprise"
}
}Counts:
user/account × activity metric × feature/plan × timeframeLLM-processing style event
{
"scope": { "id": "org_123" },
"request": {
"id": "llm_req_001",
"createdAt": "2026-05-08T14:33:00Z",
"status": "success"
},
"app": { "id": "graphs-studio" },
"model": {
"provider": "openai",
"name": "gpt-4.1"
},
"usage": {
"inputTokens": 1200,
"outputTokens": 350,
"costCents": 4,
"latencyMs": 1800
}
}Counts/measures:
app/model/provider/status × timeframe
sum inputTokens
sum outputTokens
sum costCents
avg latencyMs
count requestsQuick start
import { createCountex } from "@x12i/countex";
import { MemoryDurable, MemoryCache } from "@x12i/countex/adapters/memory";
const accessMap = {
counterId: "access_by_subject_metric",
scope: { path: "$.tenant.id" },
eventTime: { path: "$.event.createdAt" },
subjects: [
{ type: "actor", path: "$.actor.id", required: true },
{ type: "target", path: "$.target.id", required: false },
],
metrics: [
{ path: "$.labels[*]", required: true },
],
dimensions: [
{ name: "eventType", path: "$.event.type" },
{ name: "situation", path: "$.event.situation" },
{ name: "targetType", path: "$.target.type" },
],
delta: { default: 1 },
timeframes: ["day", "week", "month", "quarter", "year"],
} as const;
const countex = createCountex({
maps: [accessMap],
durable: new MemoryDurable(),
cache: new MemoryCache(),
});
await countex.ingest({
tenant: { id: "org_123" },
event: { type: "share", situation: "external", createdAt: "2026-05-08T14:33:00Z" },
actor: { id: "user_42" },
target: { id: "doc_99", type: "document" },
labels: ["PCI", "PII"],
});
await countex.flush();
const result = await countex.query({
scope: "org_123",
counterId: "access_by_subject_metric",
timeframe: { type: "day", from: "2026-05-01", to: "2026-05-08" },
groupBy: ["subject", "metric", "dim:eventType"],
format: "table",
orderBy: "-count",
limit: 100,
});
console.log(result.rows);
await countex.shutdown();That single ingest produces:
- 1 actor + 1 target = 2 subjects
- 2 labels = 2 metrics
- 5 timeframes
- → 2 × 2 × 5 = 20 cell increments
Each cell uses an atomic $inc on durable and HINCRBY on the cache, so 20 ingest workers can safely run in parallel.
Preview a map (dry-run)
Use countex.preview(event, { counterId? }) to inspect what one event would produce without any I/O — no dictionary allocation, no buffering, no writes. Returns one MapPreviewResult per matching map, each with cells: PreviewIncrementCell[] (subjects, metrics, dimensions, dimensionsHash, timeframe keys, deltas, measures), extractionWarnings, and droppedAfterCardinality. See docs/guides/preview.md.
Production setup (Mongo + Redis)
Do not derive Mongo database names directly from raw tenant strings (length, $, /, and Unicode can break DB naming rules or leak structure). Hash or normalize first:
import { createHash } from "node:crypto";
import { MongoClient } from "mongodb";
import IORedis from "ioredis";
import { createCountex } from "@x12i/countex";
import { MongoDurable } from "@x12i/countex/adapters/mongo";
import { RedisCache } from "@x12i/countex/adapters/redis";
function hashScopeId(scope: string): string {
return createHash("sha256").update(scope, "utf8").digest("hex").slice(0, 16);
}
const mongo = new MongoClient(process.env.MONGO_URI!);
await mongo.connect();
const redis = new IORedis(process.env.REDIS_URL!);
const countex = createCountex({
maps: [accessMap],
durable: new MongoDurable({
client: mongo,
dbResolver: ({ scope }) => `countex_${hashScopeId(scope)}`,
}),
cache: new RedisCache({
client: redis,
keyPrefix: "countex",
// Optional adapter default TTLs; when `retention.redis` is set below, the
// engine resolves those strings to seconds and forwards them via
// `setRetentionPolicy`, overriding adapter defaults for matching timeframes.
}),
retention: {
redis: {
minute: "48h",
hour: "14d",
day: "92d",
week: "52w",
month: "24mo",
quarter: "24mo",
year: "forever",
all: "forever",
},
// Durable TTL / prune is not enforced by the engine — operators apply Mongo
// TTL indexes or scheduled jobs. `retention.mongo` is optional per-timeframe
// documentation (same duration grammar as redis); see types / DEPLOYMENT.
},
flush: {
everyMs: 60_000,
maxBufferedCells: 100_000,
maxBufferedEvents: 50_000,
flushOnShutdown: true,
},
cardinality: {
maxDimensionValuesPerName: 10_000,
onLimit: "overflow",
overflowValue: "__OTHER__",
},
});
await countex.init();
process.on("SIGTERM", async () => { await countex.shutdown(); });Mapping reference
A CounterMap is the heart of the package. Everything Countex knows about your data is in here.
| Field | Type | Notes |
|---------------|----------------------------|-------|
| counterId | string | Stable identifier for this counter family. |
| scope | { path } | Path into the event for the tenant/scope id. Required. |
| eventTime | { path } | Path to a Date / ISO string / epoch ms. Drives bucket selection. |
| subjects | SubjectMapping[] | One or many. Each has type and path; multi-value paths fan out. |
| metrics | MetricMapping[] | Same fanout semantics. $.labels[*] is common. |
| measures | MeasureMapping[] | Optional numeric fields per event (sum / count / min / max); see Measures. |
| dimensions | DimensionMapping[] | Named extra breakdowns. Optional by default. |
| delta | { path?, default? } | Numeric value at runtime. Default 1. Negative deltas are fine. |
| timeframes | TimeframeType[] | Subset of minute / hour / day / week / month / quarter / year / all. |
| idempotency | IdempotencyPolicy | Optional; needs a cache adapter that implements markEventSeen. |
| cardinality | CardinalityPolicy | Optional; per-map override of engine-level defaults. |
| options | MappingOptions | skipMissingOptionalDimensions (default true), normalizeValues (default true). |
Path syntax
A small JSONPath subset:
| Syntax | Meaning |
|-------------------|--------------------------------------------------|
| $ | Root. |
| .name | Property. |
| ['name'] | Property (bracket form, allows arbitrary chars). |
| [N] | Numeric array index (negatives allowed). |
| [*] | Wildcard — fan out across array elems / object values. |
Examples:
$.tenant.id
$.actor.id
$.labels[*]
$.items[*].id
$['weird.key.with.dots']Fanout
A single event with 2 subjects and 3 labels (metrics) and 5 selected timeframes produces 2 × 3 × 5 = 30 atomic increments. This is intentional — read your mapping carefully. Countex emits a dropped event when cardinality protection bites, so you'll know.
Storage adapters
Countex defines two contracts and ships three adapters.
interface DurableStorage {
applyDeltas(deltas: PendingDelta[]): Promise<void>;
readCells(spec: QuerySpec): Promise<CounterCell[]>;
resolveIds(lookups: DictionaryLookup[]): Promise<Map<string, number>>;
streamCells(filter): AsyncIterable<CounterCell>;
init?(): Promise<void>;
close?(): Promise<void>;
}
interface CacheStorage {
applyDeltas(deltas: PendingDelta[]): Promise<void>;
readCells(spec: QuerySpec): Promise<{ cells: CounterCell[]; coverage: "full" | "partial" | "miss" }>;
warm(cells: CounterCell[]): Promise<void>;
markEventSeen?(scopeId, eventId, ttlSeconds): Promise<boolean>;
init?(): Promise<void>;
close?(): Promise<void>;
}Built-in adapters
| Module | Purpose |
|-----------------------------------|--------------------------------------|
| @x12i/countex/adapters/memory | In-process durable + cache; tests / local dev. MemoryCache + Mongo is single-process only (init warns). |
| @x12i/countex/adapters/mongo | Durable. Requires mongodb peer. |
| @x12i/countex/adapters/redis | Hot cache. Requires ioredis peer. |
| @x12i/countex/adapters/durable-cache | No Redis: each query reads durable once. Low volume / one machine. |
Bring your own clients — Countex doesn't open or close them. That keeps lifecycle in the host app's hands.
Mongo schema
countex_cells:
{
"scopeId": "org_123",
"counterId": "access_by_subject_metric",
"timeframeType": "day",
"timeframeKey": "2026-05-08",
"subjectType": "actor",
"subjectId": 10001,
"metricId": 41,
"dimensionsHash": "a8f21c0b1d2e",
"dimensions": { "eventType": "share", "situation": "external" },
"count": 42,
"createdAt": "...",
"updatedAt": "..."
}Indexes (created automatically on first write):
uniq_cell_identity(unique, on the full identity tuple)by_timeframe— for range scansby_subject— for subject filtersby_metric— for metric filters
countex_dict — the per-scope subject/metric dictionary, with monotonic ids allocated via countex_seq. Allocation is concurrency-safe by construction: read existing → atomic sequence $inc → insertMany({ ordered: false }) → on duplicate-key, re-read. See docs/guides/adapter-mongo.md §"Dictionary allocation (resolveIds) and concurrency" and docs/internals/internal-storage-mongo.md.
dimensionsHash is canonicalized so any two workers produce the same hash for the same logical dimension tuple: keys sorted lexicographically, values type-tagged (n: / b: / i: / s:), serialized, hashed (SHA-1, truncated to 12 hex). See docs/internals/internal-dimension-hash.md.
Redis layout
countex:{scope}:{counterId}:{timeframeType}:{timeframeKey} → HASH
field = "{subjectType}|{subjectId}|{metricId}|{dimensionsHash}"
value = count
countex:{scope}:{counterId}:{timeframeType}:{timeframeKey}:cov → HASH (coverage meta)
status = complete | partial | rebuilding (see BucketCoverageMeta; reads may infer missing)
lastWarmAt = epoch ms (string)
mapVersion = optional opaque integer (string)
schemaVersion = cache layout version (string)
cellCount = cells last warmed into this bucket (string, when known)
countex:{scope}:{counterId}:...:{timeframeKey}:meta:{field} → JSON of dimensions
countex:{scope}:dedupe:{eventId} → "1" with TTLOne hash per (scope, counterId, bucket) keeps Redis key counts small and per-bucket reads to a single HGETALL. Do not treat “cell hash is non-empty” as full cache coverage — use the :cov hash (status === complete after warm, partial after incremental applyDeltas, rebuilding during cache rebuild). See docs/guides/adapter-redis.md and docs/internals/internal-storage-redis.md.
Query API
const result = await countex.query({
scope: "org_123",
counterId: "access_by_subject_metric",
timeframe: {
type: "day",
from: "2026-05-01",
to: "2026-05-08",
},
// Filters — all optional (subject.ids / metrics: internal numbers and/or dictionary source strings)
subject: { type: "actor", ids: [10001, 10002, "user_uuid_a"] },
metrics: [41, 42, "PCI"],
dimensions: { eventType: ["share", "download"], situation: ["external"] },
// Shape
groupBy: ["subject", "metric", "dim:eventType"],
format: "table", // "cells" | "table" | "matrix" | "timeseries"
orderBy: "-count",
limit: 100,
// Optional: use dictionary source keys in grouped keys / cells instead of numeric ids only
resolveLabels: true,
// Diagnostics
bypassCache: false,
signal: AbortSignal.timeout(5_000),
});Filters: subject.ids and metrics may mix internal numeric ids (from ingest) and string source keys (UUIDs, labels, and so on). String subject keys require subject.type. Set resolveLabels: true to return source keys in grouped keys.subject / keys.metric, and subjectSourceKey / metricSourceKey on format: "cells" rows. See docs/guides/querying.md.
explainQuery runs the planner without materializing cells. It returns a QueryPlan: expanded bucketsRequested, cacheEligible / cacheReason, predicted durableFallbackPredicted, cacheBucketsExpectedHit (when the cache adapter supports probeBucketCoverage), heuristic plannedCellsEstimate, and cost hints (expensive, groupByCost). Use it to preflight dashboards and expensive groupBy shapes.
const plan = await countex.explainQuery({
scope: "org_123",
counterId: "access_by_subject_metric",
timeframe: { type: "day", from: "2026-05-01", to: "2026-05-08" },
groupBy: ["subject", "metric"],
format: "table",
});Result:
{
format: "table",
rows: [
{ keys: { subject: 10001, metric: 41, "dim:eventType": "share" }, count: 142 },
// With resolveLabels: true (and known dictionary rows), subject/metric keys are strings:
// { keys: { subject: "user_42", metric: "PCI", "dim:eventType": "share" }, count: 142 },
...
],
total: 1234,
source: "cache" | "durable" | "merged",
}String filters are resolved to internal ids before reads; unknown source keys are omitted from the filter. With resolveLabels: true, format: "cells" rows include subjectSourceKey and metricSourceKey in addition to numeric ids.
groupBy accepts:
"subject","subjectType","metric","timeframeKey""dim:NAME"— group by a dimension value
Insights and advisors
Countex can optionally analyze stored counters and measures to produce higher-level insights. The core engine answers “what happened?” by maintaining sparse aggregate cells. The insights layer answers “what matters?” by ranking, comparing periods, detecting anomalies, forecasting usage, and emitting evidence-backed recommendations.
Typical uses:
- Top / bottom N subjects, metrics, or dimensions; top/bottom percent of groups; Pareto and contribution views
- Period-over-period comparison and simple trends on timeseries
- Explainable anomaly detection (robust z-score, thresholds, missing activity)
- Rule evaluation with normalized recommendations
- Cost / budget / max-token style advisors (generic measures only — no domain lock-in)
- Map and cardinality health when you pass the counter map into
createCountexInsights - Resource projections from observed cells via
estimateResourcesFromActuals(requires the durable adapter reference)
Safety: Insights honor maxBuckets (aligned with the core 10,000 bucket cap) and maxGroups. Grouped queries load all matching cells for the timeframe before aggregation, so narrow filters when possible.
Sparse counters: Bottom-N style rankings omit zero rows by default; Countex only stores cells that received increments.
import { createCountexInsights } from "@x12i/countex/insights";
const insights = createCountexInsights({
countex,
durable, // optional; required for estimateResourcesFromActuals → streamCells
mapsByCounterId: new Map(maps.map((m) => [m.counterId, m])),
});
const topCostDrivers = await insights.topPercent({
scope: "org_123",
counterId: "llm_usage",
timeframe: { type: "month", from: "2026-05", to: "2026-05" },
groupBy: ["subject"],
measure: "costMicros.sum",
percent: 20,
});Further reading: docs/guides/insights-overview.md, docs/guides/insights-ranking.md, docs/guides/insights-comparison-and-trends.md, docs/guides/insights-anomalies-and-rules.md, docs/guides/insights-forecasting-and-budget.md, docs/guides/insights-advisors-and-optimization.md. Deeper topic notes also live under docs/INSIGHTS*.md for maintainers.
CLI (after npm run build / install):
npx countex insights:top ./path/to/insight-context.mjs --scope org_123 --counter my_counter
# TypeScript configs: node --import=tsx node_modules/@x12i/countex/dist/cli.js insights:pareto ./ctx.tsEvents
The Countex client is an EventEmitter. Subscribe to observe what's happening:
countex.on("flush", ({ cells, durationMs }) => {});
countex.on("flush_error", ({ error }) => {});
countex.on("cardinality_warning", ({ kind, count, limit })=> {});
countex.on("overflow", ({ kind, sourceKey }) => {});
countex.on("dropped", ({ reason }) => {});
countex.on("duplicate", ({ scopeId, eventId }) => {});All event payloads are typed via CountexEvents.
Operational notes
Buffering and flushing
Each Countex instance keeps only a delta buffer in memory:
Map<cellKey, delta>Flush triggers (any of):
flush.everyMstimer (default 60s)flush.maxBufferedCellsreached (default 100k)flush.maxBufferedEventsreached (default 50k)shutdown()ifflushOnShutdownis true (default)- explicit
flush()
Flush is two-phase: durable first, cache second. If durable fails, the snapshot is merged back into the buffer — counts are not lost. If cache fails after durable succeeded, it's logged and continues; the next read warms cache from durable.
At most one flush runs at a time. Concurrent callers receive the same in-flight promise.
Cardinality protection
A bad mapping (e.g. a free-form URL as a dimension) can balloon the dictionary. Countex tracks per-(scope, kind, qualifier) how many distinct keys it has minted, and applies a policy when limits hit:
overflow(default) — rewrite the source key to__OTHER__. Emitsoverflow.drop— silently drop the increment. Emitsdropped.error— throwCardinalityLimitError.
Defaults are conservative; tune them per workload. A warning fires at 80% of the limit (cardinality_warning).
Idempotency
Optional and per-map. Requires a cache adapter that implements markEventSeen (the included RedisCache and MemoryCache both do).
idempotency: {
enabled: true,
eventIdPath: "$.event.id",
ttlSeconds: 86_400,
}Dedupe keys are namespaced by counterId, so different maps with their own idempotency don't collide on shared event ids.
Late events
Bucket selection always uses the event time, never the ingestion time. Late-arriving events go into the correct historical bucket — $inc doesn't care about ordering.
Concurrency safety
All writes are atomic per cell:
- Mongo:
bulkWriteofupdateOnewith$incandupsert: true - Redis:
HINCRBY
20+ Countex workers can safely run in parallel against the same scope. Reads are eventually consistent across the two stores; if Redis is behind durable, the planner will detect partial coverage and read durable.
Backup / verify / restore / rebuild
Top-level free functions and a countex.backup namespace bound to your engine:
import {
exportCells,
computeCellsFingerprint,
createBackupManifest,
verifyBackupAgainstManifest,
restoreSnapshotCells,
rebuildCache,
COUNTEX_CELL_SCHEMA_VERSION,
} from "@x12i/countex";
const filter = {
scope: "org_123",
counterId: "access_by_subject_metric",
timeframe: { type: "day", from: "2026-02-01", to: "2026-05-08" },
};
// Stream all cells in a window — bounded async iteration.
for await (const cell of exportCells(durable, filter)) {
// ndjson, S3, whatever you want
}
// Fingerprint + manifest for cold-storage verification.
const fp = await countex.backup.fingerprint(filter); // { cellCount, contentSha256, countSum }
const manifest = countex.backup.createManifest(filter, fp); // { manifestVersion, schemaVersion, filter, … }
// Verify a manifest against the live durable adapter.
const ok = await countex.backup.verify(manifest);
if (!ok.ok) throw new Error(ok.detail);
// Restore. Modes: "validate-only" | "replace-window" (default) | "merge-deltas" (reserved).
const { restored, mode } = await countex.backup.restore(
countex.backup.export(filter),
{ mode: "replace-window" },
);
// Re-warm cache from durable. Each warmed bucket's :cov flips to status=complete.
await countex.backup.rebuildCache(filter);See docs/guides/backup-export.md for the full recipe and manifest layout.
Docker integration testing
Countex ships a Docker-based stack (MongoDB, Redis, MinIO) and integration tests under tests/integration/docker/ that exercise real adapters: dictionary allocation under concurrency, Mongo lookupIds / reverseLookupIds, an end-to-end query() with source-key filters and resolveLabels, multi-worker ingestion, Redis cache coverage and durable fallback, rebuildCache, simulation fingerprints against a memory reference, and backup / manifest / verify / restore to S3-compatible storage (via the AWS SDK in devDependencies).
One-shot local run (starts services, runs tests with default localhost URLs, tears volumes down):
npm run test:e2eHost-runner (services in Docker, tests on the host):
npm run docker:up
export COUNTEX_MONGO_URI='mongodb://countex:[email protected]:27017/admin'
export COUNTEX_REDIS_URL='redis://127.0.0.1:6379'
export COUNTEX_S3_ENDPOINT='http://127.0.0.1:9000'
export COUNTEX_S3_REGION='us-east-1'
export COUNTEX_S3_ACCESS_KEY_ID='countex'
export COUNTEX_S3_SECRET_ACCESS_KEY='countex-password'
export COUNTEX_S3_BUCKET='countex-backups'
export COUNTEX_INTEGRATION=1
npm run test:integration
npm run docker:downOptional tuning: COUNTEX_TEST_PROFILE, COUNTEX_TEST_EVENTS, COUNTEX_TEST_WORKERS. Stress-style full pipeline (larger simulation + S3 round-trip): set COUNTEX_TEST_FULL=1 (requires S3 env as above).
Container-runner (CI parity): npm run test:integration:docker builds docker/test-runner.Dockerfile and runs npm run test:integration:exec with service hostnames injected by Compose.
Reports are written to artifacts/integration/report.json and artifacts/integration/report.md after npm run test:integration.
Errors
All errors thrown by Countex extend CountexError:
| Class | When |
|--------------------------|------|
| MappingError | Invalid mapping config (caught at engine startup). |
| ExtractionError | Malformed path or value during extraction. |
| CardinalityLimitError | Cardinality limit hit when policy is error. |
| StorageError | Wrapping a storage-adapter failure. |
Versioning & scope
Additive counts are not distinct counts. If
user_42appears in 100 events,count = 100, not 1 unique user. Counter cells store sums of deltas, so a query result ofcount = 100from a subject-grouped view means "100 increments hit this subject", not "100 distinct subjects". For exact distinct subject counts, group by["subjectType", "subject"]and userows.lengthafter retrieving every matching cell — seedocs/guides/querying.md§"Distinct subjects / metrics as row cardinality". Approximate distinct counts (HLL/sets) are a planned opt-in extension, not core.
v0.x is for stabilizing the public API. Once it's hardened:
- exact unique counts (HLL / sets) are an opt-in extension, not core
- complex expression deltas are not in core
- distributed locks / exactly-once: not Countex's job; durable storage is the source of truth and
$incmakes worker-count irrelevant
Repository layout
| Path | Role |
|------|------|
| src/index.ts | Public exports (createCountex, types, errors). |
| src/core/ | Engine: ingest pipeline, buffer, extraction, mapping validation, time buckets, cardinality. |
| src/query/ | Query planner (planQuery + executeQuery) and result formatting. |
| src/storage/ | Adapters: memory (in-process), mongo (durable), redis (cache); shared coverage helpers. |
| src/utils/ | JSONPath subset, hashing, logging. |
| src/backup/ | Export, fingerprint, manifest, verify, restore, cache rebuild helpers. |
| src/simulation/ | Simulation engine, profiles, expected aggregates, resource reports. |
| src/insights/ | Ranking, comparison, anomaly detection, advisors, recommendations. |
| src/cli.ts | CLI commands for simulation, insights, export, and operational helpers. |
| examples/ | Runnable demos (npx tsx examples/basic.ts). |
| docs/ | Deployment guide, tutorial, guides, internals, and use cases (published in the npm package). |
| tests/unit/ | Fast Node test runner tests (npm test). |
| tests/integration/ | Opt-in live tests against real services (npm run test:integration). |
Publishing (npm)
This package is scoped as @x12i/countex and is configured for public registry access (publishConfig.access in package.json).
Ensure you are logged in and allowed to publish under the
@x12iscope on npm.Put auth tokens only in
~/.npmrc— do not commit project.npmrcwith credentials (it is listed in.gitignore).From the repo root:
npm test npm publishprepublishOnlyruns a clean build so the tarball always contains a freshdist/.
The docs/ directory (DEPLOYMENT.md, END_TO_END_EXAMPLE.md) is part of the published package (files in package.json).
License
MIT — see LICENSE.
