npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

drain3-js

v1.1.0

Published

JavaScript port of Drain3 log parsing algorithm

Readme

drain3-js

A zero-dependency JavaScript/TypeScript port of Drain3, a log parsing algorithm that automatically clusters log messages into templates.

Input: An array of structured log entries { id, log, severity } Output: Clustered templates grouped by severity, with log IDs preserved

Installation

npm install drain3-js

Quick Start

import { clusterLogs } from 'drain3-js';

const logs = [
  { id: "req-1", log: "Requested is served to 3uedhx2wedock", severity: "INFO" },
  { id: "req-2", log: "Requested is served to 8fjsk29dkx", severity: "INFO" },
  { id: "req-3", log: "Requested is served to pq83nxlow2", severity: "INFO" },
  { id: "req-4", log: "Connection timeout for user abc123", severity: "ERROR" },
  { id: "req-5", log: "Connection timeout for user xyz789", severity: "ERROR" },
  { id: "req-6", log: "Connection timeout for user lmn456", severity: "WARN" },
  { id: "req-7", log: "Disk usage at 85% on server-01", severity: "WARN" },
  { id: "req-8", log: "Disk usage at 92% on server-03", severity: "WARN" },
];

const clusters = clusterLogs(logs);
console.log(clusters);

Output:

[
  {
    clusterId: 1,
    template: "Requested is served to <*>",
    severity: "INFO",
    count: 3,
    logs: [
      { id: "req-1", log: "Requested is served to 3uedhx2wedock" },
      { id: "req-2", log: "Requested is served to 8fjsk29dkx" },
      { id: "req-3", log: "Requested is served to pq83nxlow2" },
    ]
  },
  {
    clusterId: 2,
    template: "Connection timeout for user <*>",
    severity: "ERROR",
    count: 2,
    logs: [
      { id: "req-4", log: "Connection timeout for user abc123" },
      { id: "req-5", log: "Connection timeout for user xyz789" },
    ]
  },
  {
    clusterId: 3,
    template: "Connection timeout for user <*>",
    severity: "WARN",           // same template, separate cluster (different severity)
    count: 1,
    logs: [
      { id: "req-6", log: "Connection timeout for user lmn456" },
    ]
  },
  {
    clusterId: 4,
    template: "Disk usage at <*> on <*>",
    severity: "WARN",
    count: 2,
    logs: [
      { id: "req-7", log: "Disk usage at 85% on server-01" },
      { id: "req-8", log: "Disk usage at 92% on server-03" },
    ]
  }
]

API

clusterLogs(logs, options?)

Clusters an array of log entries into templates, grouped by severity.

clusterLogs(logs: LogEntry[], options?: DrainOptions): ClusterResult[]

| Parameter | Type | Description | |-----------|------|-------------| | logs | LogEntry[] | Array of structured log entries | | options | DrainOptions | Optional configuration (see below) |

DrainOptions

| Option | Type | Default | Description | |--------|------|---------|-------------| | simTh | number | 0.4 | Similarity threshold (0-1). Higher values create more clusters. | | depth | number | 4 | Prefix tree depth (minimum 3). Higher values give more precise matching. | | maxChildren | number | 100 | Maximum children per tree node. | | maxClusters | number | undefined | Maximum clusters to keep. Oldest are evicted via LRU when exceeded. | | extraDelimiters | string[] | [] | Additional characters to split tokens on (beyond whitespace). | | masking | MaskingRule[] | [] | Regex rules to mask tokens before clustering. | | snapshot | DrainSnapshot | undefined | Restore state from a previous toJSON() call. Used with createDrain(). |

LogEntry

| Field | Type | Description | |-------|------|-------------| | id | string \| number | Unique identifier for the log entry | | log | string | The log message content | | severity | string | Severity level (e.g. "INFO", "ERROR", "WARN") |

ClusterResult

Logs are clustered per-severity -- the same log pattern with different severities produces separate clusters.

| Field | Type | Description | |-------|------|-------------| | clusterId | number | Unique cluster identifier (assigned in creation order) | | template | string | Log template with <*> wildcards for variable parts | | severity | string | Severity shared by all logs in this cluster | | count | number | Number of logs in this cluster | | logs | Array<{ id, log }> | Original entries (id + log message) in input order |

MaskingRule

| Field | Type | Description | |-------|------|-------------| | regex | RegExp | Pattern to match (use g flag for global replacement) | | maskWith | string | Label for the mask (appears as <:label:> in templates) |

Incremental / Batch Processing

For processing logs in batches (e.g. streaming, tailing files), use createDrain() to maintain state across batches:

import { createDrain } from 'drain3-js';

const drain = createDrain({ simTh: 0.4 });

// Process first batch - returns results for this batch
const result1 = drain.addLogs([
  { id: 1, log: "connected to server-01", severity: "INFO" },
  { id: 2, log: "connected to server-02", severity: "INFO" },
  { id: 3, log: "error 404 on /api/users", severity: "ERROR" },
]);
// result1[0] = { template: "connected to <*>", severity: "INFO", count: 2, logs: [{id:1, ...}, {id:2, ...}] }
// result1[1] = { template: "error <*> on /api/users", severity: "ERROR", count: 1, logs: [{id:3, ...}] }

// Process second batch - builds on existing templates
const result2 = drain.addLogs([
  { id: 4, log: "connected to server-03", severity: "INFO" },
  { id: 5, log: "error 500 on /api/orders", severity: "ERROR" },
]);
// result2[0] = { template: "connected to <*>", severity: "INFO", count: 3, logs: [{id:4, ...}] }
// result2[1] = { template: "error <*> on <*>", severity: "ERROR", count: 2, logs: [{id:5, ...}] }
//                                                          ^ total count          ^ only this batch

Saving and Restoring State

Save the Drain state as a JSON snapshot and restore it later. Snapshots are lightweight - they store only templates, counts, and severity, not the original log strings.

// Save state
const snapshot = drain.toJSON();
const json = JSON.stringify(snapshot);
// Store `json` in a file, database, Redis, etc.

// Later: restore and continue
const restored = createDrain({ simTh: 0.4, snapshot: JSON.parse(json) });
const result3 = restored.addLogs(newBatch);
// Picks up where it left off - counts accumulate, templates evolve

A snapshot for 1,000 clusters is roughly 80 KB regardless of how many logs were processed.

createDrain(options?)

Creates a stateful Drain instance for incremental processing.

Returns: DrainInstance

| Method | Returns | Description | |--------|---------|-------------| | addLogs(logs) | ClusterResult[] | Process a batch. Returns results with count = total across all batches, logs = only this batch. | | getClusters() | ClusterResult[] | All current clusters with total counts. logs is empty. | | toJSON() | DrainSnapshot | Lightweight snapshot for persistence. Pass as options.snapshot to restore. |

Examples

Masking sensitive patterns

const clusters = clusterLogs(logs, {
  masking: [
    { regex: /\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}/g, maskWith: "IP" },
    { regex: /\b[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}\b/g, maskWith: "UUID" },
  ],
});
// Templates will contain <:IP:> and <:UUID:> instead of <*>

Tuning similarity

// Strict: only very similar logs merge
const strict = clusterLogs(logs, { simTh: 0.8 });

// Loose: aggressively merge similar logs
const loose = clusterLogs(logs, { simTh: 0.2 });

Extra delimiters

// Split on = and ; in addition to whitespace
const clusters = clusterLogs(logs, {
  extraDelimiters: ["=", ";"],
});

Limiting cluster count

// Keep at most 1000 clusters, evicting oldest via LRU
const clusters = clusterLogs(logs, { maxClusters: 1000 });

How It Works

The Drain algorithm parses logs using a fixed-depth prefix tree:

  1. Preprocess - Optionally mask known patterns (IPs, UUIDs, etc.) with labeled placeholders
  2. Tokenize - Split log by whitespace (and any extra delimiters) into tokens
  3. Group by severity - Each severity gets its own prefix tree, so clusters never mix severities
  4. Tree lookup - Walk the prefix tree: first by token count, then by leading tokens
  5. Similarity check - Compare with candidate clusters. If similarity >= threshold, merge
  6. Merge or create - Matched: update template (differing tokens become <*>). Unmatched: create new cluster.

Compatibility

  • Node.js 18+
  • Cloudflare Workers
  • Modern browsers (ES2020)
  • Zero dependencies

License

MIT