@ananotherdeveloper/drain3js

v0.9.11-rev1

Published

15 days ago

TypeScript port of Drain3 - persistent & streaming log template miner

0High
0Medium
0Low

ananotherdeveloper

drain3 drain log parser template miner log-parsing log-template log-clustering streaming

drain3js

TypeScript port of Drain3 - a persistent and streaming log template miner that uses a fixed-depth parse tree.

What is Drain3?

Drain3 extracts log templates from raw log messages in a streaming fashion. Given logs like:

Connected to 10.0.0.1
Connected to 10.0.0.2
Disk error on /dev/sda1
Disk error on /dev/sdb2

It mines templates:

Connected to <IP>
Disk error on <*>

Based on the paper: "Drain: An Online Log Parsing Approach with Fixed Depth Tree" by Pinjia He, Jieming Zhu, Zibin Zheng, and Michael R. Lyu (ICWS 2017).

Installation

npm install @ananotherdeveloper/drain3js

Quick Start

import { TemplateMiner, TemplateMinerConfig } from '@ananotherdeveloper/drain3js';

const config = new TemplateMinerConfig();
const miner = new TemplateMiner(null, config);

const logs = [
  'Connected to 10.0.0.1',
  'Connected to 10.0.0.2',
  'Disk error on /dev/sda1',
  'Disk error on /dev/sdb2',
];

for (const log of logs) {
  const result = miner.addLogMessage(log);
  console.log(`Template: ${result.templateMined} (cluster #${result.clusterId})`);
}

Features

Streaming: Process logs one at a time - no need to batch
Persistent state: Save and restore miner state via pluggable persistence handlers
Masking: Regex-based masking to normalize IPs, numbers, hex values, etc. before mining
Two algorithms: Standard Drain (token-count tree) and JaccardDrain (Jaccard similarity)
Parameter extraction: Extract variable parameters from logs using mined templates
Memory efficient: Optional LRU-based cluster eviction with configurable max clusters
Inference mode: Match logs against existing templates without creating new clusters

Configuration

Use TemplateMinerConfig to customize behavior:

import { TemplateMinerConfig } from '@ananotherdeveloper/drain3js';

const config = new TemplateMinerConfig();

// Load from a JSON config object
config.load({
  drain: {
    engine: 'Drain',         // 'Drain' or 'JaccardDrain'
    simTh: 0.4,              // Similarity threshold (default: 0.4)
    depth: 4,                // Parse tree depth (default: 4, minimum: 3)
    maxChildren: 100,        // Max children per node (default: 100)
    maxClusters: null,       // Max clusters, null = unlimited (default: null)
    extraDelimiters: ['_'],  // Additional token delimiters (default: [])
    parametrizeNumericTokens: true,
  },
  masking: {
    instructions: [
      { regexPattern: '((?:\\d+\\.){3}\\d+)', maskWith: 'IP' },
      { regexPattern: '0x[a-fA-F0-9]+', maskWith: 'HEX' },
      { regexPattern: '\\d+', maskWith: 'NUM' },
    ],
    maskPrefix: '<',
    maskSuffix: '>',
  },
  snapshot: {
    snapshotIntervalMinutes: 5,
    compressState: true,
  },
});

const miner = new TemplateMiner(null, config);

You can also load configuration from a JSON file:

config.load('/path/to/config.json');

Persistence

Save and restore miner state using persistence handlers:

import { TemplateMiner, TemplateMinerConfig, FilePersistence } from '@ananotherdeveloper/drain3js';

const persistence = new FilePersistence('./drain3_state.bin');
const config = new TemplateMinerConfig();
const miner = new TemplateMiner(persistence, config);

// State is automatically saved based on snapshotIntervalMinutes
// and whenever a new cluster is created

Built-in persistence handlers:

| Handler | Description | |---------|-------------| | FilePersistence | Saves state to a local file | | MemoryBufferPersistence | Stores state in memory (useful for testing) |

Implement the PersistenceHandler interface for custom storage (Redis, S3, etc.):

import { PersistenceHandler } from '@ananotherdeveloper/drain3js';

class RedisPersistence implements PersistenceHandler {
  saveState(state: Buffer): void { /* ... */ }
  loadState(): Buffer | null { /* ... */ }
}

Masking

Mask sensitive or variable data before template mining:

config.load({
  masking: {
    instructions: [
      { regexPattern: '((?:\\d+\\.){3}\\d+)', maskWith: 'IP' },
      { regexPattern: '0x[a-fA-F0-9]+', maskWith: 'HEX' },
      { regexPattern: '(?<=user=)\\w+', maskWith: 'USER' },
    ],
  },
});

Input: Connected to 10.0.0.1 by user=admin After masking: Connected to <IP> by user=<USER>

Parameter Extraction

Extract variable parameters from logs using mined templates:

const result = miner.addLogMessage('Connected to 10.0.0.1');
const params = miner.extractParameters(result.templateMined, 'Connected to 10.0.0.1');
// [{ value: '10.0.0.1', maskName: 'IP' }]

Inference Mode

Match logs against existing templates without creating new clusters:

// Training phase
miner.addLogMessage('User alice logged in');
miner.addLogMessage('User bob logged in');

// Inference phase
const match = miner.match('User charlie logged in');
if (match) {
  console.log(`Matched template: ${match.getTemplate()}`);
}

The match() method supports three search strategies:

'never' (default) - Only search the tree path
'fallback' - Tree search first, then full search if no match
'always' - Always search all clusters for the token count

API Reference

`TemplateMiner`

| Method | Description | |--------|-------------| | addLogMessage(logMessage) | Process a log message and return mining result | | match(logMessage, strategy?) | Match against existing templates | | extractParameters(template, logMessage) | Extract parameters with mask names | | getParameterList(template, logMessage) | Extract parameter values only | | saveState(reason) | Manually save state | | loadState() | Manually load state |

`MinerResult`

| Field | Type | Description | |-------|------|-------------| | changeType | string | 'cluster_created', 'cluster_template_changed', or 'none' | | clusterId | number | ID of the matched/created cluster | | clusterSize | number | Number of logs in the cluster | | templateMined | string | The current template string | | clusterCount | number | Total number of clusters |

Low-level API

Use Drain or JaccardDrain directly for advanced use cases:

import { Drain } from '@ananotherdeveloper/drain3js';

const drain = new Drain(4, 0.4, 100);
const [cluster, changeType] = drain.addLogMessage('Connected to 10.0.0.1');
console.log(cluster.getTemplate());

Attribution

This project is a TypeScript port of Drain3 by IBM Research.

The Drain algorithm is based on the paper:

Pinjia He, Jieming Zhu, Zibin Zheng, and Michael R. Lyu. "Drain: An Online Log Parsing Approach with Fixed Depth Tree," Proceedings of the 2017 IEEE International Conference on Web Services (ICWS), 2017.

License

MIT