@modular-intelligence/log-analysis

v1.0.2

Published

2 months ago

MCP server for log parsing & pattern detection

0High
0Medium
0Low

Log Analysis MCP Server

A Model Context Protocol (MCP) server for parsing, searching, and analyzing structured logs with automatic format detection. Written in pure TypeScript with no external CLI dependencies.

Features

Multi-format log parsing: Automatically detect and parse JSON, BSD syslog, ArcSight CEF, and Apache/Nginx CLF formats
Pattern matching: Search logs using regex patterns with severity and time range filters
Timeline correlation: Generate chronological event timelines across multiple log sources
Statistical analysis: Aggregate event counts by severity, source, and time period
Anomaly detection: Identify frequency spikes, high error rates, logging gaps, and unusual patterns
Pure TypeScript: No external CLI tools required—everything runs in-process
Security hardened: File size validation (200MB max), regex validation, and path normalization

Supported Log Formats

The server automatically detects log format from file content, or you can specify the format explicitly. Each format supports different field extraction and timestamp parsing.

JSON Structured Logs

Newline-delimited JSON objects with flexible field naming.

Example:

{"timestamp": "2024-01-15T10:23:45.123Z", "level": "ERROR", "service": "api-server", "message": "Database connection failed", "error_code": 5001}

Recognized timestamp fields: timestamp, time, @timestamp, date, ts Recognized severity fields: level, severity, log_level, priority Recognized source fields: source, logger, service, hostname, host

BSD Syslog

Standard syslog format with facility, hostname, service, and message.

Example:

Jan 15 10:23:45 web-01 nginx[1234]: GET /api/users HTTP/1.1 - 502 Bad Gateway

Format: <month> <day> <time> <hostname> <service>[<pid>]: <message>

ArcSight CEF (Common Event Format)

Vendor-agnostic format for security event logging with key-value extensions.

Example:

CEF:0|Vendor|Product|1.0|12345|User Login|8|src=192.168.1.100 dst=10.0.0.5 user=admin cs1=success

Severity mapping: 0-3=DEBUG, 4-6=INFO, 7-8=ERROR, 9-10=CRITICAL

Apache/Nginx Common Log Format (CLF)

Standard web server access log format.

Example:

192.168.1.100 - frank [15/Jan/2024:10:23:45 -0700] "GET /index.html HTTP/1.0" 200 2326

Format: <client_ip> <identity> <user> [<timestamp>] "<request>" <status> <size>

HTTP status code severity mapping: 500+=ERROR, 400-499=WARNING, all others=INFO

Tools

The server provides five specialized tools for log analysis:

| Tool | Purpose | |------|---------| | log_parse | Parse log file and extract normalized events | | log_search | Search logs by pattern, severity, time range, and field values | | log_timeline | Create chronological event timeline across multiple files | | log_stats | Generate statistical summaries by severity, source, and time | | log_anomalies | Detect statistical anomalies and logging gaps |

log_parse

Parse a log file and extract all events in normalized format.

Input Schema:

{
  "file": "string (required, absolute path)",
  "format": "string (auto|json|syslog|cef|clf, default: auto)",
  "max_results": "integer (1-5000, default: 500)"
}

Example Input:

{
  "file": "/var/log/app.log",
  "format": "auto",
  "max_results": 100
}

Example Output:

{
  "file": "/var/log/app.log",
  "format": "syslog",
  "total_lines": 2450,
  "parsed_events": 2398,
  "parse_errors": 52,
  "events": [
    {
      "timestamp": "Jan 15 10:23:45",
      "source": "web-01",
      "severity": "ERROR",
      "message": "Connection timeout to database",
      "raw": "Jan 15 10:23:45 web-01 app[1234]: Connection timeout to database",
      "fields": {
        "service": "app",
        "pid": "1234"
      },
      "format": "syslog",
      "line_number": 156
    },
    {
      "timestamp": "Jan 15 10:24:12",
      "source": "web-01",
      "severity": "INFO",
      "message": "Database connection re-established",
      "raw": "Jan 15 10:24:12 web-01 app[1234]: Database connection re-established",
      "fields": {
        "service": "app",
        "pid": "1234"
      },
      "format": "syslog",
      "line_number": 157
    }
  ],
  "sample_errors": []
}

log_search

Search logs using regex patterns, severity levels, time ranges, and specific field values.

Input Schema:

{
  "file": "string (required, absolute path)",
  "pattern": "string (required, regex pattern)",
  "format": "string (auto|json|syslog|cef|clf, default: auto)",
  "severity": "string (DEBUG|INFO|WARNING|ERROR|CRITICAL, optional)",
  "time_range": {
    "start": "string (ISO 8601 or common log format)",
    "end": "string (ISO 8601 or common log format)"
  },
  "field": "string (search within specific field name, optional)",
  "max_results": "integer (1-5000, default: 500)"
}

Example Input:

{
  "file": "/var/log/app.log",
  "pattern": "ERROR|timeout|failed",
  "severity": "ERROR",
  "time_range": {
    "start": "2024-01-15T10:00:00Z",
    "end": "2024-01-15T11:00:00Z"
  },
  "max_results": 50
}

Example Output:

{
  "file": "/var/log/app.log",
  "query": "ERROR|timeout|failed",
  "total_matches": 3,
  "events": [
    {
      "timestamp": "Jan 15 10:23:45",
      "source": "web-01",
      "severity": "ERROR",
      "message": "Connection timeout to database",
      "raw": "Jan 15 10:23:45 web-01 app[1234]: Connection timeout to database",
      "fields": {
        "service": "app",
        "pid": "1234"
      },
      "format": "syslog",
      "line_number": 156
    },
    {
      "timestamp": "Jan 15 10:45:30",
      "source": "web-02",
      "severity": "ERROR",
      "message": "Failed to authenticate with service",
      "raw": "Jan 15 10:45:30 web-02 app[5678]: Failed to authenticate with service",
      "fields": {
        "service": "app",
        "pid": "5678"
      },
      "format": "syslog",
      "line_number": 289
    }
  ]
}

log_timeline

Generate a sorted chronological timeline of events across multiple log files, useful for correlating events from different systems.

Input Schema:

{
  "files": "string[] (required, array of 1-10 absolute paths)",
  "format": "string (auto|json|syslog|cef|clf, default: auto)",
  "severity": "string (DEBUG|INFO|WARNING|ERROR|CRITICAL, optional)",
  "max_results": "integer (1-5000, default: 500)"
}

Example Input:

{
  "files": [
    "/var/log/app.log",
    "/var/log/database.log",
    "/var/log/auth.log"
  ],
  "format": "auto",
  "severity": "WARNING",
  "max_results": 100
}

Example Output:

{
  "files": [
    "/var/log/app.log",
    "/var/log/database.log",
    "/var/log/auth.log"
  ],
  "total_events": 45,
  "time_range": {
    "start": "2024-01-15T10:00:15Z",
    "end": "2024-01-15T11:45:32Z"
  },
  "events": [
    {
      "timestamp": "2024-01-15T10:00:15Z",
      "source": "/var/log/app.log",
      "severity": "WARNING",
      "message": "High memory usage detected",
      "raw": "{\"timestamp\": \"2024-01-15T10:00:15Z\", \"level\": \"WARNING\", \"message\": \"High memory usage detected\"}",
      "fields": {},
      "format": "json",
      "line_number": 42
    },
    {
      "timestamp": "2024-01-15T10:15:45Z",
      "source": "/var/log/database.log",
      "severity": "WARNING",
      "message": "Slow query detected: execution time 5.2s",
      "raw": "Jan 15 10:15:45 db-01 mysql[9876]: Slow query detected: execution time 5.2s",
      "fields": {
        "service": "mysql",
        "pid": "9876"
      },
      "format": "syslog",
      "line_number": 512
    },
    {
      "timestamp": "2024-01-15T10:22:10Z",
      "source": "/var/log/auth.log",
      "severity": "WARNING",
      "message": "Multiple failed login attempts",
      "raw": "Jan 15 10:22:10 auth-01 sshd[3456]: Failed password for invalid user admin",
      "fields": {
        "service": "sshd",
        "pid": "3456"
      },
      "format": "syslog",
      "line_number": 789
    }
  ]
}

log_stats

Generate statistical summaries including event counts by severity, source, and hourly breakdown.

Input Schema:

{
  "file": "string (required, absolute path)",
  "format": "string (auto|json|syslog|cef|clf, default: auto)"
}

Example Input:

{
  "file": "/var/log/app.log",
  "format": "auto"
}

Example Output:

{
  "file": "/var/log/app.log",
  "total_events": 2398,
  "time_range": {
    "start": "2024-01-15T08:00:00.000Z",
    "end": "2024-01-15T17:30:45.000Z"
  },
  "by_severity": {
    "DEBUG": 245,
    "INFO": 1456,
    "WARNING": 512,
    "ERROR": 178,
    "CRITICAL": 7
  },
  "by_source": {
    "web-01": 823,
    "web-02": 756,
    "api-server": 589,
    "worker-01": 230
  },
  "events_per_hour": [
    {
      "hour": "2024-01-15 08:00",
      "count": 156
    },
    {
      "hour": "2024-01-15 09:00",
      "count": 298
    },
    {
      "hour": "2024-01-15 10:00",
      "count": 412
    },
    {
      "hour": "2024-01-15 11:00",
      "count": 385
    }
  ]
}

log_anomalies

Detect statistical anomalies including frequency spikes, elevated error rates, unusual source cardinality, and logging gaps.

Input Schema:

{
  "file": "string (required, absolute path)",
  "format": "string (auto|json|syslog|cef|clf, default: auto)",
  "sensitivity": "integer (1-5, default: 3, where 1=low, 5=high)"
}

Example Input:

{
  "file": "/var/log/app.log",
  "format": "auto",
  "sensitivity": 3
}

Example Output:

{
  "file": "/var/log/app.log",
  "total_events": 5234,
  "anomaly_count": 4,
  "anomalies": [
    {
      "type": "frequency_spike",
      "description": "Unusual event volume: 187 events/min vs avg 45.2/min (3.1 std devs above mean)",
      "severity": "HIGH",
      "timestamp": "2024-01-15 10:23",
      "details": {
        "events_per_minute": 187,
        "average": 45.2,
        "std_devs_above": 3.1
      }
    },
    {
      "type": "high_error_rate",
      "description": "High error rate: 18.5% of events are ERROR/CRITICAL (128/692)",
      "severity": "HIGH",
      "details": {
        "error_count": 128,
        "total": 692,
        "rate": 0.185
      }
    },
    {
      "type": "logging_gap",
      "description": "45-minute gap in logging between 2024-01-15 14:30 and 2024-01-15 15:15",
      "severity": "HIGH",
      "timestamp": "2024-01-15 14:30",
      "details": {
        "gap_minutes": 45,
        "from": "2024-01-15 14:30",
        "to": "2024-01-15 15:15"
      }
    },
    {
      "type": "high_source_cardinality",
      "description": "Unusually high number of unique sources: 67",
      "severity": "LOW",
      "details": {
        "unique_sources": 67,
        "top_sources": [
          ["192.168.1.50", 234],
          ["192.168.1.51", 189],
          ["192.168.1.52", 156]
        ]
      }
    }
  ],
  "baseline": {
    "avg_events_per_minute": 45.2,
    "std_dev_per_minute": 12.8,
    "unique_sources": 67,
    "common_severities": {
      "DEBUG": 512,
      "INFO": 3456,
      "WARNING": 892,
      "ERROR": 289,
      "CRITICAL": 85
    }
  }
}

Installation

Prerequisites

Bun 1.0 or later
Node.js 18+ (for runtime if not using Bun directly)

Setup

Clone the repository and install dependencies:

cd /path/to/log-analysis
bun install

Build the server:

bun run build

This produces dist/index.js, which is the compiled MCP server.

Usage

Stdio Transport (Direct Integration)

The server uses stdio (stdin/stdout) transport for communication with MCP clients.

Run the server directly:

bun run src/index.ts

Or run the compiled version:

bun dist/index.js

Claude Desktop Configuration

Add this to your Claude Desktop config file (~/.config/Claude/claude_desktop_config.json):

{
  "mcpServers": {
    "log-analysis": {
      "command": "bun",
      "args": ["/path/to/log-analysis/src/index.ts"]
    }
  }
}

On Windows, use the absolute path with backslashes or forward slashes:

{
  "mcpServers": {
    "log-analysis": {
      "command": "bun",
      "args": ["C:/path/to/log-analysis/src/index.ts"]
    }
  }
}

Claude Code MCP Settings

If using Claude Code with MCP support, configure it in your settings JSON:

{
  "mcpServers": {
    "log-analysis": {
      "command": "bun",
      "args": ["/absolute/path/to/log-analysis/src/index.ts"],
      "env": {}
    }
  }
}

Example Tool Calls

Once configured, you can use the tools in Claude or your MCP client:

Parse a log file:

const result = await tools.log_parse({
  file: "/var/log/application.log",
  format: "auto",
  max_results: 100
});

Search for errors in a time window:

const result = await tools.log_search({
  file: "/var/log/application.log",
  pattern: "error|failed|exception",
  severity: "ERROR",
  time_range: {
    start: "2024-01-15T10:00:00Z",
    end: "2024-01-15T11:00:00Z"
  },
  max_results: 50
});

Generate a timeline across multiple files:

const result = await tools.log_timeline({
  files: [
    "/var/log/app.log",
    "/var/log/database.log",
    "/var/log/nginx/access.log"
  ],
  format: "auto",
  severity: "WARNING",
  max_results: 200
});

Get statistics for a log file:

const result = await tools.log_stats({
  file: "/var/log/application.log",
  format: "auto"
});

Detect anomalies:

const result = await tools.log_anomalies({
  file: "/var/log/application.log",
  format: "auto",
  sensitivity: 3
});

Security

The server implements multiple security measures to prevent abuse and ensure safe operation:

File Size Limit

Log files are limited to 200 MB maximum. This prevents memory exhaustion from processing extremely large files.

Error: File too large (456.2MB). Maximum: 200MB

Path Validation

All file paths are validated to:

Resolve to absolute paths
Normalize path separators
Verify the file exists and is a regular file (not a directory)
Reject non-existent or inaccessible paths

Error: File does not exist: /var/log/nonexistent.log
Error: Path is not a file: /var/log/

Regex Validation

All regex patterns are validated before execution to catch syntax errors and prevent ReDoS (Regular Expression Denial of Service) attacks.

Error: Invalid regex pattern: Nothing to repeat

Rate Limiting

The server processes one request at a time on a single thread, preventing resource exhaustion from concurrent requests.

No External Commands

All log parsing and analysis runs in-process using pure TypeScript/JavaScript. This eliminates the security risks associated with spawning external CLI tools.

Architecture

The server is organized into modular components:

index.ts - MCP server setup and tool registration
schemas.ts - Zod validation schemas for all tool inputs
types.ts - TypeScript interfaces for log events and results
parsers.ts - Format detection and line parsing (JSON, syslog, CEF, CLF)
security.ts - Path and regex validation
tools/ - Individual tool implementations:
- log-parse.ts - Format detection and event extraction
- log-search.ts - Pattern matching with filters
- log-timeline.ts - Multi-file event correlation
- log-stats.ts - Statistical aggregation
- log-anomalies.ts - Anomaly detection algorithms

Format Detection Logic

The server automatically detects log format based on line content:

JSON: Lines starting with {
CEF: Lines matching CEF header pattern
Syslog: Lines matching BSD syslog format
CLF: Lines matching Apache/Nginx access log format
Generic: Falls back to generic parsing if no format matches

Detection uses the first non-empty line in a file. You can override detection by specifying a format explicitly.

Performance Considerations

Log files are loaded entirely into memory. Keep files under 200 MB.
Parsing is single-threaded but optimized for throughput.
Regex patterns are case-insensitive for broader matching.
Statistical calculations use single-pass algorithms where possible.
Anomaly detection uses standard deviation-based thresholds.

License

MIT

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

Log Analysis MCP Server

Features

Supported Log Formats

JSON Structured Logs

BSD Syslog

ArcSight CEF (Common Event Format)

Apache/Nginx Common Log Format (CLF)

Tools

log_parse

log_search

log_timeline

log_stats

log_anomalies

Installation

Prerequisites

Setup

Usage

Stdio Transport (Direct Integration)

Claude Desktop Configuration

Claude Code MCP Settings

Example Tool Calls

Security

File Size Limit

Path Validation

Regex Validation

Rate Limiting

No External Commands

Architecture

Format Detection Logic

Performance Considerations

License