@modular-intelligence/forensic-analysis

v1.0.2

Published

3 months ago

MCP server for file forensics & static analysis

0High
0Medium
0Low

Forensic Analysis MCP Server

A Model Context Protocol (MCP) server for file forensics and static analysis. Perform deep inspection of files including hashing, binary string extraction, entropy analysis, PE/ELF header parsing, metadata extraction, and comprehensive forensic investigation capabilities.

Overview

This MCP server provides ten specialized tools for analyzing files without executing them. Whether you're investigating suspicious binaries, analyzing malware samples, performing general file forensics, or building forensic investigation reports, this server enables Claude to perform comprehensive static analysis using industry-standard techniques.

Key capabilities:

Calculate cryptographic hashes (MD5, SHA1, SHA256) on individual files and entire directories
Extract printable strings with pattern recognition (URLs, IPs, suspicious APIs)
Identify file types via magic bytes
Calculate Shannon entropy to detect packing/encryption
Parse Windows PE (Portable Executable) headers with section analysis
Parse Linux ELF headers with readelf
Extract metadata from images, documents, and media files
Recursively hash directories and detect duplicate files
Correlate file timestamps with log entries for timeline analysis
Generate structured forensic investigation reports with findings and indicators

Tools

Tool Reference

| Tool | Purpose | Input | Output | |------|---------|-------|--------| | file_hash | Calculate MD5, SHA1, SHA256 hashes | File path | Three hash values + file size | | file_strings | Extract strings with pattern highlighting | File path, min length, encoding | Strings + categorized interesting patterns | | file_identify | Identify file type via magic bytes | File path | Type, MIME type, hex magic bytes | | file_entropy | Calculate Shannon entropy | File path | Entropy value + rating (very_low to extremely_high) | | pe_header | Parse Windows PE headers | File path | Machine type, sections, imports, packing detection | | elf_header | Parse Linux ELF headers | File path | Architecture, sections, entry point | | exif_metadata | Extract file metadata | File path | Key-value metadata dictionary | | hash_directory | Recursively hash files in directory | Directory path, algorithm, pattern | File hashes + duplicate detection | | file_correlate | Correlate file timestamps with log entries | Directory path, log file, time window | Timeline correlations between files and logs | | forensic_report | Generate structured forensic report | Case ID, title, findings, evidence | Professional report with aggregated findings |

Tool Details

file_hash

Calculate cryptographic hashes for file integrity verification and malware database lookups.

Input Schema:

{
  "file": "/path/to/file"
}

Example Output:

{
  "file": "/path/to/sample.exe",
  "size": 45056,
  "md5": "d41d8cd98f00b204e9800998ecf8427e",
  "sha1": "da39a3ee5e6b4b0d3255bfef95601890afd80709",
  "sha256": "e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855"
}

Use cases:

Check files against VirusTotal or other malware databases
Verify file integrity across different systems
Create hash-based file inventories

file_strings

Extract ASCII and wide-character strings from binaries with automatic pattern recognition for suspicious indicators.

Input Schema:

{
  "file": "/path/to/file",
  "min_length": 6,
  "encoding": "both",
  "max_results": 500
}

Parameters:

min_length: Minimum string length (3-100, default: 6)
encoding: "ascii", "wide" (Unicode), or "both" (default: "both")
max_results: Maximum strings to return (default: 500)

Interesting Pattern Categories:

url — HTTP/HTTPS URLs
ip_address — IPv4 addresses
email — Email addresses
unc_path — UNC network paths (\server\share)
windows_path — Windows file paths (C:...)
registry_key — Windows registry keys (HKEY_...)
sensitive_keyword — "password", "secret", "apikey", etc.
shell_reference — Shell commands (cmd.exe, powershell, bash)
suspicious_api — Win32 APIs (CreateProcess, VirtualAlloc, LoadLibrary, etc.)
crypto_reference — Cryptography keywords (encrypt, decrypt, cipher)

Example Output:

{
  "file": "/path/to/sample.exe",
  "total_strings": 1247,
  "strings": [
    "This program cannot be run in DOS mode",
    "kernel32.dll",
    "advapi32.dll",
    "user32.dll",
    "LoadLibraryA",
    "GetProcAddress"
  ],
  "interesting": [
    {
      "value": "http://malware.example.com/beacon",
      "category": "url"
    },
    {
      "value": "CreateProcessA",
      "category": "suspicious_api"
    },
    {
      "value": "192.168.1.100",
      "category": "ip_address"
    },
    {
      "value": "VirtualAlloc WriteProcessMemory CreateRemoteThread",
      "category": "suspicious_api"
    },
    {
      "value": "password=",
      "category": "sensitive_keyword"
    }
  ]
}

Use cases:

Quickly identify command & control domains
Detect suspicious API usage patterns
Find hardcoded credentials or configuration
Analyze code reuse and similarities

file_identify

Identify file type using magic bytes (file command) for verification of claimed file types.

Input Schema:

{
  "file": "/path/to/file"
}

Example Output:

{
  "file": "/path/to/sample.exe",
  "type": "PE32 executable (console) Intel 80386, for MS Windows",
  "mime_type": "application/x-msdownload",
  "magic_bytes": "4d 5a 90 00 03 00 00 00 04 00 00 00 ff ff 00 00"
}

Magic Bytes Reference:

4d 5a — PE/DOS executable (MZ header)
7f 45 4c 46 — ELF (Linux/Unix binary)
89 50 4e 47 — PNG image
ff d8 ff — JPEG image
50 4b 03 04 — ZIP archive

Use cases:

Verify file type hasn't been disguised with wrong extension
Detect polyglot files that are multiple formats simultaneously
Identify obfuscated or renamed files

file_entropy

Calculate Shannon entropy to detect compression, encryption, or packing.

Input Schema:

{
  "file": "/path/to/file"
}

Example Output:

{
  "file": "/path/to/sample.exe",
  "overall_entropy": 6.847,
  "entropy_rating": "very_high (likely compressed/encrypted)",
  "size": 45056,
  "max_possible_entropy": 8,
  "entropy_percentage": 85.59
}

Entropy Rating Scale:

| Entropy Range | Rating | Meaning | |---|---|---| | < 1.0 | very_low | Empty or uniform data (suspicious padding) | | 1.0 - 3.0 | low | Structured data, plain text, source code | | 3.0 - 5.0 | medium | Mixed content, normal executables | | 5.0 - 7.0 | high | Compressed or encoded sections | | 7.0 - 7.5 | very_high | Likely compressed/encrypted | | > 7.5 | extremely_high | Almost certainly encrypted or random data |

Interpretation:

Legitimate binaries: Typically 4.0 - 6.5 (mix of code, strings, data)
Packed malware: Often 7.0+ (entire payload compressed/encrypted)
Encrypted data: Approaches 8.0 (maximum randomness)

Use cases:

Detect code packing/obfuscation
Identify encryption without cryptanalysis
Spot unusual data patterns
Compare sections of a binary (high entropy .text section = suspicious)

pe_header

Parse Windows PE (Portable Executable) headers for comprehensive binary analysis. Pure TypeScript parsing, no external tools required.

Input Schema:

{
  "file": "/path/to/file"
}

Example Output:

{
  "file": "/path/to/malware.exe",
  "is_64bit": true,
  "machine": "AMD64",
  "timestamp": "2023-06-15T10:23:45.000Z",
  "characteristics": [
    "EXECUTABLE_IMAGE",
    "LARGE_ADDRESS_AWARE"
  ],
  "sections": [
    {
      "name": ".text",
      "virtual_size": 204800,
      "raw_size": 205312,
      "entropy": 6.234,
      "characteristics": [
        "CODE",
        "EXECUTE",
        "READ"
      ]
    },
    {
      "name": ".data",
      "virtual_size": 4096,
      "raw_size": 4096,
      "entropy": 3.456,
      "characteristics": [
        "INITIALIZED_DATA",
        "READ",
        "WRITE"
      ]
    },
    {
      "name": ".rsrc",
      "virtual_size": 8192,
      "raw_size": 8192,
      "entropy": 2.123,
      "characteristics": [
        "INITIALIZED_DATA",
        "READ"
      ]
    }
  ],
  "imports": [
    {
      "dll": "kernel32.dll",
      "functions": []
    },
    {
      "dll": "user32.dll",
      "functions": []
    },
    {
      "dll": "advapi32.dll",
      "functions": []
    }
  ],
  "is_packed": true,
  "packing_indicators": "High entropy sections: .text, .reloc; RWX sections: .overlay"
}

Key Fields:

machine: i386, AMD64, ARM, ARM64
timestamp: Compilation date (UTC)
characteristics: EXECUTABLE_IMAGE, DLL, 32BIT_MACHINE, LARGE_ADDRESS_AWARE
sections: Code, data, resources, debug info
- entropy: Section-level entropy (high entropy = packed)
- characteristics: CODE, INITIALIZED_DATA, EXECUTE, READ, WRITE
is_packed: Detected based on high entropy sections or RWX permissions
packing_indicators: Specific detection reasons

Packing Detection Heuristics:

Section entropy > 7.0 (indicates compression/encryption)
Read+Write+Execute permissions on same section (unusual for legitimate code)
Size mismatches between virtual and raw sizes

Common Packed Binary Indicators:

.text entropy: 7.2+ (should be 4.5-6.5)
.reloc entropy: 7.5+ (normally 3.0-5.0)
RWX section present (normal code is RX only)
Timestamp: 1970-01-01 (modified to hide compilation time)

Use cases:

Detect code packing and obfuscation
Identify imported APIs (often reveals malware family)
Analyze binary compilation metadata
Check for suspicious section permissions
Verify DLL dependencies

elf_header

Parse Linux/Unix ELF (Executable and Linkable Format) headers using readelf.

Input Schema:

{
  "file": "/path/to/file"
}

Example Output:

{
  "file": "/usr/bin/ls",
  "class": "ELF64",
  "data": "2's complement, little endian",
  "type": "EXEC (Executable file)",
  "machine": "Advanced Micro Devices X86-64",
  "entry_point": "0x401000",
  "section_count": 27,
  "sections": [
    {
      "name": ".text",
      "type": "PROGBITS",
      "size": 184652,
      "flags": "AX"
    },
    {
      "name": ".rodata",
      "type": "PROGBITS",
      "size": 98304,
      "flags": "A"
    },
    {
      "name": ".data",
      "type": "PROGBITS",
      "size": 8192,
      "flags": "WA"
    },
    {
      "name": ".bss",
      "type": "NOBITS",
      "size": 4096,
      "flags": "WA"
    }
  ]
}

Key Fields:

class: ELF32, ELF64
data: Endianness (little/big endian)
type: EXEC (executable), DYN (shared object), REL (relocatable)
machine: x86, x86-64, ARM, ARM64, MIPS, PPC, etc.
entry_point: Memory address where execution begins
sections: Program sections (.text, .data, .bss, .rodata, etc.)
- flags: A (allocate), W (write), X (execute)

Section Types:

PROGBITS — Program data in file
NOBITS — Space but no file data (like .bss)
SYMTAB — Symbol table
STRTAB — String table
RELA — Relocation entries
DYNAMIC — Dynamic linking info

Use cases:

Analyze Linux/Unix binaries
Check architecture and entry points
Identify stripped vs. unstripped binaries
Verify PIE (Position Independent Executable) support
Detect hardening features

exif_metadata

Extract metadata from image files, documents, and media using exiftool.

Input Schema:

{
  "file": "/path/to/image.jpg"
}

Example Output:

{
  "file": "/path/to/photo.jpg",
  "field_count": 34,
  "metadata": {
    "FileName": "photo.jpg",
    "FileSize": "2048 kB",
    "FileType": "JPEG",
    "MIMEType": "image/jpeg",
    "ExifImageWidth": "4032",
    "ExifImageHeight": "3024",
    "Make": "Apple",
    "Model": "iPhone 14 Pro",
    "DateTime": "2023:06:15 14:23:45",
    "LensModel": "iPhone 14 Pro main camera 6.86mm f/1.78",
    "GPSLatitude": "37 deg 46' 54.32\" N",
    "GPSLongitude": "122 deg 24' 13.70\" W",
    "GPSAltitude": "12.3 m Above Sea Level",
    "Copyright": "2023 John Doe",
    "ImageDescription": "Vacation photo"
  }
}

Common Metadata Fields:

Image: Width, height, bit depth, color space
Camera: Make, model, lens, aperture, shutter speed, ISO
Location: GPS coordinates, altitude
Timestamps: Original, modified, digitized
Copyright: Creator, copyright notice, usage rights
Software: Creator application, version

Privacy Note: EXIF data can reveal:

Camera location (GPS coordinates)
Device used (phone model)
Creation timestamp
Copyright/author information
Camera settings and behavior patterns

Use cases:

Extract geolocation data from images
Identify device/software used
Discover copyright and authorship info
Detect privacy leaks in shared images
Verify timestamp authenticity

hash_directory

Recursively hash all files in a directory with duplicate file detection.

Input Schema:

{
  directory_path: string              // Path to directory to hash
  algorithm: "md5" | "sha1" | "sha256" // Hash algorithm (default: sha256)
  recursive: boolean                  // Hash subdirectories (default: true)
  max_files: number                   // Maximum files to process (default: 1000)
  include_pattern?: string            // Only include files matching pattern (e.g., '*.exe')
}

Example Request:

{
  "directory_path": "/path/to/downloads",
  "algorithm": "sha256",
  "recursive": true,
  "max_files": 1000,
  "include_pattern": "*.exe"
}

Example Output:

{
  "directory": "/path/to/downloads",
  "algorithm": "sha256",
  "total_files": 42,
  "total_errors": 2,
  "duplicates_found": 3,
  "files": [
    {
      "path": "/path/to/installer.exe",
      "hash": "a1b2c3d4e5f6g7h8i9j0k1l2m3n4o5p6q7r8s9t0u1v2w3x4y5z6a7b8c9d0e1f",
      "size": 524288,
      "modified": "2024-01-15T10:23:45.000Z"
    },
    {
      "path": "/path/to/setup.exe",
      "hash": "b2c3d4e5f6g7h8i9j0k1l2m3n4o5p6q7r8s9t0u1v2w3x4y5z6a7b8c9d0e1f2g",
      "size": 1048576,
      "modified": "2024-01-14T15:30:20.000Z"
    }
  ],
  "duplicates": [
    {
      "hash": "c3d4e5f6g7h8i9j0k1l2m3n4o5p6q7r8s9t0u1v2w3x4y5z6a7b8c9d0e1f2g3h",
      "files": [
        "/path/to/report_v1.pdf",
        "/path/to/report_v1_copy.pdf",
        "/path/to/Archive/report_old.pdf"
      ]
    }
  ],
  "errors": [
    {
      "path": "/path/to/large_file.iso",
      "error": "File too large (>100MB), skipped"
    }
  ]
}

Use cases:

Find duplicate files in a directory tree
Verify integrity of archived files
Detect suspicious files disguised with different names
Create forensic inventories of file systems
Identify potential data exfiltration artifacts

file_correlate

Correlate file modification timestamps with log entries within a specified time window for timeline analysis.

Input Schema:

{
  directory_path: string                // Path to directory containing files
  log_file: string                      // Path to log file to correlate
  time_window_minutes: number           // Time window in minutes (default: 60)
  max_files: number                     // Maximum files to analyze (default: 500)
}

Example Request:

{
  "directory_path": "/var/log/suspicious_dir",
  "log_file": "/var/log/syslog",
  "time_window_minutes": 30,
  "max_files": 500
}

Example Output:

{
  "directory": "/var/log/suspicious_dir",
  "log_file": "/var/log/syslog",
  "time_window_minutes": 30,
  "total_files_analyzed": 15,
  "total_log_entries": 2847,
  "correlations_found": 8,
  "correlations": [
    {
      "file": "/var/log/suspicious_dir/malware.exe",
      "file_modified": "2024-01-15T14:23:45.000Z",
      "related_log_entries": [
        {
          "line_number": 1245,
          "timestamp": "2024-01-15T14:20:30.000Z",
          "content": "Process execution detected: C:\\System32\\cmd.exe /c download malware.exe"
        },
        {
          "line_number": 1246,
          "timestamp": "2024-01-15T14:22:15.000Z",
          "content": "File created: /var/log/suspicious_dir/malware.exe"
        },
        {
          "line_number": 1247,
          "timestamp": "2024-01-15T14:23:45.000Z",
          "content": "Suspicious API call: CreateProcessA from unknown process"
        }
      ]
    },
    {
      "file": "/var/log/suspicious_dir/config.ini",
      "file_modified": "2024-01-15T14:45:20.000Z",
      "related_log_entries": [
        {
          "line_number": 1312,
          "timestamp": "2024-01-15T14:43:00.000Z",
          "content": "Registry key modified: HKEY_LOCAL_MACHINE\\Software\\Microsoft\\Windows\\Run"
        }
      ]
    }
  ]
}

Supported Timestamp Formats:

ISO 8601: 2024-01-15T14:23:45Z or 2024-01-15 14:23:45
Syslog: Jan 15 14:23:45

Use cases:

Timeline analysis during incident response
Correlate file modifications with system events
Identify suspicious activity sequences
Establish causality between files and log entries
Support forensic investigation timelines

forensic_report

Generate a structured forensic investigation report with findings, evidence, indicators of compromise, and recommendations.

Input Schema:

{
  case_id: string                           // Unique case or incident identifier
  title: string                             // Report title
  findings: Array<{
    category: string                        // Category (e.g., 'malware', 'network', 'filesystem')
    severity: "LOW" | "MEDIUM" | "HIGH" | "CRITICAL"
    description: string                     // Detailed finding description
    evidence?: string[]                     // Evidence items (files, hashes, etc.)
    tools_used?: string[]                   // Tools used to discover finding
  }>
  timeline?: Array<{
    timestamp: string                       // ISO 8601 timestamp
    event: string                           // Event description
    source?: string                         // Source of the event
  }>
  affected_systems?: string[]               // List of affected systems/hosts
  iocs?: Array<{
    type: string                            // IOC type (ip, domain, hash, url, email)
    value: string                           // IOC value
    context?: string                        // Context/source of IOC
  }>
  recommendations?: string[]                // Recommended actions
}

Example Request:

{
  "case_id": "INC-2024-0215",
  "title": "Ransomware Incident Investigation Report",
  "findings": [
    {
      "category": "malware",
      "severity": "CRITICAL",
      "description": "Detected known ransomware executable with high entropy and suspicious API calls",
      "evidence": [
        "/Users/shared/payload.exe",
        "a1b2c3d4e5f6g7h8i9j0k1l2m3n4o5p6q7r8s9t0u1v2w3x4y5z6a7b8c9d0e1f"
      ],
      "tools_used": ["file_identify", "pe_header", "file_strings"]
    },
    {
      "category": "network",
      "severity": "HIGH",
      "description": "Suspicious outbound connections to known command and control server",
      "evidence": [
        "192.168.1.100",
        "malware-c2.example.com"
      ],
      "tools_used": ["file_strings", "network_analysis"]
    }
  ],
  "timeline": [
    {
      "timestamp": "2024-01-15T08:00:00Z",
      "event": "Initial infection vector: phishing email with malicious attachment"
    },
    {
      "timestamp": "2024-01-15T12:30:00Z",
      "event": "Ransomware deployment and encryption process initiated"
    },
    {
      "timestamp": "2024-01-15T14:00:00Z",
      "event": "Ransom note displayed to user"
    }
  ],
  "affected_systems": ["WORKSTATION-01", "FILESERVER-02", "LAPTOP-15"],
  "iocs": [
    {
      "type": "hash",
      "value": "a1b2c3d4e5f6g7h8i9j0k1l2m3n4o5p6q7r8s9t0u1v2w3x4y5z6a7b8c9d0e1f",
      "context": "SHA256 of ransomware executable"
    },
    {
      "type": "domain",
      "value": "malware-c2.example.com",
      "context": "Command and control domain"
    },
    {
      "type": "ip",
      "value": "192.0.2.100",
      "context": "C2 server IP address"
    },
    {
      "type": "email",
      "value": "[email protected]",
      "context": "Ransom contact email"
    }
  ],
  "recommendations": [
    "Isolate all affected systems from the network",
    "Preserve forensic evidence before remediation",
    "Reset credentials for all affected accounts",
    "Update security monitoring and detection rules"
  ]
}

Example Output:

{
  "report_metadata": {
    "case_id": "INC-2024-0215",
    "title": "Ransomware Incident Investigation Report",
    "generated_at": "2024-01-16T09:30:00.000Z",
    "overall_risk_level": "CRITICAL"
  },
  "executive_summary": {
    "total_findings": 2,
    "severity_breakdown": {
      "CRITICAL": 1,
      "HIGH": 1,
      "MEDIUM": 0,
      "LOW": 0
    },
    "affected_systems_count": 3,
    "ioc_count": 4
  },
  "findings": [
    {
      "id": "F-001",
      "category": "malware",
      "severity": "CRITICAL",
      "description": "Detected known ransomware executable with high entropy and suspicious API calls",
      "evidence": [
        "/Users/shared/payload.exe",
        "a1b2c3d4e5f6g7h8i9j0k1l2m3n4o5p6q7r8s9t0u1v2w3x4y5z6a7b8c9d0e1f"
      ],
      "tools_used": ["file_identify", "pe_header", "file_strings"]
    },
    {
      "id": "F-002",
      "category": "network",
      "severity": "HIGH",
      "description": "Suspicious outbound connections to known command and control server",
      "evidence": [
        "192.168.1.100",
        "malware-c2.example.com"
      ],
      "tools_used": ["file_strings", "network_analysis"]
    }
  ],
  "timeline": [
    {
      "timestamp": "2024-01-15T08:00:00Z",
      "event": "Initial infection vector: phishing email with malicious attachment"
    },
    {
      "timestamp": "2024-01-15T12:30:00Z",
      "event": "Ransomware deployment and encryption process initiated"
    },
    {
      "timestamp": "2024-01-15T14:00:00Z",
      "event": "Ransom note displayed to user"
    }
  ],
  "affected_systems": ["WORKSTATION-01", "FILESERVER-02", "LAPTOP-15"],
  "indicators_of_compromise": [
    {
      "type": "hash",
      "value": "a1b2c3d4e5f6g7h8i9j0k1l2m3n4o5p6q7r8s9t0u1v2w3x4y5z6a7b8c9d0e1f",
      "context": "SHA256 of ransomware executable"
    },
    {
      "type": "domain",
      "value": "malware-c2.example.com",
      "context": "Command and control domain"
    },
    {
      "type": "ip",
      "value": "192.0.2.100",
      "context": "C2 server IP address"
    },
    {
      "type": "email",
      "value": "[email protected]",
      "context": "Ransom contact email"
    }
  ],
  "recommendations": [
    "Isolate affected systems from the network",
    "Preserve forensic evidence before remediation",
    "Reset credentials for compromised accounts",
    "Apply patches for exploited vulnerabilities",
    "Review and update security monitoring rules"
  ]
}

Use cases:

Generate professional incident response reports
Aggregate findings from multiple forensic tools
Document investigation timeline and evidence
Track indicators of compromise
Provide actionable recommendations to stakeholders
Support incident response procedures and compliance documentation

Prerequisites

Required Tools

The server uses standard Unix/Linux utilities that come pre-installed on most systems:

strings — Extract printable strings from binaries
- Built into all POSIX systems
- Already available on macOS, Linux, BSD
file — Identify file types via magic bytes
- Standard command on all Unix-like systems
- Installed by default on macOS and Linux
readelf — Parse ELF binary headers
- Standard on Linux systems
- On macOS: brew install binutils (provides readelf)
- On BSD: pkg install binutils or equivalent

Optional Tools

exiftool — Extract metadata from images/documents
- macOS: brew install exiftool
- Linux: apt-get install libimage-exiftool-perl or yum install perl-Image-ExifTool
- Download: https://exiftool.org/

PE Header Parsing

PE header parsing is implemented in pure TypeScript with no external dependencies. The server automatically parses Windows executables without requiring any additional tools.

Verify Installation

# Check required tools
which strings
which file
which readelf  # or brew install binutils

# Optional: Check exiftool
which exiftool  # or brew install exiftool

Installation

Prerequisites

Node.js 18+ or Bun 1.0+
Bun runtime (recommended for better performance)

Steps

Install Bun (if not already installed):

curl -fsSL https://bun.sh/install | bash

Clone or download this repository:

cd /path/to/forensic-analysis

Install dependencies:

bun install

Build the server:

bun run build

This creates the compiled output in dist/index.js.

File Structure

forensic-analysis/
├── src/
│   ├── index.ts              # Server entry point
│   ├── schemas.ts            # Input validation schemas
│   ├── security.ts           # File path validation
│   ├── types.ts              # TypeScript interfaces
│   ├── cli-executor.ts       # External command execution
│   └── tools/
│       ├── file-hash.ts
│       ├── file-strings.ts
│       ├── file-identify.ts
│       ├── file-entropy.ts
│       ├── pe-header.ts
│       ├── elf-header.ts
│       ├── exif-metadata.ts
│       ├── hash-directory.ts
│       ├── file-correlate.ts
│       └── forensic-report.ts
├── dist/                      # Compiled output (generated)
├── package.json
└── README.md

Usage

Starting the Server

The server communicates via stdio (standard input/output), making it compatible with Claude Desktop and other MCP clients.

Direct execution:

bun run src/index.ts

Via compiled build:

bun dist/index.js

Claude Desktop Integration

Add the server to Claude Desktop's configuration:

File: ~/.claude/claude.json (or %APPDATA%\Claude\claude.json on Windows)

{
  "mcpServers": {
    "forensic-analysis": {
      "command": "bun",
      "args": [
        "run",
        "/path/to/mi-mcp-servers/packages/forensic-analysis/src/index.ts"
      ]
    }
  }
}

Or with built version:

{
  "mcpServers": {
    "forensic-analysis": {
      "command": "bun",
      "args": [
        "/path/to/mi-mcp-servers/packages/forensic-analysis/dist/index.js"
      ]
    }
  }
}

Claude Code (Cline) Integration

Add to Claude Code settings JSON:

{
  "mcpServers": {
    "forensic-analysis": {
      "command": "bun",
      "args": [
        "run",
        "/path/to/mi-mcp-servers/packages/forensic-analysis/src/index.ts"
      ]
    }
  }
}

Programmatic Usage

When connected to the MCP server, Claude can invoke tools like:

Analyze /path/to/binary.exe for packing indicators

This would trigger:

file_identify to verify it's a PE file
file_entropy to check for high entropy
pe_header to analyze sections and detect packing

Security

File Path Validation

All file paths are validated before access:

Path normalization: Resolves relative paths and symlinks
Absolute path requirement: Converted to absolute paths
Blocked paths: Access to sensitive system paths is denied:
- /etc/shadow — Password hashes
- /proc — Process information
- /sys — Kernel information
- /dev — Device files
Existence check: File must exist
File type check: Must be a regular file (not directory)
Size limit: Files must be under 100 MB

Security Module

// src/security.ts
const BLOCKED_PATHS = ["/etc/shadow", "/proc", "/sys", "/dev"];
const MAX_FILE_SIZE = 100 * 1024 * 1024; // 100MB

function validateFilePath(path: string): string {
  // 1. Normalize and resolve to absolute path
  const resolved = resolve(normalize(path));

  // 2. Check against blocked paths
  for (const blocked of BLOCKED_PATHS) {
    if (resolved.startsWith(blocked)) {
      throw new Error(`Access to ${blocked} is not allowed`);
    }
  }

  // 3. Verify file exists
  if (!existsSync(resolved)) {
    throw new Error(`File does not exist: ${resolved}`);
  }

  // 4. Verify it's a regular file
  const stat = statSync(resolved);
  if (!stat.isFile()) {
    throw new Error(`Path is not a file: ${resolved}`);
  }

  // 5. Check file size
  if (stat.size > MAX_FILE_SIZE) {
    throw new Error(`File too large (${(stat.size / 1024 / 1024).toFixed(1)}MB). Maximum: 100MB`);
  }

  return resolved;
}

External Command Execution

External commands (strings, file, readelf, exiftool) are executed with:

Timeout: 30 seconds per command
Buffer limit: 10 MB max output
No shell: Commands executed directly without shell interpretation
Argument validation: Zod schemas validate all inputs

Best Practices

Only analyze trusted files — Never analyze files from untrusted sources without verification
Use in isolated environment — Consider running in a container or VM for suspicious files
Monitor resource usage — Large files (approaching 100 MB) may be slow to analyze
Verify external tools — Ensure strings, file, readelf are from official sources

Examples

Analyze a Suspicious Binary

I found a suspicious executable at /Users/shared/unknown.exe.
Can you analyze it for signs of malware?

Claude would:

Run file_identify to confirm it's a PE executable
Run file_hash to get hashes for VirusTotal lookup
Run file_entropy to check for packing
Run pe_header to analyze imports and detect packing
Run file_strings to find URLs, APIs, and indicators

Extract Metadata from Photo

I have a photo at /path/to/vacation.jpg
that I want to share publicly. What metadata does it contain?

Claude would:

Run exif_metadata to extract all metadata
Report GPS coordinates, camera info, timestamps
Recommend removing sensitive fields before sharing

Compare Two Files

Are these two binaries the same?
File A: /tmp/program_v1.exe
File B: /tmp/program_v2.exe

Claude would:

Run file_hash on both files
Compare the SHA256 hashes
If different, run file_entropy and pe_header on both
Report differences in compilation time, sections, imports

Detect Packing

This binary seems obfuscated. How can I tell if it's packed?

Claude would:

Run file_entropy to check overall entropy
Run pe_header to analyze section entropy
Check for high-entropy sections (.text, .reloc)
Check for RWX (read-write-execute) sections
Report packing indicators with confidence

Find Duplicate Files

Find all duplicate files in my Downloads directory

Claude would:

Run hash_directory on /path/to/downloads
Report files with matching hashes
Identify duplicate copies and suggest deletion candidates

Timeline Analysis During Incident Response

Correlate files in /var/suspect_files with system logs to build a timeline

Claude would:

Run file_correlate with the suspect directory and system log file
Show which log entries correspond to file modifications
Help establish sequence of events during the incident

Generate Forensic Investigation Report

Create a comprehensive report for incident INC-2024-0215 with these findings:
- Detected ransomware executable
- Found C2 communications
- Affected systems: WORKSTATION-01, FILESERVER-02

Claude would:

Run forensic_report to aggregate findings
Generate professional report with timeline and IOCs
Include severity levels and recommendations

Architecture

Design Philosophy

Stateless: Each analysis is independent
Sandboxed: File path validation prevents access to protected areas
Non-invasive: No file modification or execution
Transparent: All results include file paths and metadata
Performance: Streaming analysis for large files where possible

Tool Organization

Each tool in src/tools/ exports:

Schema: Zod validation for inputs (exported as {toolName}Schema)
Function: Async handler that performs analysis (exported as {toolName})
Result types: Defined in src/types.ts

MCP Server Structure

// src/index.ts
const server = new McpServer({
  name: "forensic-analysis",
  version: "1.0.0",
});

// Each tool registered with schema and handler
server.tool("file_hash", "Calculate hashes", fileHashSchema.shape, toolHandler(fileHash));
server.tool("file_strings", "Extract strings", fileStringsSchema.shape, toolHandler(fileStrings));
// ... etc

The toolHandler wrapper:

Catches errors and returns them as MCP error responses
Serializes results to JSON
Maintains consistent response format

Troubleshooting

"Command not found: strings"

Ensure the strings utility is installed:

# macOS
which strings  # Should be /usr/bin/strings

# Linux
sudo apt-get install binutils  # or yum install binutils

# BSD
pkg install binutils

"Command not found: readelf"

Install binutils:

# macOS
brew install binutils

# Linux
sudo apt-get install binutils  # or yum install binutils

"exiftool not found"

Install exiftool:

# macOS
brew install exiftool

# Linux
sudo apt-get install libimage-exiftool-perl
# or
sudo yum install perl-Image-ExifTool

# From source
visit https://exiftool.org/

"File too large"

The 100 MB limit exists to prevent memory issues. For larger files:

Analyze portions separately using other tools
Use external forensics tools (Volatility, Ghidra, IDA)
Consider file compression/splitting

"Command timed out"

External commands have a 30-second timeout. For slow operations:

Try a smaller file
Verify the file isn't corrupted
Check system load

"Invalid PE file (missing MZ signature)"

The file is not a Windows executable:

Use file_identify to confirm actual type
PE header parsing only works on PE files
Try elf_header for Linux binaries

Performance Notes

Analysis Speed

Tool performance depends on file size and complexity:

file_hash: ~10-50 MB/sec (limited by disk I/O)
file_identify: ~1-5 MB/sec (reads magic bytes only)
file_entropy: ~50-100 MB/sec (reads entire file)
file_strings: ~5-20 MB/sec (pattern matching overhead)
pe_header: ~10-50 MB/sec (inline parsing, no external tools)
elf_header: ~5-10 MB/sec (depends on readelf performance)
exif_metadata: ~1-5 MB/sec (depends on file type and exiftool)
hash_directory: ~5-30 MB/sec aggregate (depends on file count and sizes)
file_correlate: ~1-10 MB/sec (depends on log file size)
forensic_report: Instant (data aggregation only)

Memory Usage

Small files (< 10 MB): < 50 MB RAM
Medium files (10-50 MB): 100-300 MB RAM
Large files (50-100 MB): 300-600 MB RAM
Directory operations: Depends on file count (typically under 500 MB for 1000 files)

License

MIT License

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

Forensic Analysis MCP Server

Overview

Tools

Tool Reference

Tool Details

file_hash

file_strings

file_identify

file_entropy

pe_header

elf_header

exif_metadata

hash_directory

file_correlate

forensic_report

Prerequisites

Required Tools

Optional Tools

PE Header Parsing

Verify Installation

Installation

Prerequisites

Steps

File Structure

Usage

Starting the Server

Claude Desktop Integration

Claude Code (Cline) Integration

Programmatic Usage

Security

File Path Validation

Security Module

External Command Execution

Best Practices

Examples

Analyze a Suspicious Binary

Extract Metadata from Photo

Compare Two Files

Detect Packing

Find Duplicate Files

Timeline Analysis During Incident Response

Generate Forensic Investigation Report

Architecture

Design Philosophy

Tool Organization

MCP Server Structure

Troubleshooting

"Command not found: strings"

"Command not found: readelf"

"exiftool not found"

"File too large"

"Command timed out"

"Invalid PE file (missing MZ signature)"

Performance Notes

Analysis Speed

Memory Usage

License