npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

jsonl-explorer-mcp

v1.0.0

Published

MCP server for analyzing JSONL files with streaming, statistics, search, validation, and live tailing

Readme

JSONL Explorer MCP

CI Node.js License: MIT

A Model Context Protocol (MCP) server for analyzing JSONL (JSON Lines) files. Designed for local development workflows with files ranging from 1MB to 1GB.

Why JSONL Explorer?

Working with large JSONL files in development can be challenging:

  • Log files grow too large to open in editors
  • Data exports need exploration before processing
  • Event streams require real-time monitoring
  • Schema drift happens silently across records

JSONL Explorer solves these problems by providing streaming analysis tools that work efficiently with large files while integrating seamlessly with AI assistants via MCP.

Features

| Feature | Description | |---------|-------------| | Streaming Architecture | Process files of any size without loading into memory | | Schema Inference | Automatically detect and track schema across records | | Statistical Analysis | Field-level stats including distributions, percentiles, cardinality | | Flexible Querying | Simple comparisons, regex, JSONPath, and compound queries | | JSON Schema Validation | Validate syntax and structure against schemas | | Live File Tailing | Monitor actively-written files with cursor-based tracking | | File Comparison | Diff two JSONL files by key field |

Quick Start

Installation

npm install -g jsonl-explorer-mcp

Or run directly with npx:

npx jsonl-explorer-mcp

MCP Client Configuration

Claude Desktop

Add to ~/.config/claude/claude_desktop_config.json (Linux/macOS) or %APPDATA%\Claude\claude_desktop_config.json (Windows):

{
  "mcpServers": {
    "jsonl-explorer": {
      "command": "npx",
      "args": ["jsonl-explorer-mcp"]
    }
  }
}

Claude Code

Add to your project's .mcp.json:

{
  "mcpServers": {
    "jsonl-explorer": {
      "command": "npx",
      "args": ["jsonl-explorer-mcp"]
    }
  }
}

Transport Modes

Stdio Mode (default) - For MCP clients that communicate via stdin/stdout:

jsonl-explorer-mcp

HTTP Mode - For web-based integrations:

jsonl-explorer-mcp --http --port=3000

Tools Reference

jsonl_inspect

Get a comprehensive overview of a JSONL file including size, record count, inferred schema, and field statistics.

Parameters: | Name | Type | Default | Description | |------|------|---------|-------------| | file | string | required | Absolute path to the JSONL file | | sampleSize | number | 100 | Records to sample for schema inference |

Example Response:

{
  "file": "/data/events.jsonl",
  "size": "156.2 MB",
  "lineCount": 1248392,
  "validRecords": 1248392,
  "malformedLines": 0,
  "schema": {
    "type": "object",
    "fields": [
      { "name": "id", "types": ["string"], "nullable": false },
      { "name": "timestamp", "types": ["string"], "nullable": false },
      { "name": "event_type", "types": ["string"], "nullable": false },
      { "name": "payload", "types": ["object"], "nullable": true }
    ]
  }
}

jsonl_sample

Retrieve sample records using various sampling strategies.

Parameters: | Name | Type | Default | Description | |------|------|---------|-------------| | file | string | required | Absolute path to the JSONL file | | count | number | 10 | Number of records to sample | | mode | string | "first" | Sampling mode: first, last, random, range | | rangeStart | number | - | Start line for range mode (1-indexed) | | rangeEnd | number | - | End line for range mode (1-indexed) |

Sampling Modes:

  • first - First N records (fast, streaming)
  • last - Last N records (requires file scan)
  • random - Random sample using reservoir sampling
  • range - Specific line range

jsonl_schema

Infer the schema of records by sampling.

Parameters: | Name | Type | Default | Description | |------|------|---------|-------------| | file | string | required | Absolute path to the JSONL file | | sampleSize | number | 1000 | Records to sample | | outputFormat | string | "inferred" | Format: inferred, json-schema, formatted |

Output Formats:

  • inferred - Internal schema representation with type frequencies
  • json-schema - Standard JSON Schema (draft-07)
  • formatted - Human-readable summary

jsonl_stats

Collect aggregate statistics for fields.

Parameters: | Name | Type | Default | Description | |------|------|---------|-------------| | file | string | required | Absolute path to the JSONL file | | fields | string[] | all | Specific fields to analyze | | maxRecords | number | all | Maximum records to analyze |

Statistics Provided:

  • Numeric fields: min, max, mean, median, stdDev, percentiles (p50, p90, p95, p99)
  • String fields: minLength, maxLength, avgLength, cardinality, value distribution
  • Boolean fields: true/false counts and percentages
  • All fields: null count, unique count

jsonl_search

Search for records where a field matches a regex pattern.

Parameters: | Name | Type | Default | Description | |------|------|---------|-------------| | file | string | required | Absolute path to the JSONL file | | field | string | required | Field path (supports dot notation) | | pattern | string | required | Regex pattern to match | | caseSensitive | boolean | false | Case-sensitive matching | | maxResults | number | 100 | Maximum results to return | | returnFields | string[] | all | Fields to include in results |

Example:

{
  "file": "/data/logs.jsonl",
  "field": "message",
  "pattern": "error|failed|exception",
  "caseSensitive": false,
  "maxResults": 50
}

jsonl_filter

Filter records using powerful query expressions.

Parameters: | Name | Type | Default | Description | |------|------|---------|-------------| | file | string | required | Absolute path to the JSONL file | | query | string | required | Query expression | | outputFormat | string | "records" | Output: records, count, lines | | limit | number | 1000 | Maximum results |

Query Syntax:

| Type | Example | Description | |------|---------|-------------| | Equality | status == "active" | Exact match | | Comparison | age > 30 | Numeric comparison (>, >=, <, <=, !=) | | Regex | email =~ "@gmail\\.com$" | Pattern matching | | Null check | deleted_at == null | Check for null values | | JSONPath | $[?(@.price < 100)] | Full JSONPath expressions | | Compound | status == "active" AND age > 30 | Combine with AND/OR |

Examples:

// Find active premium users
"subscription == \"premium\" AND active == true"

// Find orders over $100
"total > 100"

// Find emails from specific domain
"email =~ \"@company\\.com$\""

// Complex JSONPath
"$[?(@.items[*].quantity > 10)]"

jsonl_validate

Validate file syntax and optionally against a JSON Schema.

Parameters: | Name | Type | Default | Description | |------|------|---------|-------------| | file | string | required | Absolute path to the JSONL file | | schema | object/string | - | JSON Schema (inline or file path) | | stopOnFirstError | boolean | false | Stop on first error | | maxErrors | number | 100 | Maximum errors to report |

Response:

{
  "valid": false,
  "totalRecords": 10000,
  "validRecords": 9987,
  "invalidRecords": 13,
  "errors": [
    {
      "line": 1523,
      "error": "must have required property 'user_id'",
      "path": "/user_id"
    }
  ]
}

jsonl_tail

Monitor actively-written files for new records using cursor-based tracking.

Parameters: | Name | Type | Default | Description | |------|------|---------|-------------| | file | string | required | Absolute path to the JSONL file | | cursor | number | 0 | Byte position to start from | | maxRecords | number | 100 | Maximum records to return | | timeout | number | 0 | Wait time for new content (ms) |

Usage Pattern:

// Initial call - start from beginning
{ "file": "/var/log/app.jsonl", "cursor": 0 }
// Response: { records: [...], newCursor: 15234, hasMore: false }

// Subsequent calls - continue from cursor
{ "file": "/var/log/app.jsonl", "cursor": 15234, "timeout": 5000 }
// Waits up to 5s for new content

jsonl_diff

Compare two JSONL files and report differences.

Parameters: | Name | Type | Default | Description | |------|------|---------|-------------| | file1 | string | required | Path to first file | | file2 | string | required | Path to second file | | keyField | string | - | Field to use as unique key for matching | | compareFields | string[] | all | Specific fields to compare | | maxDiffs | number | 100 | Maximum differences to report |

Diff Types:

  • added - Record exists only in file2
  • removed - Record exists only in file1
  • modified - Record exists in both but differs

Use Cases

Exploring Log Files

"Inspect the application log file at /var/log/app.jsonl and show me
the schema and any error messages from the last hour"

Data Quality Analysis

"Validate /data/export.jsonl against this schema and show me
statistics on the user_id field to check for duplicates"

Real-time Monitoring

"Tail the events file and alert me when you see any records
with event_type containing 'error'"

Comparing Exports

"Diff these two data exports using 'id' as the key field
and show me what changed"

Architecture

See ARCHITECTURE.md for detailed technical documentation including:

  • Streaming parser design
  • Schema inference algorithm
  • Statistics collection with Welford's algorithm
  • Query engine implementation
  • Memory efficiency strategies

Development

Prerequisites

  • Node.js >= 18
  • npm >= 9

Setup

# Clone the repository
git clone https://github.com/YOUR_USERNAME/jsonl-explorer-mcp.git
cd jsonl-explorer-mcp

# Install dependencies
npm install

# Build
npm run build

# Run tests
npm test

Scripts

| Command | Description | |---------|-------------| | npm run build | Compile TypeScript to JavaScript | | npm run dev | Run with auto-reload (development) | | npm run start | Run compiled server (stdio mode) | | npm run start:http | Run compiled server (HTTP mode) | | npm test | Run test suite in watch mode | | npm run test:run | Run tests once | | npm run typecheck | Type-check without emitting |

Project Structure

src/
├── index.ts              # Entry point, transport setup
├── server.ts             # MCP server configuration
├── core/                 # Core processing modules
│   ├── streaming-parser.ts   # Line-by-line JSONL processing
│   ├── schema-inferrer.ts    # Schema detection
│   ├── statistics.ts         # Stats collection
│   ├── query-engine.ts       # Query parsing/execution
│   └── file-tailer.ts        # Cursor-based tailing
├── tools/                # MCP tool implementations
│   ├── inspect.ts
│   ├── sample.ts
│   ├── schema.ts
│   ├── stats.ts
│   ├── search.ts
│   ├── filter.ts
│   ├── validate.ts
│   ├── tail.ts
│   └── diff.ts
└── utils/                # Shared utilities
    ├── format.ts
    ├── file-info.ts
    └── types.ts

Performance

Designed for efficiency with large files:

| File Size | Records | Inspect Time | Memory | |-----------|---------|--------------|--------| | 10 MB | 50,000 | ~0.5s | ~20 MB | | 100 MB | 500,000 | ~3s | ~25 MB | | 1 GB | 5,000,000 | ~25s | ~30 MB |

Memory usage stays constant regardless of file size due to streaming architecture.

Contributing

Contributions are welcome! Please:

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

License

MIT - see LICENSE for details.

Related Projects