@thangnm93/graylog-mcp
v1.0.1
Published
Model Context Protocol (MCP) server for Graylog log searching with distributed tracing. Search logs, trace requests across services, get surrounding context, and debug production issues.
Maintainers
Readme
graylog-mcp
Model Context Protocol (MCP) server for Graylog log searching. Search logs by absolute/relative timestamps, filter by streams, and debug production issues directly from Claude Desktop.
Built for production debugging - Search Graylog logs using exact timestamps, filter by application streams, and get actionable insights for troubleshooting production issues.
Features
- ✅ Absolute timestamp search - Debug specific errors with exact time ranges
- ✅ Relative timestamp search - Search recent logs (last N seconds)
- ✅ Distributed tracing - Follow a
trace_idacross all services - ✅ Surrounding-log context - See what happened ±N seconds around an error
- ✅ Composite incident analysis - One tool call fans out to trace + context + baseline
- ✅ Field aggregation - Group counts by service/level/pod/lead_id with bandwidth-efficient projection
- ✅ Stream discovery - List all available streams/applications
- ✅ System health check - Verify Graylog connectivity
- ✅ Comprehensive validation - ISO 8601 timestamps, query syntax, stream IDs
- ✅ Clear error messages - Actionable errors for auth, network, and API issues
- ✅ Timeout handling - 30-second timeouts prevent hanging
- ✅ Production-ready - 54 tests, 9.2/10 code quality score
Table of Contents
Installation
Option 1: Use with npx (Recommended)
# No installation needed - use directly with npx
npx @thangnm93/graylog-mcpOption 2: Global Installation
npm install -g @thangnm93/graylog-mcpOption 3: Local Installation
# Clone the repository
git clone https://github.com/thangnm93/graylog-mcp.git
cd graylog-mcp
# Install dependencies
npm installConfiguration
Claude Desktop Setup
Add to your Claude Desktop config file:
macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
Windows: %APPDATA%\Claude\claude_desktop_config.json
Using npx (Recommended)
{
"mcpServers": {
"graylog": {
"command": "npx",
"args": ["-y", "@thangnm93/graylog-mcp"],
"env": {
"BASE_URL": "https://graylog.example.com",
"API_TOKEN": "your_api_token_here",
"EXTRA_HEADER_X_ORG_ID": "my-org",
"EXTRA_HEADER_X_REQUEST_SOURCE": "mcp"
}
}
}
}Extra headers: Any env var prefixed
EXTRA_HEADER_is forwarded as an HTTP header on every Graylog request. The suffix becomes the header name with_→-(e.g.EXTRA_HEADER_X_ORG_ID→X-Org-Id). Add as many as needed.
Using Local Installation
{
"mcpServers": {
"graylog": {
"command": "node",
"args": ["/path/to/graylog-mcp/src/index.js"],
"env": {
"BASE_URL": "https://graylog.example.com",
"API_TOKEN": "your_api_token_here"
}
}
}
}Environment Variables
| Variable | Required | Description |
|----------|----------|-------------|
| BASE_URL | Yes | Graylog server URL (e.g., https://graylog.example.com) |
| API_TOKEN | Yes | Graylog API token (username for Basic Auth, password is "token") |
| EXTRA_HEADER_* | No | Any env var starting with EXTRA_HEADER_ is sent as an HTTP header on every Graylog request. The suffix becomes the header name with _ replaced by -. Example: EXTRA_HEADER_X_ORG_ID=abc → X-Org-Id: abc |
Getting Your Graylog API Token
- Log in to Graylog web interface
- Go to System → Users
- Select your user
- Click Edit tokens
- Create a new token with read permissions
- Copy the token value
Available Tools
1. search_logs_absolute
Search logs using absolute timestamps (from/to). Perfect for debugging errors with specific timestamps from monitoring tools or error tracking systems.
Parameters:
query(required): Search query using Elasticsearch syntaxfrom(required): Start timestamp in ISO 8601 formatto(required): End timestamp in ISO 8601 formatstreamId(optional): Stream ID to filter resultslimit(optional): Maximum results (default: 50, max: 1000)
Example:
{
"query": "\"/api/v1/registrations\" AND \"PUT\"",
"from": "2025-10-23T10:00:00.000Z",
"to": "2025-10-23T11:00:00.000Z",
"streamId": "646221a5bd29672a6f0246d8",
"limit": 100
}2. search_logs_relative
Search logs using relative time range (e.g., last 15 minutes). Useful for recent log analysis.
Parameters:
query(required): Search query using Elasticsearch syntaxrangeSeconds(optional): Time range in seconds (default: 900 = 15 minutes, max: 86400 = 24 hours)streamId(optional): Stream ID to filter resultslimit(optional): Maximum results (default: 50, max: 1000)
Example:
{
"query": "level:ERROR",
"rangeSeconds": 3600,
"limit": 100
}3. trace_request
Trace a request across ALL services using a trace_id. Fetches logs from every stream, groups by service/pod, and sorts each service's messages chronologically. Essential for distributed debugging in microservice architectures.
Parameters:
traceId(required): The trace ID to follow (e.g.,abbb27610a7fd76be8fb5af17edbe00d)from(required): Start timestamp in ISO 8601 format (search window)to(required): End timestamp in ISO 8601 format (search window)limit(optional): Maximum results (default: 200, max: 1000)
Example:
{
"traceId": "abbb27610a7fd76be8fb5af17edbe00d",
"from": "2026-05-13T15:38:00.000Z",
"to": "2026-05-13T15:48:00.000Z"
}4. get_surrounding_logs
Return logs within ±N seconds of a timestamp, optionally filtered by source/pod/stream. Reveals what happened immediately before and after an error.
Parameters:
timestamp(required): Center timestamp in ISO 8601 formatsource(optional): Source hostname or pod to filter bystreamId(optional): Stream ID filterwindowSeconds(optional): Window on each side (default: 5, max: 300)limit(optional): Maximum results (default: 100)
Example:
{
"timestamp": "2026-05-13T15:43:27.844Z",
"source": "argus-production-f747f5d4d-x9hpp",
"windowSeconds": 10
}5. analyze_incident
Composite tool. One call fans out to three searches and returns an aggregated incident report — saves 2-3 LLM orchestration rounds when investigating a specific trace.
Internally executes:
- The full trace hop chain (
trace_id:X) - Pod-scoped surrounding logs around the first ERROR/CRITICAL/FATAL hop (filters by
pod:to avoid multi-tenant noise on shared hosts) - A trailing-hour error baseline for the anchor service
Parameters:
traceId(required): The trace ID to investigatefrom(required): Start timestamp in ISO 8601 formatto(required): End timestamp in ISO 8601 formatwindow(optional): Surrounding-logs window in seconds (default: 10, max: 300)baselineSeconds(optional): Trailing window for the baseline lookup (default: 3600, max: 86400)
Example:
{
"traceId": "abbb27610a7fd76be8fb5af17edbe00d",
"from": "2026-05-13T15:38:00.000Z",
"to": "2026-05-13T15:48:00.000Z",
"window": 10,
"baselineSeconds": 3600
}Returns (abridged):
{
"trace_id": "abbb27610a7fd76be8fb5af17edbe00d",
"found": true,
"steps_executed": 4,
"summary": {
"hops": 4,
"services_involved": ["argus"],
"errors_in_trace": 1,
"anchor_service": "argus",
"anchor_pod": "argus-production-f747f5d4d-x9hpp",
"first_error": { "timestamp": "...", "service": "argus", "message": "nil fund_id ...", "lead_id": "..." },
"request": { "http_path": "/api/v2/user/graph", "http_method": "POST", "http_status": 200, "duration_ms": 67 },
"baseline_errors_in_service": 16,
"baseline_window_seconds": 3600
},
"trace_hops": [...],
"surrounding_logs": [...]
}6. aggregate_logs
Count log entries grouped by a field — Graylog's most-used operation, made one-call. Issues a single search with fields=<group_field> projected (so only the column you want is downloaded) and aggregates client-side. Replaces Graylog 5.x's removed legacy terms-aggregation endpoint.
Parameters:
query(required): Filter (Elasticsearch syntax). Use*for all entries.field(required): Field to group by. Common:service,logger_level,pod,lead_id,http_status,container_name.from+toORrangeSeconds(required, mutually exclusive): time windowsize(optional): Top N to return (default 25, max 100). Rest summed intoother.fetchLimit(optional): Max messages to aggregate (default 5000, max 10000). When matched exceeds this,truncated: trueis flagged.streamId(optional)
Example:
{
"query": "logger_level:error",
"field": "service",
"rangeSeconds": 1800,
"size": 10
}Returns:
{
"field": "service",
"query": "logger_level:error",
"time_range": "Last 1800 seconds",
"total_matched": 30,
"messages_aggregated": 30,
"truncated": false,
"unique_groups": 5,
"top": { "milkyway": 8, "argus": 4, "telex": 4, "advisory": 3, "auth": 1 },
"other": 0,
"missing": 10,
"api_calls": 1
}The missing count is messages that matched the query but had no value for the group-by field — useful signal for log-hygiene issues.
7. list_streams
List all available Graylog streams (applications). Use this to discover stream IDs for filtering.
Parameters: None
Returns:
{
"total": 3,
"streams": [
{
"id": "646221a5bd29672a6f0246d8",
"title": "application-api",
"description": "API application logs",
"disabled": false
}
]
}8. get_system_info
Get Graylog system information and health status. Verify connectivity and check server version.
Parameters: None
Returns:
{
"version": "5.1.0",
"codename": "graylog",
"cluster_id": "abc123",
"is_processing": true,
"timezone": "UTC"
}Query Examples
Search for Errors
level:ERRORSearch for Specific Endpoint
"/api/v1/registrations" AND "PUT"Search for HTTP Status Codes
status:500
status:>=400Search for User Actions
user_id:12345 AND action:loginSearch for Slow Requests
duration_ms:>1000Search for Exceptions
exception:NullPointerExceptionCombine Multiple Conditions
level:ERROR AND source:nexus AND message:*timeout*Search with Wildcards
message:*connection refused*Search by Field Existence
_exists_:error_codeCommon Use Cases
1. Debug Production Error
When you get an error with a timestamp from your monitoring system:
1. Copy error timestamp from your monitoring tool
2. Use search_logs_absolute with ±5 minute window
3. Filter by application stream
4. Find root cause in logs2. Monitor Recent Deployments
After deploying:
1. Use search_logs_relative with last 15 minutes
2. Search for level:ERROR
3. Verify no new errors introduced3. Investigate API Failures
When an API endpoint fails:
1. Search for endpoint path: "/api/v1/endpoint"
2. Filter by status codes: status:>=400
3. Check error patternsError Messages
The server provides clear, actionable error messages:
| Error | Meaning | Solution |
|-------|---------|----------|
| Authentication failed | Invalid API token | Check API_TOKEN in configuration |
| Invalid query | Elasticsearch syntax error | Check query syntax and parameters |
| Endpoint not found | Wrong Graylog URL | Check BASE_URL in configuration |
| Cannot reach Graylog | Network connectivity issue | Verify Graylog is accessible |
| Invalid timestamp | Wrong timestamp format | Use ISO 8601 format (e.g., 2025-10-23T10:00:00.000Z) |
Troubleshooting
Server Won't Start
Check environment variables:
# Verify BASE_URL and API_TOKEN are set in Claude Desktop config
# Check Claude Desktop logs:
# macOS: ~/Library/Logs/Claude/mcp*.log
# Windows: %APPDATA%\Claude\logs\mcp*.logVerify Graylog accessibility:
curl -u "YOUR_API_TOKEN:token" https://graylog.example.com/api/systemAuthentication Errors
- Verify API token has read permissions in Graylog
- Token format: Use token value as username, "token" as password
- Check token hasn't expired
No Results Returned
- Verify stream ID is correct using
list_streamstool - Check timestamp range includes data
- Try simplifying query to
*to see if any data exists - Verify stream is not disabled
Integration Tests Failing
# Set environment variables for integration tests
export INTEGRATION_TESTS=true
export BASE_URL=https://graylog.example.com
export API_TOKEN=your_token_here
# Run integration tests
npm run test:integrationDevelopment
Prerequisites
- Node.js >= 18.0.0
- npm >= 8.0.0
- Access to a Graylog instance (for integration tests)
Development Workflow
# Install dependencies
npm install
# Run in development mode (auto-reload)
npm run dev
# Run tests
npm test
# Run tests in watch mode
npm run test:watch
# Run only unit tests
npm run test:unit
# Run integration tests (requires Graylog instance)
INTEGRATION_TESTS=true BASE_URL=https://graylog.example.com API_TOKEN=xxx npm run test:integration
# Check syntax
npm run lintProject Structure
graylog-mcp/
├── src/
│ └── index.js # Main server implementation (429 lines)
├── test/
│ ├── helpers.test.js # Helper function tests (14 tests)
│ ├── validation.test.js # Input validation tests (24 tests)
│ ├── mcp-protocol.test.js # MCP protocol tests (16 tests)
│ └── integration.test.js # Integration tests (7 tests)
├── example-config.json # Claude Desktop config example
├── CONTRIBUTING.md # Contributing guidelines
├── CHANGELOG.md # Version history
└── package.json # npm configurationRunning Tests
# Run all tests (54 tests)
npm test
# Expected output:
# tests 54
# pass 54
# fail 0Architecture
Simple, focused architecture in a single file (429 lines):
- Configuration & Validation - Environment variable checking
- Helper Functions - ISO 8601 validation, error formatting
- MCP Server Setup - Standard MCP protocol implementation
- Tool Definitions - 4 tools with clear schemas
- Tool Implementations - Clean, validated functions
- Server Startup - Validation then connection
Design Principles:
- ✓ Simple and maintainable
- ✓ One file, easy to understand
- ✓ Clear separation of concerns
- ✓ Comprehensive error handling
- ✓ Input validation at boundaries
- ✓ Consistent response format
Contributing
Contributions are welcome! Please see CONTRIBUTING.md for guidelines.
Quick Start:
- Fork the repository
- Create a feature branch
- Add tests for your changes
- Ensure all tests pass (
npm test) - Submit a pull request
Changelog
See CHANGELOG.md for version history and release notes.
Security
- Environment variables for sensitive data (never hardcoded)
- Basic authentication properly implemented
- Input validation prevents injection attacks
- Timeout prevents hanging requests
- Error messages don't leak sensitive information
To report security vulnerabilities, please create a private security advisory on GitHub.
License
MIT License - see LICENSE file for details.
Links
Acknowledgments
- Built with @modelcontextprotocol/sdk
- Inspired by the MCP community
- Thanks to all contributors!
Made with ❤️ for the Claude Desktop community
For questions or support, please open an issue on GitHub
