autotel-mcp
v0.1.9
Published
MCP server for AI agents to investigate OpenTelemetry traces, metrics, and logs
Downloads
1,250
Maintainers
Readme
autotel-mcp
An MCP server that gives AI agents the ability to investigate OpenTelemetry traces, metrics, and logs. Ships with a built-in OTLP collector so any instrumented app can send data directly — no Jaeger, Grafana, or vendor setup required.
Key Features
- Backend-agnostic. Built-in OTLP collector on port 4318 accepts data from any OTel-instrumented app.
- All three signals. Traces, metrics, and logs — with cross-signal correlation.
- Agent-optimized. 33 tools designed for progressive investigation: discover → diagnose → correlate → root cause.
- Zero infrastructure. In-memory by default, persistent with
--persist.
Requirements
- Node.js 20 or newer
- An MCP client: Claude Code, Claude Desktop, VS Code, Cursor, Windsurf, Goose, or any other MCP client
Getting started
Install the autotel-mcp server with your client.
Standard config works in most tools:
{
"mcpServers": {
"autotel": {
"command": "npx",
"args": ["autotel-mcp"]
}
}
}Use the Claude Code CLI to add the server:
claude mcp add autotel npx autotel-mcpFollow the MCP install guide. Add to your config:
{
"mcpServers": {
"autotel": {
"command": "npx",
"args": ["autotel-mcp"]
}
}
}Follow the MCP install guide, use the standard config above. Or use the CLI:
code --add-mcp '{"name":"autotel","command":"npx","args":["autotel-mcp"]}'Go to Cursor Settings -> MCP -> Add new MCP Server. Use command type with the command npx autotel-mcp.
Follow Windsurf MCP documentation. Use the standard config above.
Add to your cline_mcp_settings.json:
{
"mcpServers": {
"autotel": {
"type": "stdio",
"command": "npx",
"args": ["autotel-mcp"],
"disabled": false
}
}
}codex mcp add autotel npx "autotel-mcp"Or add to ~/.codex/config.toml:
[mcp_servers.autotel]
command = "npx"
args = ["autotel-mcp"]/mcp addOr add to ~/.copilot/mcp-config.json:
{
"mcpServers": {
"autotel": {
"type": "local",
"command": "npx",
"tools": ["*"],
"args": ["autotel-mcp"]
}
}
}Follow the MCP install guide, use the standard config above.
Go to Advanced settings -> Extensions -> Add custom extension. Use type STDIO, set command to npx autotel-mcp.
amp mcp add autotel -- npx autotel-mcpOr add to VS Code settings:
"amp.mcpServers": {
"autotel": {
"command": "npx",
"args": ["autotel-mcp"]
}
}Go to Settings -> AI -> Manage MCP Servers -> + Add. Use the standard config above.
With Jaeger backend
To query an existing Jaeger instance instead of the built-in collector:
{
"mcpServers": {
"autotel": {
"command": "npx",
"args": ["autotel-mcp"],
"env": {
"AUTOTEL_BACKEND": "jaeger",
"JAEGER_BASE_URL": "http://localhost:16686"
}
}
}
}With persistent storage
{
"mcpServers": {
"autotel": {
"command": "npx",
"args": ["autotel-mcp", "--persist", "./autotel.db"]
}
}
}How it works
Your App ──OTLP──> autotel-mcp (port 4318) ──libsql──> in-memory store
│
AI Agent ──MCP──────────┘
(stdio or HTTP)- Your instrumented app sends traces/metrics/logs via OTLP to
http://localhost:4318 - autotel-mcp stores the data in libsql (in-memory by default)
- Your AI agent connects via MCP and investigates using 33 tools
Backends
Collector (default)
Built-in OTLP collector with libsql storage. Accepts all three signals on port 4318. No external dependencies.
# In-memory (default) — data lost on restart
npx autotel-mcp
# Persistent storage — survives restarts
npx autotel-mcp --persist ./autotel.dbPoint your app's OTLP exporter at the collector:
OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318 node your-app.jsJaeger
Query an existing Jaeger instance. Traces only (metrics and logs unsupported by Jaeger API).
AUTOTEL_BACKEND=jaeger JAEGER_BASE_URL=http://localhost:16686 npx autotel-mcp
## Offline cache and snapshots
Collector schema and semantic-convention tools can run in CI or air-gapped environments using local cache/snapshots.
- `AUTOTEL_OFFLINE_MODE=true`
Disable network fetches and use local cache + bundled snapshots only.
- `AUTOTEL_UPSTREAM_CACHE_DIR=/path/to/cache`
Override cache directory (default: `.autotel-cache` in current working directory).
Bundled snapshots ship under `fixtures/upstream/` and include a baseline collector version/component catalog plus semantic-convention namespace data.Configuration
Environment Variables
| Variable | Default | Description |
| ------------------------ | --------------------------------------------- | ------------------------------------- |
| AUTOTEL_BACKEND | collector | Backend: collector, jaeger |
| AUTOTEL_TRANSPORT | stdio | MCP transport: stdio, http |
| AUTOTEL_PORT | 3000 | MCP HTTP port |
| AUTOTEL_HOST | 127.0.0.1 | MCP HTTP bind address |
| AUTOTEL_COLLECTOR_PORT | 4318 | OTLP receiver port |
| AUTOTEL_PERSIST | — | libsql file path (omit for in-memory) |
| AUTOTEL_RETENTION_MS | 3600000 (1h mem) / 86400000 (24h persist) | Data retention |
| AUTOTEL_MAX_TRACES | 10000 | Max traces before eviction |
| JAEGER_BASE_URL | http://localhost:16686 | Jaeger API URL |
HTTP mode
Run as a standalone HTTP server (for remote clients or environments without stdio):
npx autotel-mcp --transport http --port 3000Then configure your MCP client with:
{
"mcpServers": {
"autotel": {
"url": "http://localhost:3000/mcp"
}
}
}Tools
33 tools organized by investigation workflow.
- list_services — Services with span counts and error rates
- list_operations — Operations for a service, ranked by traffic
- backend_health — Backend reachability and ingestion status
- backend_capabilities — Signal support and query features
- list_capabilities — Full server manifest
- search_traces — Find traces by service, operation, status, duration, tags, time window
- search_spans — Span-level search across traces
- get_trace — Full trace detail by ID
- summarize_trace — Compact summary: span tree, errors, critical path, duration breakdown
- find_anomalies — Scan for statistical outliers: latency spikes, error rate jumps
- find_root_cause — Walk a trace span tree to identify the bottleneck span
- find_errors — Aggregate error spans grouped by service and operation
- check_slos — Report SLO violations given p99 latency and error rate targets
- service_map — Dependency graph with call counts, error rates, latency percentiles
- list_services / list_operations — Service and operation discovery
- get_llm_usage — Token usage by model and service
- list_llm_models — Models in use with request counts
- get_llm_model_stats — Latency/token/error percentiles per model
- get_llm_expensive_traces — Top traces by token count
- get_llm_slow_traces — Slowest LLM traces
- list_llm_tools — Tool/function call usage by name
- list_metrics — Available metric series
- get_metric_series — Time-series data for a metric
- search_logs — Log search by severity, service, trace ID, text
- correlate — Given a trace ID: return trace + metrics from involved services + correlated logs
- explain_slowdown — Combines anomaly detection with cross-signal correlation
- validate_collector_config — Validate OTLP receiver config fragment
- explain_collector_config — Explain config shape and defaults
- suggest_collector_config — Generate minimal config
- score_span_instrumentation — Quality score 0-100 with A-F grade
- explain_instrumentation_score — Scoring rubric details
Resources
MCP resources give agents context without burning tool calls:
| URI | Content |
| -------------------------------- | ------------------------------------------------- |
| otel://capabilities | Server manifest: transports, tool groups, signals |
| otel://tool-catalog | All tools with descriptions and workflow hints |
| otel://backend/capabilities | Active backend's signal support |
| otel://collector/config | OTLP receiver config guidance |
| otel://instrumentation/scoring | Scoring rubric explanation |
License
MIT
