npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

observability-toolkit

v2.1.1

Published

MCP server for observability tooling - query traces, metrics, logs from local JSONL or SigNoz

Readme

observability-toolkit

MCP server for observability tooling — query traces, metrics, logs, and LLM events from agentic coding tools. Works with any agent emitting OTel GenAI semantic conventions. Ingest via OTLP to Cloudflare R2 or read from local JSONL.

Version: v2.1.0 | Published: 2026-04-07 | License: MIT


Installation

# Claude Code
claude mcp add observability-toolkit -- npx -y observability-toolkit

# Cursor / Windsurf / Continue.dev / Cline — add to MCP config:
{ "mcpServers": { "observability-toolkit": { "command": "npx", "args": ["-y", "observability-toolkit"] } } }

# Local dev
node ~/.claude/mcp-servers/observability-toolkit/dist/server.js

Tools

| Tool | Description | |------|-------------| | obs_query_traces | Query spans with filtering, regex, numeric operators, agent/tool attributes | | obs_query_metrics | Query metrics with aggregations (sum, avg, p50, p95, p99, rate), time buckets | | obs_query_logs | Query logs with boolean search, field extraction, negation | | obs_query_llm_events | Query LLM events with token usage, duration, provider/model filters | | obs_query_evaluations | Query evaluation events with aggregations and groupBy | | obs_query_verifications | Query human verification events for EU AI Act compliance | | obs_query_regressions | Detect quality metric regressions via EWMA drift and consecutive breach tracking | | obs_query_metric_histograms | Query OTLP histogram bucket distributions by metric name | | obs_health_check | Telemetry system health with cache statistics | | obs_context_stats | Context window utilization stats | | obs_token_budget | Context utilization, cache hit rate, headroom per model/session with alert levels | | obs_hallucination_detection | Hallucination risk from evaluation telemetry — rates, scores, model/method breakdowns | | obs_multi_agent_coordination | Delegation depth, fan-out ratio, handoff latency, agent token usage | | obs_routing_telemetry | Model distribution, cost savings, fallback rate, routing latency | | obs_estimate_cost | Token cost estimation across models | | obs_audit_trail | Query audit trail events (SHA-256 hash chain) | | obs_manage_datasets | Create, list, get, delete evaluation datasets (trace promotion) | | obs_inject_evaluations | Inject evaluation events into local telemetry | | obs_ingest_spans | Ingest spans to cloud backend via OTLP protobuf | | obs_ingest_traces | Push complete OTel traces (resourceSpans) with service metadata | | obs_export_langfuse | Export evaluations to Langfuse via OTLP HTTP | | obs_export_phoenix | Export evaluations to Arize Phoenix via OTLP HTTP | | obs_export_datadog | Export evaluations to Datadog LLM Observability | | obs_export_confident | Export evaluations to Confident AI | | obs_get_trace_url | Get trace viewer URL | | obs_setup_claudeignore | Add entries to .claudeignore | | obs_export_jaeger | Export spans to a local Jaeger instance via OTLP HTTP | | obs_detect_trace_anomalies | Detect anomalous spans — duration, error status, token usage, unknown names, instrumentation loops |


Configuration

export OBTOOL_API_KEY="obtk_YOUR_KEY_HERE"
export OTEL_EXPORTER_OTLP_ENDPOINT="https://ingest.integritystudio.ai"
# Or: doppler run --project integrity-studio --config dev -- npm start

| Variable | Default | Notes | |----------|---------|-------| | OBTOOL_API_KEY | — | Required for cloud backend tools | | OTEL_EXPORTER_OTLP_ENDPOINT | — | Required for cloud telemetry export | | BACKEND_TYPE | cloud | Query/ingest backend: local or cloud | | TELEMETRY_DIR | ./.otel | Local telemetry directory (cwd-relative) | | CACHE_TTL_MS | 60000 | Query cache TTL | | RETENTION_DAYS | 7 | Telemetry file retention | | OBTOOL_API_URL | — | Cloud backend query URL | | OBTOOL_INGEST_URL | — | Cloud ingest URL |

See docs/ENVIRONMENT_SETUP.md for full setup.


Data Sources

Local JSONL (default)

Scans ./.otel/ (cwd-relative; override with TELEMETRY_DIR) plus project-local .claude/telemetry/, telemetry/, .telemetry/. All directories are only included if they exist on disk at query time. Supports gzip. Compatible with Claude Code natively and any agent using the OTel file exporter.

File patterns: {traces,logs,metrics,llm-events,evaluations,verifications}-YYYY-MM-DD.jsonl[.gz]

Cloud Backend (optional)

All query tools accept backend: 'local' | 'cloud' (default: cloud). Cloud queries obtool-api (D1/R2) via OBTOOL_API_URL + OBTOOL_API_KEY. Circuit breaker protects against cascading failures.

Data Pipelines

Local backend (backend: 'local'):

Claude Code hooks / OTel SDK
         │ write
         ▼
  ./.otel/<path>.jsonl       ← FileSpanExporter (SDK self-telemetry; path is caller-configured)
  ./.otel/*.jsonl            ← hook-written JSONL (TELEMETRY_DIR primary)
  .claude/telemetry/         ← legacy project-local hook spans (if present)
  telemetry/ .telemetry/     ← other project-local dirs (if present)
         │ read (getTelemetryDirectories union)
         ▼
  obs_query_* tools
         │ enrich
         ▼
  derive → judge → sync-to-kv.ts → Cloudflare KV

Cloud backend (backend: 'cloud'):

hooks → OTLP HTTP → ingest.integritystudio.ai → R2 → D1 → api.integritystudio.ai

Services

Ingest Worker (obtool-ingest)

Cloudflare Worker (Hono v4) receiving OTLP protobuf telemetry. Deployed at ingest.integritystudio.ai.

  • POST /v1/{traces,metrics,logs} — OTLP protobuf ingest (gzip supported)
  • R2 storage: telemetry/{signal}/{YYYY-MM-DD}/{HH}/batch-{ts}-{uuid8}.jsonl
  • SHA-256 bearer token auth, KV idempotency (5min TTL)
cd services/obtool-ingest && npm run dev | npm test | npm run deploy

API Worker (obtool-api)

Cloudflare Worker (Hono v4) querying D1/R2. Deployed at api.integritystudio.ai.

Routes: /v1/{traces,metrics,logs,sessions,cost,datasets} + histogram and raw span drill-down. Cursor-based pagination, per-key rate limiting.

cd services/obtool-api && npm run dev | npm test | npm run deploy

API Provisioning

Two-worker system for Flutter client API key lifecycle.

  • Sender Worker (public): signup / signin / provision; signs payloads with HMAC-SHA256 before forwarding to receiver
  • Receiver Worker (services/api-provisioning-receiver/): verifies HMAC signature + replay window; validates email format, registrable domain, and MX records via Cloudflare DoH; authenticates JWT via Auth0 /userinfo; upserts team organization (type='team', current_plan=tier); adds user to org membership; calls api-keys-create Supabase Edge Function
  • api-keys-create (Supabase/Deno): generates obtk_ token, inserts api_keys with tier from request body, syncs to Cloudflare KV

Tier values (api_key_tier DB enum): starter | growth | enterprise — defaults to starter.

Receiver env bindings: AUTH0_DOMAIN, SUPABASE_URL, SUPABASE_ANON_KEY, SUPABASE_SERVICE_ROLE_KEY, SHARED_SECRET.

See docs/api-provisioning-flutter-contract.md | docs/auth/api-key-provisioning.md | docs/api-provisioning-security.md.

cd services/e2e && npm test   # 7 files, 30 E2E scenarios (api-key-auth, dashboard-auth, receiver-security, sender-receiver, …)

Evaluation Libraries

LLM-as-Judge

G-Eval (chain-of-thought + logprob normalization), QAG faithfulness, position bias mitigation (mitigatedPairwiseEval), panel evaluation, circuit breaker + retry. Zod schemas co-located as single source of truth.

See docs/quality/llm-as-judge.md.

Agent-as-Judge

ProceduralJudge (fixed pipeline, early termination), ReactiveJudge (adaptive routing, LRU state), tool verification (selection 40% / args 30% / result 30%), trajectory efficiency analysis, multi-agent handoff scoring.

See docs/quality/agent-as-judge.md.

Quality Pipeline (Hooks)

  • T1 rule-based: tool_correctness, evaluation_latency, task_completion — every invocation, zero cost
  • T2 LLM judge: relevance, coherence, faithfulness, hallucination — sampled, budget-controlled
  • Divergence detection: entropy-based bimodal alerts for relevance, coherence, task_completion
  • Regression detection: post-T2 inline EWMA drift check, emits quality.degradation_confirmed OTel event
  • Meta-evaluation: explanation quality scoring via evaluateExplanationQuality() (R6.2)

Dashboard

React 19 + Vite 8 in dashboard/ (git submodule). Hono API on :3001, Auth0 Universal Login, role-based access via Supabase.

Routes: / (overview), /metrics/:name, /role/:roleName, /correlations, /coverage, /traces/:traceId, /sessions/:sessionId, /agents/:sessionId, /compliance

Deployed as Cloudflare Pages (frontend) + Worker (quality-metrics-api). Data synced from local pipeline via sync-to-kv.ts.

cd dashboard && npm run dev          # :5173 + API :3001
cd dashboard && npm run populate -- --seed

Integrations

| Platform | Method | Status | |----------|--------|--------| | Claude Code | Native MCP | Full | | Cursor / Windsurf / Continue.dev / Cline | MCP config | Full | | Any OTel agent | OTLP → local JSONL | Full | | obtool-ingest | OTLP → Cloudflare R2 | Full | | Langfuse / Phoenix / Datadog / Confident AI | OTLP / HTTP export | Export only |

OTel GenAI semconv v1.40.0 compliance. 15 LLM providers supported (anthropic, openai, gcp.gemini, gcp.vertex_ai, aws.bedrock, azure.ai.openai, mistral_ai, cohere, groq, ollama, together_ai, fireworks_ai, huggingface, replicate, perplexity).


Development

npm install && npm run build && npm test

See docs/repomix/token-tree.txt for the full file tree with token counts.


Documentation

| Doc | Description | |-----|-------------| | docs/CHANGELOG.md | Version index v2.0.1 → v3.0.14 | | docs/roadmap/README.md | Roadmap, research directions, architecture docs index | | docs/otel-v2/otel-genai-attribute-reference.md | OTel GenAI attribute reference (v1.40.0) | | docs/otel-v2/agent-span-hierarchies.md | Agent span model and hierarchy patterns | | docs/otel-v2/schema-migration.md | JSONL → OTLP migration | | docs/otel-v3/llm-evaluation-frameworks.md | Langfuse, Phoenix, DeepEval, Datadog comparison | | docs/quality/llm-as-judge.md | LLM-as-Judge architecture | | docs/quality/agent-as-judge.md | Agent-as-Judge architecture | | docs/api-provisioning-flutter-contract.md | Flutter client integration | | docs/auth/api-key-provisioning.md | End-to-end provisioning flow diagram | | docs/api-provisioning-security.md | Security properties and threat model | | docs/reliability/security.md | Security controls | | docs/hooks-integration.md | Hooks system integration — producer-consumer architecture, type chain, path coupling | | docs/test-anti-patterns.md | Test anti-patterns — patterns to avoid and shared helpers to use instead |