npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@luzhuohuan-bd/openclaw-otel-plugin

v0.1.1

Published

Three-signal OpenTelemetry observability plugin for OpenClaw (Traces + Metrics + Logs)

Readme

OpenClaw OTel Observability Plugin

三信号 OpenTelemetry 观测插件:Traces + Metrics + Logs,通过 OTLP/HTTP protobuf 导出。

整体架构

用户飞书消息 → OpenClaw Gateway → Plugin Hooks 触发 → 本插件处理 → 三种 OTel 数据导出
                                                                    ├── Traces (Span 树)
                                                                    ├── Metrics (计数/直方图)
                                                                    └── Logs (聊天日志)
                                                        ↓ OTLP/HTTP protobuf
                                                   OTel Collector (:4318)
                                                        ↓
                                               文件 / Jaeger / Grafana / APMPlus

代码模块划分

1. OTel SDK 初始化(initProviders

创建三个独立的 OTel Provider,各负责一种信号:

| Provider | 信号 | Exporter | 端点 | |----------|------|----------|------| | BasicTracerProvider | Traces(Span) | OTLPTraceExporter | {endpoint}/v1/traces | | MeterProvider | Metrics(Counter/Histogram) | OTLPMetricExporter | {endpoint}/v1/metrics | | LoggerProvider | Logs(日志记录) | OTLPLogExporter | {endpoint}/v1/logs |

三者共享同一个 Resource(标识 service.namehost.name),数据通过 OTLP/HTTP protobuf 协议批量发送。

2. 共享状态(globalThis

OpenClaw 的 jiti 加载器会多次加载插件模块(不同注册阶段),每次都会执行 activate()。为了避免重复创建 Provider 和 Store,用 globalThis.__openclaw_otel_state__ 存储单例:

  • 首次加载:创建 Provider + Store + Metric 工具
  • 后续加载:复用已有的,只注册 Hook

3. Context Store(ContextStore 类)

核心数据结构,管理跨 Hook 的状态传递

sessions:       Map<sessionKey, SessionContext>    ← 每个会话的 span 树状态
convToSession:  Map<conversationId, sessionKey>    ← 消息 hook 用 convId 查找 session
subagents:      Map<childSessionKey, SubagentInfo> ← 子代理 span
pendingMsg:     PendingMessage | null              ← 暂存的消息(session 还没创建时)

每 60 秒清理一次超过 TTL(10 分钟)的过期 session,防止内存泄漏。

SessionContext 结构:

interface SessionContext {
  sessionKey: string;          // 会话唯一标识
  sessionId: string;           // 会话 UUID
  rootSpan: Span;              // 根 span(整个交互生命周期)
  rootCtx: Context;            // 根 span 的 OTel context
  agentSpan: Span | null;      // Agent span(agent 执行期间)
  agentCtx: Context | null;    // Agent span 的 OTel context
  llmSpan: Span | null;        // LLM span(模型调用期间,包裹 tool 调用)
  llmCtx: Context | null;      // LLM span 的 OTel context
  llmStartTime?: number;       // LLM 调用开始时间
  llmInput?: unknown;          // LLM 输入内容(隐私控制下才保存)
  toolSpans: Map<string, ...>; // 并发 tool span 追踪(按 toolCallId)
  userInput?: string;          // 用户输入内容
  lastOutput?: string;         // 最后一次 agent 输出
  createdAt: number;           // 创建时间(用于 TTL 清理)
}

4. Diagnostic 桥接

订阅 OpenClaw 内部的运行时诊断事件(不走 Hook 系统),转换为 OTel 信号:

| 诊断事件 | 转换为 | |---------|--------| | model.usage(token 用量) | Metric counter(按 input/output/cache_read 分类) | | session.stuck(会话卡住) | Log warning | | tool.loop(工具死循环) | Log warning |

5. Hook 注册

插件通过 api.on() 注册了 17 个 Hook,覆盖 OpenClaw agent 执行的完整生命周期。

6. Span 父子关系确定

三个辅助函数决定 span 挂在哪个父节点下:

llmParentCtx(session)  → agent > root       // LLM span 挂在 agent 下
toolParentCtx(session) → llm > agent > root  // Tool span 挂在 LLM 下

产生的全部数据

一、Traces(7 种 Span)

| # | Span 名 | 产生时机 | 生命周期 | 父节点 | 关键属性 | |---|---------|---------|---------|--------|---------| | 1 | openclaw.session | 首次收到带 sessionKey 的 hook 时惰性创建 | 长:agent_end + 800ms 后关闭 | 无(root) | session.key, gen_ai.session.id, openclaw.input, openclaw.output, session.duration_ms, session.message_count | | 2 | message_received | message_received hook(session 不存在时延迟到 ensureSession 补创建) | 瞬时 | session | message.from, channel.id, message.content* | | 3 | agent.{id} | before_agent_start hook | 长:agent_end 时关闭 | session | agent.id, agent.duration_ms, agent.success | | 4 | llm.{provider}/{model} | llm_input hook 创建 | 长:llm_output 时关闭 | agent | gen_ai.request.model, gen_ai.provider.name, gen_ai.usage.input_tokens, gen_ai.usage.output_tokens, gen_ai.usage.total_tokens, gen_ai.usage.cache_read_input_tokens, gen_ai.usage.cache_creation_input_tokens, llm.duration_ms, gen_ai.input_messages, gen_ai.output_messages | | 5 | tool.{toolName} | before_tool_call hook 创建 | 中:after_tool_call 时关闭 | llm | gen_ai.tool.name, gen_ai.tool.call.id, tool.input, tool.output, error.message | | 6 | compaction | after_compaction hook | 瞬时 | session | compaction.message_count, compaction.compacted_count, compaction.token_count | | 7 | subagent.{agentId} | subagent_spawned 创建,subagent_ended 关闭 | 长 | 父 session | subagent.mode, subagent.run_id, subagent.outcome, subagent.reason, error.message |

* 的属性受 allowUserDetailInfoReport 隐私开关控制,默认不上报内容。

Span 树层级:

openclaw.session (ROOT, SERVER)
├── message_received (INTERNAL)
└── agent.main (INTERNAL)
    └── llm.provider/model (CLIENT)          ← LLM 包裹 tool
        ├── tool.feishu_calendar_event (CLIENT)
        └── tool.exec (CLIENT)

二、Metrics(13 种指标)

Counters(累计计数):

| # | 指标名 | 产生时机 | 维度标签 | |---|--------|---------|---------| | 1 | openclaw.sessions.total | session 创建时 +1 | — | | 2 | openclaw.messages.received.total | message_received hook +1 | channel.id | | 3 | openclaw.messages.sent.total | message_sent hook +1 | channel.id, success | | 4 | gen_ai.client.operation.count | llm_output hook +1 | gen_ai.provider.name, gen_ai.request.model | | 5 | gen_ai.client.token.usage | diagnostic model.usage 事件 | gen_ai.provider.name, gen_ai.request.model, gen_ai.token.type (input/output/cache_read) | | 6 | openclaw.tool_calls.total | after_tool_call hook +1 | tool.name | | 7 | openclaw.tool_calls.errors.total | after_tool_call 且有 error 时 +1 | tool.name | | 8 | openclaw.compactions.total | after_compaction hook +1 | — | | 9 | openclaw.subagents.total | subagent_spawned hook +1 | — |

Histograms(延迟分布):

| # | 指标名 | 产生时机 | 维度标签 | |---|--------|---------|---------| | 10 | gen_ai.client.operation.duration | llm_output hook | gen_ai.provider.name, gen_ai.request.model | | 11 | openclaw.tool_call.duration | after_tool_call hook | tool.name | | 12 | openclaw.session.duration | session 关闭时 | — |

Gauge(瞬时值):

| # | 指标名 | 采集方式 | 说明 | |---|--------|---------|------| | 13 | openclaw.sessions.active | 每 15s 由 PeriodicExportingMetricReader 回调 | 当前活跃 session 数 |

三、Logs(6 种日志)

| # | 触发时机 | 严重级别 | 内容 | |---|---------|---------|------| | 1 | gateway_start hook | INFO | "OpenClaw gateway started on port {port}" | | 2 | gateway_stop hook | INFO | "OpenClaw gateway stopping: {reason}" | | 3 | message_received hook | INFO | 用户消息内容(或 "User message (N chars)",受隐私控制) | | 4 | before_message_write hook(sync) | INFO | 每条写入 session transcript 的消息(user/assistant/tool 角色),受隐私控制 | | 5 | diagnostic session.stuck 事件 | WARN | "Stuck session: {key} state={state} age={age}ms" | | 6 | diagnostic tool.loop 事件 | WARN | "Tool loop: {tool} ({detector}) count={count}" |


注册的 Hook 列表

| # | Hook 名 | Context 类型 | 有 sessionKey | 本插件做什么 | |---|---------|-------------|:---:|------------| | 1 | gateway_start | GatewayContext | - | 记 Log | | 2 | gateway_stop | GatewayContext | - | flush + shutdown 所有 Provider | | 3 | session_start | SessionContext | ✓ | 惰性创建 session(如果没有的话) | | 4 | session_end | SessionContext | ✓ | 关闭所有 span,flush | | 5 | message_received | MessageContext | ✗ | 暂存 pendingMsg 或直接创建 span;记 Metric + Log | | 6 | message_sending | MessageContext | ✗ | 捕获最后输出内容 | | 7 | message_sent | MessageContext | ✗ | 创建 span;记 Metric | | 8 | before_agent_start | AgentContext | ✓ | 创建 agent span | | 9 | agent_end | AgentContext | ✓ | 关闭 agent span;800ms 后关闭 root span 并 flush | | 10 | llm_input | AgentContext | ✓ | 创建 LLM span(长生命周期) | | 11 | llm_output | AgentContext | ✓ | 结束 LLM span(写入 token 数据);记 Metric | | 12 | before_tool_call | ToolContext | ✓ | 创建 tool span(父=LLM) | | 13 | after_tool_call | ToolContext | ✓ | 结束 tool span;记 Metric | | 14 | tool_result_persist | ToolContext(sync) | ✓ | 创建短 span(tool 的子节点) | | 15 | after_compaction | AgentContext | ✓ | 创建 compaction span;记 Metric | | 16 | subagent_spawned | SubagentContext | - | 创建 subagent span;记 Metric | | 17 | subagent_ended | SubagentContext | - | 结束 subagent span | | - | before_message_write | sync | - | 记 Log(聊天内容) |


一次完整交互的数据流时序

以用户问 "明天有什么会议" 为例:

用户发飞书消息 "明天有什么会议"
    │
    ▼
① message_received hook (MessageContext, 无 sessionKey)
   → Metric: messages.received +1
   → Log: 用户消息内容
   → session 还不存在,暂存 pendingMsg = { from, content, conversationId, time }
    │
    ▼
② before_agent_start hook (AgentContext, 有 sessionKey)
   → ensureSession() 惰性创建 session:
       → Span: openclaw.session 开始 (ROOT)
       → Span: message_received 补创建 (从 pendingMsg, 瞬时关闭)
       → Metric: sessions.total +1
       → linkConv(conversationId → sessionKey) 建立关联
   → Span: agent.main 开始 (child of ROOT)
    │
    ▼
③ llm_input hook (AgentContext)
   → Span: llm.ark-doubao-seed-16/ep-xxx 开始 (child of agent)
   → 记录 llmStartTime、prompt 内容
    │
    ▼
④ 模型返回 tool_call 指令 (OpenClaw 内部处理)
    │
    ▼
⑤ before_tool_call hook (ToolContext)
   → Span: tool.feishu_calendar_event 开始 (child of LLM ← 关键层级!)
    │
    ▼
⑥ tool 执行完成 (飞书 API 调用日历)
    │
    ▼
⑦ after_tool_call hook (ToolContext)
   → Span: tool.feishu_calendar_event 结束
   → Metric: tool_calls +1, tool_call.duration 记录
    │
    ▼
⑧ agent_end hook (AgentContext)
   → Span: agent.main 结束 (写入 duration_ms, success)
   → 启动 800ms setTimeout 等待 llm_output
    │
    ▼
⑨ llm_output hook (AgentContext, 在 agent_end 之后触发)
   → Span: llm 结束 (写入 token 数据: input=117905, output=721)
   → Metric: llm_calls +1, llm_duration 记录
    │
    ▼
⑩ before_message_write hook (每条消息写入 transcript 时, sync)
   → Log: { role: "assistant", content: "你明天有3个会议..." }
    │
    ▼
⑪ 800ms setTimeout 触发
   → Span: openclaw.session 结束 (写入 input/output/duration)
   → Metric: session.duration 记录
   → forceFlush() 强制刷新所有缓冲 span
    │
    ▼
⑫ BatchSpanProcessor 批量发送到 OTel Collector
   → traces.jsonl: 5 个 span (session, message_received, agent, llm, tool)
   → metrics.jsonl: counters + histograms
   → logs.jsonl: 用户消息 + assistant 回复

关键设计决策

| 决策 | 选择 | 原因 | |------|------|------| | LLM span 生命周期 | 长生命周期(llm_input 创建,llm_output 关闭) | tool span 需要挂在 LLM 下,必须让 LLM span 存活到 tool 执行完 | | Tool span 父节点 | LLM > Agent > Root(toolParentCtx) | 对齐火山 APMPlus 链路图:llm → tool 层级 | | Session 创建 | 惰性创建(首次收到 sessionKey 时) | message_received 没有 sessionKey,不能依赖它来创建 session | | message_received span | pendingMsg 暂存 + ensureSession 补创建 | 解决 Hook 时序问题:message_received 在 session 创建之前触发 | | agent_end → root close | 800ms setTimeout 延迟 | llm_outputagent_end 之后触发,需要等它写入 token 数据 | | globalThis 单例 | 所有 jiti 加载共享 Provider/Store | OpenClaw 多次加载同一模块,避免重复创建和 traceId 分裂 | | conversationId 映射 | convToSession Map | MessageContext 没有 sessionKey,通过 convId 间接查找 session | | TTL 清理 | 60s 间隔,10min TTL | 防止 session 泄漏(DM 模式下 session_end 可能不触发) |


配置

~/.openclaw/openclaw.json 中:

{
  "plugins": {
    "allow": ["openclaw-otel-plugin"],
    "entries": {
      "openclaw-otel-plugin": {
        "enabled": true,
        "config": {
          "endpoint": "http://localhost:4318",
          "debug": true,
          "traces": true,
          "metrics": true,
          "logs": true,
          "allowUserDetailInfoReport": true,
          "exportIntervalMillis": 15000
        }
      }
    }
  }
}

| 字段 | 类型 | 默认值 | 说明 | |------|------|--------|------| | endpoint | string | http://localhost:4318 | OTLP Collector 地址 | | headers | object | {} | 自定义 HTTP headers(如 Authorization) | | serviceName | string | openclaw-gateway | OTel resource service.name | | debug | boolean | false | 开启 [otel] 前缀的调试日志 | | traces | boolean | true | 启用 Trace 导出 | | metrics | boolean | true | 启用 Metric 导出 | | logs | boolean | true | 启用 Log 导出 | | exportIntervalMillis | number | 15000 | Metrics 刷新间隔(ms) | | allowUserDetailInfoReport | boolean | false | 上报消息内容到 span/log(隐私控制) |