@bryel/trace-sdk

v0.3.0

Published

19 days ago

TypeScript LLM-tracing SDK — drop-in for @raindrop-ai/ai-sdk; ships traces/events/signals to a backend you own.

0High
0Medium
0Low

@bryel/trace-sdk

A TypeScript LLM-tracing SDK built as a drop-in for @raindrop-ai/[email protected]. It emits the same traces / events / signals / identify wire protocol so you can ship LLM observability to a backend you own instead of Raindrop.ai.

The public API is a superset of the vendor's: createTraceSDK (aliased as createRaindropAISDK), wrap(), events.*, traces.*, signals.*, users.*, eventMetadata / eventMetadataFromChatRequest, plus all the vendor's exported types. Three entry points — . (node), ./browser, ./workers — differing only in async-context backend.

Point it at your backend

import { createTraceSDK } from "@bryel/trace-sdk"          // or "@bryel/trace-sdk/browser"

const sdk = createTraceSDK({
  writeKey: "your-key",
  endpoint: "https://your-ingest.example.com/v1/",         // defaults to https://api.raindrop.ai/v1/
})

Wire endpoints (POST {endpoint}…): traces (OTLP/HTTP JSON), events/track_partial, signals/track, users/identify. Auth: Authorization: Bearer <writeKey>.

Verified fidelity

Correctness is enforced by differential tests against the real vendor: each scenario drives @raindrop-ai/[email protected] against an in-process mock ingest, captures its HTTP, and asserts our output matches after normalizing volatile/identity fields (ids, timestamps, $context, service identity).

Byte-identical to the vendor (test/golden/parity.test.ts):

traces — OTLP span shape, eventId span attribute, batching
events/track_partial — event/ai_data/$context shape, is_pending, user-id-required drop behavior
signals/track — array body, snake_case
users/identify — array body

wrap() auto-trace span tree (test/golden/wrapTraces.parity.test.ts): we inject a minimal OpenTelemetry-compatible Tracer into the AI SDK's experimental_telemetry.tracer, so the AI SDK nests spans itself. Verified byte-identical to the vendor on:

Span structure — ai.generateText root → ai.generateText.doGenerate / ai.toolCall children (flat under root), one shared trace, children-first ordering
Canonical telemetry — model id/provider, operation.name/ai.operationId, response text, finish reason, token counts, gen_ai.*, the ai.telemetry.metadata.raindrop.eventId stamp on every span

Fidelity boundary (known differences from the vendor)

These are deliberate, documented limits — not bugs:

| Area | Behavior | |---|---| | wrap() trace attribute set | We forward live, correct ai@6 native telemetry (ai.usage.inputTokens/outputTokens/inputTokenDetails.*, ai.settings.maxRetries, ai.prompt as {prompt}). The vendor instead hand-reconstructs spans with frozen legacy keys (ai.usage.promptTokens/completionTokens, ai.response.toolCalls, ai.toolCall.count, ai.prompt as a message array). Same underlying data; we chose live telemetry over reproducing deprecated keys. | | selfDiagnostics | The __raindrop_report tool is injected into the wrapped call's tools when enabled (off by default). Tool-call → signal conversion is wired through signals. | | nativeTelemetry, autoAttachment | Accepted in WrapAISDKOptions for type compatibility; currently no-ops. | | traces.createSpan | Honors name/attributes/error; input/output/durationMs/startTime are not yet applied (zero-duration one-shot span). FramerStudio uses startSpan/endSpan, not createSpan. | | Streaming output capture | streamText/streamObject event output is not buffered from the stream (the stream is never drained by telemetry). Spans for streaming calls still emit. |

Eval export (Harbor format)

Export agent eval runs as Harbor-format results (ATIF trajectory + CTRF checks + reward), shipped to a Bryel sink. The trajectory is built from the SDK's captured traces; scores/cost/ tokens come from your eval harness (e.g. FramerStudio evals2 AgentEvalRunResult).

const sdk = createTraceSDK({
  endpoint: mainTraceEndpoint,                       // your main traces go here
  evals: { sink: { endpoint: BRYEL_URL, writeKey: BRYEL_KEY } },  // Harbor results go here
})

const job = sdk.evals.startJob({ name: "framer-bench", dataset: "framer-102" })
// per eval case: run the agent under a known eventId, then:
const trial = await job.exportTrial(agentEvalRunResult, { eventId })  // → POST /v1/trials
await job.finish(reportSummary)                                       // → POST /v1/jobs

startJob enables trajectory recording; exportTrial builds + ships one Harbor trial bundle (ATIF + CTRF + reward) and evicts that run's spans; finish ships the job aggregate and stops recording.

Develop

npm install
npm test          # vitest — full differential + parity suite
npm run typecheck # tsc --noEmit
npm run build     # tsup → dist/ (esm + cjs + d.ts, 3 entries)

Status

The SDK is feature-complete and verified as a drop-in for the manual API surface and the wrap() event + auto-trace-structure paths. Not yet done: the backend ingest server (the service implementing the 4 endpoints + persistence) and the FramerStudio cutover — both deferred by design.

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

@bryel/trace-sdk

Point it at your backend

Verified fidelity

Fidelity boundary (known differences from the vendor)

Eval export (Harbor format)

Develop

Status