@bryel/trace-sdk
v0.3.0
Published
TypeScript LLM-tracing SDK — drop-in for @raindrop-ai/ai-sdk; ships traces/events/signals to a backend you own.
Readme
@bryel/trace-sdk
A TypeScript LLM-tracing SDK built as a drop-in for @raindrop-ai/[email protected]. It emits the same traces / events / signals / identify wire protocol so you can ship LLM observability to a backend you own instead of Raindrop.ai.
The public API is a superset of the vendor's: createTraceSDK (aliased as createRaindropAISDK), wrap(), events.*, traces.*, signals.*, users.*, eventMetadata / eventMetadataFromChatRequest, plus all the vendor's exported types. Three entry points — . (node), ./browser, ./workers — differing only in async-context backend.
Point it at your backend
import { createTraceSDK } from "@bryel/trace-sdk" // or "@bryel/trace-sdk/browser"
const sdk = createTraceSDK({
writeKey: "your-key",
endpoint: "https://your-ingest.example.com/v1/", // defaults to https://api.raindrop.ai/v1/
})Wire endpoints (POST {endpoint}…): traces (OTLP/HTTP JSON), events/track_partial, signals/track, users/identify. Auth: Authorization: Bearer <writeKey>.
Verified fidelity
Correctness is enforced by differential tests against the real vendor: each scenario drives @raindrop-ai/[email protected] against an in-process mock ingest, captures its HTTP, and asserts our output matches after normalizing volatile/identity fields (ids, timestamps, $context, service identity).
Byte-identical to the vendor (test/golden/parity.test.ts):
traces— OTLP span shape, eventId span attribute, batchingevents/track_partial—event/ai_data/$contextshape,is_pending, user-id-required drop behaviorsignals/track— array body, snake_caseusers/identify— array body
wrap() auto-trace span tree (test/golden/wrapTraces.parity.test.ts): we inject a minimal OpenTelemetry-compatible Tracer into the AI SDK's experimental_telemetry.tracer, so the AI SDK nests spans itself. Verified byte-identical to the vendor on:
- Span structure —
ai.generateTextroot →ai.generateText.doGenerate/ai.toolCallchildren (flat under root), one shared trace, children-first ordering - Canonical telemetry — model id/provider,
operation.name/ai.operationId, response text, finish reason, token counts,gen_ai.*, theai.telemetry.metadata.raindrop.eventIdstamp on every span
Fidelity boundary (known differences from the vendor)
These are deliberate, documented limits — not bugs:
| Area | Behavior |
|---|---|
| wrap() trace attribute set | We forward live, correct ai@6 native telemetry (ai.usage.inputTokens/outputTokens/inputTokenDetails.*, ai.settings.maxRetries, ai.prompt as {prompt}). The vendor instead hand-reconstructs spans with frozen legacy keys (ai.usage.promptTokens/completionTokens, ai.response.toolCalls, ai.toolCall.count, ai.prompt as a message array). Same underlying data; we chose live telemetry over reproducing deprecated keys. |
| selfDiagnostics | The __raindrop_report tool is injected into the wrapped call's tools when enabled (off by default). Tool-call → signal conversion is wired through signals. |
| nativeTelemetry, autoAttachment | Accepted in WrapAISDKOptions for type compatibility; currently no-ops. |
| traces.createSpan | Honors name/attributes/error; input/output/durationMs/startTime are not yet applied (zero-duration one-shot span). FramerStudio uses startSpan/endSpan, not createSpan. |
| Streaming output capture | streamText/streamObject event output is not buffered from the stream (the stream is never drained by telemetry). Spans for streaming calls still emit. |
Eval export (Harbor format)
Export agent eval runs as Harbor-format results (ATIF trajectory + CTRF checks + reward),
shipped to a Bryel sink. The trajectory is built from the SDK's captured traces; scores/cost/
tokens come from your eval harness (e.g. FramerStudio evals2 AgentEvalRunResult).
const sdk = createTraceSDK({
endpoint: mainTraceEndpoint, // your main traces go here
evals: { sink: { endpoint: BRYEL_URL, writeKey: BRYEL_KEY } }, // Harbor results go here
})
const job = sdk.evals.startJob({ name: "framer-bench", dataset: "framer-102" })
// per eval case: run the agent under a known eventId, then:
const trial = await job.exportTrial(agentEvalRunResult, { eventId }) // → POST /v1/trials
await job.finish(reportSummary) // → POST /v1/jobsstartJob enables trajectory recording; exportTrial builds + ships one Harbor trial bundle
(ATIF + CTRF + reward) and evicts that run's spans; finish ships the job aggregate and stops
recording.
Develop
npm install
npm test # vitest — full differential + parity suite
npm run typecheck # tsc --noEmit
npm run build # tsup → dist/ (esm + cjs + d.ts, 3 entries)Status
The SDK is feature-complete and verified as a drop-in for the manual API surface and the wrap() event + auto-trace-structure paths. Not yet done: the backend ingest server (the service implementing the 4 endpoints + persistence) and the FramerStudio cutover — both deferred by design.
