@runtime-judgement/vercel-ai

v0.1.0

Published

19 days ago

Vercel AI SDK → Runtime Judgement bridge — one-line model wrapper that auto-instruments AI calls and sends traces for attribution. Zero breaking changes to your existing AI SDK usage.

0High
0Medium
0Low

rossamac01

vercel-ai ai-sdk runtime-judgement observability agents llm attribution otel opentelemetry

@runtime-judgement/vercel-ai

Vercel AI SDK wrapper for Runtime Judgement — one line of code to instrument your AI calls and send traces for failure attribution and regression gating.

Installation

npm install @runtime-judgement/vercel-ai ai
# or
pnpm add @runtime-judgement/vercel-ai ai

ai (the Vercel AI SDK) is a peer dependency — you likely already have it installed.

Quick start

import { rj } from "@runtime-judgement/vercel-ai"
import { openai } from "@ai-sdk/openai"
import { generateText } from "ai"

// Wrap once at startup — drop-in replacement for any LanguageModelV1
const model = await rj(openai("gpt-4o"), {
  apiKey: process.env.RJ_API_KEY!,
})

// Use exactly like a normal model — no other changes needed
const result = await generateText({ model, prompt: "Summarise this article..." })

Synchronous variant

If you prefer not to await the wrapper (e.g. in module initialisation code), use rjSync and pass in wrapLanguageModel from ai directly:

import { rjSync } from "@runtime-judgement/vercel-ai"
import { wrapLanguageModel } from "ai"
import { openai } from "@ai-sdk/openai"

const model = rjSync(openai("gpt-4o"), { apiKey: process.env.RJ_API_KEY! }, wrapLanguageModel)

Linking to a snapshot suite

Pass suiteId to automatically link every trace to a Runtime Judgement snapshot suite. When a trace arrives, RJ can auto-run the suite and flag regressions:

const model = await rj(openai("gpt-4o"), {
  apiKey: process.env.RJ_API_KEY!,
  suiteId: "01HZSUITE_ID_FROM_RJ_UI",
})

Configuration options

| Option | Type | Default | Description | |---|---|---|---| | apiKey | string | required | Runtime Judgement API key. Generate at /app/settings/api-keys. | | endpoint | string | https://runtime-judgement.app/api/ingest/otlp/v1/traces | OTLP ingest endpoint. Override for self-hosted deployments. | | suiteId | string | — | Snapshot suite ID to link traces to. Enables auto-regression gating. | | disabled | boolean | false | Set true to skip all trace sending (e.g. in unit tests or local dev). The model still works normally. | | maxPromptChars | number | 1000 | Maximum prompt characters captured per call. Longer prompts are truncated. | | maxCompletionChars | number | 2000 | Maximum completion characters captured per call. Longer completions are truncated. |

What data is captured

Each doGenerate (non-streaming) call sends one OTLP span with:

gen_ai.system — inferred from the model name (e.g. "openai" for GPT models)
gen_ai.request.model — the model ID string (e.g. "gpt-4o")
gen_ai.usage.prompt_tokens — token count from the SDK response
gen_ai.usage.completion_tokens — token count from the SDK response
gen_ai.response.finish_reasons — e.g. ["stop"], ["length"]
gen_ai.prompt — first maxPromptChars chars of the prompt (default 1000)
gen_ai.output.value — first maxCompletionChars chars of the completion (default 2000)
Start/end timestamps in nanosecond UNIX epoch format

What is NOT captured

Tool call arguments — if your tool calls contain secrets or PII, they are not currently captured. A redactFields option for fine-grained field redaction is planned for a follow-up sprint.
Streaming calls — only doGenerate (non-streaming) is currently instrumented. doStream support is planned.
Full prompts > 1000 chars — truncated to maxPromptChars (configurable).
Full completions > 2000 chars — truncated to maxCompletionChars (configurable).

Privacy note

Prompt and completion text is captured and transmitted to the Runtime Judgement ingest endpoint. By default:

Prompts are truncated to 1000 characters
Completions are truncated to 2000 characters

You can reduce these limits further (maxPromptChars: 0 captures no prompt text at all), or set disabled: true in environments where you do not want any data transmitted.

Data is transmitted over HTTPS with Bearer token authentication. Runtime Judgement does not sell or share trace data. See our privacy policy for details.

How it works

rj() wraps your model using the Vercel AI SDK's wrapLanguageModel API with a custom middleware. The middleware:

Calls the real model and returns the result immediately (no latency added to your application)
Captures timing, prompt, completion, and token usage from the result
Builds an OTLP JSON payload following the gen_ai.* semantic conventions
POSTs the payload to the Runtime Judgement ingest endpoint in the background (fire-and-forget)
Swallows any errors from the POST so telemetry failures never interrupt your agent

TypeScript support

This package ships full TypeScript definitions. The rj() wrapper preserves the generic type of the original model, so the returned model is typed identically to the input.

Testing your instrumentation

Set disabled: true in your test configuration to prevent calls to the Runtime Judgement endpoint:

const model = await rj(openai("gpt-4o"), {
  apiKey: "test-key",
  disabled: process.env.NODE_ENV === "test",
})

Or mock fetch in your test suite to capture the OTLP payloads.

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

@runtime-judgement/vercel-ai

Installation

Quick start

Synchronous variant

Linking to a snapshot suite

Configuration options

What data is captured

What is NOT captured

Privacy note

How it works

TypeScript support

Testing your instrumentation