@reaatech/otel-genai-semconv-vertexai
v0.1.0
Published
Vertex AI SDK instrumentation with OTel GenAI semantic conventions
Readme
@reaatech/otel-genai-semconv-vertexai
Status: Pre-1.0 — APIs may change in minor versions. Pin to a specific version in production.
Transparent instrumentation for the Google Generative Language (Vertex AI) SDK. Wraps model.generateContent() to emit OpenTelemetry GenAI semantic convention spans with GCP project/location metadata, generation config attributes, candidate events, and cost tracking for Gemini models.
Installation
npm install @reaatech/otel-genai-semconv-vertexai
# or
pnpm add @reaatech/otel-genai-semconv-vertexaiFeature Overview
- Zero-config instrumentation — call
instrument(model)once, everygenerateContent()call is traced - GCP metadata — automatically attaches
gcp.project_idandgcp.locationwhen configured - Generation config mapping — temperature, topP, topK, maxOutputTokens, stopSequences, and more mapped to OTel attributes
- Candidate events — each response candidate emits a
gen_ai.choiceevent with text content and finish reason - System instruction tracking — system instructions are captured as
gen_ai.system.messageevents - Double-instrumentation guard — calling
instrument()twice is a safe no-op - Lifecycle hooks —
onStartandonEndcallbacks for custom span attributes - Safe uninstrument — restores the original
generateContent()method - Dual ESM/CJS output — works with
importandrequire
Quick Start
import { VertexAIInstrumentation } from "@reaatech/otel-genai-semconv-vertexai";
const instrumentation = new VertexAIInstrumentation({
trackCosts: true,
projectId: "my-gcp-project",
location: "us-central1",
});
instrumentation.instrument(model);
const response = await model.generateContent({
contents: [{ role: "user", parts: [{ text: "What is OpenTelemetry?" }] }],
});
// Each call now emits OTel spans with gen_ai.* attributesCaptured Attributes
Request Attributes
| Attribute | Source | Description |
|-----------|--------|-------------|
| gen_ai.request.model | Model name | Model identifier |
| gen_ai.request.temperature | generationConfig.temperature | Sampling temperature |
| gen_ai.request.top_p | generationConfig.topP | Top-p sampling |
| gen_ai.request.top_k | generationConfig.topK | Top-k sampling |
| gen_ai.request.max_tokens | generationConfig.maxOutputTokens | Max output tokens |
| gen_ai.request.stop_sequences | generationConfig.stopSequences | Stop sequences |
| gen_ai.request.candidates_per_prompt | generationConfig.candidateCount | Number of candidates |
| gen_ai.request.presence_penalty | generationConfig.presencePenalty | Presence penalty |
| gen_ai.request.frequency_penalty | generationConfig.frequencyPenalty | Frequency penalty |
| gen_ai.request.tool_names | request.tools[].functionDeclarations[].name | Tool names |
| gen_ai.provider.name | hardcoded | "gcp.vertex_ai" |
GCP Metadata (when configured)
| Attribute | Source | Description |
|-----------|--------|-------------|
| gcp.project_id | config.projectId | GCP project identifier |
| gcp.location | config.location | GCP region |
Response Attributes
| Attribute | Source | Description |
|-----------|--------|-------------|
| gen_ai.response.model | response.modelVersion | Model version used |
| gen_ai.response.finish_reasons | candidates[].finishReason (mapped) | Mapped to OTel finish reasons |
| gen_ai.usage.input_tokens | usageMetadata.promptTokenCount | Input token count |
| gen_ai.usage.output_tokens | usageMetadata.candidatesTokenCount | Output token count |
Finish Reason Mapping
Vertex AI's finishReason values are mapped to OTel:
| Vertex AI | OTel |
|-----------|------|
| STOP | stop |
| MAX_TOKENS | length |
| SAFETY | content_filter |
| RECITATION | content_filter |
| OTHER | unknown |
Cost Attributes (when trackCosts: true)
| Attribute | Description |
|-----------|-------------|
| llm.cost.total | Total cost in USD |
| llm.cost.input | Input token cost |
| llm.cost.output | Output token cost |
| llm.cost.currency | Currency code (always "USD") |
Span Events
| Event | When |
|-------|------|
| gen_ai.system.message | System instruction in the request |
| gen_ai.user.message | User content parts in the request |
| gen_ai.assistant.message | Assistant content parts |
| gen_ai.choice | Each candidate (with index, finish_reason, text content) |
API Reference
VertexAIInstrumentation (class)
Constructor
new VertexAIInstrumentation({
captureRequestHeaders?: boolean;
captureResponseHeaders?: boolean;
trackCosts?: boolean;
pricing?: Record<string, PricingInfo>;
projectId?: string;
location?: string;
onStart?: (span: Span, request: GenerateContentRequest) => void;
onEnd?: (span: Span, response: GenerateContentResponse) => void;
})Methods
| Method | Description |
|--------|-------------|
| instrument(model) | Wrap model.generateContent() with instrumentation |
| uninstrument(model) | Restore the original generateContent() method |
VertexAITokenCounter (class)
Character-based token estimation for Vertex AI models:
const counter = new VertexAITokenCounter();
counter.countTokens("Hello, world!", "gemini-pro");
counter.countContentsTokens(contents, "gemini-pro");
counter.clearCache();Attribute Mappers
import { mapVertexAIRequest, mapVertexAIResponse, mapVertexAIError } from "@reaatech/otel-genai-semconv-vertexai";
const requestAttrs = mapVertexAIRequest(request, "gemini-pro");
const responseAttrs = mapVertexAIResponse(response);
const errorAttrs = mapVertexAIError(apiError);Configuration
GCP Project and Location
new VertexAIInstrumentation({
projectId: "my-gcp-project",
location: "us-central1",
}).instrument(model);Lifecycle Hooks
new VertexAIInstrumentation({
onStart: (span, request) => {
span.setAttribute("vertexai.candidate_count", request.generationConfig?.candidateCount ?? 1);
},
onEnd: (span, response) => {
span.setAttribute("vertexai.model_version", response.modelVersion);
},
}).instrument(model);Usage Patterns
String Input (Auto-Normalized)
// The instrumentation automatically normalizes string input:
const response = await model.generateContent("What is OpenTelemetry?");
// Internally converted to { contents: [{ role: "user", parts: [{ text: "..." }] }] }Multi-Turn Conversation
const response = await model.generateContent({
contents: [
{ role: "user", parts: [{ text: "What is OpenTelemetry?" }] },
{ role: "assistant", parts: [{ text: "OpenTelemetry is..." }] },
{ role: "user", parts: [{ text: "Tell me more about tracing." }] },
],
});
// Each message emits the appropriate gen_ai.*.message eventRelated Packages
@reaatech/otel-genai-semconv-core— Core types and constants@reaatech/otel-genai-semconv-instrumentation— Instrumentation framework@reaatech/otel-genai-semconv-utils— Cost calculator and token counter
