@reaatech/otel-genai-semconv-vertexai

v0.1.0

Published

a month ago

Vertex AI SDK instrumentation with OTel GenAI semantic conventions

0High
0Medium
0Low

reaatech

@reaatech/otel-genai-semconv-vertexai

Status: Pre-1.0 — APIs may change in minor versions. Pin to a specific version in production.

Transparent instrumentation for the Google Generative Language (Vertex AI) SDK. Wraps model.generateContent() to emit OpenTelemetry GenAI semantic convention spans with GCP project/location metadata, generation config attributes, candidate events, and cost tracking for Gemini models.

Installation

npm install @reaatech/otel-genai-semconv-vertexai
# or
pnpm add @reaatech/otel-genai-semconv-vertexai

Feature Overview

Zero-config instrumentation — call instrument(model) once, every generateContent() call is traced
GCP metadata — automatically attaches gcp.project_id and gcp.location when configured
Generation config mapping — temperature, topP, topK, maxOutputTokens, stopSequences, and more mapped to OTel attributes
Candidate events — each response candidate emits a gen_ai.choice event with text content and finish reason
System instruction tracking — system instructions are captured as gen_ai.system.message events
Double-instrumentation guard — calling instrument() twice is a safe no-op
Lifecycle hooks — onStart and onEnd callbacks for custom span attributes
Safe uninstrument — restores the original generateContent() method
Dual ESM/CJS output — works with import and require

Quick Start

import { VertexAIInstrumentation } from "@reaatech/otel-genai-semconv-vertexai";

const instrumentation = new VertexAIInstrumentation({
  trackCosts: true,
  projectId: "my-gcp-project",
  location: "us-central1",
});

instrumentation.instrument(model);

const response = await model.generateContent({
  contents: [{ role: "user", parts: [{ text: "What is OpenTelemetry?" }] }],
});
// Each call now emits OTel spans with gen_ai.* attributes

Captured Attributes

Request Attributes

| Attribute | Source | Description | |-----------|--------|-------------| | gen_ai.request.model | Model name | Model identifier | | gen_ai.request.temperature | generationConfig.temperature | Sampling temperature | | gen_ai.request.top_p | generationConfig.topP | Top-p sampling | | gen_ai.request.top_k | generationConfig.topK | Top-k sampling | | gen_ai.request.max_tokens | generationConfig.maxOutputTokens | Max output tokens | | gen_ai.request.stop_sequences | generationConfig.stopSequences | Stop sequences | | gen_ai.request.candidates_per_prompt | generationConfig.candidateCount | Number of candidates | | gen_ai.request.presence_penalty | generationConfig.presencePenalty | Presence penalty | | gen_ai.request.frequency_penalty | generationConfig.frequencyPenalty | Frequency penalty | | gen_ai.request.tool_names | request.tools[].functionDeclarations[].name | Tool names | | gen_ai.provider.name | hardcoded | "gcp.vertex_ai" |

GCP Metadata (when configured)

| Attribute | Source | Description | |-----------|--------|-------------| | gcp.project_id | config.projectId | GCP project identifier | | gcp.location | config.location | GCP region |

Response Attributes

| Attribute | Source | Description | |-----------|--------|-------------| | gen_ai.response.model | response.modelVersion | Model version used | | gen_ai.response.finish_reasons | candidates[].finishReason (mapped) | Mapped to OTel finish reasons | | gen_ai.usage.input_tokens | usageMetadata.promptTokenCount | Input token count | | gen_ai.usage.output_tokens | usageMetadata.candidatesTokenCount | Output token count |

Finish Reason Mapping

Vertex AI's finishReason values are mapped to OTel:

| Vertex AI | OTel | |-----------|------| | STOP | stop | | MAX_TOKENS | length | | SAFETY | content_filter | | RECITATION | content_filter | | OTHER | unknown |

Cost Attributes (when `trackCosts: true`)

| Attribute | Description | |-----------|-------------| | llm.cost.total | Total cost in USD | | llm.cost.input | Input token cost | | llm.cost.output | Output token cost | | llm.cost.currency | Currency code (always "USD") |

Span Events

| Event | When | |-------|------| | gen_ai.system.message | System instruction in the request | | gen_ai.user.message | User content parts in the request | | gen_ai.assistant.message | Assistant content parts | | gen_ai.choice | Each candidate (with index, finish_reason, text content) |

API Reference

`VertexAIInstrumentation` (class)

Constructor

new VertexAIInstrumentation({
  captureRequestHeaders?: boolean;
  captureResponseHeaders?: boolean;
  trackCosts?: boolean;
  pricing?: Record<string, PricingInfo>;
  projectId?: string;
  location?: string;
  onStart?: (span: Span, request: GenerateContentRequest) => void;
  onEnd?: (span: Span, response: GenerateContentResponse) => void;
})

Methods

| Method | Description | |--------|-------------| | instrument(model) | Wrap model.generateContent() with instrumentation | | uninstrument(model) | Restore the original generateContent() method |

`VertexAITokenCounter` (class)

Character-based token estimation for Vertex AI models:

const counter = new VertexAITokenCounter();
counter.countTokens("Hello, world!", "gemini-pro");
counter.countContentsTokens(contents, "gemini-pro");
counter.clearCache();

Attribute Mappers

import { mapVertexAIRequest, mapVertexAIResponse, mapVertexAIError } from "@reaatech/otel-genai-semconv-vertexai";

const requestAttrs = mapVertexAIRequest(request, "gemini-pro");
const responseAttrs = mapVertexAIResponse(response);
const errorAttrs = mapVertexAIError(apiError);

Configuration

GCP Project and Location

new VertexAIInstrumentation({
  projectId: "my-gcp-project",
  location: "us-central1",
}).instrument(model);

Lifecycle Hooks

new VertexAIInstrumentation({
  onStart: (span, request) => {
    span.setAttribute("vertexai.candidate_count", request.generationConfig?.candidateCount ?? 1);
  },
  onEnd: (span, response) => {
    span.setAttribute("vertexai.model_version", response.modelVersion);
  },
}).instrument(model);

Usage Patterns

String Input (Auto-Normalized)

// The instrumentation automatically normalizes string input:
const response = await model.generateContent("What is OpenTelemetry?");
// Internally converted to { contents: [{ role: "user", parts: [{ text: "..." }] }] }

Multi-Turn Conversation

const response = await model.generateContent({
  contents: [
    { role: "user", parts: [{ text: "What is OpenTelemetry?" }] },
    { role: "assistant", parts: [{ text: "OpenTelemetry is..." }] },
    { role: "user", parts: [{ text: "Tell me more about tracing." }] },
  ],
});
// Each message emits the appropriate gen_ai.*.message event

Related Packages

@reaatech/otel-genai-semconv-core — Core types and constants
@reaatech/otel-genai-semconv-instrumentation — Instrumentation framework
@reaatech/otel-genai-semconv-utils — Cost calculator and token counter

License

MIT

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

@reaatech/otel-genai-semconv-vertexai

Installation

Feature Overview

Quick Start

Captured Attributes

Request Attributes

GCP Metadata (when configured)

Response Attributes

Finish Reason Mapping

Cost Attributes (when trackCosts: true)

Span Events

API Reference

VertexAIInstrumentation (class)

Constructor

Methods

VertexAITokenCounter (class)

Attribute Mappers

Configuration

GCP Project and Location

Lifecycle Hooks

Usage Patterns

String Input (Auto-Normalized)

Multi-Turn Conversation

Related Packages

License

Cost Attributes (when `trackCosts: true`)

`VertexAIInstrumentation` (class)

`VertexAITokenCounter` (class)