@drift_rail/sdk

v2.3.0

Published

3 days ago

DriftRail SDK - AI Safety & Observability Platform

0High
0Medium
0Low

driftrailteam

ai llm observability safety monitoring audit openai anthropic gemini tracing prompts evaluation caching simulation

@drift_rail/sdk

AI Safety & Observability Platform - Monitor, classify, and audit every LLM interaction.

Installation

npm install @drift_rail/sdk
# or
yarn add @drift_rail/sdk
# or
pnpm add @drift_rail/sdk

Quick Start

import { DriftRail, DriftRailEnterprise } from '@drift_rail/sdk';

const client = new DriftRail({
  apiKey: 'dr_live_...',
  appId: 'my-app'
});

// Log an LLM interaction
const response = await client.ingest({
  model: 'gpt-4o',
  provider: 'openai',
  input: { prompt: 'What is the capital of France?' },
  output: { text: 'The capital of France is Paris.' }
});

console.log(`Event ID: ${response.event_id}`);

Enterprise Features

Use DriftRailEnterprise for advanced features:

import { DriftRailEnterprise } from '@drift_rail/sdk';

const enterprise = new DriftRailEnterprise({
  apiKey: 'dr_live_...',
  appId: 'my-app'
});

Distributed Tracing

Track complex LLM workflows with traces and spans:

// Start a trace for a user request
const trace = await enterprise.startTrace({
  app_id: 'my-app',
  name: 'chat-completion',
  user_id: 'user-123',
  session_id: 'session-abc',
  metadata: { feature: 'customer-support' }
});

// Create spans for each step
const retrievalSpan = await enterprise.startSpan({
  trace_id: trace.trace_id,
  name: 'vector-search',
  span_type: 'retrieval',
  metadata: { index: 'knowledge-base' }
});

// ... do retrieval work ...

await enterprise.endSpan(retrievalSpan.span_id, {
  status: 'completed',
  output: { documents: 5 }
});

// Create LLM span
const llmSpan = await enterprise.startSpan({
  trace_id: trace.trace_id,
  name: 'gpt-4-completion',
  span_type: 'llm',
  model: 'gpt-4o',
  provider: 'openai',
  input: { prompt: '...' }
});

// ... call LLM ...

await enterprise.endSpan(llmSpan.span_id, {
  status: 'completed',
  output: { response: '...' },
  tokens_in: 150,
  tokens_out: 200,
  cost_usd: 0.0045
});

// End the trace
await enterprise.endTrace(trace.trace_id, 'completed');

// Get full trace with all spans
const { trace: fullTrace, spans } = await enterprise.getTrace(trace.trace_id);

Prompt Management

Version, deploy, and manage prompts across environments:

// Create a prompt
const prompt = await enterprise.createPrompt({
  name: 'customer-support-v1',
  description: 'Main customer support prompt',
  content: 'You are a helpful customer support agent for {{company}}...',
  variables: ['company', 'product'],
  tags: ['support', 'production']
});

// Create a new version
const version = await enterprise.createPromptVersion(prompt.prompt_id, {
  content: 'You are an expert customer support agent for {{company}}...',
  variables: ['company', 'product', 'tone'],
  commit_message: 'Added tone variable for customization'
});

// Deploy to staging
await enterprise.deployPromptVersion(prompt.prompt_id, {
  version_id: version.version_id,
  environment: 'staging'
});

// Get deployed prompt for an environment
const { prompt: deployed, version: deployedVersion } = await enterprise.getDeployedPrompt(
  prompt.prompt_id,
  'prod'
);

// Rollback if needed
await enterprise.rollbackPrompt(prompt.prompt_id, 'prod', 'previous-version-id');

Evaluation Framework

Test and score your LLM outputs systematically:

// Create a dataset
const dataset = await enterprise.createDataset({
  name: 'qa-golden-set',
  description: 'Golden test cases for QA',
  schema_type: 'qa',
  tags: ['qa', 'regression']
});

// Add test items
await enterprise.addDatasetItems(dataset.dataset_id, [
  {
    input: { question: 'What is 2+2?' },
    expected_output: { answer: '4' },
    metadata: { difficulty: 'easy' }
  },
  {
    input: { question: 'Explain quantum computing' },
    expected_output: { answer: 'Quantum computing uses...' },
    metadata: { difficulty: 'hard' }
  }
]);

// Run evaluation
const run = await enterprise.createEvalRun({
  dataset_id: dataset.dataset_id,
  name: 'gpt-4o-eval-run',
  model: 'gpt-4o',
  evaluators: [
    { name: 'exact_match', type: 'exact_match' },
    { name: 'semantic_similarity', type: 'llm_judge', config: { model: 'gpt-4o-mini' } }
  ]
});

// Submit results as you evaluate
await enterprise.submitEvalResult(run.run_id, {
  item_id: 'item-123',
  output: { answer: '4' },
  scores: {
    exact_match: { score: 1.0, passed: true },
    semantic_similarity: { score: 0.95, reason: 'Correct answer', passed: true }
  },
  latency_ms: 450
});

// Get full results
const { run: completedRun, results } = await enterprise.getEvalRun(run.run_id);
console.log(`Pass rate: ${completedRun.passed_items}/${completedRun.total_items}`);

Semantic Caching

Cache LLM responses based on semantic similarity:

// Configure cache settings
await enterprise.updateCacheSettings({
  is_enabled: true,
  similarity_threshold: 0.92,
  ttl_seconds: 3600,
  max_entries: 10000,
  embedding_model: 'text-embedding-3-small'
});

// Check cache before calling LLM
const cacheResult = await enterprise.cacheLookup({
  input: 'What is the capital of France?',
  model: 'gpt-4o'
});

if (cacheResult.hit) {
  console.log('Cache hit!', cacheResult.output);
  return cacheResult.output;
}

// Call LLM and store result
const llmResponse = await callLLM(prompt);

await enterprise.cacheStore({
  input: 'What is the capital of France?',
  output: llmResponse,
  model: 'gpt-4o',
  provider: 'openai',
  metadata: { tokens: 50 }
});

// Get cache stats
const stats = await enterprise.getCacheStats();
console.log(`Hit rate: ${stats.hit_rate}%, Tokens saved: ${stats.total_tokens_saved}`);

// Clear cache if needed
await enterprise.clearCache({ model: 'gpt-4o' });

Agent Simulation

Test AI agents with simulated user interactions:

// Create a simulation scenario
const simulation = await enterprise.createSimulation({
  name: 'refund-request-flow',
  scenario: 'User wants to request a refund for a defective product',
  description: 'Tests the refund handling capability',
  persona: {
    name: 'Frustrated Customer',
    description: 'A customer who received a broken item',
    traits: ['impatient', 'direct'],
    goals: ['get refund', 'express frustration']
  },
  success_criteria: [
    { name: 'refund_offered', description: 'Agent offers refund option' },
    { name: 'empathy_shown', description: 'Agent acknowledges frustration' },
    { name: 'resolution_reached', description: 'Conversation ends with resolution' }
  ],
  max_turns: 10,
  model: 'gpt-4o',
  tags: ['refund', 'customer-service']
});

// Run the simulation
const run = await enterprise.runSimulation(simulation.simulation_id, {
  max_turns: 8,
  model: 'gpt-4o'
});

// Add turns as the simulation progresses
await enterprise.addSimulationTurn(simulation.simulation_id, run.run_id, {
  turn_number: 1,
  role: 'user',
  content: 'I want a refund! This product is broken!',
  latency_ms: 0
});

await enterprise.addSimulationTurn(simulation.simulation_id, run.run_id, {
  turn_number: 2,
  role: 'assistant',
  content: 'I understand your frustration...',
  latency_ms: 450,
  tokens_in: 50,
  tokens_out: 80
});

// Complete the run with results
await enterprise.completeSimulationRun(simulation.simulation_id, run.run_id, {
  success: true,
  criteria_results: [
    { name: 'refund_offered', passed: true },
    { name: 'empathy_shown', passed: true, reason: 'Agent acknowledged frustration' },
    { name: 'resolution_reached', passed: true }
  ],
  summary: 'Agent successfully handled refund request with empathy'
});

// Get simulation stats
const stats = await enterprise.getSimulationStats();
console.log(`Success rate: ${stats.success_rate}%`);

Inline Guardrails

Block dangerous outputs BEFORE they reach users:

import { DriftRail } from '@drift_rail/sdk';

const client = new DriftRail({ apiKey: '...', appId: 'my-app' });

// Get response from your LLM
const llmResponse = await yourLLMCall(userPrompt);

// Guard it before returning to user
const result = await client.guard({
  output: llmResponse,
  input: userPrompt,
  mode: 'strict'  // or 'permissive'
});

if (result.allowed) {
  // Safe to return (may be redacted if PII was found)
  return result.output;
} else {
  // Content was blocked
  console.log('Blocked:', result.triggered.map(t => t.reason));
  return "Sorry, I can't help with that.";
}

Guard Modes

strict (default): Blocks on medium+ risk (PII, moderate toxicity, prompt injection)
permissive: Only blocks on high risk (severe toxicity, high-risk injection)

Fail-Open vs Fail-Closed

// Fail-open (default): If DriftRail is unavailable, content is allowed through
const client = new DriftRail({ apiKey: '...', appId: '...', guardMode: 'fail_open' });

// Fail-closed: If DriftRail is unavailable, throws exception
const client = new DriftRail({ apiKey: '...', appId: '...', guardMode: 'fail_closed' });

try {
  const result = await client.guard({ output: llmResponse });
} catch (e) {
  if (e instanceof GuardBlockedError) {
    console.log('Blocked:', e.result.triggered);
  }
}

Guard Response

const result = await client.guard({ output: '...' });

result.allowed      // boolean - True if content can be shown to user
result.action       // 'allow' | 'block' | 'redact' | 'warn'
result.output       // Original or redacted content
result.triggered    // Array of triggered guardrails/classifications
result.classification  // AI classification details (risk_score, pii, toxicity, etc.)
result.latency_ms   // Processing time
result.fallback     // True if classification failed (fail-open)

Fire-and-Forget (Non-blocking)

// Won't block your code
client.ingestAsync({
  model: 'gpt-4o',
  provider: 'openai',
  input: { prompt: '...' },
  output: { text: '...' }
});

Streaming / SSE Integration

When logging streaming LLM responses, always log before closing the stream:

// ✅ Correct - log before close
async function handleStream(controller: ReadableStreamDefaultController) {
  let fullResponse = '';
  const startTime = Date.now();
  
  for await (const chunk of llmStream) {
    controller.enqueue(chunk);
    fullResponse += chunk;
  }
  
  // Log BEFORE closing - use await in serverless!
  await client.ingest({
    model: 'gpt-4o',
    provider: 'openai',
    input: { prompt: userMessage },
    output: { text: fullResponse },
    metadata: { latencyMs: Date.now() - startTime }
  });
  
  controller.close(); // Close AFTER logging
}

// ❌ Wrong - logging after close may not execute
controller.close();
await client.ingest({...}); // Too late!

Serverless Warning: Always use await client.ingest() in Vercel/Netlify/Lambda. ingestAsync() causes race conditions where the function terminates before the request completes.

See Serverless Deployment Guide for full examples with Next.js, Netlify Functions, AWS Lambda, and Cloudflare Workers.

With Metadata

const start = Date.now();
// ... your LLM call ...
const latencyMs = Date.now() - start;

await client.ingest({
  model: 'gpt-4o',
  provider: 'openai',
  input: { prompt: '...' },
  output: { text: '...' },
  metadata: {
    latencyMs,
    tokensIn: 50,
    tokensOut: 150,
    temperature: 0.7
  }
});

With RAG Sources

await client.ingest({
  model: 'gpt-4o',
  provider: 'openai',
  input: {
    prompt: 'What does our refund policy say?',
    retrievedSources: [
      { id: 'doc-123', content: 'Refunds are available within 30 days...' },
      { id: 'doc-456', content: 'Contact support for refund requests...' }
    ]
  },
  output: { text: 'According to our policy, refunds are available within 30 days...' }
});

OpenAI Chat Completions Helper

// Convenience method for chat completions
await client.logChatCompletion({
  model: 'gpt-4o',
  messages: [
    { role: 'system', content: 'You are a helpful assistant.' },
    { role: 'user', content: 'Hello!' }
  ],
  response: 'Hi there! How can I help you today?',
  latencyMs: 420,
  tokensIn: 25,
  tokensOut: 12
});

Fail-Open Architecture

By default, the SDK fails open - errors are captured but won't crash your app:

const client = new DriftRail({
  apiKey: '...',
  appId: '...',
  failOpen: true // Default
});

// Even if DriftRail is down, this won't throw
const response = await client.ingest({...});
if (!response.success) {
  console.warn(`DriftRail warning: ${response.error}`);
}

Set failOpen: false to throw exceptions on errors.

Configuration

const client = new DriftRail({
  apiKey: 'dr_live_...',      // Required: Your API key
  appId: 'my-app',             // Required: Your app identifier
  baseUrl: 'https://...',      // Optional: Custom API URL
  timeout: 30000,              // Optional: Request timeout (ms)
  failOpen: true,              // Optional: Fail silently on errors (default: true)
  maxRetries: 3,               // Optional: Retry transient 429/5xx/network failures
  guardMode: 'fail_open'       // Optional: 'fail_open' or 'fail_closed' for guard()
});

Serverless Deployment

When deploying to Vercel, Netlify, AWS Lambda, or Cloudflare Workers, always use await client.ingest() instead of ingestAsync(). Fire-and-forget patterns cause race conditions in serverless where the function terminates before the HTTP request completes.

// ✅ Serverless (Vercel, Netlify, Lambda)
await client.ingest({...});

// ❌ Will lose events in serverless
client.ingestAsync({...});

See Serverless Deployment Guide for platform-specific examples.

TypeScript Support

Full TypeScript support with exported types:

import type { 
  IngestParams, 
  IngestResponse, 
  Provider,
  InputPayload,
  OutputPayload 
} from '@drift_rail/sdk';

License

MIT

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

@drift_rail/sdk

Installation

Quick Start

Enterprise Features

Distributed Tracing

Prompt Management

Evaluation Framework

Semantic Caching

Agent Simulation

Inline Guardrails

Guard Modes

Fail-Open vs Fail-Closed

Guard Response

Fire-and-Forget (Non-blocking)

Streaming / SSE Integration

With Metadata

With RAG Sources

OpenAI Chat Completions Helper

Fail-Open Architecture

Configuration

Serverless Deployment

TypeScript Support

License