@wardenai/sdk

v0.1.3

Published

17 days ago

Warden SDK — LLM gateway client with circuit breaker fail-open

0High
0Medium
0Low

amarinder

@wardenai/sdk

The control layer between your code and every LLM.

Warden intercepts, attributes, and governs every AI API call in your stack — with one line of code.

Cost attribution — know exactly what every feature, customer, and team spends on AI
Zero migration — drop-in replacement for OpenAI, Anthropic, and Gemini SDKs
Production-grade reliability — built-in circuit breaker, automatic failover, zero downtime
Real-time control — budgets, policies, and guardrails from a single dashboard

npm install @wardenai/sdk

10-Second Setup

// Before
import OpenAI from 'openai';
const client = new OpenAI();

// After
import { OpenAI } from '@wardenai/sdk';
const client = new OpenAI({
  apiKey: 'warden_live_...',
  baseURL: 'https://api.wardenai.dev/v1/chat',
});

That's it. Every call now flows through Warden. Same API. Same types. Same behavior.

Drop-in Replacement

Warden wraps the official SDKs you already use. No new interfaces. No abstraction layers. No migration.

// Your existing code doesn't change
const response = await client.chat.completions.create({
  model: 'gpt-4o',
  messages: [{ role: 'user', content: 'Hello!' }],
});

The response is identical. The types are identical. The only difference: Warden now tracks cost, latency, and attribution — automatically.

Multi-Provider Support

OpenAI (WardenClient)

import { WardenClient } from '@wardenai/sdk';

const warden = new WardenClient({
  apiKey: 'warden_live_...',
  gatewayUrl: 'https://api.wardenai.dev',
});

const response = await warden.chat.completions.create(
  { model: 'gpt-4o', messages: [{ role: 'user', content: 'Hello!' }] },
  { feature: 'chat', customerId: 'cust_123' }
);

OpenAI Drop-in

import { OpenAI } from '@wardenai/sdk';

const openai = new OpenAI({
  apiKey: 'warden_live_...',
  baseURL: 'https://api.wardenai.dev/v1/chat',
});

// Identical to the official OpenAI SDK
const response = await openai.chat.completions.create({
  model: 'gpt-4o',
  messages: [{ role: 'user', content: 'Hello!' }],
});

Anthropic

import Anthropic from '@anthropic-ai/sdk';
import { wrapAnthropic } from '@wardenai/sdk';

const client = wrapAnthropic(new Anthropic(), {
  wardenApiKey: 'warden_live_...',
  gatewayUrl: 'https://api.wardenai.dev',
  feature: 'chat',
});

const response = await client.messages.create({
  model: 'claude-sonnet-4-20250514',
  max_tokens: 1024,
  messages: [{ role: 'user', content: 'Hello!' }],
});

Google Gemini

import { GoogleGenerativeAI } from '@google/generative-ai';
import { wrapGemini } from '@wardenai/sdk';

const genAI = wrapGemini(new GoogleGenerativeAI('your-google-key'), {
  wardenApiKey: 'warden_live_...',
  gatewayUrl: 'https://api.wardenai.dev',
  feature: 'chat',
});

const model = genAI.getGenerativeModel({ model: 'gemini-2.0-flash' });
const result = await model.generateContent('Hello!');

Attribution

Tag every request. Know exactly where your AI spend goes.

const response = await warden.chat.completions.create(
  {
    model: 'gpt-4o',
    messages: [{ role: 'user', content: 'Find products...' }],
  },
  {
    feature: 'search',
    customerId: 'cust_456',
    team: 'growth',
    version: '2.1.0',
  }
);

| Tag | Required | Description | |-----|----------|-------------| | feature | Yes | Product area — "chat", "search", "agents" | | customerId | No | Per-customer cost tracking | | team | No | Engineering team responsible | | version | No | Application version for spend-by-release |

Untagged requests are recorded as "untagged". Requests are never blocked.

Reliability

Warden is designed to be invisible when things go wrong.

Circuit breaker — if the gateway is unreachable, the SDK calls the LLM provider directly. Your application never sees an error.

| Parameter | Default | |-----------|---------| | Failure threshold | 3 consecutive failures in 10s | | Recovery probe | After 30 seconds | | Local event queue | Up to 500 events | | Recovery | Automatic flush on reconnect |

// Health check
warden.getCircuitState();    // 'closed' | 'open' | 'half-open'
warden.getBypassQueueSize(); // number of queued events

// Flush before process exit
await warden.flush();

The SDK fails open. Always. Your LLM calls never depend on Warden being available.

Architecture

Your Application
      ↓
  Warden SDK          ← attribution tags injected here
      ↓
  Warden Gateway      ← cost calculated, policies evaluated
      ↓
  LLM Provider        ← OpenAI / Anthropic / Gemini
      ↓
  Control Center      ← real-time dashboards, alerts, controls

Warden never stores prompts or responses. Only metadata flows through the control layer.

Why Warden Exists

AI costs are exploding. Teams have no idea which features, customers, or models are driving spend — until the invoice arrives.

No attribution — you can't optimize what you can't see
No control — a single runaway feature can burn through your budget overnight
No governance — finance asks "what are we spending on AI?" and engineering guesses

Warden fixes this at the infrastructure level, not with dashboards bolted on after the fact.

Control Center

Every request tracked through the SDK appears in the Warden Control Center within 30 seconds.

Cost breakdown by feature, customer, team, model
Budget utilization and alerts
Anomaly detection
Coming: policy enforcement, spend limits, model governance

Configuration

const warden = new WardenClient({
  apiKey: 'warden_live_...',                  // Required
  gatewayUrl: 'https://api.wardenai.dev',     // Required
  debug: false,                                // Optional — enable debug logging
  circuitBreaker: {                            // Optional — override defaults
    failureThreshold: 3,
    windowMs: 10_000,
    recoveryTimeoutMs: 30_000,
    maxBypassQueueSize: 500,
  },
});

Roadmap

Warden is building the complete AI control layer.

| Phase | Status | What it does | |-------|--------|-------------| | Visibility | Live | Cost tracking, attribution, dashboards | | Control | Building | Budget enforcement, policy engine, guardrails | | Optimization | Planned | Smart model routing, auto-pacing, cost-aware decisions |

The SDK you install today gets smarter with every release. No migration required.

Requirements

Node.js 18+
TypeScript 5+ (optional)

Links

wardenai.dev — Product control.wardenai.dev — Control Center GitHub — Source

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

@wardenai/sdk

10-Second Setup

Drop-in Replacement

Multi-Provider Support

OpenAI (WardenClient)

OpenAI Drop-in

Anthropic

Google Gemini

Attribution

Reliability

Architecture

Why Warden Exists

Control Center

Configuration

Roadmap

Requirements

Links