@wardenai/sdk
v0.1.3
Published
Warden SDK — LLM gateway client with circuit breaker fail-open
Readme
@wardenai/sdk
The control layer between your code and every LLM.
Warden intercepts, attributes, and governs every AI API call in your stack — with one line of code.
- Cost attribution — know exactly what every feature, customer, and team spends on AI
- Zero migration — drop-in replacement for OpenAI, Anthropic, and Gemini SDKs
- Production-grade reliability — built-in circuit breaker, automatic failover, zero downtime
- Real-time control — budgets, policies, and guardrails from a single dashboard
npm install @wardenai/sdk10-Second Setup
// Before
import OpenAI from 'openai';
const client = new OpenAI();
// After
import { OpenAI } from '@wardenai/sdk';
const client = new OpenAI({
apiKey: 'warden_live_...',
baseURL: 'https://api.wardenai.dev/v1/chat',
});That's it. Every call now flows through Warden. Same API. Same types. Same behavior.
Drop-in Replacement
Warden wraps the official SDKs you already use. No new interfaces. No abstraction layers. No migration.
// Your existing code doesn't change
const response = await client.chat.completions.create({
model: 'gpt-4o',
messages: [{ role: 'user', content: 'Hello!' }],
});The response is identical. The types are identical. The only difference: Warden now tracks cost, latency, and attribution — automatically.
Multi-Provider Support
OpenAI (WardenClient)
import { WardenClient } from '@wardenai/sdk';
const warden = new WardenClient({
apiKey: 'warden_live_...',
gatewayUrl: 'https://api.wardenai.dev',
});
const response = await warden.chat.completions.create(
{ model: 'gpt-4o', messages: [{ role: 'user', content: 'Hello!' }] },
{ feature: 'chat', customerId: 'cust_123' }
);OpenAI Drop-in
import { OpenAI } from '@wardenai/sdk';
const openai = new OpenAI({
apiKey: 'warden_live_...',
baseURL: 'https://api.wardenai.dev/v1/chat',
});
// Identical to the official OpenAI SDK
const response = await openai.chat.completions.create({
model: 'gpt-4o',
messages: [{ role: 'user', content: 'Hello!' }],
});Anthropic
import Anthropic from '@anthropic-ai/sdk';
import { wrapAnthropic } from '@wardenai/sdk';
const client = wrapAnthropic(new Anthropic(), {
wardenApiKey: 'warden_live_...',
gatewayUrl: 'https://api.wardenai.dev',
feature: 'chat',
});
const response = await client.messages.create({
model: 'claude-sonnet-4-20250514',
max_tokens: 1024,
messages: [{ role: 'user', content: 'Hello!' }],
});Google Gemini
import { GoogleGenerativeAI } from '@google/generative-ai';
import { wrapGemini } from '@wardenai/sdk';
const genAI = wrapGemini(new GoogleGenerativeAI('your-google-key'), {
wardenApiKey: 'warden_live_...',
gatewayUrl: 'https://api.wardenai.dev',
feature: 'chat',
});
const model = genAI.getGenerativeModel({ model: 'gemini-2.0-flash' });
const result = await model.generateContent('Hello!');Attribution
Tag every request. Know exactly where your AI spend goes.
const response = await warden.chat.completions.create(
{
model: 'gpt-4o',
messages: [{ role: 'user', content: 'Find products...' }],
},
{
feature: 'search',
customerId: 'cust_456',
team: 'growth',
version: '2.1.0',
}
);| Tag | Required | Description |
|-----|----------|-------------|
| feature | Yes | Product area — "chat", "search", "agents" |
| customerId | No | Per-customer cost tracking |
| team | No | Engineering team responsible |
| version | No | Application version for spend-by-release |
Untagged requests are recorded as "untagged". Requests are never blocked.
Reliability
Warden is designed to be invisible when things go wrong.
Circuit breaker — if the gateway is unreachable, the SDK calls the LLM provider directly. Your application never sees an error.
| Parameter | Default | |-----------|---------| | Failure threshold | 3 consecutive failures in 10s | | Recovery probe | After 30 seconds | | Local event queue | Up to 500 events | | Recovery | Automatic flush on reconnect |
// Health check
warden.getCircuitState(); // 'closed' | 'open' | 'half-open'
warden.getBypassQueueSize(); // number of queued events
// Flush before process exit
await warden.flush();The SDK fails open. Always. Your LLM calls never depend on Warden being available.
Architecture
Your Application
↓
Warden SDK ← attribution tags injected here
↓
Warden Gateway ← cost calculated, policies evaluated
↓
LLM Provider ← OpenAI / Anthropic / Gemini
↓
Control Center ← real-time dashboards, alerts, controlsWarden never stores prompts or responses. Only metadata flows through the control layer.
Why Warden Exists
AI costs are exploding. Teams have no idea which features, customers, or models are driving spend — until the invoice arrives.
- No attribution — you can't optimize what you can't see
- No control — a single runaway feature can burn through your budget overnight
- No governance — finance asks "what are we spending on AI?" and engineering guesses
Warden fixes this at the infrastructure level, not with dashboards bolted on after the fact.
Control Center
Every request tracked through the SDK appears in the Warden Control Center within 30 seconds.
- Cost breakdown by feature, customer, team, model
- Budget utilization and alerts
- Anomaly detection
- Coming: policy enforcement, spend limits, model governance
Configuration
const warden = new WardenClient({
apiKey: 'warden_live_...', // Required
gatewayUrl: 'https://api.wardenai.dev', // Required
debug: false, // Optional — enable debug logging
circuitBreaker: { // Optional — override defaults
failureThreshold: 3,
windowMs: 10_000,
recoveryTimeoutMs: 30_000,
maxBypassQueueSize: 500,
},
});Roadmap
Warden is building the complete AI control layer.
| Phase | Status | What it does | |-------|--------|-------------| | Visibility | Live | Cost tracking, attribution, dashboards | | Control | Building | Budget enforcement, policy engine, guardrails | | Optimization | Planned | Smart model routing, auto-pacing, cost-aware decisions |
The SDK you install today gets smarter with every release. No migration required.
Requirements
- Node.js 18+
- TypeScript 5+ (optional)
Links
wardenai.dev — Product control.wardenai.dev — Control Center GitHub — Source
