optimisely-smart-llm-router

v1.0.1

Published

2 months ago

Intelligent LLM routing SDK that optimizes costs and provides a unified interface for all major AI providers

0High
0Medium
0Low

uozef

llm ai openai claude gemini grok routing cost-optimization machine-learning artificial-intelligence

Optimisely Smart LLM Router SDK

Production-ready multi-LLM architecture with intelligent cost optimization. Cut your AI API costs by 50% with zero config. One SDK for 20+ LLMs.

Why Smart LLM Router?

The Problem: AI companies waste thousands per month on LLM APIs:

❌ Paying for expensive models when cheaper ones work fine
❌ No caching = paying for duplicate requests
❌ Managing 6+ different SDKs and pricing models
❌ No visibility into what's driving costs

The Solution: Optimisely Smart Router cuts costs in half automatically:

✅ Intelligent routing selects cheapest model that meets requirements
✅ 70% cache hit rate = 70% of requests cost $0
✅ One unified SDK replaces all provider SDKs
✅ Real-time analytics show exactly where money goes

Real Results: YC-backed startup reduced monthly LLM costs from $12,000 → $5,800 (52% savings)

Key Features

💰 Smart Cost Optimization

Automatic model selection based on real-time pricing across 20+ models
Semantic caching with 70%+ hit rates eliminates redundant API calls
Zero-config optimization - just install and save money
Cost analytics dashboard tracks spend by provider, model, and query type

🏗️ Production-Ready Architecture

Multi-LLM support: Claude, GPT, Gemini, Grok, Llama, Mistral, and 14+ more
Unified interface: Same code works with all providers
Enterprise reliability: 99.99% uptime with automatic failover
Type-safe: Full TypeScript support with comprehensive types

📊 Developer Experience

One-line installation with npm/yarn
5-minute integration - minimal code changes required
Real-time monitoring: Latency, costs, cache hits, error rates
Production-tested: Battle-tested routing algorithm (<10ms decision time)

Prerequisites

⚠️ IMPORTANT: API Key Required

Before using this SDK, you must:

Get a FREE Optimisely API key: Visit https://optimisely.ai/developer-portal
Register and get your API key (Free tier: 10M requests/month)
Have LLM provider keys (Anthropic, OpenAI, Google, etc.)

Installation

npm install optimisely-smart-llm-router

Quick Start

Step 1: Get Your API Key

Visit https://optimisely.ai/developer-portal to get your FREE Optimisely API key.

Step 2: Install & Configure

import { SmartLLMRouter } from 'optimisely-smart-llm-router';

// Initialize with your Optimisely API key (from developer portal)
const router = new SmartLLMRouter({
  apiKey: process.env.OPTIMISELY_API_KEY // Get from https://optimisely.ai/developer-portal
});

// Configure your LLM provider keys (one-time setup)
await router.configure({
  providers: {
    claude: process.env.ANTHROPIC_API_KEY,
    openai: process.env.OPENAI_API_KEY,
    gemini: process.env.GOOGLE_API_KEY,
    // Add more providers as needed
  }
});

Step 3: Start Saving Money

// That's it! The router automatically optimizes costs
const response = await router.chat({
  messages: [
    { role: 'user', content: 'Explain quantum computing' }
  ],
  strategy: 'cost-optimized' // Automatically routes to cheapest model
});

console.log(response.content);
console.log('💰 Cost saved: $' + response.savings);
console.log('📊 Provider used: ' + response.provider);
console.log('⚡ Latency: ' + response.latency + 'ms');

You're done! The router will now:

✅ Automatically select the cheapest model for each request
✅ Cache responses to eliminate duplicate API calls
✅ Track costs and savings in real-time
✅ Failover to backup models if one goes down

Routing Strategies

Cost-Optimized (Default)

Prioritizes the most cost-effective models while maintaining quality.

const response = await router.chat({
  messages: [...],
  strategy: 'cost-optimized'
});

Performance

Prioritizes speed and quality over cost.

const response = await router.chat({
  messages: [...],
  strategy: 'performance'
});

Balanced

Balances cost and performance.

const response = await router.chat({
  messages: [...],
  strategy: 'balanced'
});

Advanced Configuration

const response = await router.chat({
  messages: [
    { role: 'system', content: 'You are a helpful assistant' },
    { role: 'user', content: 'Write a complex algorithm' }
  ],
  routing: {
    strategy: 'balanced',
    preferredProviders: ['claude', 'gpt'],
    maxCost: 0.05,
    minPerformance: 0.8
  },
  cache: {
    enabled: true,
    ttl: 3600,
    semanticSimilarity: true
  }
});

Streaming

const stream = await router.chatStream({
  messages: [{ role: 'user', content: 'Write a story' }],
  strategy: 'cost-optimized'
});

for await (const chunk of stream) {
  process.stdout.write(chunk.content);
}

Analytics

// Get usage metrics
const metrics = await router.getMetrics({
  timeRange: 'last-30-days'
});

console.log('Total cost: $' + metrics.totalCost);
console.log('Total savings: $' + metrics.totalSavings);
console.log('Cache hit rate: ' + metrics.cacheHitRate + '%');

// Get detailed analytics
const analytics = await router.analytics.getSummary();

console.log('Requests this month:', analytics.totalRequests);
console.log('Cost breakdown by provider:', analytics.costByProvider);
console.log('Average latency:', analytics.avgLatency + 'ms');
console.log('Most used models:', analytics.topModels);

Supported Providers

Anthropic: Claude Sonnet 4.5, Opus 4, Haiku 3.5
OpenAI: GPT-4o, GPT-4 Turbo, GPT-3.5
Google: Gemini 2.0 Flash, Gemini Pro, Gemini Ultra
xAI: Grok-2, Grok-1.5
Meta: Llama 3.3, Llama 3.2
Mistral AI: Mistral Large, Mistral Medium
And 14+ more providers...

Configuration Options

interface SmartLLMRouterConfig {
  apiKey: string;                    // Your Optimisely API key
  defaultStrategy?: RoutingStrategy; // 'cost-optimized' | 'performance' | 'balanced'
  cache?: {
    enabled?: boolean;
    ttl?: number;
    maxSize?: number;
  };
  failover?: {
    enabled?: boolean;
    maxRetries?: number;
  };
  analytics?: {
    enabled?: boolean;
    trackTokens?: boolean;
  };
}

API Reference

`SmartLLMRouter`

Constructor

new SmartLLMRouter(config: SmartLLMRouterConfig)

Methods

configure(providers: ProviderConfig): Promise<void> - Configure LLM provider API keys
chat(options: ChatOptions): Promise<ChatResponse> - Send chat request
chatStream(options: ChatOptions): AsyncIterator<ChatChunk> - Stream chat response
getMetrics(options?: MetricsOptions): Promise<Metrics> - Get usage metrics
analytics.getSummary(): Promise<AnalyticsSummary> - Get detailed analytics

Cost Savings Example

Without Smart Router

// Direct GPT-4 Turbo usage
Total monthly cost: $750

With Smart Router

// Optimized routing + caching
Total monthly cost: $375
Monthly savings: $375 (50%)

License

MIT

Support

Documentation: https://docs.optimisely.ai/smart-llm-router
Issues: https://github.com/optimisely/smart-llm-router/issues
Email: [email protected]