optimisely-smart-llm-router
v1.0.1
Published
Intelligent LLM routing SDK that optimizes costs and provides a unified interface for all major AI providers
Maintainers
Readme
Optimisely Smart LLM Router SDK
Production-ready multi-LLM architecture with intelligent cost optimization. Cut your AI API costs by 50% with zero config. One SDK for 20+ LLMs.
Why Smart LLM Router?
The Problem: AI companies waste thousands per month on LLM APIs:
- ❌ Paying for expensive models when cheaper ones work fine
- ❌ No caching = paying for duplicate requests
- ❌ Managing 6+ different SDKs and pricing models
- ❌ No visibility into what's driving costs
The Solution: Optimisely Smart Router cuts costs in half automatically:
- ✅ Intelligent routing selects cheapest model that meets requirements
- ✅ 70% cache hit rate = 70% of requests cost $0
- ✅ One unified SDK replaces all provider SDKs
- ✅ Real-time analytics show exactly where money goes
Real Results: YC-backed startup reduced monthly LLM costs from $12,000 → $5,800 (52% savings)
Key Features
💰 Smart Cost Optimization
- Automatic model selection based on real-time pricing across 20+ models
- Semantic caching with 70%+ hit rates eliminates redundant API calls
- Zero-config optimization - just install and save money
- Cost analytics dashboard tracks spend by provider, model, and query type
🏗️ Production-Ready Architecture
- Multi-LLM support: Claude, GPT, Gemini, Grok, Llama, Mistral, and 14+ more
- Unified interface: Same code works with all providers
- Enterprise reliability: 99.99% uptime with automatic failover
- Type-safe: Full TypeScript support with comprehensive types
📊 Developer Experience
- One-line installation with npm/yarn
- 5-minute integration - minimal code changes required
- Real-time monitoring: Latency, costs, cache hits, error rates
- Production-tested: Battle-tested routing algorithm (<10ms decision time)
Prerequisites
⚠️ IMPORTANT: API Key Required
Before using this SDK, you must:
- Get a FREE Optimisely API key: Visit https://optimisely.ai/developer-portal
- Register and get your API key (Free tier: 10M requests/month)
- Have LLM provider keys (Anthropic, OpenAI, Google, etc.)
Installation
npm install optimisely-smart-llm-routerQuick Start
Step 1: Get Your API Key
Visit https://optimisely.ai/developer-portal to get your FREE Optimisely API key.
Step 2: Install & Configure
import { SmartLLMRouter } from 'optimisely-smart-llm-router';
// Initialize with your Optimisely API key (from developer portal)
const router = new SmartLLMRouter({
apiKey: process.env.OPTIMISELY_API_KEY // Get from https://optimisely.ai/developer-portal
});
// Configure your LLM provider keys (one-time setup)
await router.configure({
providers: {
claude: process.env.ANTHROPIC_API_KEY,
openai: process.env.OPENAI_API_KEY,
gemini: process.env.GOOGLE_API_KEY,
// Add more providers as needed
}
});Step 3: Start Saving Money
// That's it! The router automatically optimizes costs
const response = await router.chat({
messages: [
{ role: 'user', content: 'Explain quantum computing' }
],
strategy: 'cost-optimized' // Automatically routes to cheapest model
});
console.log(response.content);
console.log('💰 Cost saved: $' + response.savings);
console.log('📊 Provider used: ' + response.provider);
console.log('⚡ Latency: ' + response.latency + 'ms');You're done! The router will now:
- ✅ Automatically select the cheapest model for each request
- ✅ Cache responses to eliminate duplicate API calls
- ✅ Track costs and savings in real-time
- ✅ Failover to backup models if one goes down
Routing Strategies
Cost-Optimized (Default)
Prioritizes the most cost-effective models while maintaining quality.
const response = await router.chat({
messages: [...],
strategy: 'cost-optimized'
});Performance
Prioritizes speed and quality over cost.
const response = await router.chat({
messages: [...],
strategy: 'performance'
});Balanced
Balances cost and performance.
const response = await router.chat({
messages: [...],
strategy: 'balanced'
});Advanced Configuration
const response = await router.chat({
messages: [
{ role: 'system', content: 'You are a helpful assistant' },
{ role: 'user', content: 'Write a complex algorithm' }
],
routing: {
strategy: 'balanced',
preferredProviders: ['claude', 'gpt'],
maxCost: 0.05,
minPerformance: 0.8
},
cache: {
enabled: true,
ttl: 3600,
semanticSimilarity: true
}
});Streaming
const stream = await router.chatStream({
messages: [{ role: 'user', content: 'Write a story' }],
strategy: 'cost-optimized'
});
for await (const chunk of stream) {
process.stdout.write(chunk.content);
}Analytics
// Get usage metrics
const metrics = await router.getMetrics({
timeRange: 'last-30-days'
});
console.log('Total cost: $' + metrics.totalCost);
console.log('Total savings: $' + metrics.totalSavings);
console.log('Cache hit rate: ' + metrics.cacheHitRate + '%');
// Get detailed analytics
const analytics = await router.analytics.getSummary();
console.log('Requests this month:', analytics.totalRequests);
console.log('Cost breakdown by provider:', analytics.costByProvider);
console.log('Average latency:', analytics.avgLatency + 'ms');
console.log('Most used models:', analytics.topModels);Supported Providers
- Anthropic: Claude Sonnet 4.5, Opus 4, Haiku 3.5
- OpenAI: GPT-4o, GPT-4 Turbo, GPT-3.5
- Google: Gemini 2.0 Flash, Gemini Pro, Gemini Ultra
- xAI: Grok-2, Grok-1.5
- Meta: Llama 3.3, Llama 3.2
- Mistral AI: Mistral Large, Mistral Medium
- And 14+ more providers...
Configuration Options
interface SmartLLMRouterConfig {
apiKey: string; // Your Optimisely API key
defaultStrategy?: RoutingStrategy; // 'cost-optimized' | 'performance' | 'balanced'
cache?: {
enabled?: boolean;
ttl?: number;
maxSize?: number;
};
failover?: {
enabled?: boolean;
maxRetries?: number;
};
analytics?: {
enabled?: boolean;
trackTokens?: boolean;
};
}API Reference
SmartLLMRouter
Constructor
new SmartLLMRouter(config: SmartLLMRouterConfig)Methods
configure(providers: ProviderConfig): Promise<void>- Configure LLM provider API keyschat(options: ChatOptions): Promise<ChatResponse>- Send chat requestchatStream(options: ChatOptions): AsyncIterator<ChatChunk>- Stream chat responsegetMetrics(options?: MetricsOptions): Promise<Metrics>- Get usage metricsanalytics.getSummary(): Promise<AnalyticsSummary>- Get detailed analytics
Cost Savings Example
Without Smart Router
// Direct GPT-4 Turbo usage
Total monthly cost: $750With Smart Router
// Optimized routing + caching
Total monthly cost: $375
Monthly savings: $375 (50%)License
MIT
Support
- Documentation: https://docs.optimisely.ai/smart-llm-router
- Issues: https://github.com/optimisely/smart-llm-router/issues
- Email: [email protected]
