@bernierllc/ai-provider-router
v1.0.1
Published
Intelligent AI provider routing service with load balancing, cost optimization, and automatic failover
Readme
@bernierllc/ai-provider-router
Intelligent AI provider routing service with load balancing, cost optimization, and automatic failover.
Installation
npm install @bernierllc/ai-provider-routerFeatures
- Multiple Routing Strategies: Round-robin, least-cost, fastest-response, load-balanced, quality-based, priority
- Automatic Failover: Seamlessly switch to alternate providers when one fails
- Cost Tracking: Track token usage and costs across providers
- Load Balancing: Distribute requests based on configurable weights
- Quota Management: Track and enforce usage limits
- Health Monitoring: Real-time provider health status
Usage
import { AIProviderRouter } from '@bernierllc/ai-provider-router';
const router = new AIProviderRouter({
strategy: 'load-balanced',
enableFailover: true,
enableCostOptimization: true,
providers: [
{
name: 'openai',
priority: 10,
weight: 2,
enabled: true,
costPer1kTokens: 0.002,
},
{
name: 'anthropic',
priority: 9,
weight: 1,
enabled: true,
costPer1kTokens: 0.003,
},
],
});
// Initialize (loads provider clients)
await router.initialize();
// Route a request
const result = await router.route({
prompt: 'Write a haiku about coding',
maxTokens: 100,
temperature: 0.7,
});
if (result.success) {
console.log(`Response from ${result.provider}:`, result.response?.content);
console.log(`Cost: $${result.metadata.cost.toFixed(4)}`);
}Routing Strategies
Round-Robin
Distributes requests evenly across available providers.
const router = new AIProviderRouter({
strategy: 'round-robin',
providers: [...],
});Least-Cost
Selects the cheapest provider based on cost per 1K tokens.
const router = new AIProviderRouter({
strategy: 'least-cost',
enableCostOptimization: true,
providers: [
{ name: 'openai', costPer1kTokens: 0.002, ... },
{ name: 'anthropic', costPer1kTokens: 0.003, ... },
],
});Fastest-Response
Selects the provider with the lowest average latency.
const router = new AIProviderRouter({
strategy: 'fastest-response',
providers: [...],
});Load-Balanced
Distributes requests based on configured weights.
const router = new AIProviderRouter({
strategy: 'load-balanced',
providers: [
{ name: 'openai', weight: 2, ... }, // Gets ~66% of requests
{ name: 'anthropic', weight: 1, ... }, // Gets ~33% of requests
],
});Quality-Based
Selects the provider with the highest success rate.
const router = new AIProviderRouter({
strategy: 'quality-based',
providers: [...],
});Priority
Selects the highest priority provider.
const router = new AIProviderRouter({
strategy: 'priority',
providers: [
{ name: 'openai', priority: 10, ... }, // Primary
{ name: 'anthropic', priority: 5, ... }, // Fallback
],
});Request Preferences
Override the routing strategy per-request:
// Force a specific provider
await router.route({
prompt: '...',
maxTokens: 100,
preferences: { preferredProvider: 'anthropic' },
});
// Exclude providers
await router.route({
prompt: '...',
maxTokens: 100,
preferences: { excludeProviders: ['openai'] },
});
// Override strategy for this request
await router.route({
prompt: '...',
maxTokens: 100,
preferences: { prioritizeCost: true }, // Use least-cost strategy
});Failover
When a provider fails, the router automatically tries alternate providers:
const router = new AIProviderRouter({
enableFailover: true, // Default: true
providers: [
{ name: 'openai', priority: 10, ... },
{ name: 'anthropic', priority: 5, ... },
],
});
const result = await router.route(request);
console.log(`Failover attempts: ${result.metadata.failoverAttempts}`);
console.log(`Alternatives tried: ${result.metadata.alternativesConsidered}`);Cost Tracking
Track costs across providers:
const router = new AIProviderRouter({
enableCostOptimization: true,
providers: [...],
});
// After some requests
const totalCost = router.getTotalCost();
const breakdown = router.getCostBreakdown();
console.log(`Total cost: $${totalCost.toFixed(4)}`);
console.log('By provider:', breakdown);Quota Management
Enforce usage limits:
const router = new AIProviderRouter({
providers: [
{
name: 'openai',
enabled: true,
quotaLimit: {
daily: 100000, // 100K tokens per day
hourly: 10000, // 10K tokens per hour
},
},
],
});
// Reset quotas (e.g., at start of new day)
router.resetQuota('openai');
router.resetAllQuotas();Health Monitoring
Check provider health status:
const health = router.getHealth();
console.log(`Status: ${health.status}`); // 'healthy', 'degraded', or 'unhealthy'
console.log(`Available: ${health.availableProviders}/${health.totalProviders}`);
// Per-provider health
for (const [name, info] of Object.entries(health.providers)) {
console.log(`${name}: ${info.available ? 'up' : 'down'}, success rate: ${info.successRate}`);
}Metrics
Access detailed provider metrics:
const metrics = router.getProviderMetrics('openai');
console.log(`Success rate: ${metrics?.successRate}`);
console.log(`Avg latency: ${metrics?.averageLatency}ms`);
console.log(`Total requests: ${metrics?.totalRequests}`);
console.log(`Failed requests: ${metrics?.failedRequests}`);Custom Providers
Register custom AI providers:
const customClient = {
complete: async (request) => {
// Your custom implementation
return { success: true, data: { content: '...', model: '...', tokensUsed: 100 } };
},
};
router.registerProvider('custom', customClient, {
priority: 20,
costPer1kTokens: 0.001,
});API Reference
RouterConfig
| Property | Type | Default | Description | |----------|------|---------|-------------| | strategy | RoutingStrategy | 'round-robin' | Default routing strategy | | enableFailover | boolean | true | Enable automatic failover | | enableCostOptimization | boolean | false | Enable cost tracking | | enableLoadBalancing | boolean | false | Enable load balancing | | providers | ProviderConfig[] | - | List of provider configurations | | defaultTimeoutMs | number | 30000 | Default request timeout | | maxRetries | number | 3 | Maximum retry attempts |
ProviderConfig
| Property | Type | Description | |----------|------|-------------| | name | string | Provider name ('openai', 'anthropic', or custom) | | priority | number | Priority for failover (higher = more preferred) | | weight | number | Weight for load balancing | | enabled | boolean | Whether provider is enabled | | costPer1kTokens | number | Cost per 1K tokens | | quotaLimit | QuotaLimit | Usage limits |
Integration Status
- Logger: planned
- Docs-Suite: ready (typedoc)
- NeverHub: planned
License
Copyright (c) 2025 Bernier LLC. All rights reserved.
