@ai-orchestration/core
v0.4.0
Published
Modular AI orchestration framework for multiple LLM providers
Maintainers
Readme
AI Orchestration Framework
A modular and extensible framework for orchestrating multiple AI/LLM providers consistently and configurable.
📦 This is an npm package: API keys must be configured in the project that uses this package (using environment variables or .env files in that project), not in the package itself.
Features
- 🔌 Plugin-based architecture: Add new providers or strategies without modifying the core
- 🎯 Multiple selection strategies: Round-robin, priority, fallback, weighted, health-aware
- 🌊 Native streaming: Full support for streaming responses using ReadableStream
- 🔄 Automatic fallback: Automatically tries multiple providers if one fails
- 💚 Health checks: Provider health monitoring with latency metrics
- 📦 Runtime agnostic: Compatible with Node.js and Bun
- 🎨 Declarative API: Simple configuration via JSON/JS objects
- 🔒 Type-safe: Fully typed with TypeScript
Installation
npm install @ai-orchestration/coreModule System Compatibility
This package supports both ESM (ECMAScript Modules) and CommonJS, so you can use it in any Node.js project:
ESM Projects (recommended):
import { createOrchestrator } from '@ai-orchestration/core';CommonJS Projects:
const { createOrchestrator } = require('@ai-orchestration/core');The package automatically exports the correct format based on your project's module system.
Quick Start
Basic Usage
import { createOrchestrator } from '@ai-orchestration/core';
// API keys should come from environment variables configured in YOUR project
// Example: export GROQ_API_KEY="your-key" or using dotenv in your project
const orchestrator = createOrchestrator({
providers: [
{
id: 'groq-1',
type: 'groq',
apiKey: process.env.GROQ_API_KEY!, // Configure this variable in your project
model: 'llama-3.3-70b-versatile',
},
{
id: 'openrouter-1',
type: 'openrouter',
apiKey: process.env.OPENROUTER_API_KEY!,
model: 'openai/gpt-3.5-turbo',
},
],
strategy: {
type: 'round-robin',
},
});
// Simple chat
const response = await orchestrator.chat([
{ role: 'user', content: 'Hello, world!' },
]);
console.log(response.content);
// Streaming chat
const stream = await orchestrator.chatStream([
{ role: 'user', content: 'Tell me a story' },
]);
const reader = stream.getReader();
while (true) {
const { done, value } = await reader.read();
if (done) break;
process.stdout.write(value.content);
}Programmatic Usage
import {
Orchestrator,
RoundRobinStrategy,
GroqProvider,
OpenRouterProvider,
} from '@ai-orchestration/core';
// Create strategy
const strategy = new RoundRobinStrategy();
// Create orchestrator
const orchestrator = new Orchestrator(strategy);
// Register providers
// API keys should come from environment variables configured in YOUR project
orchestrator.registerProvider(
new GroqProvider({
id: 'groq-1',
apiKey: process.env.GROQ_API_KEY!,
})
);
orchestrator.registerProvider(
new OpenRouterProvider({
id: 'openrouter-1',
apiKey: process.env.OPENROUTER_API_KEY!,
})
);
// Use
const response = await orchestrator.chat([
{ role: 'user', content: 'Hello!' },
]);Selection Strategies
Round-Robin
Cycles through providers in order:
{
strategy: {
type: 'round-robin',
},
}Priority
Selects providers based on priority (lower number = higher priority):
{
strategy: {
type: 'priority',
priorities: {
'groq-1': 1,
'openrouter-1': 2,
'gemini-1': 3,
},
},
}Fallback
Tries providers in order until one works:
{
strategy: {
type: 'fallback',
order: ['groq-1', 'openrouter-1', 'gemini-1'],
},
}Weighted
Selection based on weights (useful for load balancing):
{
strategy: {
type: 'weighted',
weights: {
'groq-1': 0.7,
'openrouter-1': 0.3,
},
},
}Weighted Cost-Aware
Considers cost per token:
{
strategy: {
type: 'weighted',
costAware: true,
weights: {
'groq-1': 1.0,
'openrouter-1': 1.0,
},
},
}Health-Aware
Selects based on health metrics (latency, success rate):
{
strategy: {
type: 'health-aware',
preferLowLatency: true,
minHealthScore: 0.5,
},
}Supported Providers
Groq
{
id: 'groq-1',
type: 'groq',
apiKey: 'your-api-key',
model: 'llama-3.3-70b-versatile', // optional, default
baseURL: 'https://api.groq.com/openai/v1', // optional
}OpenRouter
{
id: 'openrouter-1',
type: 'openrouter',
apiKey: 'your-api-key',
model: 'openai/gpt-3.5-turbo', // optional
baseURL: 'https://openrouter.ai/api/v1', // optional
}Google Gemini
{
id: 'gemini-1',
type: 'gemini',
apiKey: 'your-api-key',
model: 'gemini-pro', // optional
baseURL: 'https://generativelanguage.googleapis.com/v1beta', // optional
}Cerebras
Cerebras Inference API - OpenAI compatible. Documentation: inference-docs.cerebras.ai
{
id: 'cerebras-1',
type: 'cerebras',
apiKey: 'your-api-key', // Get at: https://inference-docs.cerebras.ai
model: 'llama-3.3-70b', // optional, default
baseURL: 'https://api.cerebras.ai/v1', // optional
}Note: Cerebras API requires the User-Agent header to avoid CloudFront blocking. This is included automatically.
Local (Local Models)
For local models that expose an OpenAI-compatible API:
{
id: 'local-1',
type: 'local',
baseURL: 'http://localhost:8000',
model: 'local-model', // optional
apiKey: 'optional-key', // optional
}Advanced Configuration
Retry and Timeout Configuration
const orchestrator = createOrchestrator({
providers: [...],
strategy: {...},
maxRetries: 3, // Maximum retry attempts (default: number of providers)
requestTimeout: 30000, // Global timeout in milliseconds (default: 30000)
retryDelay: 'exponential', // or number in milliseconds (default: 1000)
});Circuit Breaker
Automatically disable providers after consecutive failures:
const orchestrator = createOrchestrator({
providers: [...],
strategy: {...},
circuitBreaker: {
enabled: true,
failureThreshold: 5, // Open circuit after 5 failures
resetTimeout: 60000, // Reset after 60 seconds
},
});Health Checks
Enhanced health check configuration:
const orchestrator = createOrchestrator({
providers: [...],
strategy: {...},
healthCheck: {
enabled: true,
interval: 60000, // Check every 60 seconds
timeout: 5000, // Health check timeout (default: 5000ms)
maxConsecutiveFailures: 3, // Mark unhealthy after 3 failures (default: 3)
latencyThreshold: 10000, // Max latency in ms (default: 10000ms)
},
// Legacy format still supported:
// enableHealthChecks: true,
// healthCheckInterval: 60000,
});Or manually check health:
const health = await provider.checkHealth();
console.log(health.healthy, health.latency);Chat Options
const response = await orchestrator.chat(messages, {
temperature: 0.7,
maxTokens: 1000,
topP: 0.9,
topK: 40,
stopSequences: ['\n\n'],
responseLanguage: 'es', // Force response in Spanish
frequencyPenalty: 0.5, // Reduce repetition
presencePenalty: 0.3, // Encourage new topics
seed: 42, // For reproducible outputs
timeout: 30000, // Request timeout in milliseconds
user: 'user-123', // User identifier for tracking
});Available Chat Options
temperature: Controls randomness (0.0 to 2.0)maxTokens: Maximum tokens in responsetopP: Nucleus sampling thresholdtopK: Top-K samplingstopSequences: Stop generation on these sequencesresponseLanguage: Force response language (see below)frequencyPenalty: Penalize frequent tokens (-2.0 to 2.0)presencePenalty: Penalize existing tokens (-2.0 to 2.0)seed: Seed for reproducible outputstimeout: Request timeout in milliseconds (overrides global timeout)user: User identifier for tracking/rate limiting
Forcing Response Language
You can force the AI to respond in a specific language using the responseLanguage option:
// Using ISO 639-1 language codes
const response = await orchestrator.chat(messages, {
responseLanguage: 'es', // Spanish
// or 'en', 'fr', 'de', 'it', 'pt', 'ja', 'zh', 'ru', etc.
});
// Using full language names
const response2 = await orchestrator.chat(messages, {
responseLanguage: 'spanish', // Also works
// or 'english', 'french', 'german', 'italian', etc.
});How it works: When responseLanguage is specified, the framework automatically prepends a system message instructing the model to respond in the specified language. If you already have a system message, the language instruction will be prepended to it.
Supported languages: Spanish, English, French, German, Italian, Portuguese, Japanese, Chinese, Russian, Korean, Arabic, Hindi, Dutch, Polish, Swedish, Turkish (and more via ISO 639-1 codes).
Metrics and Analytics
Track provider usage, costs, and strategy effectiveness:
const orchestrator = createOrchestrator({
providers: [...],
strategy: {...},
enableMetrics: true, // Enabled by default
onMetricsEvent: (event) => {
// Optional: Real-time event tracking
console.log('Event:', event.type, event.providerId);
},
});
// Make some requests...
// Get overall metrics
const metrics = orchestrator.getMetrics().getOrchestratorMetrics();
console.log('Total Requests:', metrics.totalRequests);
console.log('Total Cost:', metrics.totalCost);
console.log('Error Rate:', metrics.errorRate);
// Get provider-specific metrics
const providerMetrics = orchestrator.getMetrics().getProviderMetrics('groq-1');
console.log('Provider Requests:', providerMetrics?.totalRequests);
console.log('Provider Cost:', providerMetrics?.totalCost);
console.log('Success Rate:', providerMetrics?.successfulRequests / providerMetrics?.totalRequests);
// Get strategy metrics
const strategyMetrics = orchestrator.getMetrics().getStrategyMetrics();
console.log('Selections by Provider:', strategyMetrics.selectionsByProvider);
console.log('Average Selection Time:', strategyMetrics.averageSelectionTime);Available Metrics
- Provider Metrics: Requests, success/failure rates, latency, token usage, costs
- Strategy Metrics: Selection counts, distribution, selection time
- Overall Metrics: Total requests, costs, error rates, requests per minute
- Request History: Detailed history with filtering options
See examples/metrics.ts for a complete example.
Extensibility
Adding a New Provider
import { BaseProvider } from '@ai-orchestration/core';
import type {
ChatMessage,
ChatOptions,
ChatResponse,
ChatChunk,
ProviderHealth,
ProviderMetadata,
} from '@ai-orchestration/core';
export class CustomProvider extends BaseProvider {
readonly id: string;
readonly metadata: ProviderMetadata;
constructor(config: CustomConfig) {
super();
this.id = config.id;
this.metadata = {
id: this.id,
name: 'Custom Provider',
};
}
async checkHealth(): Promise<ProviderHealth> {
// Implement health check
}
async chat(messages: ChatMessage[], options?: ChatOptions): Promise<ChatResponse> {
// Implement chat
}
async chatStream(messages: ChatMessage[], options?: ChatOptions): Promise<ReadableStream<ChatChunk>> {
// Implement streaming
}
protected formatMessages(messages: ChatMessage[]): unknown {
// Convert standard format to provider format
}
protected parseResponse(response: unknown): ChatResponse {
// Convert provider response to standard format
}
protected parseStream(stream: ReadableStream<unknown>): ReadableStream<ChatChunk> {
// Convert provider stream to standard format
}
}Adding a New Strategy
import { BaseStrategy } from '@ai-orchestration/core';
import type { AIService, SelectionContext } from '@ai-orchestration/core';
export class CustomStrategy extends BaseStrategy {
async select(
providers: AIService[],
context?: SelectionContext
): Promise<AIService | null> {
// Implement selection logic
return providers[0];
}
update?(provider: AIService, success: boolean, metadata?: unknown): void {
// Optional: update internal state
}
}Architecture
src/
├── core/
│ ├── interfaces.ts # Main interfaces
│ ├── types.ts # Shared types
│ ├── orchestrator.ts # Orchestrator core
│ └── errors.ts # Custom error classes
├── providers/
│ ├── base.ts # Base class for providers
│ ├── groq.ts
│ ├── openrouter.ts
│ ├── gemini.ts
│ ├── cerebras.ts
│ └── local.ts
├── strategies/
│ ├── base.ts # Base class for strategies
│ ├── round-robin.ts
│ ├── priority.ts
│ ├── fallback.ts
│ ├── weighted.ts
│ └── health-aware.ts
├── factory/
│ └── index.ts # Factory for declarative creation
└── index.ts # Main entry pointDesign Principles
- Single Responsibility: Each class has a single responsibility
- Open/Closed Principle: Extensible without modifying the core
- Plugin-based Architecture: Providers and strategies are plugins
- Composition over Inheritance: Preference for composition
- Configuration over Hard-coding: Declarative configuration
- Declarative APIs: Simple and expressive APIs
Development
Setup
# Install dependencies
npm install
# Build
npm run build
# Development with watch
npm run dev
# Type checking
npm run typecheck
# Tests
npm testTesting
Quick Test (No API Keys Required)
Test the framework with mock providers without needing API keys:
npm run test:mockTest with Real Providers
Note: The @ai-orchestration/core package does not include .env files. Environment variables must be configured in your project or in the examples.
- Set environment variables:
export GROQ_API_KEY="your-key"
export OPENROUTER_API_KEY="your-key"
export GEMINI_API_KEY="your-key"
export CEREBRAS_API_KEY="your-key"- Run tests:
npm run test:localLocal Development in Other Projects
Method 1: npm link (Recommended)
# In this directory (ai-orchestration)
npm run link
# In your other project
npm link @ai-orchestration/coreNow you can import normally:
import { createOrchestrator } from '@ai-orchestration/core';Method 2: npm pack
# In this directory
npm run pack:local
# In your other project
npm install ./@ai-orchestration-core-0.1.0.tgzRequirements
- Node.js: >= 18.0.0 (for native ReadableStream and test runner)
- TypeScript: 5.3+ (already included in devDependencies)
Examples
See the examples/ directory for more code examples:
basic.ts- Basic usage examplestrategies.ts- Strategy examplestest-local.ts- Testing with real providerstest-mock.ts- Testing with mock providerschat-app/- Full chat application example
License
MIT
Contributing
See CONTRIBUTING.md for guidelines on contributing to this project.
Related Documentation
- ARCHITECTURE.md - Detailed architecture documentation
- CHANGELOG.md - Version history and changes
