llm-orchestra
v0.6.0
Published
Unified Observability & Orchestration SDK for Multi-Model AI Applications
Downloads
483
Maintainers
Readme
LLM Orchestra
Unified Observability & Orchestration SDK for Multi-Model AI Applications
The Problem
Building production LLM applications is painful:
- Multi-model chaos - Switching between Claude, GPT-4, Gemini requires different SDKs, error handling, and retry logic
- Blind spots - No unified view of costs, latency, token usage across providers
- Debugging nightmares - Tracing a request through chains, agents, and tool calls is nearly impossible
- Cost explosions - No visibility into which prompts/models are eating your budget
The Solution
LLM Orchestra provides a unified layer for orchestrating and observing multi-model AI applications.
import { Orchestra } from 'llm-orchestra';
const orchestra = new Orchestra({
providers: ['anthropic', 'openai', 'google'],
observability: {
tracing: true,
metrics: true,
costTracking: true
}
});
// Unified interface - same code, any model
const response = await orchestra.complete({
model: 'claude-3-opus', // or 'gpt-4', 'gemini-pro'
messages: [{ role: 'user', content: 'Hello!' }],
fallback: ['gpt-4-turbo', 'gemini-pro'], // Automatic failover
tags: ['production', 'chat-feature'] // For cost allocation
});
// Full observability out of the box
console.log(response.meta);
// {
// latency: 1234,
// tokens: { input: 10, output: 50 },
// cost: 0.0023,
// traceId: 'abc-123',
// model: 'claude-3-opus',
// provider: 'anthropic'
// }Key Features
Unified Multi-Model Interface
- Single SDK for Claude, GPT-4, Gemini, Mistral, Cohere, Azure OpenAI, Llama, and more
- Automatic failover with configurable fallback chains
- Load balancing across providers and API keys
- Semantic caching to reduce costs and latency
Production-Grade Observability
- Distributed tracing - Follow requests through chains, agents, and tools
- Real-time metrics - Latency, throughput, error rates per model/prompt
- Cost tracking - Per-request, per-feature, per-team cost allocation
- Prompt versioning - Track which prompts are deployed where
Agent Orchestration
- Multi-agent coordination - Built-in patterns for agent collaboration
- Tool call tracing - See exactly what tools agents used and why
- Conversation memory - Pluggable memory backends with observability
- Workflow engine - Define complex agent workflows with monitoring
Developer Experience
- TypeScript & Python SDKs - First-class support for both
- OpenTelemetry native - Export to any OTEL-compatible backend
- Self-hosted or cloud - Run the dashboard locally or use our cloud
- Framework integrations - LangChain (guide), LlamaIndex, Vercel AI SDK (guide)
Quick Start
Installation
# TypeScript/Node.js
npm install llm-orchestra
# Python
pip install llm-orchestraBasic Usage
import { Orchestra } from 'llm-orchestra';
// Initialize with your API keys
const orchestra = new Orchestra({
providers: {
anthropic: { apiKey: process.env.ANTHROPIC_API_KEY },
openai: { apiKey: process.env.OPENAI_API_KEY },
}
});
// Make requests with full observability
const result = await orchestra.complete({
model: 'claude-3-sonnet',
messages: [
{ role: 'system', content: 'You are a helpful assistant.' },
{ role: 'user', content: 'Explain quantum computing in simple terms.' }
]
});With Tracing
import { Orchestra, trace } from 'llm-orchestra';
// Automatic tracing for complex flows
const result = await trace('user-question-flow', async (span) => {
// Step 1: Classify intent
const intent = await orchestra.complete({
model: 'claude-3-haiku',
messages: [{ role: 'user', content: userQuestion }],
tags: ['intent-classification']
});
span.addEvent('intent-classified', { intent: intent.content });
// Step 2: Route to appropriate model
const response = await orchestra.complete({
model: intent.content === 'complex' ? 'claude-3-opus' : 'claude-3-sonnet',
messages: [...],
tags: ['response-generation']
});
return response;
});Architecture
┌─────────────────────────────────────────────────────────────────┐
│ Your Application │
└─────────────────────────────┬───────────────────────────────────┘
│
┌─────────────────────────────▼───────────────────────────────────┐
│ LLM Orchestra SDK │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ │ Routing │ │ Caching │ │ Tracing │ │
│ │ Engine │ │ Layer │ │ Context │ │
│ └──────┬──────┘ └──────┬──────┘ └──────┬──────┘ │
│ │ │ │ │
│ ┌──────▼────────────────▼────────────────▼──────┐ │
│ │ Provider Adapters │ │
│ │ ┌─────────┐ ┌─────────┐ ┌─────────┐ │ │
│ │ │Anthropic│ │ OpenAI │ │ Google │ • • • │ │
│ │ └─────────┘ └─────────┘ └─────────┘ │ │
│ └───────────────────────────────────────────────┘ │
└─────────────────────────────┬───────────────────────────────────┘
│
┌─────────────────────────────▼───────────────────────────────────┐
│ Observability Backend │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ │ Traces │ │ Metrics │ │ Costs │ │
│ │ Store │ │ Store │ │ Tracker │ │
│ └─────────────┘ └─────────────┘ └─────────────┘ │
└─────────────────────────────────────────────────────────────────┘Roadmap
Phase 1: Core SDK (Q1 2026) ✅
- [x] Unified provider interface (Claude, GPT-4, Gemini)
- [x] Basic tracing and cost tracking
- [x] TypeScript SDK
- [x] Local dashboard
Phase 2: Production Features (Q2 2026) ✅
- [x] Python SDK
- [x] Semantic caching
- [x] Automatic failover and retries
- [x] OpenTelemetry export
Phase 3: Agent Orchestration (Q3 2026) ✅
- [x] Multi-agent coordination primitives
- [x] Tool call tracing
- [x] Workflow engine
- [x] Memory backends
Phase 4: Enterprise (Q4 2026) ✅
- [x] Cloud dashboard (v0.3.0)
- [x] Team management (v0.3.0)
- [x] RBAC and audit logs (v0.3.0)
Phase 5: Security Hardening (2026+) ✅
- [x] Security scanning (CodeQL, dependency scanning, secret detection)
- [x] Encryption at rest (self-hosted PostgreSQL)
- [x] Azure AD SSO/OIDC integration
Contributing
We welcome contributions! See CONTRIBUTING.md for guidelines.
Development Setup
# Clone the repo
git clone https://github.com/MegaPhoenix92/llm-orchestra.git
cd llm-orchestra
# Install dependencies
npm install
# Run tests
npm test
# Start local dashboard
npm run dashboardCloud Dashboard Development (with Docker)
The cloud dashboard requires PostgreSQL. We provide a Docker setup for local development:
# Clone the repo
git clone https://github.com/MegaPhoenix92/llm-orchestra.git
cd llm-orchestra
# Start PostgreSQL in Docker
docker-compose up -d
# Copy environment template
cp packages/dashboard/.env.example packages/dashboard/.env
# Install dependencies
npm install
# Push database schema
npm run db:push -w llm-orchestra-dashboard
# Run all tests
npm test
# Start the cloud dashboard
npm run dev -w llm-orchestra-dashboardDocker Services
| Service | Port | Description | |---------|------|-------------| | PostgreSQL | 5436 | Database for dashboard (mapped from container's 5432) |
Environment Variables
Copy .env.example to .env and configure:
| Variable | Description | Default |
|----------|-------------|---------|
| DATABASE_URL | PostgreSQL connection string | postgresql://orchestra:orchestra_dev@localhost:5436/llm_orchestra |
| JWT_SECRET | Secret for JWT tokens | (required) |
| ENCRYPTION_KEY | Optional encryption for secrets at rest | (optional) |
Useful Commands
# Start database
docker-compose up -d
# Stop database
docker-compose down
# Reset database (delete all data)
docker-compose down -v && docker-compose up -d
# View database logs
docker-compose logs -f postgres
# Connect to database
docker exec -it llm-orchestra-db psql -U orchestra -d llm_orchestraWhy LLM Orchestra?
vs. LangSmith
- Open source - Self-host, no vendor lock-in
- Multi-model native - Not just OpenAI-focused
- Simpler integration - Works without LangChain
vs. Helicone
- SDK-first - Not just a proxy, full orchestration
- Agent support - Built for multi-agent systems
- Local-first - Run everything locally for development
vs. Building In-House
- Battle-tested - Patterns from production systems
- Time savings - Months of work, ready to use
- Community - Shared learnings and improvements
About
Built by TROZLAN — We're building the future of AI-powered enterprise solutions, including multi-agent orchestration and MCP infrastructure.
LLM Orchestra is born from our experience building production AI systems that coordinate multiple models and agents.
License
MIT License - See LICENSE for details.
Star this repo if you're interested in better LLM observability!
