llm-orchestra

v0.6.0

Published

2 months ago

Unified Observability & Orchestration SDK for Multi-Model AI Applications

Downloads

195

0High
0Medium
0Low

otrozlan

llm ai orchestration observability openai anthropic claude gpt gemini tracing multi-model failover cost-tracking

LLM Orchestra

Unified Observability & Orchestration SDK for Multi-Model AI Applications

The Problem

Building production LLM applications is painful:

Multi-model chaos - Switching between Claude, GPT-4, Gemini requires different SDKs, error handling, and retry logic
Blind spots - No unified view of costs, latency, token usage across providers
Debugging nightmares - Tracing a request through chains, agents, and tool calls is nearly impossible
Cost explosions - No visibility into which prompts/models are eating your budget

The Solution

LLM Orchestra provides a unified layer for orchestrating and observing multi-model AI applications.

import { Orchestra } from 'llm-orchestra';

const orchestra = new Orchestra({
  providers: ['anthropic', 'openai', 'google'],
  observability: {
    tracing: true,
    metrics: true,
    costTracking: true
  }
});

// Unified interface - same code, any model
const response = await orchestra.complete({
  model: 'claude-3-opus',  // or 'gpt-4', 'gemini-pro'
  messages: [{ role: 'user', content: 'Hello!' }],
  fallback: ['gpt-4-turbo', 'gemini-pro'],  // Automatic failover
  tags: ['production', 'chat-feature']       // For cost allocation
});

// Full observability out of the box
console.log(response.meta);
// {
//   latency: 1234,
//   tokens: { input: 10, output: 50 },
//   cost: 0.0023,
//   traceId: 'abc-123',
//   model: 'claude-3-opus',
//   provider: 'anthropic'
// }

Key Features

Unified Multi-Model Interface

Single SDK for Claude, GPT-4, Gemini, Mistral, Cohere, Azure OpenAI, Llama, and more
Automatic failover with configurable fallback chains
Load balancing across providers and API keys
Semantic caching to reduce costs and latency

Production-Grade Observability

Distributed tracing - Follow requests through chains, agents, and tools
Real-time metrics - Latency, throughput, error rates per model/prompt
Cost tracking - Per-request, per-feature, per-team cost allocation
Prompt versioning - Track which prompts are deployed where

Agent Orchestration

Multi-agent coordination - Built-in patterns for agent collaboration
Tool call tracing - See exactly what tools agents used and why
Conversation memory - Pluggable memory backends with observability
Workflow engine - Define complex agent workflows with monitoring

Developer Experience

TypeScript & Python SDKs - First-class support for both
OpenTelemetry native - Export to any OTEL-compatible backend
Self-hosted or cloud - Run the dashboard locally or use our cloud
Framework integrations - LangChain (guide), LlamaIndex, Vercel AI SDK (guide)

Quick Start

Installation

# TypeScript/Node.js
npm install llm-orchestra

# Python
pip install llm-orchestra

Basic Usage

import { Orchestra } from 'llm-orchestra';

// Initialize with your API keys
const orchestra = new Orchestra({
  providers: {
    anthropic: { apiKey: process.env.ANTHROPIC_API_KEY },
    openai: { apiKey: process.env.OPENAI_API_KEY },
  }
});

// Make requests with full observability
const result = await orchestra.complete({
  model: 'claude-3-sonnet',
  messages: [
    { role: 'system', content: 'You are a helpful assistant.' },
    { role: 'user', content: 'Explain quantum computing in simple terms.' }
  ]
});

With Tracing

import { Orchestra, trace } from 'llm-orchestra';

// Automatic tracing for complex flows
const result = await trace('user-question-flow', async (span) => {
  // Step 1: Classify intent
  const intent = await orchestra.complete({
    model: 'claude-3-haiku',
    messages: [{ role: 'user', content: userQuestion }],
    tags: ['intent-classification']
  });

  span.addEvent('intent-classified', { intent: intent.content });

  // Step 2: Route to appropriate model
  const response = await orchestra.complete({
    model: intent.content === 'complex' ? 'claude-3-opus' : 'claude-3-sonnet',
    messages: [...],
    tags: ['response-generation']
  });

  return response;
});

Architecture

┌─────────────────────────────────────────────────────────────────┐
│                        Your Application                         │
└─────────────────────────────┬───────────────────────────────────┘
                              │
┌─────────────────────────────▼───────────────────────────────────┐
│                      LLM Orchestra SDK                          │
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐             │
│  │   Routing   │  │   Caching   │  │   Tracing   │             │
│  │   Engine    │  │   Layer     │  │   Context   │             │
│  └──────┬──────┘  └──────┬──────┘  └──────┬──────┘             │
│         │                │                │                     │
│  ┌──────▼────────────────▼────────────────▼──────┐             │
│  │              Provider Adapters                 │             │
│  │  ┌─────────┐ ┌─────────┐ ┌─────────┐         │             │
│  │  │Anthropic│ │ OpenAI  │ │ Google  │ • • •   │             │
│  │  └─────────┘ └─────────┘ └─────────┘         │             │
│  └───────────────────────────────────────────────┘             │
└─────────────────────────────┬───────────────────────────────────┘
                              │
┌─────────────────────────────▼───────────────────────────────────┐
│                   Observability Backend                         │
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐             │
│  │   Traces    │  │   Metrics   │  │    Costs    │             │
│  │   Store     │  │   Store     │  │   Tracker   │             │
│  └─────────────┘  └─────────────┘  └─────────────┘             │
└─────────────────────────────────────────────────────────────────┘

Roadmap

Phase 1: Core SDK (Q1 2026) ✅

[x] Unified provider interface (Claude, GPT-4, Gemini)
[x] Basic tracing and cost tracking
[x] TypeScript SDK
[x] Local dashboard

Phase 2: Production Features (Q2 2026) ✅

[x] Python SDK
[x] Semantic caching
[x] Automatic failover and retries
[x] OpenTelemetry export

Phase 3: Agent Orchestration (Q3 2026) ✅

[x] Multi-agent coordination primitives
[x] Tool call tracing
[x] Workflow engine
[x] Memory backends

Phase 4: Enterprise (Q4 2026) ✅

[x] Cloud dashboard (v0.3.0)
[x] Team management (v0.3.0)
[x] RBAC and audit logs (v0.3.0)

Phase 5: Security Hardening (2026+) ✅

[x] Security scanning (CodeQL, dependency scanning, secret detection)
[x] Encryption at rest (self-hosted PostgreSQL)
[x] Azure AD SSO/OIDC integration

Contributing

We welcome contributions! See CONTRIBUTING.md for guidelines.

Development Setup

# Clone the repo
git clone https://github.com/MegaPhoenix92/llm-orchestra.git
cd llm-orchestra

# Install dependencies
npm install

# Run tests
npm test

# Start local dashboard
npm run dashboard

Cloud Dashboard Development (with Docker)

The cloud dashboard requires PostgreSQL. We provide a Docker setup for local development:

# Clone the repo
git clone https://github.com/MegaPhoenix92/llm-orchestra.git
cd llm-orchestra

# Start PostgreSQL in Docker
docker-compose up -d

# Copy environment template
cp packages/dashboard/.env.example packages/dashboard/.env

# Install dependencies
npm install

# Push database schema
npm run db:push -w llm-orchestra-dashboard

# Run all tests
npm test

# Start the cloud dashboard
npm run dev -w llm-orchestra-dashboard

Docker Services

| Service | Port | Description | |---------|------|-------------| | PostgreSQL | 5436 | Database for dashboard (mapped from container's 5432) |

Environment Variables

Copy .env.example to .env and configure:

| Variable | Description | Default | |----------|-------------|---------| | DATABASE_URL | PostgreSQL connection string | postgresql://orchestra:orchestra_dev@localhost:5436/llm_orchestra | | JWT_SECRET | Secret for JWT tokens | (required) | | ENCRYPTION_KEY | Optional encryption for secrets at rest | (optional) |

Useful Commands

# Start database
docker-compose up -d

# Stop database
docker-compose down

# Reset database (delete all data)
docker-compose down -v && docker-compose up -d

# View database logs
docker-compose logs -f postgres

# Connect to database
docker exec -it llm-orchestra-db psql -U orchestra -d llm_orchestra

Why LLM Orchestra?

vs. LangSmith

Open source - Self-host, no vendor lock-in
Multi-model native - Not just OpenAI-focused
Simpler integration - Works without LangChain

vs. Helicone

SDK-first - Not just a proxy, full orchestration
Agent support - Built for multi-agent systems
Local-first - Run everything locally for development

vs. Building In-House

Battle-tested - Patterns from production systems
Time savings - Months of work, ready to use
Community - Shared learnings and improvements

About

Built by TROZLAN — We're building the future of AI-powered enterprise solutions, including multi-agent orchestration and MCP infrastructure.

LLM Orchestra is born from our experience building production AI systems that coordinate multiple models and agents.

License

MIT License - See LICENSE for details.

Star this repo if you're interested in better LLM observability!

Report Bug · Request Feature · Join Discord