@atom8ai/unify-llm
v1.0.3
Published
A TypeScript SDK for multi-provider LLM orchestration with tool calling, prompt caching, streaming, routing, and cost tracking across OpenAI, Anthropic, Gemini, and Ollama.
Maintainers
Keywords
Readme
Unify LLM: TypeScript SDK for Multi-Provider LLM Orchestration
Unify LLM is a TypeScript SDK for building AI applications across OpenAI, Anthropic, Gemini, and Ollama with one consistent API. Use it when you need provider-agnostic text generation, tool calling, structured outputs, prompt caching, streaming, middleware, cost tracking, routing, and hallucination interception without rewriting your app for every model vendor.
Unify LLM is designed for developers who want a TypeScript-first LLM SDK with direct control over providers, middleware, and runtime behavior. It normalizes provider differences such as OpenAI tool_calls, Anthropic tool_use, and Gemini functionCall, while still leaving room for provider-specific options when you need them.
Table of contents
- Why use Unify LLM?
- Core features
- Installation
- Quickstart
- Tool calling example
- Middleware, routing, and safety
- Supported providers
- Benchmarks and quality signals
- Use cases
- Unify LLM vs other TypeScript AI SDKs
- Plain-English naming guide
- Examples and repository guide
- FAQ
- Contributing
- License
Why use Unify LLM?
Most multi-provider AI projects hit the same friction points:
- different request and response shapes for each provider
- inconsistent tool calling formats
- one-off streaming adapters scattered across the codebase
- duplicate cost tracking and retry logic
- growing need for routing, safety, and local-model support
Unify LLM gives you a single client and middleware pipeline so your application logic stays stable while you switch models, add failover, or experiment with routing.
Best fit for
- TypeScript AI apps that need OpenAI, Anthropic, Gemini, and Ollama behind one SDK
- AI agents that rely on tool calling and structured outputs
- LLM gateways that need retries, rate limiting, routing, or cost controls
- safety-aware systems that want response anomaly detection or stream interception
- teams comparing providers without rewriting business logic for each API
Core features
- Unified multi-provider API for OpenAI, Anthropic, Gemini, Ollama, and related integrations
- Universal tool calling with a single schema shape across supported providers
- Structured outputs using JSON schema-style contracts
- Streaming support for incremental generation and stream middleware
- Prompt caching support where providers expose native caching controls
- Middleware pipeline for retry, caching, rate limiting, cost tracking, and safety
- Routing primitives for cost/latency/quality-aware or drift-aware model selection
- Hallucination interception for response anomaly detection and early stream aborts
- TypeScript-first developer experience with exported types, examples, and benchmark utilities
Installation
npm install @atom8ai/unify-llmRequires Node.js 20+.
If you want to run examples locally, configure the provider API keys you actually use. For local-only workflows with Ollama, point your runtime at http://localhost:11434.
Quickstart
This is the fastest way to send one prompt through a unified TypeScript interface.
import { UnifyClient, OpenAIProvider } from '@atom8ai/unify-llm';
const client = new UnifyClient()
.registerProvider(new OpenAIProvider(process.env.OPENAI_API_KEY!));
const response = await client.generate('openai', {
model: 'gpt-4o',
messages: [
{ role: 'user', content: 'Explain prompt caching in one sentence.' },
],
});
console.log(response.content);If you later switch from OpenAI to Anthropic or Gemini, your app can keep the same high-level flow while only changing provider registration and model selection.
Tool calling example
Unify LLM is especially useful when you want one tool definition that works across multiple LLM providers.
import { UnifyClient, OpenAIProvider } from '@atom8ai/unify-llm';
const client = new UnifyClient()
.registerProvider(new OpenAIProvider(process.env.OPENAI_API_KEY!));
const getWeatherTool = {
name: 'getWeather',
description: 'Get the current weather for a city.',
schema: {
type: 'object',
properties: {
city: { type: 'string' },
},
required: ['city'],
},
execute: async ({ city }: { city: string }) => {
return { city, forecast: 'Rain', temperatureF: 52 };
},
};
const result = await client.generate('openai', {
model: 'gpt-4o',
messages: [
{ role: 'user', content: 'Should I bring an umbrella in Seattle today?' },
],
tools: [getWeatherTool],
autoExecute: true,
});
console.log(result.content);Middleware, routing, and safety
The middleware layer is where Unify LLM becomes more than a thin API wrapper.
import {
CacheMiddleware,
CostTrackerMiddleware,
RetryMiddleware,
UnifyClient,
OpenAIProvider,
createHallucinationGuard,
} from '@atom8ai/unify-llm';
const costTracker = new CostTrackerMiddleware();
const client = new UnifyClient()
.registerProvider(new OpenAIProvider(process.env.OPENAI_API_KEY!))
.use(new CacheMiddleware())
.use(new RetryMiddleware({ maxRetries: 3, baseDelayMs: 1000 }))
.use(costTracker)
.use(createHallucinationGuard({ alpha: 1.2, tau: 2, chunkSize: 6 }));
const response = await client.generate('openai', {
model: 'gpt-4o-mini',
messages: [{ role: 'user', content: 'Summarize the benefits of JSON schema.' }],
});
console.log(response.content);
console.log('Total cost:', costTracker.getTotalCost());Routing and orchestration highlights
Unify LLM includes advanced and experimental routing primitives for teams exploring:
- cost/latency/quality tradeoffs with
CostLatencyQualityRouter - Gaussian-process utility routing with
BayesianUtilityRouter - topological drift monitoring with
TopologicalDriftRouter - complexity-threshold routing with
ComplexityThresholdRouter - failover-capable orchestration with
SelfHealingGateway
These are useful when you want a single TypeScript SDK to act like a lightweight LLM gateway, multi-model router, or AI orchestration layer.
Supported providers
| Provider | Typical models | Tool calling | Streaming | Vision | Prompt caching |
| --- | --- | --- | --- | --- | --- |
| OpenAI | gpt-4o, gpt-4o-mini, o1 | ✅ | ✅ | ✅ | N/A |
| Anthropic | claude-3-7-sonnet, claude-3-opus | ✅ | ✅ | ✅ | ✅ |
| Google Gemini | gemini-2.0-flash, gemini-1.5-pro | ✅ | ✅ | ✅ | ✅ |
| Ollama | llama3.3, mistral, phi4 | ✅ | ✅ | ✅ | N/A |
Benchmarks and quality signals
Unify LLM ships with a reproducible benchmark harness in benchmarks/run.ts and evaluation helpers in evaluation/.
What the benchmark covers
- hallucination guard accuracy
- stream abort latency
- cost savings from model routing
- scaling behavior for micro-batched async workloads
Current checked-in synthetic snapshot
From the current local benchmark artifact in benchmarks/latest.json:
- hallucination guard accuracy:
100% - guardian p95 abort latency:
19.23 ms - cost savings vs always-frontier baseline:
43.69% - scaling smoke test:
10,000micro-batched iterations completed
CI regression thresholds
The benchmark assertion step currently enforces these regression guards:
- guardian accuracy ≥ 95%
- guardian p95 abort latency ≤ 50 ms
- pareto cost savings ≥ 20%
These are synthetic benchmark thresholds, not universal production guarantees. They are most useful for catching regressions in routing and safety logic over time.
Test coverage signal
The latest local test run completed with 33 passing test files, 230 passing tests, and 1 skipped file, which helps keep README claims grounded in code that is actually exercised.
Use cases
Developers usually land on Unify LLM through one of these intents:
Build one app across multiple LLM providers
Use one TypeScript client to talk to OpenAI, Anthropic, Gemini, and Ollama while keeping your application code stable.
Add tool calling to AI agents
Define tools once, keep schemas predictable, and avoid provider-specific tool payload drift.
Create a lightweight TypeScript LLM gateway
Combine middleware, routing, and failover so your app can make model decisions without adopting a heavier agent framework.
Add safety checks to streaming AI output
Use createHallucinationGuard to monitor semantic drift, annotate provider metadata, and stop unstable streams early.
Run local and hosted models side by side
Use Ollama for local experimentation and hosted providers for production paths or fallbacks.
Unify LLM vs other TypeScript AI SDKs
This section exists for real developer intent: many users are actively searching for a Vercel AI SDK alternative, LangChain.js alternative, or a more focused multi-provider LLM SDK for TypeScript.
| Tool | Best when you want | Tradeoff | | --- | --- | --- | | Unify LLM | One API for multiple providers, middleware, routing, tool calling, and safety primitives | Smaller ecosystem than the largest framework players | | Vercel AI SDK | Tight UI integration for web apps, especially React/Next.js streaming experiences | Less centered on experimental routing and safety middleware primitives | | LangChain.js | Large ecosystem of chains, integrations, and agent abstractions | Heavier abstraction layer if you mainly want direct provider control |
When Unify LLM is a strong choice
- you want a TypeScript SDK for OpenAI, Anthropic, Gemini, and Ollama
- you care about middleware, tool calling, and provider normalization
- you want routing and safety controls without building them all from scratch
- you prefer direct programmatic control over a large framework stack
Plain-English naming guide
Some modules still keep research-style or legacy names for backward compatibility. For new code, prefer the clearer aliases below.
| Internal name | Preferred public name | Meaning |
| --- | --- | --- |
| createSemanticMomentumGuardian | createHallucinationGuard | Hallucination and drift guard |
| HallucinationInterceptionAlgorithm | ResponseAnomalyDetector | Response anomaly detector |
| ParetoNavigatorRouter | CostLatencyQualityRouter | Cost/latency/quality router |
| PrimRouter | TopologicalDriftRouter | Topological drift router |
| VonNeumannRouter | BayesianUtilityRouter | Bayesian utility router |
| AstralDysonRouter | ComplexityThresholdRouter | Prompt complexity router |
| semanticFingerprintEngine.ts | semanticFingerprint.ts | Semantic fingerprint helpers |
| topologyPersistence.ts | topologyDrift.ts | Topology drift helpers |
| loopRiskEngine.ts | executionLoopRisk.ts | Execution loop risk helpers |
Examples and repository guide
Useful starting points in this repository:
examples/basic.ts- base client with cache and cost trackingexamples/orchestration.ts- retrieval, prompt templates, and structured parsingexamples/paretoNavigator.ts- multi-objective routing exampleexamples/primRouter.ts- topological drift routing exampleexamples/hallucinationGuard.ts- non-streaming and streaming guard usagebenchmarks/run.ts- local benchmark harnessCONTRIBUTING.md- contributor setup and expectations
FAQ
What is Unify LLM in one sentence?
Unify LLM is a TypeScript SDK that lets you build AI applications across multiple LLM providers with one API for generation, tool calling, middleware, routing, and safety.
Is Unify LLM a LangChain.js replacement?
It can be, depending on your goals. If you want a lighter TypeScript abstraction with direct provider control, middleware, and routing primitives, Unify LLM is a strong option.
Is Unify LLM a Vercel AI SDK alternative?
Yes. If your priority is provider normalization, routing, and middleware rather than UI-focused web framework helpers, Unify LLM is a reasonable alternative.
Can I use local models?
Yes. Unify LLM includes an OllamaProvider, which is useful for local inference, offline experiments, and hybrid local/hosted setups.
Does Unify LLM support structured outputs?
Yes. You can define JSON schema-style response shapes and use them for more predictable parsing and downstream automation.
Does Unify LLM include hallucination protection?
It includes createHallucinationGuard, which monitors semantic drift and can annotate or abort unstable response streams. You should still add normal application-level validation and domain-specific safety checks in production.
Contributing
Contributions are welcome.
- Open an issue for bugs, provider support requests, or documentation gaps
- Include tests when you change routing, middleware, or core request handling
- Include benchmark notes when your change affects performance, safety, or routing behavior
- Start with
CONTRIBUTING.md
License
Released under the MIT License.
