@unified-llm/core
v0.6.9
Published
Unified LLM interface (in-memory).
Maintainers
Readme
@unified-llm/core
A simple way to manipulate multiple LLMs (OpenAI, Anthropic, Google Gemini, DeepSeek, Azure OpenAI, Ollama) with unified interface.
Why this matters:
- One interface for many LLMs: swap providers without changing app code.
- Event-based streaming API: start → text_delta* → stop → error.
- Same field for display: use
response.textfor both chat and stream. - Clean streaming with tools: providers execute tool calls mid-stream; you only receive text.
- Power when you need it: access provider-native payloads via
rawResponseon the final chunk.
Features
- 🤖 Multi-Provider Support - OpenAI, Anthropic Claude, Google Gemini, DeepSeek, Azure OpenAI, Ollama
- ⚡ Event‑Based Streaming API - Unified
start/text_delta/stop/errorevents across providers - 🔧 Function Calling - Execute local functions and integrate external tools
- 📊 Structured Output - Guaranteed JSON schema compliance across all providers
- 💬 Conversation Persistence - SQLite-based chat history and thread management
- 🏠 Local LLM Support - Run models locally with Ollama's OpenAI-compatible API
Installation
npm install @unified-llm/coreSimplest way to chat with multiple LLMs
One tiny interface, many providers. Change only the provider and model.
import { LLMClient } from '@unified-llm/core';
const providers = [
{ provider: 'openai', model: 'gpt-4o-mini', apiKey: process.env.OPENAI_API_KEY },
{ provider: 'anthropic',model: 'claude-3-haiku-20240307', apiKey: process.env.ANTHROPIC_API_KEY },
{ provider: 'google', model: 'gemini-2.5-flash', apiKey: process.env.GOOGLE_API_KEY },
{ provider: 'deepseek', model: 'deepseek-chat', apiKey: process.env.DEEPSEEK_API_KEY },
// Azure and Ollama have dedicated sections below
];
for (const cfg of providers.filter(p => p.apiKey)) {
const client = new LLMClient(cfg as any);
const res = await client.chat({
messages: [{ role: 'user', content: 'Give me one fun fact.' }]
});
console.log(cfg.provider, '→', res.text);
}Streaming Responses
Streaming now uses an event-based streaming API across providers. Instead of provider-specific chunk shapes, you receive structured events with an eventType and optional delta. This replaces the legacy streaming example and is not backward compatible. See docs/streaming-unification.md for the full spec.
Basic Streaming Example
import { LLMClient } from '@unified-llm/core';
const client = new LLMClient({
provider: 'openai',
model: 'gpt-4o-mini',
apiKey: process.env.OPENAI_API_KEY,
systemPrompt: 'You are a helpful assistant that answers questions in Japanese.',
});
const stream = await client.stream({
messages: [
{
id: '1',
role: 'user',
content: 'What are some recommended tourist spots in Osaka?',
createdAt: new Date()
},
],
});
let acc = '';
for await (const ev of stream) {
switch (ev.eventType) {
case 'start':
// initialize UI state if needed
break;
case 'text_delta':
// ev.delta?.text is the incremental piece; ev.text is the accumulator
process.stdout.write(ev.delta?.text ?? '');
acc = ev.text;
break;
case 'stop':
console.log('\nComplete response:', ev.text);
// ev.rawResponse contains provider-native final response (or stream data)
break;
case 'error':
console.error('Stream error:', ev.delta);
break;
}
}Function Calling
The defineTool helper provides type safety for tool definitions, automatically inferring argument and return types from the handler function:
import { LLMClient } from '@unified-llm/core';
import { defineTool } from '@unified-llm/core/tools';
import fs from 'fs/promises';
// Let AI read and analyze any file
const readFile = defineTool({
type: 'function',
function: {
name: 'readFile',
description: 'Read any text file',
parameters: {
type: 'object',
properties: {
filename: { type: 'string', description: 'Name of file to read' }
},
required: ['filename']
}
},
handler: async (args: { filename: string }) => {
const content = await fs.readFile(args.filename, 'utf8');
return content;
}
});
const client = new LLMClient({
provider: 'openai',
model: 'gpt-4o-mini',
apiKey: process.env.OPENAI_API_KEY,
tools: [readFile]
});
// Create a sample log file for demo
await fs.writeFile('app.log', `
[ERROR] 2024-01-15 Database connection timeout
[WARN] 2024-01-15 Memory usage at 89%
[ERROR] 2024-01-15 Failed to authenticate user rhyizm
[ERROR] 2024-01-15 Database connection timeout
[INFO] 2024-01-15 Server restarted
`);
// Ask AI to analyze the log file
const response = await client.chat({
messages: [{
role: 'user',
content: "Read app.log and tell me what's wrong with my application",
createdAt: new Date()
}]
});
console.log(response.message.content);
// AI will read the actual file and give you insights about the errors!Using tools from an MCP server
You can pull tools from an MCP server and pass them directly to LLMClient as function-calling tools. The MCP tool inputSchema maps cleanly to OpenAI/Responses function parameters.
import { Client } from '@modelcontextprotocol/sdk/client/index.js';
import { StreamableHTTPClientTransport } from '@modelcontextprotocol/sdk/client/streamable-http.js';
import { LLMClient } from '@unified-llm/core';
const MCP_SERVER_URL = process.env.MCP_SERVER_URL!; // e.g. http://localhost:3000
async function main() {
// Connect to MCP server (Streamable HTTP example)
const transport = new StreamableHTTPClientTransport(new URL('/mcp', MCP_SERVER_URL));
const mcpClient = new Client(
{ name: 'local-mcp-responses-client', version: '1.0.0' },
{ capabilities: {} }
);
await mcpClient.connect(transport);
// Fetch tools from MCP and adapt to function-calling tools
const toolsList = await mcpClient.listTools();
const tools = toolsList.tools.map((tool) => ({
type: 'function' as const,
function: {
name: tool.name,
description: tool.description,
parameters: tool.inputSchema,
},
}));
const client = new LLMClient({
provider: 'openai',
model: 'gpt-4o-mini',
apiKey: process.env.OPENAI_API_KEY,
tools,
});
const res = await client.chat({
messages: [{
role: 'user',
content: 'Use available tools to help me',
createdAt: new Date(),
}],
});
console.log(res.message.content);
}
main().catch(console.error);Streaming with Function Calls
During streaming, tool calls are handled provider-side for you. When a model requests tool input mid-stream, the provider accumulates the tool call, executes your registered tool handlers, and continues streaming the final assistant text. You only observe text events: start → text_delta* → stop.
import { LLMClient } from '@unified-llm/core';
import { defineTool } from '@unified-llm/core/tools';
const getWeather = defineTool({
type: 'function',
function: {
name: 'getWeather',
description: 'Get current weather for a location',
parameters: {
type: 'object',
properties: {
location: { type: 'string', description: 'City name' }
},
required: ['location']
}
},
handler: async (args: { location: string }) => {
return `Weather in ${args.location}: Sunny, 27°C`;
}
});
const client = new LLMClient({
provider: 'openai',
model: 'gpt-4o-mini',
apiKey: process.env.OPENAI_API_KEY,
tools: [getWeather]
});
const stream = await client.stream({
messages: [{
id: '1',
role: 'user',
content: "What's the weather like in Tokyo?",
createdAt: new Date()
}]
});
for await (const ev of stream) {
if (ev.eventType === 'text_delta') {
process.stdout.write(ev.delta?.text ?? '');
}
if (ev.eventType === 'stop') {
console.log('\nFinal text:', ev.text);
}
}Structured Output
Structured Output ensures that AI responses follow a specific JSON schema format across all supported providers. This is particularly useful for applications that need to parse and process AI responses programmatically.
Basic Structured Output
import { LLMClient, ResponseFormat } from '@unified-llm/core';
// Define the expected response structure
const weatherFormat = new ResponseFormat({
name: 'weather_info',
description: 'Weather information for a location',
schema: {
type: 'object',
properties: {
location: { type: 'string' },
temperature: { type: 'number' },
condition: { type: 'string' },
humidity: { type: 'number' }
},
required: ['location', 'temperature', 'condition']
}
});
const client = new LLMClient({
provider: 'openai',
model: 'gpt-4o-2024-08-06', // Structured output requires specific models
apiKey: process.env.OPENAI_API_KEY
});
const response = await client.chat({
messages: [{
role: 'user',
content: 'What is the weather like in Tokyo today?'
}],
generationConfig: {
responseFormat: weatherFormat
}
});
// Response will be guaranteed to follow the schema
console.log(JSON.parse(response.message.content[0].text));
// Output: { "location": "Tokyo", "temperature": 25, "condition": "Sunny", "humidity": 60 }Multi-Provider Structured Output
The same ResponseFormat works across all providers with automatic conversion:
// Works with OpenAI (uses json_schema format internally)
const openaiClient = new LLMClient({
provider: 'openai',
model: 'gpt-4o-2024-08-06',
apiKey: process.env.OPENAI_API_KEY
});
// Works with Google Gemini (uses responseSchema format internally)
const geminiClient = new LLMClient({
provider: 'google',
model: 'gemini-1.5-pro',
apiKey: process.env.GOOGLE_API_KEY
});
// Works with Anthropic (uses prompt engineering internally)
const claudeClient = new LLMClient({
provider: 'anthropic',
model: 'claude-3-5-sonnet-latest',
apiKey: process.env.ANTHROPIC_API_KEY
});
const userInfoFormat = new ResponseFormat({
name: 'user_profile',
schema: {
type: 'object',
properties: {
name: { type: 'string' },
age: { type: 'number' },
email: { type: 'string' },
interests: {
type: 'array',
items: { type: 'string' }
}
},
required: ['name', 'age', 'email']
}
});
const request = {
messages: [{ role: 'user', content: 'Create a sample user profile' }],
generationConfig: { responseFormat: userInfoFormat }
};
// All three will return structured JSON in the same format
const openaiResponse = await openaiClient.chat(request);
const geminiResponse = await geminiClient.chat(request);
const claudeResponse = await claudeClient.chat(request);Pre-built Response Format Templates
The library provides convenient templates for common structured output patterns:
import { ResponseFormats } from '@unified-llm/core';
// Key-value extraction
const contactFormat = ResponseFormats.keyValue(['name', 'email', 'phone']);
const contactResponse = await client.chat({
messages: [{
role: 'user',
content: 'Extract contact info: John Doe, [email protected], 555-1234'
}],
generationConfig: { responseFormat: contactFormat }
});
// Classification with confidence scores
const sentimentFormat = ResponseFormats.classification(['positive', 'negative', 'neutral']);
const sentimentResponse = await client.chat({
messages: [{
role: 'user',
content: 'Analyze sentiment: "I absolutely love this new feature!"'
}],
generationConfig: { responseFormat: sentimentFormat }
});
// Returns: { "category": "positive", "confidence": 0.95 }
// List responses
const taskFormat = ResponseFormats.list({
type: 'object',
properties: {
task: { type: 'string' },
priority: { type: 'string', enum: ['high', 'medium', 'low'] },
deadline: { type: 'string' }
}
});
const taskResponse = await client.chat({
messages: [{
role: 'user',
content: 'Create a task list for launching a mobile app'
}],
generationConfig: { responseFormat: taskFormat }
});
// Returns: { "items": [{ "task": "Design UI", "priority": "high", "deadline": "2024-02-01" }, ...] }Complex Nested Schemas
const productReviewFormat = new ResponseFormat({
name: 'product_review',
schema: {
type: 'object',
properties: {
rating: { type: 'number', minimum: 1, maximum: 5 },
summary: { type: 'string' },
pros: {
type: 'array',
items: { type: 'string' }
},
cons: {
type: 'array',
items: { type: 'string' }
},
recommendation: {
type: 'object',
properties: {
wouldRecommend: { type: 'boolean' },
targetAudience: { type: 'string' },
alternatives: {
type: 'array',
items: { type: 'string' }
}
}
}
},
required: ['rating', 'summary', 'pros', 'cons', 'recommendation']
}
});
const reviewResponse = await client.chat({
messages: [{
role: 'user',
content: 'Review this smartphone: iPhone 15 Pro - great camera, expensive, good battery life'
}],
generationConfig: { responseFormat: productReviewFormat }
});Provider-Specific Notes
- OpenAI: Supports native structured outputs with
gpt-4o-2024-08-06and newer models - Google Gemini: Uses
responseMimeType: 'application/json'withresponseSchema - Anthropic: Uses prompt engineering to request JSON format responses
- DeepSeek: Similar to OpenAI, supports JSON mode
The ResponseFormat class automatically handles the conversion to each provider's specific format, ensuring consistent behavior across all supported LLMs.
Multi-Provider Example
import { LLMClient } from '@unified-llm/core';
// Create LLM clients for different providers
const gpt = new LLMClient({
provider: 'openai',
model: 'gpt-4o-mini',
apiKey: process.env.OPENAI_API_KEY,
systemPrompt: 'You are a helpful assistant that answers concisely.'
});
const claude = new LLMClient({
provider: 'anthropic',
model: 'claude-3-haiku-20240307',
apiKey: process.env.ANTHROPIC_API_KEY,
systemPrompt: 'You are a thoughtful assistant that provides detailed explanations.'
});
const gemini = new LLMClient({
provider: 'google',
model: 'gemini-2.0-flash',
apiKey: process.env.GOOGLE_API_KEY,
systemPrompt: 'You are a creative assistant that thinks outside the box.'
});
const deepseek = new LLMClient({
provider: 'deepseek',
model: 'deepseek-chat',
apiKey: process.env.DEEPSEEK_API_KEY,
systemPrompt: 'You are a technical assistant specialized in coding.'
});
// Use the unified chat interface
const request = {
messages: [{
id: '1',
role: 'user',
content: 'What are your thoughts on AI?',
createdAt: new Date()
}]
};
// Each provider will respond according to their system prompt
const gptResponse = await gpt.chat(request);
const claudeResponse = await claude.chat(request);
const geminiResponse = await gemini.chat(request);
const deepseekResponse = await deepseek.chat(request);Multi-Provider Streaming
Streaming works consistently across all supported providers using the unified event model:
const providers = [
{ name: 'OpenAI', provider: 'openai', model: 'gpt-4o-mini' },
{ name: 'Claude', provider: 'anthropic', model: 'claude-3-haiku-20240307' },
{ name: 'Gemini', provider: 'google', model: 'gemini-2.0-flash' },
{ name: 'DeepSeek', provider: 'deepseek', model: 'deepseek-chat' }
];
for (const config of providers) {
const client = new LLMClient({
provider: config.provider as any,
model: config.model,
apiKey: process.env[`${config.provider.toUpperCase()}_API_KEY`]
});
console.log(`\n--- ${config.name} Response ---`);
const stream = await client.stream({
messages: [{
id: '1',
role: 'user',
content: 'Tell me a short story about AI.',
createdAt: new Date()
}]
});
for await (const ev of stream) {
if (ev.eventType === 'text_delta') {
process.stdout.write(ev.delta?.text ?? '');
}
}
}Unified Response Format
All providers return responses in a consistent format, making it easy to switch between different LLMs:
Chat Response Format
{
id: "chatcmpl-Blub8EgOvVaP7c3lxzmVF4TJpVCun",
model: "gpt-4o-mini",
provider: "openai",
message: {
id: "msg_1750758679093_r9hqdhfzh",
role: "assistant",
content: [
{
type: "text",
text: "The author of this project is rhyizm."
}
],
createdAt: "2025-06-24T09:51:19.093Z"
},
usage: {
inputTokens: 72,
outputTokens: 10,
totalTokens: 82
},
finish_reason: "stop",
createdAt: "2025-06-24T09:51:18.000Z",
rawResponse: {
/* Original response from the provider (as returned by OpenAI, Anthropic, Google, DeepSeek, etc.) */
}
}Stream Response Format
Each streaming event mirrors UnifiedChatResponse with a few additions:
{
id: "chatcmpl-example",
model: "gpt-4o-mini",
provider: "openai",
message: {
id: "msg_example",
role: "assistant",
content: [
{
type: "text",
text: "Chunk of text..."
}
],
createdAt: "2025-01-01T00:00:00.000Z"
},
// createdAt is optional in streaming events
eventType: "text_delta", // one of: start | text_delta | stop | error
outputIndex: 0,
delta: { type: "text", text: "Chunk of text..." }
}
On the final `stop` event:
- `finish_reason` may be present (e.g., "stop", "length").
- `usage` may be present when available.
- `rawResponse` contains the provider-native final result. For streaming providers that don’t return a single object, it contains native stream data (e.g., an array of SSE chunks). For Gemini, it includes both `{ stream, response }`.Key benefits:
- Consistent structure across all providers (OpenAI, Anthropic, Google, DeepSeek, Azure)
- Event-based streaming with
eventTypeanddeltafor incremental text - Unified usage tracking when the provider reports it
- Provider identification to know which service generated the response
- Raw response access on the final chunk for provider-specific features
Persistent LLM Client Configuration
import { LLMClient } from '@unified-llm/core';
// Save LLM client configuration
const savedClientId = await LLMClient.save({
name: 'My AI Assistant',
provider: 'openai',
model: 'gpt-4o-mini',
systemPrompt: 'You are a helpful coding assistant.',
tags: ['development', 'coding'],
isActive: true
});
// Load saved LLM client
const client = await LLMClient.fromSaved(
savedClientId,
process.env.OPENAI_API_KEY
);
// List all saved LLM clients
const clients = await LLMClient.list({
provider: 'openai',
includeInactive: false
});Azure OpenAI Example
Azure OpenAI requires a different initialization pattern compared to other providers:
import { AzureOpenAIProvider } from '@unified-llm/core/providers/azure';
// Azure OpenAI uses a different constructor pattern
// First parameter: Azure-specific configuration
// Second parameter: Base options (apiKey, tools, etc.)
const azureOpenAI = new AzureOpenAIProvider(
{
endpoint: process.env.AZURE_OPENAI_ENDPOINT!, // Azure resource endpoint
deployment: process.env.AZURE_OPENAI_DEPLOYMENT!, // Model deployment name
apiVersion: '2024-10-21', // Optional, defaults to 'preview'
useV1: true // Use /openai/v1 endpoint format
},
{
apiKey: process.env.AZURE_OPENAI_KEY!,
tools: [] // Optional tools
}
);
// Compare with standard LLMClient initialization:
// const client = new LLMClient({
// provider: 'openai',
// model: 'gpt-4o-mini',
// apiKey: process.env.OPENAI_API_KEY
// });
// Use it like any other provider
const response = await azureOpenAI.chat({
messages: [{
id: '1',
role: 'user',
content: 'Hello from Azure!',
createdAt: new Date()
}]
});Ollama & OpenAI-Compatible APIs
Ollama (Local LLM) Example
Ollama provides an OpenAI-compatible API, allowing you to run large language models locally. You can use either provider: 'ollama'.
Prerequisites
- Install Ollama from ollama.ai
- Pull a model:
ollama pull llama3(or any other model) - Start Ollama server (usually runs automatically at
http://localhost:11434)
Basic Usage
import { LLMClient } from '@unified-llm/core';
// Ollama configuration - no API key required
const ollama = new LLMClient({
provider: 'ollama',
model: 'llama3', // or 'mistral', 'codellama', etc.
baseURL: 'http://localhost:11434/v1', // Optional - this is the default
systemPrompt: 'You are a helpful assistant running locally.'
});
// Use it just like any other provider
const response = await ollama.chat({
messages: [{
id: '1',
role: 'user',
content: 'Explain quantum computing in simple terms.',
createdAt: new Date()
}]
});
console.log(response.message.content);Remote Ollama Server
If you're running Ollama on a different machine or port:
const remoteOllama = new LLMClient({
provider: 'ollama',
model: 'llama3',
baseURL: 'http://your-server:11434/v1' // Replace with your server address
});Available Models
Popular models you can use with Ollama:
llama3- Meta's Llama 3mistral- Mistral AI's modelscodellama- Code-focused Llama variantphi- Microsoft's Phi modelsgemma- Google's Gemma modelsmixtral- Mixture of experts model
Check available models with: ollama list
Why Use Ollama Provider?
While Ollama is OpenAI-compatible and could work with provider: 'openai', using provider: 'ollama' offers:
- Clearer intent - Makes it obvious you're using a local model
- No API key required - Ollama doesn't need authentication
- Future compatibility - If we add Ollama-specific features, your code won't need changes
Running Examples
You can run TypeScript examples directly with tsx:
# Add tsx to your project
npm install --save-dev tsx
# Run the example
npx tsx example.tsExample file (example.ts):
import { LLMClient } from '@unified-llm/core';
async function runOllamaExample() {
const ollamaClient = new LLMClient({
provider: 'ollama',
model: 'llama3',
baseURL: 'http://localhost:11434/v1'
});
const response = await ollamaClient.chat({
messages: [{
id: '1',
role: 'user',
content: 'Hello, introduce yourself in Japanese.',
createdAt: new Date()
}]
});
console.log('Ollama Response:', JSON.stringify(response, null, 2));
}
runOllamaExample().catch(console.error);Environment Variables
OPENAI_API_KEY=your-openai-key
ANTHROPIC_API_KEY=your-anthropic-key
GOOGLE_API_KEY=your-google-key
DEEPSEEK_API_KEY=your-deepseek-key
AZURE_OPENAI_KEY=your-azure-key # For Azure OpenAI
AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com # For Azure OpenAI
AZURE_OPENAI_DEPLOYMENT=your-deployment-name # For Azure OpenAI
UNIFIED_LLM_DB_PATH=./chat-history.db # Optional custom DB pathSupported Providers
| Provider | Models | Features | |----------|---------|----------| | OpenAI | GPT-4o, GPT-4o-mini, GPT-4, GPT-3.5 | Function calling, streaming, vision, structured output | | Anthropic | Claude 3.5 (Sonnet), Claude 3 (Opus, Sonnet, Haiku) | Tool use, streaming, long context, structured output | | Google | Gemini 2.0 Flash, Gemini 1.5 Pro/Flash | Function calling, multimodal, structured output | | DeepSeek | DeepSeek-Chat, DeepSeek-Coder | Function calling, streaming, code generation, structured output | | Azure OpenAI | GPT-4o, GPT-4, GPT-3.5 (via Azure deployments) | Function calling, streaming, structured output | | Ollama | Llama 3, Mistral, CodeLlama, Phi, Gemma, Mixtral, etc. | Local execution, OpenAI-compatible API, no API key required |
API Methods
Core Methods
// Main chat method - returns complete response
await client.chat(request: UnifiedChatRequest)
// Streaming responses (returns async generator)
for await (const chunk of client.stream(request)) {
console.log(chunk);
}Note: Persistence methods are experimental; there is a possibility that they may be removed or moved to a separate package in the future.
Persistence Methods
// Save LLM client configuration
await LLMClient.save(config: LLMClientConfig)
// Load saved LLM client
await LLMClient.fromSaved(id: string, apiKey?: string)
// Get saved configuration
await LLMClient.getConfig(id: string)
// List saved LLM clients
await LLMClient.list(options?: { provider?: string, includeInactive?: boolean })
// Update configuration
await LLMClient.update(id: string, updates: Partial<LLMClientConfig>)
// Soft delete LLM client
await LLMClient.delete(id: string)Requirements
- Node.js 20 or higher
- TypeScript 5.4.5 or higher (for development)
License
MIT - see LICENSE for details.
