@unified-llm/core

v0.6.3

Published

3 months ago

Unified LLM interface (in-memory).

0High
0Medium
0Low

rhyizm

LLM AI Unified LLM unified-llm Large Language Model AI Agent Agent OpenAI Gemini Claude DeepSeek Azure OpenAI

@unified-llm/core

A simple way to manipulate multiple LLMs (OpenAI, Anthropic, Google Gemini, DeepSeek, Azure OpenAI, Ollama) with unified interface.

Why this matters:

One interface for many LLMs: swap providers without changing app code.
Event-based streaming API: start → text_delta* → stop → error.
Same field for display: use response.text for both chat and stream.
Clean streaming with tools: providers execute tool calls mid-stream; you only receive text.
Power when you need it: access provider-native payloads via rawResponse on the final chunk.

Features

🤖 Multi-Provider Support - OpenAI, Anthropic Claude, Google Gemini, DeepSeek, Azure OpenAI, Ollama
⚡ Event‑Based Streaming API - Unified start/text_delta/stop/error events across providers
🔧 Function Calling - Execute local functions and integrate external tools
📊 Structured Output - Guaranteed JSON schema compliance across all providers
💬 Conversation Persistence - SQLite-based chat history and thread management
🏠 Local LLM Support - Run models locally with Ollama's OpenAI-compatible API

Installation

npm install @unified-llm/core

Simplest way to chat with multiple LLMs

One tiny interface, many providers. Change only the provider and model.

import { LLMClient } from '@unified-llm/core';

const providers = [
  { provider: 'openai',   model: 'gpt-4o-mini',            apiKey: process.env.OPENAI_API_KEY },
  { provider: 'anthropic',model: 'claude-3-haiku-20240307', apiKey: process.env.ANTHROPIC_API_KEY },
  { provider: 'google',   model: 'gemini-2.5-flash',        apiKey: process.env.GOOGLE_API_KEY },
  { provider: 'deepseek', model: 'deepseek-chat',           apiKey: process.env.DEEPSEEK_API_KEY },
  // Azure and Ollama have dedicated sections below
];

for (const cfg of providers.filter(p => p.apiKey)) {
  const client = new LLMClient(cfg as any);
  const res = await client.chat({
    messages: [{ role: 'user', content: 'Give me one fun fact.' }]
  });
  console.log(cfg.provider, '→', res.text);
}

Streaming Responses

Streaming now uses an event-based streaming API across providers. Instead of provider-specific chunk shapes, you receive structured events with an eventType and optional delta. This replaces the legacy streaming example and is not backward compatible. See docs/streaming-unification.md for the full spec.

Basic Streaming Example

import { LLMClient } from '@unified-llm/core';

const client = new LLMClient({
  provider: 'openai',
  model: 'gpt-4o-mini',
  apiKey: process.env.OPENAI_API_KEY,
  systemPrompt: 'You are a helpful assistant that answers questions in Japanese.',
});

const stream = await client.stream({
  messages: [
    {
      id: '1',
      role: 'user',
      content: 'What are some recommended tourist spots in Osaka?',
      createdAt: new Date()
    },
  ],
});

let acc = '';
for await (const ev of stream) {
  switch (ev.eventType) {
    case 'start':
      // initialize UI state if needed
      break;
    case 'text_delta':
      // ev.delta?.text is the incremental piece; ev.text is the accumulator
      process.stdout.write(ev.delta?.text ?? '');
      acc = ev.text;
      break;
    case 'stop':
      console.log('\nComplete response:', ev.text);
      // ev.rawResponse contains provider-native final response (or stream data)
      break;
    case 'error':
      console.error('Stream error:', ev.delta);
      break;
  }
}

Function Calling

The defineTool helper provides type safety for tool definitions, automatically inferring argument and return types from the handler function:

import { LLMClient } from '@unified-llm/core';
import { defineTool } from '@unified-llm/core/tools';
import fs from 'fs/promises';

// Let AI read and analyze any file
const readFile = defineTool({
  type: 'function',
  function: {
    name: 'readFile',
    description: 'Read any text file',
    parameters: {
      type: 'object',
      properties: {
        filename: { type: 'string', description: 'Name of file to read' }
      },
      required: ['filename']
    }
  },
  handler: async (args: { filename: string }) => {
    const content = await fs.readFile(args.filename, 'utf8');
    return content;
  }
});

const client = new LLMClient({
  provider: 'openai',
  model: 'gpt-4o-mini',
  apiKey: process.env.OPENAI_API_KEY,
  tools: [readFile]
});

// Create a sample log file for demo
await fs.writeFile('app.log', `
[ERROR] 2024-01-15 Database connection timeout
[WARN]  2024-01-15 Memory usage at 89%
[ERROR] 2024-01-15 Failed to authenticate user rhyizm
[ERROR] 2024-01-15 Database connection timeout
[INFO]  2024-01-15 Server restarted
`);

// Ask AI to analyze the log file
const response = await client.chat({
  messages: [{
    role: 'user',
    content: "Read app.log and tell me what's wrong with my application",
    createdAt: new Date()
  }]
});

console.log(response.message.content);
// AI will read the actual file and give you insights about the errors!

Streaming with Function Calls

During streaming, tool calls are handled provider-side for you. When a model requests tool input mid-stream, the provider accumulates the tool call, executes your registered tool handlers, and continues streaming the final assistant text. You only observe text events: start → text_delta* → stop.

import { LLMClient } from '@unified-llm/core';
import { defineTool } from '@unified-llm/core/tools';

const getWeather = defineTool({
  type: 'function',
  function: {
    name: 'getWeather',
    description: 'Get current weather for a location',
    parameters: {
      type: 'object',
      properties: {
        location: { type: 'string', description: 'City name' }
      },
      required: ['location']
    }
  },
  handler: async (args: { location: string }) => {
    return `Weather in ${args.location}: Sunny, 27°C`;
  }
});

const client = new LLMClient({
  provider: 'openai',
  model: 'gpt-4o-mini',
  apiKey: process.env.OPENAI_API_KEY,
  tools: [getWeather]
});

const stream = await client.stream({
  messages: [{
    id: '1',
    role: 'user',
    content: "What's the weather like in Tokyo?",
    createdAt: new Date()
  }]
});

for await (const ev of stream) {
  if (ev.eventType === 'text_delta') {
    process.stdout.write(ev.delta?.text ?? '');
  }
  if (ev.eventType === 'stop') {
    console.log('\nFinal text:', ev.text);
  }
}

Structured Output

Structured Output ensures that AI responses follow a specific JSON schema format across all supported providers. This is particularly useful for applications that need to parse and process AI responses programmatically.

Basic Structured Output

import { LLMClient, ResponseFormat } from '@unified-llm/core';

// Define the expected response structure
const weatherFormat = new ResponseFormat({
  name: 'weather_info',
  description: 'Weather information for a location',
  schema: {
    type: 'object',
    properties: {
      location: { type: 'string' },
      temperature: { type: 'number' },
      condition: { type: 'string' },
      humidity: { type: 'number' }
    },
    required: ['location', 'temperature', 'condition']
  }
});

const client = new LLMClient({
  provider: 'openai',
  model: 'gpt-4o-2024-08-06', // Structured output requires specific models
  apiKey: process.env.OPENAI_API_KEY
});

const response = await client.chat({
  messages: [{
    role: 'user',
    content: 'What is the weather like in Tokyo today?'
  }],
  generationConfig: {
    responseFormat: weatherFormat
  }
});

// Response will be guaranteed to follow the schema
console.log(JSON.parse(response.message.content[0].text));
// Output: { "location": "Tokyo", "temperature": 25, "condition": "Sunny", "humidity": 60 }

Multi-Provider Structured Output

The same ResponseFormat works across all providers with automatic conversion:

// Works with OpenAI (uses json_schema format internally)
const openaiClient = new LLMClient({
  provider: 'openai',
  model: 'gpt-4o-2024-08-06',
  apiKey: process.env.OPENAI_API_KEY
});

// Works with Google Gemini (uses responseSchema format internally)
const geminiClient = new LLMClient({
  provider: 'google',
  model: 'gemini-1.5-pro',
  apiKey: process.env.GOOGLE_API_KEY
});

// Works with Anthropic (uses prompt engineering internally)
const claudeClient = new LLMClient({
  provider: 'anthropic',
  model: 'claude-3-5-sonnet-latest',
  apiKey: process.env.ANTHROPIC_API_KEY
});

const userInfoFormat = new ResponseFormat({
  name: 'user_profile',
  schema: {
    type: 'object',
    properties: {
      name: { type: 'string' },
      age: { type: 'number' },
      email: { type: 'string' },
      interests: {
        type: 'array',
        items: { type: 'string' }
      }
    },
    required: ['name', 'age', 'email']
  }
});

const request = {
  messages: [{ role: 'user', content: 'Create a sample user profile' }],
  generationConfig: { responseFormat: userInfoFormat }
};

// All three will return structured JSON in the same format
const openaiResponse = await openaiClient.chat(request);
const geminiResponse = await geminiClient.chat(request);
const claudeResponse = await claudeClient.chat(request);

Pre-built Response Format Templates

The library provides convenient templates for common structured output patterns:

import { ResponseFormats } from '@unified-llm/core';

// Key-value extraction
const contactFormat = ResponseFormats.keyValue(['name', 'email', 'phone']);

const contactResponse = await client.chat({
  messages: [{
    role: 'user',
    content: 'Extract contact info: John Doe, [email protected], 555-1234'
  }],
  generationConfig: { responseFormat: contactFormat }
});

// Classification with confidence scores
const sentimentFormat = ResponseFormats.classification(['positive', 'negative', 'neutral']);

const sentimentResponse = await client.chat({
  messages: [{
    role: 'user',
    content: 'Analyze sentiment: "I absolutely love this new feature!"'
  }],
  generationConfig: { responseFormat: sentimentFormat }
});
// Returns: { "category": "positive", "confidence": 0.95 }

// List responses
const taskFormat = ResponseFormats.list({
  type: 'object',
  properties: {
    task: { type: 'string' },
    priority: { type: 'string', enum: ['high', 'medium', 'low'] },
    deadline: { type: 'string' }
  }
});

const taskResponse = await client.chat({
  messages: [{
    role: 'user',
    content: 'Create a task list for launching a mobile app'
  }],
  generationConfig: { responseFormat: taskFormat }
});
// Returns: { "items": [{ "task": "Design UI", "priority": "high", "deadline": "2024-02-01" }, ...] }

Complex Nested Schemas

const productReviewFormat = new ResponseFormat({
  name: 'product_review',
  schema: {
    type: 'object',
    properties: {
      rating: { type: 'number', minimum: 1, maximum: 5 },
      summary: { type: 'string' },
      pros: {
        type: 'array',
        items: { type: 'string' }
      },
      cons: {
        type: 'array',
        items: { type: 'string' }
      },
      recommendation: {
        type: 'object',
        properties: {
          wouldRecommend: { type: 'boolean' },
          targetAudience: { type: 'string' },
          alternatives: {
            type: 'array',
            items: { type: 'string' }
          }
        }
      }
    },
    required: ['rating', 'summary', 'pros', 'cons', 'recommendation']
  }
});

const reviewResponse = await client.chat({
  messages: [{
    role: 'user',
    content: 'Review this smartphone: iPhone 15 Pro - great camera, expensive, good battery life'
  }],
  generationConfig: { responseFormat: productReviewFormat }
});

Provider-Specific Notes

OpenAI: Supports native structured outputs with gpt-4o-2024-08-06 and newer models
Google Gemini: Uses responseMimeType: 'application/json' with responseSchema
Anthropic: Uses prompt engineering to request JSON format responses
DeepSeek: Similar to OpenAI, supports JSON mode

The ResponseFormat class automatically handles the conversion to each provider's specific format, ensuring consistent behavior across all supported LLMs.

Multi-Provider Example

import { LLMClient } from '@unified-llm/core';

// Create LLM clients for different providers
const gpt = new LLMClient({
  provider: 'openai',
  model: 'gpt-4o-mini',
  apiKey: process.env.OPENAI_API_KEY,
  systemPrompt: 'You are a helpful assistant that answers concisely.'
});

const claude = new LLMClient({
  provider: 'anthropic', 
  model: 'claude-3-haiku-20240307',
  apiKey: process.env.ANTHROPIC_API_KEY,
  systemPrompt: 'You are a thoughtful assistant that provides detailed explanations.'
});

const gemini = new LLMClient({
  provider: 'google',
  model: 'gemini-2.0-flash',
  apiKey: process.env.GOOGLE_API_KEY,
  systemPrompt: 'You are a creative assistant that thinks outside the box.'
});

const deepseek = new LLMClient({
  provider: 'deepseek',
  model: 'deepseek-chat',
  apiKey: process.env.DEEPSEEK_API_KEY,
  systemPrompt: 'You are a technical assistant specialized in coding.'
});

// Use the unified chat interface
const request = {
  messages: [{
    id: '1',
    role: 'user',
    content: 'What are your thoughts on AI?',
    createdAt: new Date()
  }]
};

// Each provider will respond according to their system prompt
const gptResponse = await gpt.chat(request);
const claudeResponse = await claude.chat(request);
const geminiResponse = await gemini.chat(request);
const deepseekResponse = await deepseek.chat(request);

Multi-Provider Streaming

Streaming works consistently across all supported providers using the unified event model:

const providers = [
  { name: 'OpenAI', provider: 'openai', model: 'gpt-4o-mini' },
  { name: 'Claude', provider: 'anthropic', model: 'claude-3-haiku-20240307' },
  { name: 'Gemini', provider: 'google', model: 'gemini-2.0-flash' },
  { name: 'DeepSeek', provider: 'deepseek', model: 'deepseek-chat' }
];

for (const config of providers) {
  const client = new LLMClient({
    provider: config.provider as any,
    model: config.model,
    apiKey: process.env[`${config.provider.toUpperCase()}_API_KEY`]
  });

  console.log(`\n--- ${config.name} Response ---`);
  const stream = await client.stream({
    messages: [{
      id: '1',
      role: 'user',
      content: 'Tell me a short story about AI.',
      createdAt: new Date()
    }]
  });

  for await (const ev of stream) {
    if (ev.eventType === 'text_delta') {
      process.stdout.write(ev.delta?.text ?? '');
    }
  }
}

Unified Response Format

All providers return responses in a consistent format, making it easy to switch between different LLMs:

Chat Response Format

{
  id: "chatcmpl-Blub8EgOvVaP7c3lxzmVF4TJpVCun",
  model: "gpt-4o-mini",
  provider: "openai",
  message: {
    id: "msg_1750758679093_r9hqdhfzh",
    role: "assistant",
    content: [
      {
        type: "text",
        text: "The author of this project is rhyizm."
      }
    ],
    createdAt: "2025-06-24T09:51:19.093Z"
  },
  usage: {
    inputTokens: 72,
    outputTokens: 10,
    totalTokens: 82
  },
  finish_reason: "stop",
  createdAt: "2025-06-24T09:51:18.000Z",
  rawResponse: {
    /* Original response from the provider (as returned by OpenAI, Anthropic, Google, DeepSeek, etc.) */
  }
}

Stream Response Format

Each streaming event mirrors UnifiedChatResponse with a few additions:

{
  id: "chatcmpl-example",
  model: "gpt-4o-mini", 
  provider: "openai",
  message: {
    id: "msg_example",
    role: "assistant",
    content: [
      {
        type: "text",
        text: "Chunk of text..."
      }
    ],
    createdAt: "2025-01-01T00:00:00.000Z"
  },
  // createdAt is optional in streaming events
  eventType: "text_delta", // one of: start | text_delta | stop | error
  outputIndex: 0,
  delta: { type: "text", text: "Chunk of text..." }
}

On the final `stop` event:
- `finish_reason` may be present (e.g., "stop", "length").
- `usage` may be present when available.
- `rawResponse` contains the provider-native final result. For streaming providers that don’t return a single object, it contains native stream data (e.g., an array of SSE chunks). For Gemini, it includes both `{ stream, response }`.

Key benefits:

Consistent structure across all providers (OpenAI, Anthropic, Google, DeepSeek, Azure)
Event-based streaming with eventType and delta for incremental text
Unified usage tracking when the provider reports it
Provider identification to know which service generated the response
Raw response access on the final chunk for provider-specific features

Persistent LLM Client Configuration

import { LLMClient } from '@unified-llm/core';

// Save LLM client configuration
const savedClientId = await LLMClient.save({
  name: 'My AI Assistant',
  provider: 'openai',
  model: 'gpt-4o-mini',
  systemPrompt: 'You are a helpful coding assistant.',
  tags: ['development', 'coding'],
  isActive: true
});

// Load saved LLM client
const client = await LLMClient.fromSaved(
  savedClientId,
  process.env.OPENAI_API_KEY
);

// List all saved LLM clients
const clients = await LLMClient.list({
  provider: 'openai',
  includeInactive: false
});

Azure OpenAI Example

Azure OpenAI requires a different initialization pattern compared to other providers:

import { AzureOpenAIProvider } from '@unified-llm/core/providers/azure';

// Azure OpenAI uses a different constructor pattern
// First parameter: Azure-specific configuration
// Second parameter: Base options (apiKey, tools, etc.)
const azureOpenAI = new AzureOpenAIProvider(
  {
    endpoint: process.env.AZURE_OPENAI_ENDPOINT!,        // Azure resource endpoint
    deployment: process.env.AZURE_OPENAI_DEPLOYMENT!,   // Model deployment name
    apiVersion: '2024-10-21',  // Optional, defaults to 'preview'
    useV1: true                // Use /openai/v1 endpoint format
  },
  {
    apiKey: process.env.AZURE_OPENAI_KEY!,
    tools: []  // Optional tools
  }
);

// Compare with standard LLMClient initialization:
// const client = new LLMClient({
//   provider: 'openai',
//   model: 'gpt-4o-mini',
//   apiKey: process.env.OPENAI_API_KEY
// });

// Use it like any other provider
const response = await azureOpenAI.chat({
  messages: [{
    id: '1',
    role: 'user',
    content: 'Hello from Azure!',
    createdAt: new Date()
  }]
});

Ollama & OpenAI-Compatible APIs

Ollama (Local LLM) Example

Ollama provides an OpenAI-compatible API, allowing you to run large language models locally. You can use either provider: 'ollama'.

Prerequisites

Install Ollama from ollama.ai
Pull a model: ollama pull llama3 (or any other model)
Start Ollama server (usually runs automatically at http://localhost:11434)

Basic Usage

import { LLMClient } from '@unified-llm/core';

// Ollama configuration - no API key required
const ollama = new LLMClient({
  provider: 'ollama',
  model: 'llama3',  // or 'mistral', 'codellama', etc.
  baseURL: 'http://localhost:11434/v1',  // Optional - this is the default
  systemPrompt: 'You are a helpful assistant running locally.'
});

// Use it just like any other provider
const response = await ollama.chat({
  messages: [{
    id: '1',
    role: 'user',
    content: 'Explain quantum computing in simple terms.',
    createdAt: new Date()
  }]
});

console.log(response.message.content);

Remote Ollama Server

If you're running Ollama on a different machine or port:

const remoteOllama = new LLMClient({
  provider: 'ollama',
  model: 'llama3',
  baseURL: 'http://your-server:11434/v1'  // Replace with your server address
});

Available Models

Popular models you can use with Ollama:

llama3 - Meta's Llama 3
mistral - Mistral AI's models
codellama - Code-focused Llama variant
phi - Microsoft's Phi models
gemma - Google's Gemma models
mixtral - Mixture of experts model

Check available models with: ollama list

Why Use Ollama Provider?

While Ollama is OpenAI-compatible and could work with provider: 'openai', using provider: 'ollama' offers:

Clearer intent - Makes it obvious you're using a local model
No API key required - Ollama doesn't need authentication
Future compatibility - If we add Ollama-specific features, your code won't need changes

Running Examples

You can run TypeScript examples directly with tsx:

# Add tsx to your project
npm install --save-dev tsx

# Run the example
npx tsx example.ts

Example file (example.ts):

import { LLMClient } from '@unified-llm/core';

async function runOllamaExample() {
  const ollamaClient = new LLMClient({
    provider: 'ollama',
    model: 'llama3',
    baseURL: 'http://localhost:11434/v1'
  });

  const response = await ollamaClient.chat({
    messages: [{
      id: '1',
      role: 'user',
      content: 'Hello, introduce yourself in Japanese.',
      createdAt: new Date()
    }]
  });

  console.log('Ollama Response:', JSON.stringify(response, null, 2));
}

runOllamaExample().catch(console.error);

Environment Variables

OPENAI_API_KEY=your-openai-key
ANTHROPIC_API_KEY=your-anthropic-key
GOOGLE_API_KEY=your-google-key
DEEPSEEK_API_KEY=your-deepseek-key
AZURE_OPENAI_KEY=your-azure-key  # For Azure OpenAI
AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com  # For Azure OpenAI
AZURE_OPENAI_DEPLOYMENT=your-deployment-name  # For Azure OpenAI
UNIFIED_LLM_DB_PATH=./chat-history.db  # Optional custom DB path

Supported Providers

| Provider | Models | Features | |----------|---------|----------| | OpenAI | GPT-4o, GPT-4o-mini, GPT-4, GPT-3.5 | Function calling, streaming, vision, structured output | | Anthropic | Claude 3.5 (Sonnet), Claude 3 (Opus, Sonnet, Haiku) | Tool use, streaming, long context, structured output | | Google | Gemini 2.0 Flash, Gemini 1.5 Pro/Flash | Function calling, multimodal, structured output | | DeepSeek | DeepSeek-Chat, DeepSeek-Coder | Function calling, streaming, code generation, structured output | | Azure OpenAI | GPT-4o, GPT-4, GPT-3.5 (via Azure deployments) | Function calling, streaming, structured output | | Ollama | Llama 3, Mistral, CodeLlama, Phi, Gemma, Mixtral, etc. | Local execution, OpenAI-compatible API, no API key required |

API Methods

Core Methods

// Main chat method - returns complete response
await client.chat(request: UnifiedChatRequest)

// Streaming responses (returns async generator)
for await (const chunk of client.stream(request)) {
  console.log(chunk);
}

Note: Persistence methods are experimental; there is a possibility that they may be removed or moved to a separate package in the future.

Persistence Methods

// Save LLM client configuration
await LLMClient.save(config: LLMClientConfig)

// Load saved LLM client
await LLMClient.fromSaved(id: string, apiKey?: string)

// Get saved configuration
await LLMClient.getConfig(id: string)

// List saved LLM clients
await LLMClient.list(options?: { provider?: string, includeInactive?: boolean })

// Update configuration
await LLMClient.update(id: string, updates: Partial<LLMClientConfig>)

// Soft delete LLM client
await LLMClient.delete(id: string)

Requirements

Node.js 20 or higher
TypeScript 5.4.5 or higher (for development)

License

MIT - see LICENSE for details.

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

@unified-llm/core

Features

Installation

Simplest way to chat with multiple LLMs

Streaming Responses

Basic Streaming Example

Function Calling

Streaming with Function Calls

Structured Output

Basic Structured Output

Multi-Provider Structured Output

Pre-built Response Format Templates

Complex Nested Schemas

Provider-Specific Notes

Multi-Provider Example

Multi-Provider Streaming

Unified Response Format

Chat Response Format

Stream Response Format

Persistent LLM Client Configuration

Azure OpenAI Example

Ollama & OpenAI-Compatible APIs

Ollama (Local LLM) Example

Prerequisites

Basic Usage

Remote Ollama Server

Available Models

Why Use Ollama Provider?

Running Examples

Environment Variables

Supported Providers

API Methods

Core Methods

Persistence Methods

Requirements

License

Links