npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@mate-academy/llm-gateway

v7.2.1

Published

A gateway package for LLM services.

Readme

LLM Gateway

A standardized interface for interacting with various Large Language Model (LLM) providers.

Table of Contents

Overview

The @mate-academy/llm-gateway package provides a unified way to work with different LLM providers like OpenAI and Google Generative AI. It abstracts the implementation details of each provider, allowing you to easily switch between them without changing your application code.

Installation

npm install @mate-academy/llm-gateway

Features

  • Support for multiple LLM providers (OpenAI, Google Generative AI)
  • Structured Output: Type-safe JSON responses with schema validation
  • Extensible Schema Architecture: Driver pattern with provider-specific adapters
  • LLM Metrics & Reporting: Comprehensive metrics collection with automatic cost calculation
  • Cost Tracking: Real-time cost calculation for all LLM operations with flexible pricing functions supporting text and audio tokens, including cached token discounts and reasoning tokens
  • Cached Token Support: Automatic detection and separate pricing for cached tokens (prompt caching) with provider-specific discount rates
  • Instance Caching: Intelligent SDK instance pooling with LRU eviction to prevent memory leaks
  • Standardized completion service interface
  • Standardized assistance service interface with file handling and direct chat
  • Speech-to-text transcription capabilities
  • Text-to-speech generation capabilities
  • Abort signal support for cancelling long-running operations
  • Lightweight token counting using character-based approximation (no heavyweight tokenizer dependencies)
  • Type-safe prompt templates with dynamic replacements
  • Factory pattern for easy provider selection
  • Consistent logging across all providers
  • Comprehensive testing suite with integration tests and shared test utilities
  • Full backward compatibility

Usage

Logger Interface

The package accepts an optional logger that implements the LLMLoggerInterface interface. Most logging libraries are compatible (@mate-academy/logger, winston, pino, etc.). If no logger is provided, no logging will occur.

// Using @mate-academy/logger (recommended)
import { logger } from '@mate-academy/logger';

// Or simple console logger
const logger = {
  info: (msg, meta) => console.log(msg, meta),
  error: (msg, meta) => console.error(msg, meta),
  warn: (msg, meta) => console.warn(msg, meta),
  child: (context) => logger,
};

Available Metrics:

The reporter automatically collects the following metrics:

interface LLMMetrics {
  // Provider & Model Information
  provider: 'OpenAI' | 'GoogleGenerativeAI';
  purpose: 'completion' | 'assistance' | 'speech_to_text' | 'text_to_speech';
  model: string | null;
  modelConfig: Record<string, any> | null;

  // Method Context
  method: string; // e.g., 'sendMessage', 'assistInChat', 'transcribe'
  status: 'success' | 'error' | 'cancelled';

  // Token Usage
  tokens: {
    input: number;  // Regular input tokens (excludes cached tokens)
    output: number; // Output tokens (includes reasoning tokens for o1/o3 models)
    total: number;  // Total tokens (input + output)
  };

  // Cost Tracking (automatically calculated with cached token discounts)
  costs: {
    input: number;   // Cost for input tokens (text + audio), with cached token discounts applied
    output: number;  // Cost for output tokens (text + audio + reasoning tokens)
    total: number;   // Total cost
    currency: string; // 'USD'
  };
}

TypeScript Behavior:

The reporterContext parameter behavior depends on your reporter configuration:

// Case 1: Reporter with defined context type - reporterContext is REQUIRED
const reporterWithContext: LLMReporterInterface<{ userId: string }> = new MyReporter();

// Case 2: Reporter without context - reporterContext is OPTIONAL
const reporterWithoutContext: LLMReporterInterface<undefined> = new SimpleReporter();

// Case 3: No reporter - reporterContext is OPTIONAL
const service = LLMServiceFactory.getCompletionService({
  provider, options
  // no reporter
});

Basic Setup

import {
  LLMProviders,
  LLMServiceFactory,
  LLMCompletionService,
  LLMAssistanceService,
  LLMSpeechToTextService,
  LLMTextToSpeechService,
  LLMRoles,
  LLMMessageContentType,
  LLMUploadFileMimeTypes,
  LLMCompletionMessage,
  LLMModel,
  LLMSchema,
  createPromptTemplate,
  LLMLoggerInterface,
  LLMReporterInterface,
} from '@mate-academy/llm-gateway';

// Define provider options
const llmProviderOptions = LLMServiceFactory.resolveProviderOptions(
  LLMProviders.OpenAI, // chosen provider
  {
    [LLMProviders.OpenAI]: {
      apiKey: 'your-openai-api-key',
      organization: 'your-organization-id', // optional
      baseURL: 'https://api.openai.com/v1', // optional
    },
    [LLMProviders.GoogleGenerativeAI]: {
      apiKey: 'your-google-ai-api-key',
    },
  },
);

// Get completion service
const completionService = LLMServiceFactory.getCompletionService({
  provider: LLMProviders.OpenAI,
  options: llmProviderOptions, // optional, but recommended - otherwise must be set later via service.setOptions()
  logger, // optional - omit for no logging
  reporter, // optional - omit for no metrics collection
});

// Get assistance service
const assistanceService = LLMServiceFactory.getAssistanceService({
  provider: LLMProviders.OpenAI,
  options: llmProviderOptions, // optional, but recommended - otherwise must be set later via service.setOptions()
  logger, // optional - omit for no logging
  reporter, // optional - omit for no metrics collection
});

// Get speech-to-text service
const speechToTextService = LLMServiceFactory.getSpeechToTextService({
  provider: LLMProviders.OpenAI,
  options: llmProviderOptions, // optional, but recommended - otherwise must be set later via service.setOptions()
  logger, // optional - omit for no logging
  reporter, // optional - omit for no metrics collection
});

// Get text-to-speech service
const textToSpeechService = LLMServiceFactory.getTextToSpeechService({
  provider: LLMProviders.OpenAI,
  options: llmProviderOptions, // optional, but recommended - otherwise must be set later via service.setOptions()
  logger, // optional - omit for no metrics collection
  reporter, // optional - omit for no metrics collection
});

Real-world Example

Here's how you might integrate the LLM Gateway in a use case class:

import {
  LLMProviders,
  LLMServiceFactory,
  LLMCompletionService,
  LLMAssistanceService,
  LLMSpeechToTextService,
  LLMTextToSpeechService,
} from '@mate-academy/llm-gateway';

class MyUseCase {
  private llmCompletionService: LLMCompletionService<LLMProviders>;
  private llmAssistanceService: LLMAssistanceService<LLMProviders>;
  private llmSpeechToTextService: LLMSpeechToTextService<LLMProviders>;
  private llmTextToSpeechService: LLMTextToSpeechService<LLMProviders>;

  constructor(logger, config) {
    const llmProviderOptions = LLMServiceFactory.resolveProviderOptions(
      config.llmProvider,
      {
        [LLMProviders.OpenAI]: {
          apiKey: config.openAIApiKey,
          organization: config.openAIOrgId,
          baseURL: config.openAIBaseUrl,
        },
        [LLMProviders.GoogleGenerativeAI]: {
          apiKey: config.googleAIApiKey,
        },
      },
    );

    this.llmCompletionService = LLMServiceFactory.getCompletionService({
      provider: config.llmProvider,
      options: llmProviderOptions,
      logger,
      reporter,
    });

    this.llmAssistanceService = LLMServiceFactory.getAssistanceService({
      provider: config.llmProvider,
      options: llmProviderOptions,
      logger,
      reporter,
    });

    this.llmSpeechToTextService = LLMServiceFactory.getSpeechToTextService({
      provider: config.llmProvider,
      options: llmProviderOptions,
      logger,
      reporter,
    });

    this.llmTextToSpeechService = LLMServiceFactory.getTextToSpeechService({
      provider: config.llmProvider,
      options: llmProviderOptions,
      logger,
      reporter,
    });
  }

  async processRequest(prompt) {
    // Use completion service
    const completion = await this.llmCompletionService.sendMessage({
      message: {
        role: LLMRoles.User,
        content: [
          {
            type: LLMMessageContentType.TEXT,
            text: prompt,
          }
        ],
      },
      model: this.getPreferredModel(),
    });

    return completion;
  }

  async transcribeAudio(audioPath) {
    // Use speech-to-text service
    const transcription = await this.llmSpeechToTextService.transcribe({
      pathToAudio: audioPath,
      model: this.getPreferredModel(),
    });

    return transcription;
  }

  async generateSpeech(text) {
    // Use text-to-speech service
    const speech = await this.llmTextToSpeechService.createSpeech({
      text: text,
      model: this.getPreferredModel(),
      speechOptions: {
        voice: 'alloy', // OpenAI voice option
        response_format: 'mp3',
      },
    });

    return speech;
  }

  private getPreferredModel() {
    // Get the appropriate model from the service's available models
    const models = Object.values(this.llmCompletionService.models);
    return models[0]; // Use the first available model
  }
}

Usage Examples

Basic Text Completion

const result = await completionService.sendMessage({
  message: {
    role: LLMRoles.User,
    content: [
      {
        type: LLMMessageContentType.TEXT,
        text: 'Hello, how are you?',
      }
    ],
  },
  model: preferredModel,
  reporterContext: { // required if reporter is configured with context type
    userId: 12345,
    feature: 'chat',
    environment: 'production',
  },
});

console.log(result.text); // AI response

Chat-based Assistance

// Create a chat session
const chat = await assistanceService.createChat({
  model: preferredModel,
  instructions: 'You are a helpful coding assistant.',
});

// Send messages directly to the chat
const result = await assistanceService.assistInChat({
  model: preferredModel,
  message: {
    role: LLMRoles.User,
    content: [{
      type: LLMMessageContentType.TEXT,
      text: 'Help me understand React hooks',
    }],
  },
  chatId: chat.chatId,
  reporterContext: { // required if reporter is configured with context type
    userId: 12345,
    feature: 'coding_assistance',
    environment: 'production',
  },
});

console.log(result.text); // Assistant response

Direct Image Analysis

// Send image directly in message content (OpenAI)
const result = await assistanceService.assistInChat({
  model: preferredModel,
  message: {
    role: LLMRoles.User,
    content: [
      {
        type: LLMMessageContentType.TEXT,
        text: 'What do you see in this image?',
      },
      {
        type: LLMMessageContentType.IMAGE_URL,
        image_url: {
          url: 'https://example.com/image.png',
          detail: 'high',
        },
      },
    ],
  },
  chatId: chat.chatId,
});

console.log(result.text); // Image analysis response

Structured Output

The LLM Gateway supports structured output, allowing you to request type-safe, validated JSON responses from LLM providers. Built on Zod v4 with native JSON Schema generation for optimal performance.

Basic Structured Output

import { LLMSchema } from '@mate-academy/llm-gateway';

// Define your output schema
const personSchema = LLMSchema.object({
  name: LLMSchema.string(),
  age: LLMSchema.number().min(1),
  email: LLMSchema.string().email(),
  isActive: LLMSchema.boolean(),
});

// Request structured output
const result = await completionService.sendMessage({
  message: {
    role: LLMRoles.User,
    content: [{
      type: LLMMessageContentType.TEXT,
      text: 'Create a person profile for John Smith, 30 years old, email [email protected]'
    }]
  },
  model: preferredModel,
  schema: personSchema  // Add schema for structured output
});

// Type-safe access to structured data
if (result.data) {
  console.log(result.data.name);        // string
  console.log(result.data.age);         // number
  console.log(result.data.email);       // string
  console.log(result.data.isActive);    // boolean
}

// Always available as text fallback
console.log(result.text);

Complex Schema Example

const analysisSchema = LLMSchema.object({
  sentiment: LLMSchema.enum(['positive', 'negative', 'neutral']),
  confidence: LLMSchema.number().min(0).max(1),
  keywords: LLMSchema.array(LLMSchema.string()),
  summary: LLMSchema.string(),
  metadata: LLMSchema.object({
    processedAt: LLMSchema.string(),
    modelVersion: LLMSchema.string(),
  }),
});

const result = await completionService.sendMessage({
  message: {
    role: LLMRoles.User,
    content: [{
      type: LLMMessageContentType.TEXT,
      text: 'Analyze this text: "I love this new feature!"'
    }]
  },
  model: preferredModel,
  schema: analysisSchema
});

// Fully type-safe access
if (result.data) {
  console.log(result.data.sentiment);         // 'positive' | 'negative' | 'neutral'
  console.log(result.data.confidence);        // number (0-1)
  console.log(result.data.keywords);          // string[]
  console.log(result.data.summary);           // string
  console.log(result.data.metadata.processedAt); // string
}

Schema API Reference

The LLMSchema builder provides a fluent API for defining output schemas:

Basic Types:

  • LLMSchema.string() - String values
  • LLMSchema.number() - Numeric values
  • LLMSchema.boolean() - Boolean values
  • LLMSchema.array(itemSchema) - Arrays of items
  • LLMSchema.object({ ... }) - Object with specified properties
  • LLMSchema.enum(['option1', 'option2']) - Enumerated values
  • LLMSchema.literal('exact_value') - Exact literal values

Modifiers:

  • .optional() - Makes field optional and nullable (compatible with all providers)
  • .nullable() - Allows null values
  • .default(value) - Sets default value
  • .describe(text) - Adds description

String Modifiers:

  • .min(length) - Minimum string length
  • .max(length) - Maximum string length
  • .email() - Email validation
  • .url() - URL validation
  • .uuid() - UUID validation

Number Modifiers:

  • .int() - Integer values only
  • .positive() - Positive numbers (Note: Use .min(1) for better OpenAI compatibility)
  • .negative() - Negative numbers
  • .min(value) - Minimum value
  • .max(value) - Maximum value

Error Handling

const result = await completionService.sendMessage({
  message: { /* ... */ },
  model: preferredModel,
  schema: mySchema
});

if (result.data) {
  // Successfully parsed structured output
  console.log('Structured data:', result.data);
} else if (result.parseError) {
  // Parsing failed, but text is still available
  console.error('Parse error:', result.parseError);
  console.log('Raw text:', result.text);
} else if (result.error) {
  // Request failed entirely
  console.error('Request error:', result.error);
}

Provider Support

  • OpenAI: Uses native response_format with json_schema for optimal performance
  • Google Generative AI: Uses native responseSchema parameter for structured output
  • Backward Compatibility: All existing code continues to work unchanged

Provider Compatibility Notes

OpenAI Structured Output:

  • The .optional() modifier now automatically converts to required+nullable for OpenAI compatibility
  • Use .min(1) instead of .positive() for better compatibility
  • Nested objects and arrays are fully supported

Google Generative AI:

  • Supports all schema types and modifiers
  • Handles optional fields natively
  • More flexible with schema variations

Best Practices for Cross-Provider Compatibility:

  • Use .optional() for optional fields (automatically handled for all providers)
  • Use .min() and .max() instead of .positive(), .negative()
  • Both providers now have unified schema handling

Schema Architecture

The package uses a driver pattern for schema conversion, automatically adapting schemas to each provider's specific format while maintaining a unified API.

Schema adapters are automatically registered when providers are imported, so no manual configuration is needed. For detailed architecture information and extending with new providers, see the Developer Guide.

Model Configuration

Each model comes with default configuration values that can be customized for your specific needs.

Accessing and Customizing Models

// Get available models from the service
const models = completionService.models;

// Get a specific model
const model = models[OpenAIModelNames.GPT_4_1];

// Customize model configuration
const customModel = {
  ...model,
  config: {
    ...model.config,
    temperature: 0.8,  // Override temperature (0-2, controls randomness)
    top_p: 0.9,       // Override top_p (nucleus sampling)
  }
};

// Use customized model in requests
const result = await completionService.sendMessage({
  message: {
    role: LLMRoles.User,
    content: [{ type: LLMMessageContentType.TEXT, text: 'Hello!' }]
  },
  model: customModel,
});

Model-Specific Configuration

GPT-5 Models have special configuration options:

const gpt5Model = models[OpenAIModelNames.GPT_5];

const customGPT5Model = {
  ...gpt5Model,
  config: {
    ...gpt5Model.config,
    temperature: 1,  // Note: GPT-5 only supports temperature=1
    reasoning_effort: 'high',  // 'minimal' | 'low' | 'medium' | 'high' | 'xhigh'
    verbosity: 'low',  // 'low' | 'medium' | 'high'
  }
};

// GPT-5.1 and GPT-5.2 supports 'none' reasoning_effort
const gpt51Model = models[OpenAIModelNames.GPT_5_1];

const customGPT51Model = {
  ...gpt51Model,
  config: {
    ...gpt51Model.config,
    temperature: 1,  // Note: GPT-5.1 only supports temperature=1
    reasoning_effort: 'none',  // 'none' | 'minimal' | 'low' | 'medium' | 'high' | 'xhigh'
    verbosity: 'low',  // 'low' | 'medium' | 'high'
  }
};

Note: All GPT-5 models require temperature: 1 and cannot be changed. GPT-5, GPT-5-MINI, and GPT-5-NANO support reasoning_effort values of 'minimal', 'low', 'medium', 'high', or 'xhigh'. GPT-5.1, GPT-5.2, and GPT-5.4 additionally support 'none'.

Gemini 3.x thinking models (gemini-3-flash-preview, gemini-3.1-pro-preview) have special configuration options:

import { ThinkingLevel } from '@google/genai';

const gemini31Model = models[GoogleGenerativeAIModelNames.GEMINI_3_1_PRO_PREVIEW];

const customGemini31Model = {
  ...gemini31Model,
  config: {
    ...gemini31Model.config,
    temperature: 1,  // Note: Gemini 3.x thinking models only support temperature=1
    thinkingConfig: {
      thinkingLevel: ThinkingLevel.HIGH,  // LOW | HIGH | THINKING_LEVEL_UNSPECIFIED
    },
  }
};

Note: Gemini 3.x thinking models require temperature: 1 and cannot be changed. The thinkingConfig allows you to control the model's reasoning depth.

File-based Assistance

// Upload files for context
const uploadedFile = await assistanceService.uploadFile({
  name: 'document.txt',
  path: '/path/to/document.txt',
  mimeType: LLMUploadFileMimeTypes.PLAIN_TEXT,
});

// Create file storage with instructions
const storage = await assistanceService.createFileStorage({
  uploadedFiles: [uploadedFile],
  model: preferredModel,
  instructions: 'Help me analyze this document',  // Optional context for the storage
});

// Create chat with file context
const chat = await assistanceService.createChat({
  storageId: storage.storageId,
  model: preferredModel,
  history: [],
  files: [uploadedFile],
  instructions: 'Answer questions about the uploaded document',  // Chat instructions
});

// Send messages in the chat
const result = await assistanceService.assistInChat({
  model: preferredModel,
  message: {
    role: LLMRoles.User,
    content: [{
      type: LLMMessageContentType.TEXT,
      text: 'What are the key points in this document?',
    }],
  },
  chatId: chat.chatId,
  storageId: storage.storageId,
});

Speech-to-Text Transcription

const transcription = await speechToTextService.transcribe({
  pathToAudio: '/path/to/audio.mp3',
  mimeType: LLMUploadFileMimeTypes.AUDIO_MP3, // Optional, but recommended
  model: preferredModel,
  instructions: 'Transcribe the audio file', // Optional: custom prompt for transcription context
  reporterContext: { // required if reporter is configured with context type
    userId: 12345,
    feature: 'audio_transcription',
    environment: 'production',
  },
});

console.log(transcription.text); // Transcribed text

Text-to-Speech Generation

const speech = await textToSpeechService.createSpeech({
  text: 'Hello, this will be converted to speech',
  model: preferredModel,
  speechOptions: {
    voice: 'alloy', // OpenAI voice option
    response_format: 'mp3',
  },
});

// Save audio buffer to file
fs.writeFileSync('output.mp3', speech.audio);

Token Counting for Context Management

The LLM Gateway provides token counting functionality to help you manage context and optimize API usage. Token counting uses a lightweight character-based approximation (approximately 4 characters per token) to avoid heavyweight tokenizer dependencies.

Note: Token counts are approximate and may differ from actual provider token usage. For precise token counts, refer to the usage metrics returned in API responses.

// Count tokens in messages before sending to optimize context usage
const messages = [
  {
    role: LLMRoles.User,
    content: [
      {
        type: LLMMessageContentType.TEXT,
        text: 'Analyze this document and provide insights.',
      },
      {
        type: LLMMessageContentType.TEXT,
        text: longDocumentContent, // Large text content
      },
      // ... potentially more content including images
    ],
  }
];

const tokenCount = await assistanceService.countTokens(messages, preferredModel);

console.log(`Total tokens: ${tokenCount}`);

// Make intelligent decisions based on token count
if (tokenCount > 50000) {
  // Use file upload approach for large content
  const uploadedFile = await assistanceService.uploadFile({
    name: 'document.txt',
    path: '/path/to/document.txt',
    mimeType: LLMUploadFileMimeTypes.PLAIN_TEXT,
  });

  const storage = await assistanceService.createFileStorage({
    uploadedFiles: [uploadedFile],
    model: preferredModel,
  });

  const chat = await assistanceService.createChat({
    storageId: storage.storageId,
    model: preferredModel,
    instructions: 'Analyze the uploaded document',
  });
} else {
  // Send content directly in messages
  const result = await assistanceService.assistInChat({
    model: preferredModel,
    message: messages[0],
    chatId: existingChatId,
  });
}

Abort Signal Support

All operations support abort signals for cancellation:

const controller = new AbortController();

// Cancel after 30 seconds
setTimeout(() => controller.abort(), 30000);

const result = await completionService.sendMessage({
  message: {
    role: LLMRoles.User,
    content: [{ type: LLMMessageContentType.TEXT, text: 'Long request...' }],
  },
  model: preferredModel,
  abortSignal: controller.signal,
});

if (result.error) {
  console.error('Operation failed or was cancelled:', result.error);
} else {
  console.log('Success:', result.text);
}

API Reference

LLMProviders

Enum of supported LLM providers:

enum LLMProviders {
  OpenAI = 'OpenAI',
  GoogleGenerativeAI = 'GoogleGenerativeAI',
  // other providers may be added in the future
}

LLMPurposes

Enum of supported service purposes:

enum LLMPurposes {
  Completion = 'completion',
  Assistance = 'assistance',
  SpeechToText = 'speech_to_text',
  TextToSpeech = 'text_to_speech',
}

LLMServiceFactory

Factory class for creating LLM service instances.

Methods

  • resolveProviderOptions(provider, optionsMap): Resolves the options for the specified provider
  • getCompletionService(provider, logger, reporter, options): Creates a completion service instance
  • getAssistanceService(provider, logger, reporter, options): Creates an assistance service instance
  • getSpeechToTextService(provider, logger, reporter, options): Creates a speech-to-text service instance
  • getTextToSpeechService(provider, logger, reporter, options): Creates a text-to-speech service instance

Service Instance Methods

All service instances provide the following common methods:

  • setOptions(options): Update the provider options (e.g., API key, base URL) for an existing service instance
  • getOptions(): Retrieve the current provider options
  • clearInstanceCache(): Clear the internal SDK instance cache (rarely needed)

Instance Caching:

The gateway implements intelligent SDK instance caching to prevent memory leaks when switching between different credentials:

// Create service with initial credentials
const service = LLMServiceFactory.getCompletionService({
  provider: LLMProviders.OpenAI,
  options: { apiKey: 'key-1' },
});

// First call creates and caches SDK instance for 'key-1'
await service.sendMessage({ /* ... */ });

// Update to different credentials
service.setOptions({ apiKey: 'key-2' });

// Second call creates and caches SDK instance for 'key-2'
await service.sendMessage({ /* ... */ });

// Switch back to original credentials
service.setOptions({ apiKey: 'key-1' });

// Reuses cached SDK instance for 'key-1' (no recreation needed)
await service.sendMessage({ /* ... */ });

The cache uses LRU (Least Recently Used) eviction and maintains up to 10 instances per service. This is particularly useful when:

  • Using the same service with multiple products/features that have different API keys
  • Implementing multi-tenant systems where credentials change frequently
  • Running tests with different credential sets

LLMCompletionService

Interface for text completion services.

Methods

  • sendMessage(options): Send a message to the LLM and get a completion response
    • message: The message to send
    • model: The LLM model to use (with optional config overrides)
    • history: Optional conversation history
    • instructions: Optional system instructions to guide the model's behavior
    • schema: Optional schema for structured output
    • reporterContext: Context data passed to reporter for metrics tracking (required if reporter is configured with a defined context type)
    • abortSignal: Optional abort signal for cancellation

LLMAssistanceService

Interface for chat/assistance services with file handling capabilities.

Methods

  • uploadFile(fileOptions): Upload a file to the LLM service
  • getFile(path): Retrieve the file information for a previously uploaded file by its path
  • deleteFile(fileId): Delete a file from the LLM service
  • createFileStorage(options): Create a file storage for organizing files
    • uploadedFiles: Array of previously uploaded files
    • model: The LLM model to use
    • instructions: Optional instructions for how to use the stored files
  • deleteFileStorage(fileStorageId): Delete a file storage
  • createChat(options): Create a new chat/conversation
    • model: The LLM model to use
    • instructions: Optional initial instructions for the conversation
    • history: Optional conversation history
    • files: Optional array of uploaded files
    • storageId: Optional storage ID to use
  • deleteChat(chatId): Delete a chat
  • assistInChat(options): Send a message in an existing chat and get a response
    • model: The LLM model to use (required)
    • message: The message to send
    • chatId: The ID of the chat to send the message to
    • storageId: Optional storage ID to use for file context
    • schema: Optional schema for structured output (supports type-safe JSON responses)
    • reporterContext: Context data passed to reporter for metrics tracking (required if reporter is configured with a defined context type)
  • countTokens(messages, model): Count tokens in messages for context management

LLMSpeechToTextService

Interface for converting speech audio to text.

Methods

  • transcribe(options): Convert audio file to text transcription
    • pathToAudio: Path to the audio file
    • mimeType: Optional MIME type of the audio file
    • model: The LLM model to use
    • instructions: Optional custom transcription prompt or context
    • reporterContext: Context data passed to reporter for metrics tracking (required if reporter is configured with a defined context type)

LLMTextToSpeechService

Interface for converting text to speech audio.

Methods

  • createSpeech(options): Convert text to speech audio file
    • text: Text to convert to speech
    • model: The LLM model to use
    • instructions: Optional instructions for speech generation
    • speechOptions: Provider-specific speech configuration
    • reporterContext: Context data passed to reporter for metrics tracking (required if reporter is configured with a defined context type)

Prompt Builder

The LLM Gateway includes a powerful prompt template system that provides type-safe string templates with dynamic replacements and conditional sections. This allows you to create reusable prompt templates with placeholders that can be replaced with actual values at runtime.

Features

  • Type Safety: Automatic extraction and validation of placeholder keys from template strings
  • Dynamic Replacements: Replace placeholders like {{variableName}} with actual values
  • Conditional Sections: Show or hide content based on variable values using {{#condition}}...{{/condition}} syntax
  • Nested Conditionals: Support for nested conditional sections for complex logic
  • Template Reusability: Create templates once and use them multiple times with different values
  • Zero Runtime Dependencies: Pure TypeScript utility functions

Basic Usage

import { createPromptTemplate } from '@mate-academy/llm-gateway';

// Create a prompt template with placeholders
const welcomePrompt = createPromptTemplate(`
  Generate a welcome message for a user who has just started their auto tech check attempt on {{topicTitle}}.
  The user's experience level is {{experienceLevel}} and they prefer {{learningStyle}} learning.
`);

// Use the template with actual values
const instruction = welcomePrompt({
  topicTitle: 'JavaScript Basics',
  experienceLevel: 'beginner',
  learningStyle: 'hands-on',
});

// Result: "Generate a welcome message for a user who has just started their auto tech check attempt on JavaScript Basics. The user's experience level is beginner and they prefer hands-on learning."

Conditional Sections

Conditional sections allow you to show or hide parts of the template based on variable values:

// Template with conditional sections
const coursePrompt = createPromptTemplate(`
  Generate a lesson plan for {{topicTitle}}.
  {{#hasPrerequisites}}
  Prerequisites: {{prerequisites}}
  {{/hasPrerequisites}}

  {{#includeExercises}}
  Include practical exercises and code examples.
  {{/includeExercises}}

  {{#difficultyLevel}}
  Adjust content for {{difficultyLevel}} level students.
  {{/difficultyLevel}}
`);

// Usage with all sections visible
const fullLesson = coursePrompt({
  topicTitle: 'React Hooks',
  hasPrerequisites: true,
  prerequisites: 'Basic React knowledge',
  includeExercises: true,
  difficultyLevel: 'intermediate'
});

// Usage with some sections hidden
const basicLesson = coursePrompt({
  topicTitle: 'React Hooks',
  hasPrerequisites: false,
  prerequisites: '',
  includeExercises: false,
  difficultyLevel: 'beginner'
});

Negative Conditional Sections

Negative conditions allow you to show content when a value is falsy:

const feedbackPrompt = createPromptTemplate(`
  Analyze the {{language}} code submission.
  {{#passed}}
  Great job! The tests passed successfully.
  {{/passed}}
  {{#!passed}}
  The tests did not pass. Please review the following issues:
  {{errors}}
  {{/passed}}

  {{#!skipSuggestions}}
  Here are some suggestions for improvement:
  - Consider refactoring for better readability
  - Add more comprehensive error handling
  {{/skipSuggestions}}
`);

// When tests pass
const successResult = feedbackPrompt({
  language: 'JavaScript',
  passed: true,
  errors: '',
  skipSuggestions: false
});
// Result: Shows success message and suggestions

// When tests fail
const failureResult = feedbackPrompt({
  language: 'Python',
  passed: false,
  errors: 'TypeError on line 15',
  skipSuggestions: false
});
// Result: Shows failure message with errors and suggestions

Nested Conditional Sections

You can nest conditional sections for more complex logic:

const reviewPrompt = createPromptTemplate(`
  Review the {{language}} code for {{focusArea}}.
  {{#includeMetrics}}
  Provide performance metrics.
  {{#includeDetailed}}
  Include detailed benchmark analysis and memory usage patterns.
  {{/includeDetailed}}
  {{/includeMetrics}}

  {{#suggestImprovements}}
  Suggest specific improvements for better {{improvementFocus}}.
  {{/suggestImprovements}}
`);

const detailedReview = reviewPrompt({
  language: 'TypeScript',
  focusArea: 'performance',
  includeMetrics: true,
  includeDetailed: true,
  suggestImprovements: true,
  improvementFocus: 'scalability'
});

Advanced Usage

// Template without placeholders (no parameters required)
const staticPrompt = createPromptTemplate(`
  Please analyze the provided code and suggest improvements.
`);
const staticInstruction = staticPrompt(); // No parameters needed

// Template with multiple placeholders and conditional sections
const codeReviewPrompt = createPromptTemplate(`
  Review the {{language}} code below for {{focusArea}}.
  {{#includeCriteria}}
  Pay special attention to {{criteria}} and provide {{outputFormat}} feedback.
  {{/includeCriteria}}

  {{#includeCode}}
  Code:
  {{codeSnippet}}
  {{/includeCode}}

  {{#provideExamples}}
  Include examples of best practices for {{language}}.
  {{/provideExamples}}
`);

const reviewInstruction = codeReviewPrompt({
  language: 'TypeScript',
  focusArea: 'performance optimization',
  includeCriteria: true,
  criteria: 'algorithmic efficiency and memory usage',
  outputFormat: 'structured',
  includeCode: true,
  codeSnippet: 'function example() { /* code here */ }',
  provideExamples: false,
});

Integration with LLM Services

import {
  createPromptTemplate,
  LLMServiceFactory,
  LLMProviders,
  LLMRoles,
} from '@mate-academy/llm-gateway';

// Define reusable prompt templates
const PROMPTS = {
  codeExplanation: createPromptTemplate(`
    Explain the following {{language}} code in simple terms for a {{level}} developer:
    {{#includeContext}}
    Context: {{context}}
    {{/includeContext}}

    {{code}}

    {{#includeExamples}}
    Provide practical examples of how this code would be used.
    {{/includeExamples}}
  `),

  bugFinding: createPromptTemplate(`
    Find potential bugs in this {{language}} code and suggest fixes:
    {{#focusArea}}
    Focus specifically on {{focusArea}} issues.
    {{/focusArea}}

    {{code}}

    {{#includeSeverity}}
    Rate the severity of each issue from 1-5.
    {{/includeSeverity}}
  `),

  optimization: createPromptTemplate(`
    Optimize the following code for {{optimizationType}}:
    {{code}}

    {{#includeMetrics}}
    Provide before/after performance metrics.
    {{/includeMetrics}}

    {{#includeAlternatives}}
    Suggest alternative approaches and explain trade-offs.
    {{/includeAlternatives}}
  `),
};

// Use with completion service
async function explainCode(code: string, language: string, level: string, includeExamples = false) {
  const prompt = PROMPTS.codeExplanation({
    code,
    language,
    level,
    includeContext: false,
    includeExamples,
  });

  return await completionService.sendMessage({
    message: {
      role: LLMRoles.User,
      content: [{ type: LLMMessageContentType.TEXT, text: prompt }],
    },
    model: preferredModel,
  });
}

Type Safety Features

The prompt builder provides compile-time type checking for template placeholders and conditional sections:

// This will show TypeScript errors for missing or incorrect parameters
const template = createPromptTemplate(`
  Hello {{name}}, welcome to {{platform}}!
  {{#showBonus}}You have a bonus: {{bonusAmount}}{{/showBonus}}
`);

// ✅ Correct usage with all required variables
template({
  name: 'John',
  platform: 'LLM Gateway',
  showBonus: true,
  bonusAmount: '$50'
});

// ✅ Correct usage with conditional section hidden
template({
  name: 'John',
  platform: 'LLM Gateway',
  showBonus: false
  // bonusAmount is not required when showBonus is false
});

// ❌ TypeScript error: missing required parameter 'platform'
template({ name: 'John', showBonus: false });

// ❌ TypeScript error: unknown parameter 'age'
template({
  name: 'John',
  platform: 'LLM Gateway',
  showBonus: false,
  age: 25
});

API Reference

createPromptTemplate<T extends string>(template: T)

Creates a prompt template function from a template string.

  • Parameters:
    • template: T - The template string with placeholders and conditional sections
  • Returns: A function that accepts replacement values and returns the processed string
  • Type Safety: Automatically extracts placeholder names and conditional section names from the template string for type checking

Template Syntax

Variable Placeholders:

  • Placeholders must be enclosed in double curly braces: {{variableName}}
  • Whitespace around variable names is ignored: {{ variableName }} works the same as {{variableName}}
  • Variable names can contain letters, numbers, and underscores
  • Replacement values can be strings, numbers, or booleans (automatically converted to strings)

Conditional Sections:

  • Positive conditional sections use the syntax: {{#conditionName}}content{{/conditionName}}
    • The section content is included only if the condition variable is truthy
  • Negative conditional sections use the syntax: {{#!conditionName}}content{{/conditionName}}
    • The section content is included only if the condition variable is falsy
  • Truthy values: true, non-empty strings, non-zero numbers
  • Falsy values: false, empty strings, 0, null, undefined
  • Conditional variables are optional in the type system when used only as conditions
  • Variables used both as conditions and values are required in the type system

Nested Conditionals:

  • Conditional sections can be nested for complex logic
  • Inner sections are processed only if outer sections are visible
  • Variables inside nested sections follow the same truthy/falsy rules

Supported Providers

OpenAI

Supports all service types: completion, assistance, speech-to-text, and text-to-speech APIs. For more information, see OpenAI API documentation.

Available Models:

| Model | Purpose | Max Input | Max Output | Notes | |-------|---------|-----------|------------|-------| | gpt-4.1 | Completion, Assistance | 1M+ | 32K | Extended context window | | gpt-4.1-mini | Completion, Assistance | 1M+ | 32K | Cost-effective extended context | | gpt-4.1-nano | Completion, Assistance | 1M+ | 32K | Fastest, most cost-efficient GPT-4.1 | | gpt-5 | Completion, Assistance | 400K | 128K | Advanced reasoning, requires temperature=1 | | gpt-5.1 | Completion, Assistance | 400K | 128K | Advanced reasoning with 'none' reasoning_effort support, requires temperature=1 | | gpt-5.2 | Completion, Assistance | 400K | 128K | Advanced reasoning with 'none' reasoning_effort support, requires temperature=1 | | gpt-5.4 | Completion, Assistance | 1M+ | 128K | Flagship model, supports 'none'/'xhigh' reasoning_effort, requires temperature=1 | | gpt-5-mini | Completion, Assistance | 400K | 128K | Smaller GPT-5 variant, requires temperature=1 | | gpt-5-nano | Completion, Assistance | 400K | 128K | Fastest GPT-5 variant, requires temperature=1 | | gpt-4o-transcribe | Speech-to-Text | 16K | 2K | Optimized for transcription | | gpt-4o-mini-transcribe | Speech-to-Text | 16K | 2K | Cost-effective transcription | | tts-1 | Text-to-Speech | 2K | - | Standard TTS model | | gpt-4o-mini-tts | Text-to-Speech | 2K | - | Alternative TTS model |

Google Generative AI

Supports completion, assistance, speech-to-text, and text-to-speech APIs through Google's Generative AI models. For more information, see Google Generative AI documentation.

Available Models:

| Model | Purpose | Max Input | Max Output | Notes | |-------|---------|-----------|------------|-------| | gemini-2.5-flash | Completion, Assistance, Speech-to-Text | 1M+ | 65K | Fast, cost-effective, supports caching for long context | | gemini-2.5-flash-lite | Completion, Assistance, Speech-to-Text | 1M+ | 65K | Fastest, most cost-effective, supports caching | | gemini-2.5-pro | Completion, Assistance, Speech-to-Text | 1M+ | 65K | Advanced reasoning, supports caching for long context | | gemini-3-flash-preview | Completion, Assistance, Speech-to-Text | 1M+ | 65K | Thinking model, requires temperature=1, configurable thinking levels | | gemini-3.1-flash-lite-preview | Completion, Assistance, Speech-to-Text | 1M+ | 65K | Most cost-efficient Gemini 3.x model | | gemini-3.1-pro-preview | Completion, Assistance, Speech-to-Text | 1M+ | 65K | Advanced reasoning with thinking capabilities, requires temperature=1 | | gemini-2.5-flash-preview-tts | Text-to-Speech | 8K | 16K | Preview TTS model with flash performance | | gemini-2.5-pro-preview-tts | Text-to-Speech | 8K | 16K | Preview TTS model with pro capabilities |

Note: Google Generative AI models support context caching for content longer than 32,768 tokens, which can significantly reduce costs for repeated queries on the same large context.

Gemini 3.x Thinking Model Configuration:

  • Applies to: gemini-3-flash-preview, gemini-3.1-pro-preview
  • Requires temperature: 1 (cannot be changed)
  • Supports thinkingConfig with thinkingLevel property (LOW, HIGH, THINKING_LEVEL_UNSPECIFIED)
  • Default thinking level is LOW

Testing

The LLM Gateway includes a comprehensive testing suite with both unit and integration tests.

Test Structure

The package includes:

  • Unit tests for core functionality (src/tests/unit/)
    • Schema builder and validation tests
    • Prompt template builder tests
    • Utility function tests
  • Integration tests for all service types (src/tests/integration/)
    • Real API integration tests with live providers
    • Service-specific functionality tests
    • Cross-provider compatibility tests
  • Shared test utilities (src/tests/integration/shared/)
    • Reusable test patterns for common functionality
    • Service initialization tests
    • Logger integration tests
    • Abort signal handling tests
    • Reporter integration tests
    • Instance caching tests
  • Test helpers for common testing utilities (src/tests/helpers.ts)
  • Mock implementations for testing environments
  • Audio test files for speech-to-text testing

Running Tests

# Run all tests with logging
npm test

# Run tests silently (without logs)
npm run test:silent

# Run only integration tests with verbose output
npm run test:integration

# Run specific test file
npm test -- LLMSchema.test.ts

# Run tests matching a pattern
npm test -- --testNamePattern="should handle basic schemas"

# Run interactive prepublish test selector
npm run prepublish-tests

# Run with specific environment variables
ENABLE_LOGGING=true npm test

Test Configuration

Integration tests require API keys for the respective providers:

# Required environment variables for OpenAI tests
OPENAI_SECRET_API_KEY=your_openai_api_key
OPENAI_ORG_ID=your_organization_id  # optional
OPENAI_BASE_URL=https://api.openai.com/v1  # optional

# Required environment variables for Google AI tests
GOOGLE_GENERATIVE_AI_API_KEY=your_google_ai_api_key

Test Coverage

The integration tests cover:

  1. Common Service Tests (via shared utilities):

    • Service initialization with and without options
    • Instance caching and credential switching
    • Logger integration and error handling
    • Reporter metrics collection (success/abort/fail)
    • Abort signal handling for all operations
  2. Completion Service Tests:

    • Basic message sending and responses
    • Message history handling
    • Custom instructions
    • Image analysis
    • Structured output with schemas
    • Error handling for invalid inputs
    • Model-specific functionality
  3. Assistance Service Tests:

    • File upload and storage creation
    • Chat creation and management
    • Direct chat interactions
    • File-based conversations
    • Direct image handling in messages
    • Multiple abort scenarios (file upload, storage, chat)
  4. Speech-to-Text Service Tests:

    • Audio file transcription
    • Multiple audio format support (MP3, WAV, WEBM, OGG)
    • Error handling for invalid files
    • Model-specific transcription quality
  5. Text-to-Speech Service Tests:

    • Text-to-audio conversion
    • Voice selection options
    • Audio format configuration
    • Instructions for speech generation
    • Error handling

Test Helpers

The package provides several test utilities:

resolveTestConfig Function

The resolveTestConfig function creates standardized test configurations for all supported providers based on the service purpose:

import { resolveTestConfig } from '@mate-academy/llm-gateway/tests/helpers';

// Get test config for completion services
const testConfig = resolveTestConfig(LLMPurposes.Completion);

// testConfig contains configuration for all providers:
// {
//   [LLMProviders.OpenAI]: {
//     provider: LLMProviders.OpenAI,
//     availableModels: {...}, // Models available for completion
//     clientOptions: { apiKey: process.env.OPENAI_SECRET_API_KEY, ... },
//     requireCredentials: () => void, // Throws if credentials missing
//     isEnabled: true
//   },
//   [LLMProviders.GoogleGenerativeAI]: {
//     provider: LLMProviders.GoogleGenerativeAI,
//     availableModels: {...}, // Models available for completion
//     clientOptions: { apiKey: process.env.GOOGLE_GENERATIVE_AI_API_KEY },
//     requireCredentials: () => void, // Throws if credentials missing
//     isEnabled: true
//   }
// }

// Use in tests to iterate over all providers
Object.values(testConfig).forEach((config) => {
  const { provider, clientOptions, availableModels, requireCredentials } = config;

  describe(`${provider} Provider`, () => {
    beforeAll(() => {
      requireCredentials(); // Ensures API keys are present
    });

    it('should create service', () => {
      const service = LLMServiceFactory.getCompletionService({
        provider,
        options: clientOptions,
        logger: mockLogger, // Optional
      });
      expect(service).toBeDefined();
    });
  });
});

Mock Logger

// Note: Test helpers are for internal use only
// When testing your integration, create your own mock logger:
const mockLogger = {
  info: jest.fn(),
  error: jest.fn(),
  warn: jest.fn(),
  child: jest.fn(() => mockLogger),
};

Type Guards

// Type guards for test assertions
if ('text' in result && result.text) {
  expect(result.text).toContain('expected content');
}

if ('error' in result && result.error) {
  expect(result.error).toBeDefined();
}

Writing Custom Tests

Example of writing a custom integration test using resolveTestConfig:

import {
  describe,
  it,
  expect,
  beforeAll,
} from '@jest/globals';
import {
  LLMServiceFactory,
  LLMProviders,
  LLMPurposes,
  resolveTestConfig,
} from '@mate-academy/llm-gateway';

// Create your own mock logger
const mockLogger = {
  info: jest.fn(),
  error: jest.fn(),
  warn: jest.fn(),
  child: jest.fn(() => mockLogger),
};

describe('Custom LLM Integration Test', () => {
  // Use resolveTestConfig for consistent test configuration
  const testConfig = resolveTestConfig(LLMPurposes.Completion);

  // Test all enabled providers
  Object.values(testConfig).forEach((config) => {
    const { provider, clientOptions, requireCredentials } = config;

    describe(`${provider} Provider`, () => {
      let service;

      beforeAll(() => {
        requireCredentials(); // Validates API keys are present

        service = LLMServiceFactory.getCompletionService({
          provider,
          options: clientOptions,
          logger: mockLogger, // Optional - can be omitted
        });
      });

      it('should process custom request', async () => {
        // Your custom test logic here
        const result = await service.sendMessage({
          message: {
            role: LLMRoles.User,
            content: [{ type: LLMMessageContentType.TEXT, text: 'Test message' }],
          },
          model: Object.values(config.availableModels)[0], // Use first available model
        });

        // Use type guards for assertions
        if ('text' in result && result.text) {
          expect(result.text).toBeDefined();
          expect(typeof result.text).toBe('string');
        } else if ('error' in result && result.error) {
          throw result.error;
        }
      });
    });
  });
});

Developer Guide

Pricing Model Architecture

The LLM Gateway uses a flexible function-based pricing model that supports different token types and caching:

interface LLMModelPricing {
  getPriceForTextInput: (tokens: number) => number;   // Price for regular text input tokens
  getPriceForTextOutput: (tokens: number) => number;  // Price for text output tokens
  getPriceForAudioInput: (tokens: number) => number;  // Price for regular audio input tokens
  getPriceForAudioOutput: (tokens: number) => number; // Price for audio output tokens

  // Optional: Cached token pricing (prompt caching)
  getPriceForCachedTextInput?: (tokens: number) => number;  // Discounted price for cached text tokens
  getPriceForCachedAudioInput?: (tokens: number) => number; // Discounted price for cached audio tokens

  currency: string; // Currency code (e.g., 'USD')
}

This architecture allows:

  • Dynamic Pricing: Support for tiered pricing based on token count
  • Multi-Modal Support: Separate pricing for text and audio tokens
  • Provider Flexibility: Each provider can implement custom pricing logic
  • Cached Token Discounts: Automatic detection and separate pricing for cached tokens with provider-specific discount rates
  • Reasoning Token Support: Proper cost calculation for reasoning tokens in o1/o3 models
  • Character-based Pricing: Some models (e.g., OpenAI TTS) use character count instead of tokens for input pricing

Cached Token Support:

  • OpenAI: 50% discount for cached tokens (GPT-4 models), 10% discount (GPT-5 models), 25% discount (GPT-4.1 models)
  • Google Gemini: 90% discount for cached tokens (all models with context caching)
  • Cached tokens are automatically detected and priced separately
  • If cached pricing is not defined, falls back to regular pricing

Important Notes:

  • Cached tokens are tracked separately and receive automatic discount rates
  • Reasoning tokens (o1/o3 models) are charged as output tokens
  • Text-to-Speech models may use character-based pricing for input (not token-based)

Architecture Overview

The LLM Gateway uses several architectural patterns to provide a clean, extensible interface for multiple LLM providers:

Core Architecture Components

Service Layer:

  • Abstract Base Services: LLMBaseService provides common functionality for all services
  • Purpose-Specific Services: Separate services for Completion, Assistance, Speech-to-Text, and Text-to-Speech
  • Provider Implementations: Each provider extends abstract services with specific implementations
  • Instance Caching: Automatic SDK instance pooling with LRU eviction to prevent memory leaks

Instance Caching Architecture:

  • Shared Global Cache: All service types share a single static cache to optimize memory usage
  • Provider-based Caching: SDK instances are cached by provider and credential hash (SHA256)
  • Cross-Purpose Sharing: Different purposes (Completion, Assistance, etc.) share instances for the same provider and credentials
  • LRU Eviction: Maintains up to 10 instances globally using Least Recently Used eviction
  • Automatic Reuse: Instances are automatically reused when switching back to previously used credentials
  • Memory Safety: Prevents memory leaks when credentials change frequently (e.g., multi-tenant scenarios)

Metrics & Reporting:

  • LLMReporterInterface<ReporterContext>: Type-safe reporter interface for metrics collection
  • Automatic Cost Calculation: Built-in cost tracking using flexible pricing functions for text and audio tokens
  • Timer Integration: Automatic duration tracking via initMetricsWriter() pattern
  • Token Usage Tracking: Separate tracking for text and audio tokens (input/output)

Logging Infrastructure:

  • LLMLoggerInterface: Standard logging interface compatible with major logging libraries
  • Child Logger Support: Context propagation through service hierarchies

Schema Architecture Overview

The LLM Gateway uses a driver pattern for schema conversion, providing clean separation between core schema logic and provider-specific implementations.

Core Components:

  • LLMSchema: Core schema builder with unified API
  • SchemaAdapterInterface: Simple contract for provider-specific schema converters
  • SchemaAdapterRegistry: Type-safe registry ensuring all providers are handled
  • Provider Adapters: Convert generic JSON Schema to provider-specific formats

How It Works:

// The schema uses a unified API regardless of provider
const schema = LLMSchema.object({
  name: LLMSchema.string(),
  age: LLMSchema.number().min(1),
});

// Internally, services call _toProviderSchema() which automatically
// converts to the correct provider-specific format:
schema._toProviderSchema(LLMProviders.OpenAI);           // → OpenAI JSON Schema format
schema._toProviderSchema(LLMProviders.GoogleGenerativeAI); // → Google Type-based format

Provider Schema Adapters:

Each provider has its own schema adapter located in src/providers/{Provider}/schemas/:

  • OpenAISchemaAdapter: Converts to OpenAI's JSON Schema format, handles strict mode requirements
  • GoogleSchemaAdapter: Converts to Google's Type-based schema format using their Type enum

Type-Safe Registry:

Schema adapters are managed through a centralized, type-safe registry that ensures compile-time safety:

// src/utilities/schema/SchemaAdapterRegistry.ts
export const SCHEMA_ADAPTER_REGISTRY = {
  OpenAI: new OpenAISchemaAdapter(),
  GoogleGenerativeAI: new GoogleSchemaAdapter(),
} as const satisfies Record<LLMProviders, SchemaAdapterInterface | null>;

Adding a New Provider

To add support for a new LLM provider, follow these steps:

1. Create Provider Directory Structure

Create a new directory in src/providers with your provider name, following the established pattern:

src/providers/YourProvider/
├── index.ts                    # Entry point for provider exports
├── YourProvider.constants.ts   # Provider-specific constants
├── YourProvider.entity.ts      # Provider-specific entity
├── YourProvider.typedefs.ts    # TypeScript definitions
├── YourProviderService.factory.ts # Factory for your provider's services
├── schemas/                    # Schema conversion adapters
│   └── YourProviderSchemaAdapter.ts
└── services/                   # Provider service implementations
    ├── index.ts
    ├── YourProviderCompletionService.ts
    └── YourProviderAssistanceService.ts

2. Add Provider to LLM Providers Enum

Update the LLM providers enum in src/LLMService.typedefs.ts:

export enum LLMProviders {
  OpenAI = 'openai',
  GoogleGenerativeAI = 'google',
  YourProvider = 'your-provider-id',
}

3. Define Provider-Specific Types

Create type definitions in src/providers/YourProvider/YourProvider.typedefs.ts:

// Define model names as an enum for type safety
export enum YourProviderModelNames {
  MODEL_ONE = 'model-one',
  MODEL_TWO = 'model-extended',
}

// Define message roles if applicable
export enum YourProviderRoles {
  User = 'user',
  Assistant = 'assistant',
  System = 'system',
}

// Add any other provider-specific enums or interfaces

Then ensure your provider is properly integrated in the main type system by updating the necessary type mappings in src/LLMService.typedefs.ts:

// Add import for your provider's types
import { type YourProviderModelNames } from './providers/YourProvider/YourProvider.typedefs';

// Update the LLMProviders enum
export enum LLMProviders {
  OpenAI = 'OpenAI',
  GoogleGenerativeAI = 'GoogleGenerativeAI',
  YourProvider = 'YourProvider',
}

// Update LLMInstances type mapping
export type LLMInstances = {
  // ...existing code...
  [LLMProviders.YourProvider]: YourProviderClient; // Your provider's client type
};

// Update LLMInstanceOptions type mapping
export type LLMInstanceOptions = {
  // ...existing code...
  [LLMProviders.YourProvider]: {
    apiKey: string;
    // Add other provider-specific options
  };
};

// Update LLMModelName type mapping
export type LLMModelName = {
  // ...existing code...
  [LLMProviders.YourProvider]: YourProviderModelNames;
};

4. Create Schema Adapter

Implement a schema adapter in src/providers/YourProvider/schemas/YourProviderSchemaAdapter.ts:

import { LLMProviders } from '@/LLMService.typedefs';
import type { SchemaAdapterInterface } from '@/utilities/schema/SchemaAdapterInterface';

export class YourProviderSchemaAdapter implements SchemaAdapterInterface {
  convertSchema(jsonSchema: any): any {
    // Convert JSON Schema to your provider's specific format
    // Example: transform to provider-specific schema structure
    return this.transformToYourProviderFormat(jsonSchema);
  }

  private transformToYourProviderFormat(jsonSchema: any): any {
    // Implement provider-specific schema transformation logic
    // Handle objects, arrays, strings, numbers, etc.
    // Return the schema in your provider's expected format

    if (jsonSchema.type === 'object') {
      // Handle object schemas
      return {
        // Your provider's object schema format
      };
    }

    // Handle other schema types...
    return jsonSchema;
  }
}

5. Add Adapter to Schema Registry

Update the schema adapter registry in src/utilities/schema/SchemaAdapterRegistry.ts:

import { YourProviderSchemaAdapter } from '@/providers/YourProvider/schemas/YourProviderSchemaAdapter';

export const SCHEMA_ADAPTER_REGISTRY = {
  OpenAI: new OpenAISchemaAdapter(),
  GoogleGenerativeAI: new GoogleSchemaAdapter(),
  YourProvider: new YourProviderSchemaAdapter(), // Add your adapter here
  // TypeScript will enforce that ALL providers have adapters
} as const satisfies Record<LLMProviders, SchemaAdapterInterface | null>;

If your provider doesn't support structured output, set it to null:

YourProvider: null, // Provider doesn't support structured output

6. Implement Provider Constants

Define constants in src/providers/YourProvider/YourProvider.constants.ts:

import {
  type LLMProviderAvailableModels,
  type LLMProviderModelsByPurpose,
  type LLMProviders,
  LLMPurposes,
  type LLMServiceBuilder,
} from '@/LLMService.typedefs';
import { YourProviderModelNames } from './YourProvider.typedefs';
import {
  YourProviderAssistanceService,
  YourProviderCompletionService,
  YourProviderSpeechToTextService,
  YourProviderTextToSpeechService,
} from './services';
import { pick } from '@/utilities/functional.utils';

// Define available models with their capabilities, configurations and pricing
const YOUR_PROVIDER_AVAILABLE_MODELS = {
  [YourProviderModelNames.MODEL_ONE]: {
    name: YourProviderModelNames.MODEL_ONE,
    limits: {
      maxInputTokens: 8_000,
      maxOutputTokens: 2_000,
    },
    config: {
      temperature: 0.2,
    },
    pricing: {
      getPriceForTextInput: (tokens) => 0.5 * tokens / 1_000_000,  // Cost per million text input tokens
      getPriceForTextOutput: (tokens) => 1.5 * tokens / 1_000_000, // Cost per million text output tokens
      getPriceForAudioInput: (tokens) => 0, // Cost per million audio input tokens
      getPriceForAudioOutput: (tokens) => 0, // Cost per million audio output tokens
      currency: 'USD' as const,
    },
  },
  [YourProviderModelNames.MODEL_TWO]: {
    name: YourProviderModelNames.MODEL_TWO,
    limits: {
      maxInputTokens: 16_000,
      maxOutputTokens: 4_000,
    },
    config: {
      temperature: 0.2,
    },
    pricing: {
      getPriceForTextInput: (tokens) => 1 * tokens / 1_000_000,   // Cost per million text input tokens
      getPriceForTextOutput: (tokens) => 3 * tokens / 1_000_000,  // Cost per million text output tokens
      getPriceForAudioInput: (tokens) => 0, // Cost per million audio input tokens
      getPriceForAudioOutput: (tokens) => 0, // Cost per million audio output tokens
      currency: 'USD' as const,
    },
  },
} as const satisfies LLMProviderAvailableModels<
  LLMProviders.YourProvider
>;

// Specify which models are available for each purpose
export const YOUR_PROVIDER_MODELS = {
  [LLMPurposes.Completion]: pick(
    YOUR_PROVIDER_AVAILABLE_MODELS,
    [
      YourProviderModelNames.MODEL_ONE,
      YourProviderModelNames.MODEL_TWO,
    ],
  ),
  [LLMPurposes.Assistance]: pick(
    YOUR_PROVIDER_AVAILABLE_MODELS,
    [
      YourProviderModelNames.MODEL_TWO, // Only MODEL_TWO supports assistance
    ],
  ),
  [LLMPurposes.SpeechToText]: pick(
    YOUR_PROVIDER_AVAILABLE_MODELS,
    [
      YourProviderModelNames.MODEL_ONE, // Speech-to-text capable model
    ],
  ),
  [LLMPurposes.TextToSpeech]: pick(
    YOUR_PROVIDER_AVAILABLE_MODELS,
    [
      YourProviderModelNames.MODEL_ONE, // Text-to-speech capable model
    ],
  ),
} as const satisfies LLMProviderModelsByPurpose<
  LLMPurposes,
  LLMProviders.YourProvider
>;

// Define service builders for each LLM purpose
export const YOUR_PROVIDER_SERVICE_BUILDERS = {
  [LLMPurposes.Completion]: (logger, reporter, options) => (
    new YourProviderCompletionService(logger, reporter, options)
  ),
  [LLMPurposes.Assistance]: (logger, reporter, options) => (
    new YourProviderAssistanceService(logger, reporter, options)
  ),
  [LLMPurposes.SpeechToText]: (logger, reporter, options) => (
    new YourProviderSpeechToTextService(logger, reporter, options)
  ),
  [LLMPurposes.TextToSpeech]: (logger, reporter, options) => (
    new YourProviderTextToSpeechService(logger, reporter, options)
  ),
} as const satisfies {
  [purpose in LLMPurposes]: (
    LLMServiceBuilder<LLMProviders.YourProvider, purpose> | null
  )
};

6. Implement Provider Entity (if needed)

Create the entity class in src/providers/YourProvider/YourProvider.entity.ts:

export class YourProviderEntity {
  // Implement prov