@x12i/ai-provider-openai
v3.2.1
Published
OpenAI provider implementation for LLM providers
Readme
@x12i/ai-provider-openai
OpenAI provider implementation for the AI providers ecosystem. This package provides a standardized interface for interacting with OpenAI's GPT models (GPT-4, GPT-3.5, etc.) through the AIProviderInterface.
Installation
npm install @x12i/ai-provider-openaiConfiguration
The provider requires an OpenAI API key. You can also configure a custom base URL and organization ID.
⚠️ IMPORTANT: Model Selection Architecture
This provider follows a strict architectural pattern where models MUST be provided in each request's config.model. The provider does NOT support default models or hardcoded fallbacks. This ensures:
- Full control over model selection from the application layer
- Ability to switch models without code changes
- Provider-agnostic model configuration
- Support for dynamic model selection and routing
Default models should be configured in the Gateway, not in providers.
⚠️ IMPORTANT: GPT-5 Organization Requirement
GPT-5 models (gpt-5-mini, gpt-5-nano, gpt-5.2, etc.) require an organization header to be configured. If you use GPT-5 models without setting an organization, the provider will throw a clear error:
GPT-5 models (like 'gpt-5-mini') require an organization header.
Please set OPENAI_ORGANIZATION environment variable or configure organization in provider config.
This is required by OpenAI for GPT-5 model access.To use GPT-5 models, you MUST set the organization:
import { OpenAIProvider } from '@x12i/ai-provider-openai';
// ✅ Correct: GPT-5 models require organization
const provider = new OpenAIProvider({
apiKey: process.env.OPENAI_API_KEY!,
organization: process.env.OPENAI_ORGANIZATION!, // REQUIRED for GPT-5 models
});
// ❌ Incorrect: This will fail for GPT-5 models
const providerWithoutOrg = new OpenAIProvider({
apiKey: process.env.OPENAI_API_KEY!,
// organization: undefined - missing!
});Environment Variables:
export OPENAI_API_KEY="your-api-key"
export OPENAI_ORGANIZATION="org-your-org-id" # REQUIRED for GPT-5 modelsGPT-4 and other models do NOT require organization - it's optional for backward compatibility. Only GPT-5 models enforce this requirement.
import { OpenAIProvider } from '@x12i/ai-provider-openai';
const provider = new OpenAIProvider({
apiKey: process.env.OPENAI_API_KEY!,
baseURL: 'https://api.openai.com/v1', // Optional, for custom endpoints
organization: 'org-xxx', // Optional for GPT-4, REQUIRED for GPT-5
});Provider Capabilities
The provider supports querying its capabilities:
const capabilities = provider.getCapabilities();
console.log(capabilities);
// {
// supportsSync: true,
// supportsStreaming: true,
// supportsBatch: false
// }- supportsSync: Always
true- synchronousaiCallis supported - supportsStreaming:
true- streaming viaaiCallStreamis supported - supportsBatch:
false- OpenAI's batch API is file-based, not request-based (use Batch API methods below)
Usage
Basic Synchronous Call
import { OpenAIProvider } from '@x12i/ai-provider-openai';
const provider = new OpenAIProvider({
apiKey: process.env.OPENAI_API_KEY!,
});
const response = await provider.aiCall({
instructions: 'Summarize the following text in one sentence.',
inputData: 'This is a long article about artificial intelligence...',
config: {
model: 'gpt-4-turbo', // REQUIRED: Model must be provided in config.model
temperature: 0.7,
maxTokens: 100,
},
});
console.log(response.output); // The summarized textStreaming Calls
The provider supports streaming for real-time token-by-token responses:
const provider = new OpenAIProvider({
apiKey: process.env.OPENAI_API_KEY!,
});
// Check if streaming is supported
if (provider.getCapabilities().supportsStreaming) {
let fullText = '';
for await (const chunk of provider.aiCallStream({
instructions: 'Write a short story about a robot.',
inputData: undefined,
config: {
model: 'gpt-4-turbo', // REQUIRED: Model must be provided in config.model
temperature: 0.8,
maxTokens: 200,
},
})) {
// Process each chunk as it arrives
if (chunk.deltaText) {
process.stdout.write(chunk.deltaText);
fullText += chunk.deltaText;
}
// Check if this is the final chunk
if (chunk.done) {
console.log('\n\nStreaming completed!');
console.log('Full text:', fullText);
}
}
}Reasoning Models Support
OpenAI's reasoning models (o1, o1-mini, o3-mini, o1-pro) expose their internal "thinking process" as reasoning content that streams separately from the final answer.
Features
- ✅ Stream reasoning content incrementally
- ✅ Access complete reasoning in final chunk
- ✅ Support for multi-item responses
- ✅ Automatic buffer management
- ✅ Comprehensive logging via logxer
Streaming Example
import { OpenAIProvider } from '@x12i/ai-provider-openai';
const provider = new OpenAIProvider({
apiKey: process.env.OPENAI_API_KEY!,
});
for await (const chunk of provider.aiCallStream({
instructions: 'Solve this problem with detailed reasoning',
inputData: 'What is 42 × 37?',
config: {
model: 'o3-mini',
reasoningEffort: 'high', // 'low' | 'medium' | 'high'
},
})) {
// Stream reasoning (thinking process)
if (chunk.metadata?.reasoningDelta) {
process.stderr.write(chunk.metadata.reasoningDelta);
}
// Stream final answer
if (chunk.deltaText) {
process.stdout.write(chunk.deltaText);
}
// Access complete reasoning in final chunk
if (chunk.done && chunk.metadata?.reasoningAvailable) {
console.log('\n\n=== Complete Reasoning ===');
console.log(chunk.metadata.reasoning);
console.log('\n=== Final Answer ===');
console.log(chunk.deltaOutput);
}
}Synchronous Example
const result = await provider.aiCall({
instructions: 'Explain your reasoning',
config: { model: 'o1-mini' },
});
if (result.metadata?.reasoningAvailable) {
console.log('Reasoning:', result.metadata.reasoning);
}
console.log('Answer:', result.output);Important Notes
- Reasoning can be lengthy: o1/o3 models may generate thousands of tokens
- Reasoning streams first: Typically arrives before the final answer
- Only o-series models: Regular models (gpt-4, gpt-3.5) don't expose reasoning
- Cost consideration: Reasoning tokens count toward usage limits
- Logging: All reasoning events are logged via logxer for debugging
Architecture
Reasoning Flow:
├── SSE Events (response.content_part.delta/done)
├── Event Handlers (event-handlers.ts)
├── Accumulation (utils/accumulator.ts)
├── Logging (utils/logger.ts via logxer)
└── Final Chunk (includes complete reasoning)With Structured Data
const response = await provider.aiCall({
instructions: 'Analyze the following data and return a JSON object with a summary.',
inputData: {
items: ['apple', 'banana', 'cherry'],
count: 3,
},
config: {
model: 'gpt-4-turbo', // REQUIRED: Model must be provided in config.model
temperature: 0.3,
maxTokens: 200,
},
});
// response.output will be parsed JSON if the response is valid JSON
console.log(response.output); // { summary: "..." }With Tags and Trace ID
const response = await provider.aiCall({
instructions: 'Process this data',
inputData: someData,
config: {
model: 'gpt-4-turbo', // REQUIRED: Model must be provided in config.model
maxTokens: 500,
},
tags: ['production', 'batch-job'],
traceId: 'trace-12345',
});OpenAI Batch API (File-Based)
The provider includes OpenAI-specific Batch API methods for cost-effective batch processing. These are separate from the standard interface batch methods and use OpenAI's file-based batch system.
Upload Batch File
import { OpenAIProvider } from '@x12i/ai-provider-openai';
import type { BatchFileUploadRequest } from '@x12i/ai-provider-openai';
const provider = new OpenAIProvider({
apiKey: process.env.OPENAI_API_KEY!,
});
// Prepare batch file (NDJSON format)
// IMPORTANT: Use '/v1/responses' for Responses API (recommended)
// The url in each request must match the endpoint used in createBatchJob()
const batchData = [
{
custom_id: 'req-1',
method: 'POST',
url: '/v1/responses', // Use Responses API endpoint
body: {
model: 'gpt-5-nano',
messages: [{ role: 'user', content: 'Hello 1' }],
max_output_tokens: 100
}
},
{
custom_id: 'req-2',
method: 'POST',
url: '/v1/responses', // Use Responses API endpoint
body: {
model: 'gpt-5-nano',
messages: [{ role: 'user', content: 'Hello 2' }],
max_output_tokens: 100
}
},
];
const fileBuffer = Buffer.from(batchData.map(r => JSON.stringify(r)).join('\n'));
// Upload file
const fileResponse = await provider.uploadBatchFile({
file: fileBuffer,
filename: 'batch-requests.jsonl',
purpose: 'batch',
});
console.log('File ID:', fileResponse.id);Create Batch Job
import type { BatchJobCreateRequest } from '@x12i/ai-provider-openai';
// IMPORTANT: Default endpoint is '/v1/responses' (Responses API)
// This matches the url in your batch file and is consistent with aiCall() and aiCallStream()
// Legacy '/v1/chat/completions' is still supported but not recommended
const batchJob = await provider.createBatchJob({
input_file_id: fileResponse.id,
endpoint: '/v1/responses', // Optional: defaults to '/v1/responses' if omitted
completion_window: '24h', // Optional, default is 24h
});
console.log('Batch Job ID:', batchJob.id);
console.log('Status:', batchJob.status); // 'validating' | 'in_progress' | etc.Retrieve Batch Job Status
import type { BatchJobRetrieveRequest } from '@x12i/ai-provider-openai';
const status = await provider.retrieveBatchJob({
batchId: batchJob.id,
});
console.log('Current status:', status.status);
console.log('Completed:', status.request_counts.completed);
console.log('Failed:', status.request_counts.failed);Download Batch Results
import type { FileContentRequest } from '@x12i/ai-provider-openai';
// Once batch is completed, download results
if (status.status === 'completed' && status.output_file_id) {
const results = await provider.downloadFileContent({
fileId: status.output_file_id,
});
// Parse NDJSON results
const responses = results.content
.split('\n')
.filter(line => line.trim())
.map(line => JSON.parse(line));
console.log('Batch results:', responses);
}Response Structure
Synchronous Response
The provider returns a standardized AIResponse object:
{
output: unknown; // The main result (text or parsed JSON)
rawText?: string; // Raw text response from the API
metadata: { // Execution metadata
provider: 'openai';
model?: string;
providerRequestId?: string;
durationMs?: number;
normalizedConfig?: Record<string, any>; // The normalized config parameters (config is metadata)
};
normalization?: { // Optional: Only present when normalization occurred (separate from metadata)
messagesNormalized?: boolean; // Whether messages were normalized to Responses API format
configNormalized?: boolean; // Whether config parameters were normalized
normalizedRequest?: Record<string, any>; // ⭐ THE EXACT REQUEST BODY SENT TO OPENAI API
warnings?: string[]; // Warnings about parameter modifications (e.g., unsupported params removed)
};
usage?: { // Token usage information
inputTokens?: number;
outputTokens?: number;
totalTokens?: number;
};
}Normalization Information
The normalization field is a top-level field (separate from metadata) that is optional and only present when normalization occurs. This happens when:
Messages Normalization: When the provider converts your request format (either gateway
messagesarray or legacyinstructions+inputData) to OpenAI's Responses API format (instructions+input). This always happens for Responses API compatibility.Config Normalization: When the provider normalizes configuration parameters for model-specific requirements:
- Parameter name conversions (e.g.,
maxTokens→max_output_tokensfor Responses API models) - Removal of unsupported parameters for specific models
- Parameter value adjustments based on model capabilities
- Parameter name conversions (e.g.,
Normalization Fields:
metadata.normalizedConfig:Record<string, any>- The normalized config parameters (config is metadata, so it's under metadata)normalization.messagesNormalized:boolean- Whether messages were normalized to Responses API formatnormalization.configNormalized:boolean- Whether config parameters were normalizednormalization.normalizedRequest:Record<string, any>- ⭐ THE EXACT REQUEST BODY SENT TO OPENAI API - This is the complete normalized request that was actually sent to OpenAI's/v1/responsesendpoint. It includes:- The normalized messages (
instructions+inputfields) - All normalized config parameters (
max_output_tokens,stop,reasoning, etc.) - The
modelfield - This is what you need to see exactly what data was sent to OpenAI
- The normalized messages (
normalization.warnings:string[]- Warnings about parameter modifications (e.g., unsupported params removed)
🔍 Accessing the Exact Request Sent to OpenAI:
To see the exact request body that was sent to OpenAI's API, access response.normalization.normalizedRequest:
const response = await provider.aiCall({
instructions: 'Your instructions',
inputData: 'Your input',
config: { model: 'gpt-5-nano', maxTokens: 500 }
});
// Get the exact request sent to OpenAI
const exactRequestSent = response.normalization?.normalizedRequest;
console.log('Exact request sent to OpenAI:', JSON.stringify(exactRequestSent, null, 2));Example: GPT-5 Model with Normalization
const response = await provider.aiCall({
instructions: 'Summarize this text',
inputData: 'Long text here...',
config: {
model: 'gpt-5-nano', // Responses API model
maxTokens: 500, // Will be normalized to max_output_tokens
temperature: 0.8, // Will be omitted (not supported by gpt-5-nano)
},
});
// The normalized config parameters (under metadata, since config is metadata)
if (response.metadata.normalizedConfig) {
console.log('Normalized config:', response.metadata.normalizedConfig);
// Example output:
// {
// max_output_tokens: 500,
// stop: undefined // (if provided)
// }
}
// Check if normalization occurred
if (response.normalization) {
console.log('Messages normalized:', response.normalization.messagesNormalized);
console.log('Config normalized:', response.normalization.configNormalized);
// The complete normalized request body sent to OpenAI API
console.log('Normalized request:', response.normalization.normalizedRequest);
// Example output:
// {
// model: 'gpt-5-nano',
// instructions: 'Summarize this text',
// input: 'Long text here...',
// max_output_tokens: 500,
// stop: ['\n\n'] // (if provided)
// // Note: temperature is omitted (not supported by gpt-5-nano)
// }
// Check for warnings about parameter modifications
if (response.normalization.warnings) {
console.log('Normalization warnings:', response.normalization.warnings);
// Example: ["Omitted temperature=0.8 (model uses fixed default 1, parameter not accepted)."]
}
}Example: GPT-5.2 with Reasoning Effort
const response = await provider.aiCall({
instructions: 'Solve this math problem',
inputData: 'What is 2 + 2?',
config: {
model: 'gpt-5.2',
maxTokens: 1000,
reasoningEffort: 'high', // Normalized to reasoning: { effort: 'high' }
temperature: 0.7, // Omitted (only allowed when reasoning.effort is 'none')
},
});
if (response.normalization?.normalizedRequest) {
console.log('Exact request sent to OpenAI:',
JSON.stringify(response.normalization.normalizedRequest, null, 2));
// Output:
// {
// "model": "gpt-5.2",
// "instructions": "Solve this math problem",
// "input": "What is 2 + 2?",
// "max_output_tokens": 1000,
// "reasoning": {
// "effort": "high"
// },
// "reasoning_effort": "high"
// // Note: temperature is omitted
// }
}Note: The normalization field (top-level, separate from metadata) is only included when normalization actually occurs. If your request already matches the API format exactly, this field will be undefined.
Streaming Chunks
When using aiCallStream, you receive AIStreamChunk objects:
{
deltaText?: string; // Incremental text delta
deltaOutput?: unknown; // Incremental structured output (optional)
done?: boolean; // True on final chunk
metadata?: { // Partial metadata per chunk
provider?: string;
model?: string;
providerRequestId?: string;
normalizedConfig?: Record<string, any>; // The normalized config parameters (only in final chunk)
usage?: { // Token usage (only in final chunk)
inputTokens?: number;
outputTokens?: number;
totalTokens?: number;
};
};
normalization?: { // Optional: Only present in final chunk when normalization occurred (separate from metadata)
messagesNormalized?: boolean;
configNormalized?: boolean;
normalizedRequest?: Record<string, any>; // ⭐ THE EXACT REQUEST BODY SENT TO OPENAI API
warnings?: string[];
};
}Note: Normalization (including normalizedRequest) is only included in the final chunk (when done: true) of a streaming response, and only when normalization actually occurred.
Normalization Details
This provider automatically normalizes requests to match OpenAI's Responses API format and model-specific capabilities. Understanding this normalization is crucial for debugging and ensuring your requests work correctly with modern OpenAI models like GPT-5.
What Gets Normalized
1. Messages Format Normalization
Input formats accepted:
- Gateway format:
messagesarray with{ role: "system" | "user", content: string } - Legacy format:
instructions+inputData
Output format (always):
- Responses API format:
instructions+input- First system message →
instructions - Optional middle system message →
developerrole ininputarray - Last user message →
userrole ininput(or plain string if no context)
- First system message →
Example transformation:
// Input (gateway format)
{
messages: [
{ role: "system", content: "You are a helpful assistant" },
{ role: "user", content: "Hello" }
],
config: { model: "gpt-5-nano" }
}
// Normalized request sent to OpenAI
{
model: "gpt-5-nano",
instructions: "You are a helpful assistant",
input: "Hello"
}2. Config Parameter Normalization
The provider normalizes configuration parameters based on the model's capabilities:
Token Parameter Mapping:
maxTokens→max_output_tokens(for Responses API models like GPT-5)maxTokens→max_tokens(for Chat Completions API models like GPT-4)- Also accepts:
maxOutputTokens,max_output_tokens,max_tokensdirectly
Parameter Removal:
- Unsupported parameters are omitted (not sent to API)
- Examples:
temperatureomitted for GPT-5-nano/mini (fixed at 1.0)temperatureomitted for GPT-5.2 whenreasoning.effort !== 'none'top_pomitted for models that don't support itfrequencyPenalty/presencePenaltyomitted for GPT-5 models
Reasoning Effort Normalization:
reasoningEffort: 'high'→reasoning: { effort: 'high' }+reasoning_effort: 'high'reasoning_effort: 'medium'→reasoning: { effort: 'medium' }+reasoning_effort: 'medium'reasoning: { effort: 'low' }→ kept as-is +reasoning_effort: 'low'
Complete Example: GPT-5 Request Flow
What you send:
const response = await provider.aiCall({
instructions: 'Analyze this data',
inputData: { items: [1, 2, 3] },
config: {
model: 'gpt-5-nano',
maxTokens: 500,
temperature: 0.8, // Will be omitted
topP: 0.9, // Will be omitted
stop: ['\n\n'],
},
});What gets sent to OpenAI API (from response.normalization.normalizedRequest):
{
"model": "gpt-5-nano",
"instructions": "Analyze this data",
"input": "{\n \"items\": [1, 2, 3]\n}",
"max_output_tokens": 500,
"stop": ["\n\n"]
// Note: temperature and topP are omitted (not supported by gpt-5-nano)
}Warnings (from response.normalization.warnings):
[
"Omitted temperature=0.8 (model uses fixed default 1, parameter not accepted).",
"Omitted top_p=0.9 (not supported for model / reasoning setting)."
]Accessing Normalization Information
For synchronous calls:
const response = await provider.aiCall({ /* ... */ });
// The normalized config parameters (under metadata, since config is metadata)
if (response.metadata.normalizedConfig) {
const normalizedConfig = response.metadata.normalizedConfig;
console.log('Normalized config:', normalizedConfig);
}
// Check if normalization occurred
if (response.normalization) {
// The complete request body sent to OpenAI
const exactRequestSent = response.normalization.normalizedRequest;
console.log('Exact request:', JSON.stringify(exactRequestSent, null, 2));
// Any warnings about parameter modifications
const warnings = response.normalization.warnings;
}For streaming calls:
for await (const chunk of provider.aiCallStream({ /* ... */ })) {
if (chunk.done && chunk.normalization) {
// Normalization info only in final chunk
const exactRequestSent = chunk.normalization.normalizedRequest;
console.log('Exact request:', JSON.stringify(exactRequestSent, null, 2));
}
}Model-Specific Normalization Rules
GPT-5-nano / GPT-5-mini:
- Uses Responses API (
max_output_tokens) temperature: Fixed at 1.0 (parameter omitted)top_p: Not supported (omitted)reasoningEffort: Not supported (omitted)frequencyPenalty/presencePenalty: Not supported (omitted)stop: Supported
GPT-5.2 / GPT-5.2-pro:
- Uses Responses API (
max_output_tokens) reasoningEffort: Supported (none,minimal,low,medium,high)temperature: Only allowed whenreasoning.effort === 'none'(otherwise omitted)top_p: Only allowed whenreasoning.effort === 'none'(otherwise omitted)frequencyPenalty/presencePenalty: Not supported (omitted)stop: Supported
GPT-4-turbo (Chat Completions API):
- Uses Chat Completions API (
max_tokens) - All standard parameters supported (
temperature,top_p,frequencyPenalty,presencePenalty,stop)
Debugging Normalization
To see exactly what was sent to OpenAI, always check response.normalization.normalizedRequest. This shows:
- The exact API endpoint format (Responses API vs Chat Completions)
- Which parameters were included/omitted
- The normalized parameter names (
max_output_tokensvsmax_tokens) - Any warnings about unsupported parameters
Error Handling
The provider includes custom error types for better error handling:
import { OpenAIProviderError, OpenAIAPIError } from '@x12i/ai-provider-openai';
try {
const response = await provider.aiCall(request);
} catch (error) {
if (error instanceof OpenAIAPIError) {
console.error('API Error:', error.statusCode, error.message);
} else if (error instanceof OpenAIProviderError) {
console.error('Provider Error:', error.message);
}
}Supported Models
The provider supports all OpenAI chat completion models. The model MUST be specified in each request's config.model - there is no default model.
Supported models include:
gpt-4-turbogpt-4gpt-4-32kgpt-3.5-turbogpt-4ogpt-4o-mini- And other OpenAI chat models
Refer to OpenAI's model documentation for the complete list.
Configuration Options
Request Configuration
The AIRequestConfig interface supports the following options:
maxTokens?: number- Maximum tokens to generatetemperature?: number- Sampling temperature (0-2)topP?: number- Nucleus sampling parametertimeoutMs?: number- Request timeout in milliseconds- Additional provider-specific options can be passed via index signature
Model Selection
The model MUST be specified in each request's config.model - it is required, not optional. The provider will throw an error if the model is not provided.
const response = await provider.aiCall({
instructions: '...',
inputData: '...',
config: {
model: 'gpt-4', // REQUIRED: Model must be provided
maxTokens: 500,
},
});How It Works
The provider converts the task-style interface (instructions + inputData) into OpenAI's chat format:
- instructions → System message
- inputData → User message (serialized to JSON string if object/array)
- model → Must be provided in
config.model(required, no defaults) - Response is returned as raw text exactly as received from the API (no parsing or manipulation)
For streaming:
- Tokens are streamed as they're generated
- Each chunk contains incremental text deltas
- Final chunk includes the complete accumulated output
API Rate Limits
OpenAI has rate limits based on your account tier. Refer to OpenAI's documentation for details.
Cost Calculation
Token usage is included in the response, but cost calculation should be handled by the consuming application or router. Token costs vary by model:
- GPT-4 Turbo: $0.01/1K input tokens, $0.03/1K output tokens
- GPT-4: $0.03/1K input tokens, $0.06/1K output tokens
- GPT-3.5 Turbo: $0.0015/1K input tokens, $0.002/1K output tokens
Refer to OpenAI's pricing page for current rates.
Batch API Types
The provider exports types for Batch API operations:
import type {
BatchFileUploadRequest,
BatchFileUploadResponse,
BatchJobCreateRequest,
BatchJobCreateResponse,
BatchJobRetrieveRequest,
FileContentRequest,
FileContentResponse,
} from '@x12i/ai-provider-openai';Links
License
MIT
