@x12i/ai-provider-openai

v3.2.1

Published

a month ago

OpenAI provider implementation for LLM providers

0High
0Medium
0Low

x12i

openai llm provider gpt x12i

@x12i/ai-provider-openai

OpenAI provider implementation for the AI providers ecosystem. This package provides a standardized interface for interacting with OpenAI's GPT models (GPT-4, GPT-3.5, etc.) through the AIProviderInterface.

Installation

npm install @x12i/ai-provider-openai

Configuration

The provider requires an OpenAI API key. You can also configure a custom base URL and organization ID.

⚠️ IMPORTANT: Model Selection Architecture

This provider follows a strict architectural pattern where models MUST be provided in each request's config.model. The provider does NOT support default models or hardcoded fallbacks. This ensures:

Full control over model selection from the application layer
Ability to switch models without code changes
Provider-agnostic model configuration
Support for dynamic model selection and routing

Default models should be configured in the Gateway, not in providers.

⚠️ IMPORTANT: GPT-5 Organization Requirement

GPT-5 models (gpt-5-mini, gpt-5-nano, gpt-5.2, etc.) require an organization header to be configured. If you use GPT-5 models without setting an organization, the provider will throw a clear error:

GPT-5 models (like 'gpt-5-mini') require an organization header.
Please set OPENAI_ORGANIZATION environment variable or configure organization in provider config.
This is required by OpenAI for GPT-5 model access.

To use GPT-5 models, you MUST set the organization:

import { OpenAIProvider } from '@x12i/ai-provider-openai';

// ✅ Correct: GPT-5 models require organization
const provider = new OpenAIProvider({
  apiKey: process.env.OPENAI_API_KEY!,
  organization: process.env.OPENAI_ORGANIZATION!, // REQUIRED for GPT-5 models
});

// ❌ Incorrect: This will fail for GPT-5 models
const providerWithoutOrg = new OpenAIProvider({
  apiKey: process.env.OPENAI_API_KEY!,
  // organization: undefined - missing!
});

Environment Variables:

export OPENAI_API_KEY="your-api-key"
export OPENAI_ORGANIZATION="org-your-org-id"  # REQUIRED for GPT-5 models

GPT-4 and other models do NOT require organization - it's optional for backward compatibility. Only GPT-5 models enforce this requirement.

import { OpenAIProvider } from '@x12i/ai-provider-openai';

const provider = new OpenAIProvider({
  apiKey: process.env.OPENAI_API_KEY!,
  baseURL: 'https://api.openai.com/v1', // Optional, for custom endpoints
  organization: 'org-xxx', // Optional for GPT-4, REQUIRED for GPT-5
});

Provider Capabilities

The provider supports querying its capabilities:

const capabilities = provider.getCapabilities();
console.log(capabilities);
// {
//   supportsSync: true,
//   supportsStreaming: true,
//   supportsBatch: false
// }

supportsSync: Always true - synchronous aiCall is supported
supportsStreaming: true - streaming via aiCallStream is supported
supportsBatch: false - OpenAI's batch API is file-based, not request-based (use Batch API methods below)

Usage

Basic Synchronous Call

import { OpenAIProvider } from '@x12i/ai-provider-openai';

const provider = new OpenAIProvider({
  apiKey: process.env.OPENAI_API_KEY!,
});

const response = await provider.aiCall({
  instructions: 'Summarize the following text in one sentence.',
  inputData: 'This is a long article about artificial intelligence...',
  config: {
    model: 'gpt-4-turbo', // REQUIRED: Model must be provided in config.model
    temperature: 0.7,
    maxTokens: 100,
  },
});

console.log(response.output); // The summarized text

Streaming Calls

The provider supports streaming for real-time token-by-token responses:

const provider = new OpenAIProvider({
  apiKey: process.env.OPENAI_API_KEY!,
});

// Check if streaming is supported
if (provider.getCapabilities().supportsStreaming) {
  let fullText = '';
  
  for await (const chunk of provider.aiCallStream({
    instructions: 'Write a short story about a robot.',
    inputData: undefined,
    config: {
      model: 'gpt-4-turbo', // REQUIRED: Model must be provided in config.model
      temperature: 0.8,
      maxTokens: 200,
    },
  })) {
    // Process each chunk as it arrives
    if (chunk.deltaText) {
      process.stdout.write(chunk.deltaText);
      fullText += chunk.deltaText;
    }
    
    // Check if this is the final chunk
    if (chunk.done) {
      console.log('\n\nStreaming completed!');
      console.log('Full text:', fullText);
    }
  }
}

Reasoning Models Support

OpenAI's reasoning models (o1, o1-mini, o3-mini, o1-pro) expose their internal "thinking process" as reasoning content that streams separately from the final answer.

Features

✅ Stream reasoning content incrementally
✅ Access complete reasoning in final chunk
✅ Support for multi-item responses
✅ Automatic buffer management
✅ Comprehensive logging via logxer

Streaming Example

import { OpenAIProvider } from '@x12i/ai-provider-openai';

const provider = new OpenAIProvider({
  apiKey: process.env.OPENAI_API_KEY!,
});

for await (const chunk of provider.aiCallStream({
  instructions: 'Solve this problem with detailed reasoning',
  inputData: 'What is 42 × 37?',
  config: {
    model: 'o3-mini',
    reasoningEffort: 'high', // 'low' | 'medium' | 'high'
  },
})) {
  // Stream reasoning (thinking process)
  if (chunk.metadata?.reasoningDelta) {
    process.stderr.write(chunk.metadata.reasoningDelta);
  }

  // Stream final answer
  if (chunk.deltaText) {
    process.stdout.write(chunk.deltaText);
  }

  // Access complete reasoning in final chunk
  if (chunk.done && chunk.metadata?.reasoningAvailable) {
    console.log('\n\n=== Complete Reasoning ===');
    console.log(chunk.metadata.reasoning);
    console.log('\n=== Final Answer ===');
    console.log(chunk.deltaOutput);
  }
}

Synchronous Example

const result = await provider.aiCall({
  instructions: 'Explain your reasoning',
  config: { model: 'o1-mini' },
});

if (result.metadata?.reasoningAvailable) {
  console.log('Reasoning:', result.metadata.reasoning);
}
console.log('Answer:', result.output);

Important Notes

Reasoning can be lengthy: o1/o3 models may generate thousands of tokens
Reasoning streams first: Typically arrives before the final answer
Only o-series models: Regular models (gpt-4, gpt-3.5) don't expose reasoning
Cost consideration: Reasoning tokens count toward usage limits
Logging: All reasoning events are logged via logxer for debugging

Architecture

Reasoning Flow:
├── SSE Events (response.content_part.delta/done)
├── Event Handlers (event-handlers.ts)
├── Accumulation (utils/accumulator.ts)
├── Logging (utils/logger.ts via logxer)
└── Final Chunk (includes complete reasoning)

With Structured Data

const response = await provider.aiCall({
  instructions: 'Analyze the following data and return a JSON object with a summary.',
  inputData: {
    items: ['apple', 'banana', 'cherry'],
    count: 3,
  },
  config: {
    model: 'gpt-4-turbo', // REQUIRED: Model must be provided in config.model
    temperature: 0.3,
    maxTokens: 200,
  },
});

// response.output will be parsed JSON if the response is valid JSON
console.log(response.output); // { summary: "..." }

With Tags and Trace ID

const response = await provider.aiCall({
  instructions: 'Process this data',
  inputData: someData,
  config: {
    model: 'gpt-4-turbo', // REQUIRED: Model must be provided in config.model
    maxTokens: 500,
  },
  tags: ['production', 'batch-job'],
  traceId: 'trace-12345',
});

OpenAI Batch API (File-Based)

The provider includes OpenAI-specific Batch API methods for cost-effective batch processing. These are separate from the standard interface batch methods and use OpenAI's file-based batch system.

Upload Batch File

import { OpenAIProvider } from '@x12i/ai-provider-openai';
import type { BatchFileUploadRequest } from '@x12i/ai-provider-openai';

const provider = new OpenAIProvider({
  apiKey: process.env.OPENAI_API_KEY!,
});

// Prepare batch file (NDJSON format)
// IMPORTANT: Use '/v1/responses' for Responses API (recommended)
// The url in each request must match the endpoint used in createBatchJob()
const batchData = [
  { 
    custom_id: 'req-1', 
    method: 'POST', 
    url: '/v1/responses',  // Use Responses API endpoint
    body: { 
      model: 'gpt-5-nano',
      messages: [{ role: 'user', content: 'Hello 1' }],
      max_output_tokens: 100
    } 
  },
  { 
    custom_id: 'req-2', 
    method: 'POST', 
    url: '/v1/responses',  // Use Responses API endpoint
    body: { 
      model: 'gpt-5-nano',
      messages: [{ role: 'user', content: 'Hello 2' }],
      max_output_tokens: 100
    } 
  },
];
const fileBuffer = Buffer.from(batchData.map(r => JSON.stringify(r)).join('\n'));

// Upload file
const fileResponse = await provider.uploadBatchFile({
  file: fileBuffer,
  filename: 'batch-requests.jsonl',
  purpose: 'batch',
});

console.log('File ID:', fileResponse.id);

Create Batch Job

import type { BatchJobCreateRequest } from '@x12i/ai-provider-openai';

// IMPORTANT: Default endpoint is '/v1/responses' (Responses API)
// This matches the url in your batch file and is consistent with aiCall() and aiCallStream()
// Legacy '/v1/chat/completions' is still supported but not recommended
const batchJob = await provider.createBatchJob({
  input_file_id: fileResponse.id,
  endpoint: '/v1/responses',  // Optional: defaults to '/v1/responses' if omitted
  completion_window: '24h', // Optional, default is 24h
});

console.log('Batch Job ID:', batchJob.id);
console.log('Status:', batchJob.status); // 'validating' | 'in_progress' | etc.

Retrieve Batch Job Status

import type { BatchJobRetrieveRequest } from '@x12i/ai-provider-openai';

const status = await provider.retrieveBatchJob({
  batchId: batchJob.id,
});

console.log('Current status:', status.status);
console.log('Completed:', status.request_counts.completed);
console.log('Failed:', status.request_counts.failed);

Download Batch Results

import type { FileContentRequest } from '@x12i/ai-provider-openai';

// Once batch is completed, download results
if (status.status === 'completed' && status.output_file_id) {
  const results = await provider.downloadFileContent({
    fileId: status.output_file_id,
  });
  
  // Parse NDJSON results
  const responses = results.content
    .split('\n')
    .filter(line => line.trim())
    .map(line => JSON.parse(line));
  
  console.log('Batch results:', responses);
}

Response Structure

Synchronous Response

The provider returns a standardized AIResponse object:

{
  output: unknown;              // The main result (text or parsed JSON)
  rawText?: string;             // Raw text response from the API
  metadata: {                   // Execution metadata
    provider: 'openai';
    model?: string;
    providerRequestId?: string;
    durationMs?: number;
    normalizedConfig?: Record<string, any>;  // The normalized config parameters (config is metadata)
  };
  normalization?: {             // Optional: Only present when normalization occurred (separate from metadata)
    messagesNormalized?: boolean;  // Whether messages were normalized to Responses API format
    configNormalized?: boolean;   // Whether config parameters were normalized
    normalizedRequest?: Record<string, any>;  // ⭐ THE EXACT REQUEST BODY SENT TO OPENAI API
    warnings?: string[];           // Warnings about parameter modifications (e.g., unsupported params removed)
  };
  usage?: {                     // Token usage information
    inputTokens?: number;
    outputTokens?: number;
    totalTokens?: number;
  };
}

Normalization Information

The normalization field is a top-level field (separate from metadata) that is optional and only present when normalization occurs. This happens when:

Messages Normalization: When the provider converts your request format (either gateway messages array or legacy instructions + inputData) to OpenAI's Responses API format (instructions + input). This always happens for Responses API compatibility.
Config Normalization: When the provider normalizes configuration parameters for model-specific requirements:
- Parameter name conversions (e.g., maxTokens → max_output_tokens for Responses API models)
- Removal of unsupported parameters for specific models
- Parameter value adjustments based on model capabilities

Normalization Fields:

metadata.normalizedConfig: Record<string, any> - The normalized config parameters (config is metadata, so it's under metadata)
normalization.messagesNormalized: boolean - Whether messages were normalized to Responses API format
normalization.configNormalized: boolean - Whether config parameters were normalized
normalization.normalizedRequest: Record<string, any> - ⭐ THE EXACT REQUEST BODY SENT TO OPENAI API - This is the complete normalized request that was actually sent to OpenAI's /v1/responses endpoint. It includes:
- The normalized messages (instructions + input fields)
- All normalized config parameters (max_output_tokens, stop, reasoning, etc.)
- The model field
- This is what you need to see exactly what data was sent to OpenAI
normalization.warnings: string[] - Warnings about parameter modifications (e.g., unsupported params removed)

🔍 Accessing the Exact Request Sent to OpenAI:

To see the exact request body that was sent to OpenAI's API, access response.normalization.normalizedRequest:

const response = await provider.aiCall({
  instructions: 'Your instructions',
  inputData: 'Your input',
  config: { model: 'gpt-5-nano', maxTokens: 500 }
});

// Get the exact request sent to OpenAI
const exactRequestSent = response.normalization?.normalizedRequest;
console.log('Exact request sent to OpenAI:', JSON.stringify(exactRequestSent, null, 2));

Example: GPT-5 Model with Normalization

const response = await provider.aiCall({
  instructions: 'Summarize this text',
  inputData: 'Long text here...',
  config: {
    model: 'gpt-5-nano',  // Responses API model
    maxTokens: 500,        // Will be normalized to max_output_tokens
    temperature: 0.8,      // Will be omitted (not supported by gpt-5-nano)
  },
});

// The normalized config parameters (under metadata, since config is metadata)
if (response.metadata.normalizedConfig) {
  console.log('Normalized config:', response.metadata.normalizedConfig);
  // Example output:
  // {
  //   max_output_tokens: 500,
  //   stop: undefined  // (if provided)
  // }
}

// Check if normalization occurred
if (response.normalization) {
  console.log('Messages normalized:', response.normalization.messagesNormalized);
  console.log('Config normalized:', response.normalization.configNormalized);
  
  // The complete normalized request body sent to OpenAI API
  console.log('Normalized request:', response.normalization.normalizedRequest);
  // Example output:
  // {
  //   model: 'gpt-5-nano',
  //   instructions: 'Summarize this text',
  //   input: 'Long text here...',
  //   max_output_tokens: 500,
  //   stop: ['\n\n']  // (if provided)
  //   // Note: temperature is omitted (not supported by gpt-5-nano)
  // }
  
  // Check for warnings about parameter modifications
  if (response.normalization.warnings) {
    console.log('Normalization warnings:', response.normalization.warnings);
    // Example: ["Omitted temperature=0.8 (model uses fixed default 1, parameter not accepted)."]
  }
}

Example: GPT-5.2 with Reasoning Effort

const response = await provider.aiCall({
  instructions: 'Solve this math problem',
  inputData: 'What is 2 + 2?',
  config: {
    model: 'gpt-5.2',
    maxTokens: 1000,
    reasoningEffort: 'high',  // Normalized to reasoning: { effort: 'high' }
    temperature: 0.7,         // Omitted (only allowed when reasoning.effort is 'none')
  },
});

if (response.normalization?.normalizedRequest) {
  console.log('Exact request sent to OpenAI:', 
    JSON.stringify(response.normalization.normalizedRequest, null, 2));
  // Output:
  // {
  //   "model": "gpt-5.2",
  //   "instructions": "Solve this math problem",
  //   "input": "What is 2 + 2?",
  //   "max_output_tokens": 1000,
  //   "reasoning": {
  //     "effort": "high"
  //   },
  //   "reasoning_effort": "high"
  //   // Note: temperature is omitted
  // }
}

Note: The normalization field (top-level, separate from metadata) is only included when normalization actually occurs. If your request already matches the API format exactly, this field will be undefined.

Streaming Chunks

When using aiCallStream, you receive AIStreamChunk objects:

{
  deltaText?: string;           // Incremental text delta
  deltaOutput?: unknown;        // Incremental structured output (optional)
  done?: boolean;               // True on final chunk
  metadata?: {                  // Partial metadata per chunk
    provider?: string;
    model?: string;
    providerRequestId?: string;
    normalizedConfig?: Record<string, any>;  // The normalized config parameters (only in final chunk)
    usage?: {                   // Token usage (only in final chunk)
      inputTokens?: number;
      outputTokens?: number;
      totalTokens?: number;
    };
  };
  normalization?: {             // Optional: Only present in final chunk when normalization occurred (separate from metadata)
    messagesNormalized?: boolean;
    configNormalized?: boolean;
    normalizedRequest?: Record<string, any>;  // ⭐ THE EXACT REQUEST BODY SENT TO OPENAI API
    warnings?: string[];
  };
}

Note: Normalization (including normalizedRequest) is only included in the final chunk (when done: true) of a streaming response, and only when normalization actually occurred.

Normalization Details

This provider automatically normalizes requests to match OpenAI's Responses API format and model-specific capabilities. Understanding this normalization is crucial for debugging and ensuring your requests work correctly with modern OpenAI models like GPT-5.

What Gets Normalized

1. Messages Format Normalization

Input formats accepted:

Gateway format: messages array with { role: "system" | "user", content: string }
Legacy format: instructions + inputData

Output format (always):

Responses API format: instructions + input
- First system message → instructions
- Optional middle system message → developer role in input array
- Last user message → user role in input (or plain string if no context)

Example transformation:

// Input (gateway format)
{
  messages: [
    { role: "system", content: "You are a helpful assistant" },
    { role: "user", content: "Hello" }
  ],
  config: { model: "gpt-5-nano" }
}

// Normalized request sent to OpenAI
{
  model: "gpt-5-nano",
  instructions: "You are a helpful assistant",
  input: "Hello"
}

2. Config Parameter Normalization

The provider normalizes configuration parameters based on the model's capabilities:

Token Parameter Mapping:

maxTokens → max_output_tokens (for Responses API models like GPT-5)
maxTokens → max_tokens (for Chat Completions API models like GPT-4)
Also accepts: maxOutputTokens, max_output_tokens, max_tokens directly

Parameter Removal:

Unsupported parameters are omitted (not sent to API)
Examples:
- temperature omitted for GPT-5-nano/mini (fixed at 1.0)
- temperature omitted for GPT-5.2 when reasoning.effort !== 'none'
- top_p omitted for models that don't support it
- frequencyPenalty/presencePenalty omitted for GPT-5 models

Reasoning Effort Normalization:

reasoningEffort: 'high' → reasoning: { effort: 'high' } + reasoning_effort: 'high'
reasoning_effort: 'medium' → reasoning: { effort: 'medium' } + reasoning_effort: 'medium'
reasoning: { effort: 'low' } → kept as-is + reasoning_effort: 'low'

Complete Example: GPT-5 Request Flow

What you send:

const response = await provider.aiCall({
  instructions: 'Analyze this data',
  inputData: { items: [1, 2, 3] },
  config: {
    model: 'gpt-5-nano',
    maxTokens: 500,
    temperature: 0.8,      // Will be omitted
    topP: 0.9,             // Will be omitted
    stop: ['\n\n'],
  },
});

What gets sent to OpenAI API (from response.normalization.normalizedRequest):

{
  "model": "gpt-5-nano",
  "instructions": "Analyze this data",
  "input": "{\n  \"items\": [1, 2, 3]\n}",
  "max_output_tokens": 500,
  "stop": ["\n\n"]
  // Note: temperature and topP are omitted (not supported by gpt-5-nano)
}

Warnings (from response.normalization.warnings):

[
  "Omitted temperature=0.8 (model uses fixed default 1, parameter not accepted).",
  "Omitted top_p=0.9 (not supported for model / reasoning setting)."
]

Accessing Normalization Information

For synchronous calls:

const response = await provider.aiCall({ /* ... */ });

// The normalized config parameters (under metadata, since config is metadata)
if (response.metadata.normalizedConfig) {
  const normalizedConfig = response.metadata.normalizedConfig;
  console.log('Normalized config:', normalizedConfig);
}

// Check if normalization occurred
if (response.normalization) {
  // The complete request body sent to OpenAI
  const exactRequestSent = response.normalization.normalizedRequest;
  console.log('Exact request:', JSON.stringify(exactRequestSent, null, 2));
  
  // Any warnings about parameter modifications
  const warnings = response.normalization.warnings;
}

For streaming calls:

for await (const chunk of provider.aiCallStream({ /* ... */ })) {
  if (chunk.done && chunk.normalization) {
    // Normalization info only in final chunk
    const exactRequestSent = chunk.normalization.normalizedRequest;
    console.log('Exact request:', JSON.stringify(exactRequestSent, null, 2));
  }
}

Model-Specific Normalization Rules

GPT-5-nano / GPT-5-mini:

Uses Responses API (max_output_tokens)
temperature: Fixed at 1.0 (parameter omitted)
top_p: Not supported (omitted)
reasoningEffort: Not supported (omitted)
frequencyPenalty/presencePenalty: Not supported (omitted)
stop: Supported

GPT-5.2 / GPT-5.2-pro:

Uses Responses API (max_output_tokens)
reasoningEffort: Supported (none, minimal, low, medium, high)
temperature: Only allowed when reasoning.effort === 'none' (otherwise omitted)
top_p: Only allowed when reasoning.effort === 'none' (otherwise omitted)
frequencyPenalty/presencePenalty: Not supported (omitted)
stop: Supported

GPT-4-turbo (Chat Completions API):

Uses Chat Completions API (max_tokens)
All standard parameters supported (temperature, top_p, frequencyPenalty, presencePenalty, stop)

Debugging Normalization

To see exactly what was sent to OpenAI, always check response.normalization.normalizedRequest. This shows:

The exact API endpoint format (Responses API vs Chat Completions)
Which parameters were included/omitted
The normalized parameter names (max_output_tokens vs max_tokens)
Any warnings about unsupported parameters

Error Handling

The provider includes custom error types for better error handling:

import { OpenAIProviderError, OpenAIAPIError } from '@x12i/ai-provider-openai';

try {
  const response = await provider.aiCall(request);
} catch (error) {
  if (error instanceof OpenAIAPIError) {
    console.error('API Error:', error.statusCode, error.message);
  } else if (error instanceof OpenAIProviderError) {
    console.error('Provider Error:', error.message);
  }
}

Supported Models

The provider supports all OpenAI chat completion models. The model MUST be specified in each request's config.model - there is no default model.

Supported models include:

gpt-4-turbo
gpt-4
gpt-4-32k
gpt-3.5-turbo
gpt-4o
gpt-4o-mini
And other OpenAI chat models

Refer to OpenAI's model documentation for the complete list.

Configuration Options

Request Configuration

The AIRequestConfig interface supports the following options:

maxTokens?: number - Maximum tokens to generate
temperature?: number - Sampling temperature (0-2)
topP?: number - Nucleus sampling parameter
timeoutMs?: number - Request timeout in milliseconds
Additional provider-specific options can be passed via index signature

Model Selection

The model MUST be specified in each request's config.model - it is required, not optional. The provider will throw an error if the model is not provided.

const response = await provider.aiCall({
  instructions: '...',
  inputData: '...',
  config: {
    model: 'gpt-4', // REQUIRED: Model must be provided
    maxTokens: 500,
  },
});

How It Works

The provider converts the task-style interface (instructions + inputData) into OpenAI's chat format:

instructions → System message
inputData → User message (serialized to JSON string if object/array)
model → Must be provided in config.model (required, no defaults)
Response is returned as raw text exactly as received from the API (no parsing or manipulation)

For streaming:

Tokens are streamed as they're generated
Each chunk contains incremental text deltas
Final chunk includes the complete accumulated output

API Rate Limits

OpenAI has rate limits based on your account tier. Refer to OpenAI's documentation for details.

Cost Calculation

Token usage is included in the response, but cost calculation should be handled by the consuming application or router. Token costs vary by model:

GPT-4 Turbo: $0.01/1K input tokens, $0.03/1K output tokens
GPT-4: $0.03/1K input tokens, $0.06/1K output tokens
GPT-3.5 Turbo: $0.0015/1K input tokens, $0.002/1K output tokens

Refer to OpenAI's pricing page for current rates.

Batch API Types

The provider exports types for Batch API operations:

import type {
  BatchFileUploadRequest,
  BatchFileUploadResponse,
  BatchJobCreateRequest,
  BatchJobCreateResponse,
  BatchJobRetrieveRequest,
  FileContentRequest,
  FileContentResponse,
} from '@x12i/ai-provider-openai';

License

MIT

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

@x12i/ai-provider-openai

Installation

Configuration

Provider Capabilities

Usage

Basic Synchronous Call

Streaming Calls

Reasoning Models Support

Features

Streaming Example

Synchronous Example

Important Notes

Architecture

With Structured Data

With Tags and Trace ID

OpenAI Batch API (File-Based)

Upload Batch File

Create Batch Job

Retrieve Batch Job Status

Download Batch Results

Response Structure

Synchronous Response

Normalization Information

Streaming Chunks

Normalization Details

What Gets Normalized

1. Messages Format Normalization

2. Config Parameter Normalization

Complete Example: GPT-5 Request Flow

Accessing Normalization Information

Model-Specific Normalization Rules

Debugging Normalization

Error Handling

Supported Models

Configuration Options

Request Configuration

Model Selection

How It Works

API Rate Limits

Cost Calculation

Batch API Types

Links

License