@bernierllc/openai-client
v1.2.2
Published
Type-safe OpenAI API client with automatic rate limiting, retry logic, streaming support, and cost tracking
Readme
@bernierllc/openai-client
Type-safe OpenAI API client with automatic rate limiting, retry logic, streaming support, and cost tracking.
Installation
npm install @bernierllc/openai-clientFeatures
- Type-safe API: Full TypeScript support with strict typing
- Automatic Retries: Built-in retry logic with exponential backoff
- Rate Limiting: Configurable request rate limiting
- Cost Tracking: Automatic tracking of API usage and costs
- Streaming Support: Stream completions with real-time chunks
- Multiple Models: Support for GPT-4, GPT-4 Turbo, GPT-4o, GPT-3.5 Turbo
- Usage Statistics: Track tokens, costs, and requests by model
- Logger Integration: Uses @bernierllc/logger for structured logging
Usage
Basic Completion
import { OpenAIClient, OpenAIModel } from '@bernierllc/openai-client';
const client = new OpenAIClient({
apiKey: process.env.OPENAI_API_KEY
});
const result = await client.complete('Explain quantum computing in simple terms', {
model: OpenAIModel.GPT_4O,
maxTokens: 500
});
if (result.success) {
console.log(result.content);
console.log('Cost:', result.cost?.totalCost);
console.log('Tokens:', result.usage?.totalTokens);
}Streaming Response
const result = await client.stream(
'Write a short story about a robot',
(chunk) => {
process.stdout.write(chunk);
},
{
model: OpenAIModel.GPT_4O_MINI,
maxTokens: 1000
}
);
console.log('\nStreaming complete!');
console.log('Total cost:', result.cost?.totalCost);With System Prompt
const result = await client.complete(
'What is the capital of France?',
{
systemPrompt: 'You are a helpful geography teacher. Keep answers brief.',
model: OpenAIModel.GPT_35_TURBO,
temperature: 0.7
}
);Advanced Configuration
const client = new OpenAIClient({
apiKey: process.env.OPENAI_API_KEY,
organization: 'org-123',
defaultModel: OpenAIModel.GPT_4O,
maxRetries: 5,
rateLimit: {
requestsPerMinute: 60,
tokensPerMinute: 90000
},
enableLogging: true
});Cost Tracking
// Make several requests
await client.complete('Question 1');
await client.complete('Question 2');
await client.complete('Question 3');
// Get usage statistics
const stats = client.getUsage();
console.log('Total Requests:', stats.totalRequests);
console.log('Total Tokens:', stats.totalTokens);
console.log('Total Cost: $', stats.totalCost.toFixed(4));
console.log('Requests by Model:', stats.requestsByModel);
// Reset statistics
client.resetUsage();Rate Limiting
const client = new OpenAIClient({
apiKey: process.env.OPENAI_API_KEY,
rateLimit: {
requestsPerMinute: 10,
tokensPerMinute: 10000
}
});
// Check rate limit status
const status = client.getRateLimitStatus();
console.log('Requests remaining:', status?.requestsRemaining);
console.log('Reset time:', status?.resetTime);Disable Rate Limiting
const client = new OpenAIClient({
apiKey: process.env.OPENAI_API_KEY,
rateLimit: null // Disable rate limiting
});API Reference
OpenAIClient
Constructor
new OpenAIClient(config: OpenAIClientConfig)Config Options:
apiKey(required): OpenAI API keyorganization(optional): OpenAI organization IDdefaultModel(optional): Default model to use (default: GPT_4O_MINI)maxRetries(optional): Maximum retry attempts (default: 3)rateLimit(optional): Rate limiting configuration or null to disableenableLogging(optional): Enable logging (default: true)
Methods
complete(prompt: string, options?: CompletionOptions): Promise<CompletionResult>
Send a completion request to OpenAI.
Options:
model: OpenAI model to usemaxTokens: Maximum tokens to generate (default: 1024)temperature: Sampling temperature (0-2)topP: Nucleus sampling parameterfrequencyPenalty: Frequency penalty (-2 to 2)presencePenalty: Presence penalty (-2 to 2)stopSequences: Array of stop sequencessystemPrompt: System message for conversationuser: Unique user identifier for abuse monitoring
Returns:
success: Whether the request succeededcontent: Generated text contentusage: Token usage statisticscost: Cost breakdown (input, output, total)model: Model usedfinishReason: Why generation stoppederror: Error message if failed
stream(prompt: string, onChunk: StreamChunkCallback, options?: CompletionOptions): Promise<CompletionResult>
Stream a completion response with real-time chunks.
Parameters:
prompt: The prompt textonChunk: Callback function called for each chunkoptions: Same ascomplete()options
getUsage(): UsageStats
Get current usage statistics.
Returns:
totalRequests: Total number of requeststotalInputTokens: Total input tokens usedtotalOutputTokens: Total output tokens usedtotalTokens: Total tokens (input + output)totalCost: Total cost in USDrequestsByModel: Request count per model
getRateLimitStatus(): RateLimitStatus | null
Get current rate limit status. Returns null if rate limiting is disabled.
Returns:
requestsRemaining: Requests remaining in current windowtokensRemaining: Tokens remaining (currently not tracked)resetTime: When the rate limit resets
resetUsage(): void
Reset all usage statistics.
Models
enum OpenAIModel {
GPT_4 = 'gpt-4',
GPT_4_TURBO = 'gpt-4-turbo-preview',
GPT_4O = 'gpt-4o',
GPT_4O_MINI = 'gpt-4o-mini',
GPT_35_TURBO = 'gpt-3.5-turbo',
GPT_35_TURBO_16K = 'gpt-3.5-turbo-16k'
}Pricing (per 1M tokens)
| Model | Input | Output | |-------|-------|--------| | GPT-4 | $30.00 | $60.00 | | GPT-4 Turbo | $10.00 | $30.00 | | GPT-4o | $2.50 | $10.00 | | GPT-4o Mini | $0.15 | $0.60 | | GPT-3.5 Turbo | $0.50 | $1.50 | | GPT-3.5 Turbo 16K | $3.00 | $4.00 |
Pricing as of January 2025. Check OpenAI's pricing page for current rates.
Error Handling
The client automatically retries failed requests with exponential backoff. All methods return a CompletionResult with a success flag:
const result = await client.complete('Test prompt');
if (result.success) {
console.log('Success:', result.content);
} else {
console.error('Error:', result.error);
}Integration Status
Logger Integration
Status: Integrated
Justification: This package uses @bernierllc/logger for structured logging of all AI operations. Logs include request/response metadata (model, tokens, cost), retry attempts and backoff delays, rate limit events, and error details to help with debugging and cost monitoring.
Pattern: Direct integration - logger is a required dependency for this package.
NeverHub Integration
Status: Not applicable
Justification: This is a core AI client package that provides OpenAI API connectivity. It does not participate in service discovery, event publishing, or service mesh operations. AI clients are infrastructure utilities that don't require service registration or discovery.
Pattern: Core utility - no service mesh integration needed. Service-level packages that use this client can integrate with NeverHub if needed.
Retry Policy Integration
Uses @bernierllc/retry-policy for intelligent retry logic with:
- Exponential backoff
- Jitter to prevent thundering herd
- Configurable retry attempts
- Automatic error classification
See Also
- @bernierllc/anthropic-client - Anthropic Claude API client
- @bernierllc/logger - Structured logging
- @bernierllc/retry-policy - Retry logic utilities
License
Copyright (c) 2025 Bernier LLC. All rights reserved.
