@zaguan_ai/sdk
v1.4.1
Published
Official Zaguán SDK for TypeScript
Maintainers
Readme
Zaguán TypeScript SDK
Official Zaguán SDK for TypeScript - The easiest way to integrate with Zaguán CoreX, an enterprise-grade AI gateway that provides unified access to 15+ AI providers and 500+ models through a single, OpenAI-compatible API.
What's New in v1.4.1
🎉 Universal extra_body Support + Responses API - 100% API Coverage
This release implements universal extra_body support across all 64+ endpoints and adds the Responses API for stateful agent conversations. Aligns with Zaguán CoreX v0.42.0-beta7.
New Features
- Responses API - Stateful agent conversations with conversation state management
createResponse()method for stateful interactions- Support for
storeandprevious_response_idparameters - Full
extra_bodysupport for provider-specific features
- Universal Provider-Specific Parameters - All 17 request types now support
extraBody- Responses, Embeddings, Images, Audio, Speech, Assistants, Messages, Runs, Batches, Fine-tuning, Vector Stores, Moderations, and Anthropic Messages
- Enable advanced features like Google Gemini reasoning, Perplexity search, Alibaba Qwen thinking, DeepSeek reasoning
- Example Use Cases:
// Google Gemini Reasoning await client.chat({ model: "gemini/gemini-2.0-flash-thinking-exp", messages: [...], extraBody: { reasoning_effort: "high" } }); // Perplexity Search await client.chat({ model: "perplexity/sonar-pro", messages: [...], extraBody: { search_domain_filter: ["arxiv.org"], return_citations: true } });
Statistics
- 1 new method (
createResponse) - 2 new types (
ResponsesRequest,ResponsesResponse) - 17 request types with
extraBodysupport - 100% Test Pass Rate (55/55 tests)
- Zero breaking changes
Previous Releases
v1.4.0 - Vector Stores, Files, and Complete Threads/Runs API
v1.3.0 - Anthropic Messages API & Helper Methods
v1.2.0 - Full OpenAI API coverage (audio, images, embeddings, batches, assistants, fine-tuning)
See CHANGELOG.md for full details.
Why Zaguán?
Zaguán CoreX eliminates vendor lock-in and optimizes costs while unlocking advanced capabilities:
- Multi-Provider Abstraction: Access OpenAI, Anthropic, Google, Alibaba, DeepSeek, Groq, Perplexity, xAI, Mistral, Cohere, and more through one API
- Cost Optimization: 40-60% cost reduction through smart routing and provider arbitrage
- Advanced Features: Reasoning control, multimodal AI, real-time data, long context windows
- Enterprise Performance: 2-3x faster responses, 5,000+ concurrent connections
- Zero Vendor Lock-in: Switch providers by changing model name only
Getting Started
- Register for an account at zaguanai.com
- Select a tier that fits your needs
- Obtain your API key from your account dashboard
- Choose an API endpoint:
https://api.zaguanai.com/- Main endpoint proxied through Cloudflare (recommended)https://api-eu-fi-01.zaguanai.com/- Direct connection for lower latency
Installation
npm install @zaguan_ai/sdkQuick Start
import { ZaguanClient } from '@zaguan_ai/sdk';
// Initialize the client with your API key
const client = new ZaguanClient({
baseUrl: 'https://api.zaguanai.com/', // or https://api-eu-fi-01.zaguanai.com/
apiKey: 'your-api-key-from-zaguanai.com',
});
// Simple chat completion
const response = await client.chat({
model: 'openai/gpt-4o-mini',
messages: [{ role: 'user', content: 'Hello, world!' }],
});
console.log(response.choices[0].message.content);Streaming Responses
For real-time responses, use the streaming API:
// Streaming chat completion
for await (const chunk of client.chatStream({
model: 'openai/gpt-4o-mini',
messages: [{ role: 'user', content: 'Tell me a story' }],
})) {
if (chunk.choices[0]?.delta?.content) {
process.stdout.write(chunk.choices[0].delta.content);
}
}Multi-Provider Access
Access any of the 15+ supported AI providers with a simple model name change:
// OpenAI
const openaiResponse = await client.chat({
model: 'openai/gpt-4o-mini',
messages: [{ role: 'user', content: 'Hello!' }],
});
// Anthropic
const anthropicResponse = await client.chat({
model: 'anthropic/claude-3-5-sonnet',
messages: [{ role: 'user', content: 'Hello!' }],
});
// Google Gemini
const googleResponse = await client.chat({
model: 'google/gemini-2.0-flash',
messages: [{ role: 'user', content: 'Hello!' }],
});Advanced Features
Provider-Specific Parameters
Access advanced features of each provider:
const response = await client.chat({
model: 'google/gemini-2.5-pro',
messages: [{ role: 'user', content: 'Solve this complex problem...' }],
provider_specific_params: {
reasoning_effort: 'high',
thinking_budget: 10000,
},
});Function Calling
Use tools and functions with any provider:
const tools = [{
type: 'function',
function: {
name: 'get_weather',
description: 'Get weather information for a location',
parameters: {
type: 'object',
properties: {
location: { type: 'string' }
},
required: ['location']
}
}
}];
const response = await client.chat({
model: 'openai/gpt-4o-mini',
messages: [{ role: 'user', content: 'What's the weather in Paris?' }],
tools: tools,
tool_choice: 'auto'
});API Reference
ZaguanClient
Constructor
new ZaguanClient(config: ZaguanConfig)Configuration Options:
baseUrl: Your Zaguán CoreX instance URL (https://api.zaguanai.com/orhttps://api-eu-fi-01.zaguanai.com/)apiKey: Your API key obtained from zaguanai.comtimeoutMs: Optional timeout for requests (default: no timeout)fetch: Optional custom fetch implementation
Core Methods
chat(request: ChatRequest, options?: RequestOptions): Promise<ChatResponse>chatStream(request: ChatRequest, options?: RequestOptions): AsyncIterable<ChatChunk>listModels(options?: RequestOptions): Promise<ModelInfo[]>getCapabilities(options?: RequestOptions): Promise<ModelCapabilities[]>getCapabilitiesWithFilter(filter, options?: RequestOptions): Promise<ModelCapabilities[]>
Credits Methods (when credits system is enabled)
getCreditsBalance(options?: RequestOptions): Promise<CreditsBalance>getCreditsHistory(options?: CreditsHistoryOptions, requestOptions?: RequestOptions): Promise<CreditsHistory>getCreditsStats(options?: CreditsStatsOptions, requestOptions?: RequestOptions): Promise<CreditsStats>
Features
- 🎯 OpenAI Compatibility: Drop-in replacement for OpenAI SDK with familiar interfaces
- 🔌 Multi-Provider Support: Unified access to 15+ AI providers through a single API
- ⚡ Production Ready: Built-in timeouts, retries, and streaming support
- _typeDefinition: Comprehensive TypeScript definitions for all API surfaces
- 🛡️ Error Handling: Structured error types for better error handling
- 🔄 Streaming: Async iterable interface for real-time responses
- 🔐 Secure: Bearer token authentication and request ID tracking
Credits Management
When the credits system is enabled on your Zaguán instance, you can monitor usage and track costs:
// Check your credits balance
const balance = await client.getCreditsBalance();
console.log(`Credits remaining: ${balance.credits_remaining}`);
console.log(`Tier: ${balance.tier}`);
console.log(`Bands: ${balance.bands.join(', ')}`);
// Get usage history
const history = await client.getCreditsHistory({
page: 1,
page_size: 10,
model: 'openai/gpt-4o-mini', // Optional filter
});
// Get usage statistics
const stats = await client.getCreditsStats({
start_date: '2024-01-01T00:00:00Z',
end_date: '2024-12-31T23:59:59Z',
group_by: 'day',
});See examples/credits-usage.ts for a complete example.
Function Calling
Use tools and functions with any provider that supports them:
const tools = [{
type: 'function',
function: {
name: 'get_weather',
description: 'Get weather information for a location',
parameters: {
type: 'object',
properties: {
location: { type: 'string', description: 'City name' },
unit: { type: 'string', enum: ['celsius', 'fahrenheit'] }
},
required: ['location']
}
}
}];
const response = await client.chat({
model: 'openai/gpt-4o-mini',
messages: [{ role: 'user', content: 'What's the weather in Paris?' }],
tools,
tool_choice: 'auto'
});
// Handle tool calls
if (response.choices[0].message.tool_calls) {
for (const toolCall of response.choices[0].message.tool_calls) {
const result = executeFunction(toolCall.function.name, toolCall.function.arguments);
// Send result back to model...
}
}See examples/function-calling.ts for a complete example.
Vision and Multimodal
Analyze images with vision-capable models:
const response = await client.chat({
model: 'openai/gpt-4o',
messages: [{
role: 'user',
content: [
{ type: 'text', text: "What's in this image?" },
{
type: 'image_url',
image_url: {
url: 'https://example.com/image.jpg',
detail: 'high' // 'low', 'high', or 'auto'
}
}
]
}]
});Supports both URL and base64-encoded images. See examples/vision-multimodal.ts for more examples.
Provider-Specific Features
Access advanced features of each provider through provider_specific_params:
Google Gemini Reasoning
const response = await client.chat({
model: 'google/gemini-2.0-flash-thinking-exp',
messages: [{ role: 'user', content: 'Solve this complex problem...' }],
provider_specific_params: {
reasoning_effort: 'high', // 'none', 'low', 'medium', 'high'
thinking_budget: 10000,
include_thinking: true
}
});Anthropic Extended Thinking
const response = await client.chat({
model: 'anthropic/claude-3-5-sonnet',
messages: [{ role: 'user', content: 'Complex reasoning task...' }],
provider_specific_params: {
thinking: {
type: 'enabled',
budget_tokens: 5000
}
}
});Perplexity Search
const response = await client.chat({
model: 'perplexity/sonar-reasoning',
messages: [{ role: 'user', content: 'Latest AI news?' }],
provider_specific_params: {
search_domain_filter: ['arxiv.org'],
return_citations: true,
search_recency_filter: 'month'
}
});DeepSeek Thinking Control
const response = await client.chat({
model: 'deepseek/deepseek-chat',
messages: [{ role: 'user', content: 'Explain quantum physics' }],
thinking: false // Disable <think> tags
});OpenAI Reasoning Models
const response = await client.chat({
model: 'openai/o1',
messages: [{ role: 'user', content: 'Complex problem...' }],
reasoning_effort: 'high' // 'minimal', 'low', 'medium', 'high'
});See examples/provider-specific.ts for comprehensive examples of all providers.
Reasoning Tokens
Many providers expose reasoning/thinking tokens in the usage details:
const response = await client.chat({
model: 'openai/o1-mini',
messages: [{ role: 'user', content: 'Solve this...' }]
});
console.log(`Total tokens: ${response.usage.total_tokens}`);
if (response.usage.completion_tokens_details?.reasoning_tokens) {
console.log(`Reasoning tokens: ${response.usage.completion_tokens_details.reasoning_tokens}`);
}Providers with reasoning token support:
- ✅ OpenAI (o1, o3, o1-mini)
- ✅ Google Gemini (with
reasoning_effort) - ✅ Anthropic Claude (with extended thinking)
- ✅ DeepSeek (R1, Reasoner)
- ✅ Alibaba Qwen (QwQ)
- ⚠️ Perplexity (uses
<think>tags in content, not usage details)
Request Cancellation
Use AbortController to cancel requests:
const controller = new AbortController();
// Cancel after 5 seconds
setTimeout(() => controller.abort(), 5000);
try {
const response = await client.chat({
model: 'openai/gpt-4o-mini',
messages: [{ role: 'user', content: 'Long task...' }]
}, {
signal: controller.signal
});
} catch (error) {
if (error instanceof Error && error.message.includes('aborted')) {
console.log('Request was cancelled');
}
}Advanced Features
Audio Processing
// Transcribe audio to text
const transcription = await client.transcribeAudio({
file: audioBlob,
model: 'openai/whisper-1',
language: 'en',
response_format: 'verbose_json',
timestamp_granularities: ['word', 'segment'],
});
// Translate audio to English
const translation = await client.translateAudio({
file: audioBlob,
model: 'openai/whisper-1',
});
// Generate speech from text
const audioBuffer = await client.generateSpeech({
model: 'openai/tts-1',
input: 'Hello, world!',
voice: 'alloy',
response_format: 'mp3',
});Image Generation
// Generate images
const images = await client.generateImage({
prompt: 'A futuristic city at sunset',
model: 'openai/dall-e-3',
quality: 'hd',
size: '1024x1024',
});
// Edit images
const edited = await client.editImage({
image: imageBlob,
prompt: 'Add a rainbow',
size: '1024x1024',
});
// Create variations
const variations = await client.createImageVariation({
image: imageBlob,
n: 2,
});Text Embeddings
const embeddings = await client.createEmbeddings({
input: ['Text to embed', 'Another text'],
model: 'openai/text-embedding-3-small',
});Batch Processing
// Create batch job
const batch = await client.createBatch({
input_file_id: 'file-abc123',
endpoint: '/v1/chat/completions',
completion_window: '24h',
});
// Check status
const status = await client.retrieveBatch(batch.id);
// List batches
const batches = await client.listBatches();
// Cancel batch
await client.cancelBatch(batch.id);Assistants API
// Create assistant
const assistant = await client.createAssistant({
model: 'openai/gpt-4o-mini',
name: 'Math Tutor',
instructions: 'You are a helpful math tutor.',
tools: [{ type: 'code_interpreter' }],
});
// Create thread
const thread = await client.createThread({
messages: [{ role: 'user', content: 'Help me solve 2x + 5 = 15' }],
});
// Create run
const run = await client.createRun(thread.id, {
assistant_id: assistant.id,
});
// Check run status
const runStatus = await client.retrieveRun(thread.id, run.id);Fine-Tuning
// Create fine-tuning job
const job = await client.createFineTuningJob({
training_file: 'file-abc123',
model: 'openai/gpt-4o-mini-2024-07-18',
hyperparameters: { n_epochs: 3 },
});
// List jobs
const jobs = await client.listFineTuningJobs();
// Get job details
const jobDetails = await client.retrieveFineTuningJob(job.id);
// List events
const events = await client.listFineTuningEvents(job.id);Content Moderation
const moderation = await client.createModeration({
input: 'Text to moderate',
model: 'text-moderation-latest',
});
console.log('Flagged:', moderation.results[0].flagged);
console.log('Categories:', moderation.results[0].categories);Retry Logic with Exponential Backoff
const client = new ZaguanClient({
baseUrl: 'https://api.zaguanai.com/',
apiKey: 'your-api-key',
retry: {
maxRetries: 3,
initialDelayMs: 1000,
maxDelayMs: 10000,
backoffMultiplier: 2,
retryableStatusCodes: [408, 429, 500, 502, 503, 504],
},
});Logging and Observability
const client = new ZaguanClient({
baseUrl: 'https://api.zaguanai.com/',
apiKey: 'your-api-key',
onLog: (event) => {
switch (event.type) {
case 'request_start':
console.log(`Starting ${event.method} ${event.url}`);
break;
case 'request_end':
console.log(`Completed in ${event.latencyMs}ms`);
break;
case 'request_error':
console.error(`Failed: ${event.error.message}`);
break;
case 'retry_attempt':
console.log(`Retry ${event.attempt}/${event.maxRetries}`);
break;
}
},
});Streaming Message Reconstruction
const chunks = [];
for await (const chunk of client.chatStream({
model: 'openai/gpt-4o-mini',
messages: [{ role: 'user', content: 'Hello!' }],
})) {
chunks.push(chunk);
process.stdout.write(chunk.choices[0]?.delta?.content || '');
}
// Reconstruct complete message from chunks
const complete = ZaguanClient.reconstructMessageFromChunks(chunks);
console.log('Complete message:', complete.choices[0].message.content);Anthropic Messages API
The SDK provides native support for Anthropic's Messages API, which is the recommended way to access Anthropic-specific features like extended thinking:
Basic Messages Request
const response = await client.messages({
model: 'claude-3-5-sonnet-20241022',
max_tokens: 1024,
messages: [
{
role: 'user',
content: 'Explain quantum entanglement',
},
],
});
console.log(response.content[0].text);Extended Thinking (Beta)
const response = await client.messages({
model: 'claude-3-5-sonnet-20241022',
max_tokens: 4096,
messages: [
{
role: 'user',
content: 'Solve this complex problem step by step...',
},
],
thinking: {
type: 'enabled',
budget_tokens: 5000, // 1000-10000
},
});
// Access thinking and response separately
for (const block of response.content) {
if (block.type === 'thinking') {
console.log('Thinking:', block.thinking);
console.log('Signature:', block.signature);
} else if (block.type === 'text') {
console.log('Response:', block.text);
}
}Streaming Messages
for await (const chunk of client.messagesStream({
model: 'claude-3-5-sonnet-20241022',
max_tokens: 1024,
messages: [{ role: 'user', content: 'Tell me a story' }],
})) {
if (chunk.type === 'content_block_delta' && chunk.delta?.text) {
process.stdout.write(chunk.delta.text);
}
}Token Counting
const count = await client.countTokens({
model: 'claude-3-5-sonnet-20241022',
messages: [{ role: 'user', content: 'Hello, world!' }],
});
console.log(`Input tokens: ${count.input_tokens}`);Batch Processing
// Create a batch
const batch = await client.createMessagesBatch([
{
custom_id: 'request-1',
params: {
model: 'claude-3-5-sonnet-20241022',
max_tokens: 1024,
messages: [{ role: 'user', content: 'Hello 1' }],
},
},
{
custom_id: 'request-2',
params: {
model: 'claude-3-5-sonnet-20241022',
max_tokens: 1024,
messages: [{ role: 'user', content: 'Hello 2' }],
},
},
]);
// Check batch status
const status = await client.getMessagesBatch(batch.id);
console.log(`Status: ${status.processing_status}`);
// Get results when complete
if (status.processing_status === 'ended') {
const results = await client.getMessagesBatchResults(batch.id);
// Process JSONL stream...
}Helper Methods
The SDK provides utility methods for common tasks:
Extract Perplexity Thinking
Perplexity embeds reasoning in <think> tags within the content:
const response = await client.chat({
model: 'perplexity/sonar-reasoning',
messages: [{ role: 'user', content: 'Analyze this problem...' }],
});
const content = response.choices[0].message.content;
const { thinking, response: cleanResponse } =
ZaguanClient.extractPerplexityThinking(content);
console.log('Thinking:', thinking);
console.log('Response:', cleanResponse);Check for Reasoning Tokens
const response = await client.chat({
model: 'openai/o1-mini',
messages: [{ role: 'user', content: 'Solve this...' }],
});
if (ZaguanClient.hasReasoningTokens(response.usage)) {
const reasoningTokens =
response.usage.completion_tokens_details.reasoning_tokens;
console.log(`Used ${reasoningTokens} reasoning tokens`);
}Error Handling
The SDK provides structured error types:
import { APIError, InsufficientCreditsError, RateLimitError, BandAccessDeniedError } from '@zaguan_ai/sdk';
try {
const response = await client.chat({...});
} catch (error) {
if (error instanceof InsufficientCreditsError) {
console.error(`Need ${error.creditsRequired}, have ${error.creditsRemaining}`);
console.error(`Resets on: ${error.resetDate}`);
} else if (error instanceof RateLimitError) {
console.error(`Rate limited. Retry after ${error.retryAfter} seconds`);
} else if (error instanceof BandAccessDeniedError) {
console.error(`Access denied to band ${error.band}`);
console.error(`Requires ${error.requiredTier}, you have ${error.currentTier}`);
} else if (error instanceof APIError) {
console.error(`API Error ${error.statusCode}: ${error.message}`);
console.error(`Request ID: ${error.requestId}`);
}
}Supported Providers
Zaguán CoreX supports 15+ AI providers with 500+ models:
| Provider | Key Models | Capabilities | | ----------------- | -------------------------------- | ------------------------------------------ | | OpenAI | GPT-4o, GPT-4o-mini, o1, o3 | Vision, audio, reasoning, function calling | | Google Gemini | Gemini 2.0 Flash, Gemini 2.5 Pro | 2M context, advanced reasoning | | Anthropic | Claude 3.5 Sonnet, Claude 3 Opus | Extended thinking, citations | | Alibaba Qwen | Qwen 2.5, QwQ | Advanced reasoning, multilingual | | DeepSeek | DeepSeek V3, DeepSeek R1 | Cost-effective reasoning | | Groq | Llama 3, Mixtral | Ultra-fast inference | | Perplexity | Sonar, Sonar Reasoning | Real-time web search | | xAI | Grok 2, Grok 2 Vision | Real-time data | | Mistral | Mistral Large, Mixtral | Open models, multilingual | | + More | 500+ models | Specialized capabilities |
Contributing
We welcome contributions! Please see our contributing guide for details.
- Fork the repository
- Create your feature branch (
git checkout -b feature/AmazingFeature) - Commit your changes (
git commit -m 'Add some amazing feature') - Push to the branch (
git push origin feature/AmazingFeature) - Open a pull request
License
This project is licensed under the Apache 2.0 License - see the LICENSE file for details.
Support
For support, please open an issue on GitHub.
