@zaguan_ai/sdk

v1.4.1

Published

20 days ago

Official Zaguán SDK for TypeScript

0High
0Medium
0Low

zaguan

zaguan ai sdk openai llm typescript

Zaguán TypeScript SDK

Official Zaguán SDK for TypeScript - The easiest way to integrate with Zaguán CoreX, an enterprise-grade AI gateway that provides unified access to 15+ AI providers and 500+ models through a single, OpenAI-compatible API.

What's New in v1.4.1

🎉 Universal extra_body Support + Responses API - 100% API Coverage

This release implements universal extra_body support across all 64+ endpoints and adds the Responses API for stateful agent conversations. Aligns with Zaguán CoreX v0.42.0-beta7.

New Features

Responses API - Stateful agent conversations with conversation state management
- createResponse() method for stateful interactions
- Support for store and previous_response_id parameters
- Full extra_body support for provider-specific features
Universal Provider-Specific Parameters - All 17 request types now support extraBody
- Responses, Embeddings, Images, Audio, Speech, Assistants, Messages, Runs, Batches, Fine-tuning, Vector Stores, Moderations, and Anthropic Messages
- Enable advanced features like Google Gemini reasoning, Perplexity search, Alibaba Qwen thinking, DeepSeek reasoning

Example Use Cases:

// Google Gemini Reasoning
await client.chat({
  model: "gemini/gemini-2.0-flash-thinking-exp",
  messages: [...],
  extraBody: { reasoning_effort: "high" }
});
  
// Perplexity Search
await client.chat({
  model: "perplexity/sonar-pro",
  messages: [...],
  extraBody: {
    search_domain_filter: ["arxiv.org"],
    return_citations: true
  }
});

Statistics

1 new method (createResponse)
2 new types (ResponsesRequest, ResponsesResponse)
17 request types with extraBody support
100% Test Pass Rate (55/55 tests)
Zero breaking changes

Previous Releases

v1.4.0 - Vector Stores, Files, and Complete Threads/Runs API
v1.3.0 - Anthropic Messages API & Helper Methods
v1.2.0 - Full OpenAI API coverage (audio, images, embeddings, batches, assistants, fine-tuning)

See CHANGELOG.md for full details.

Why Zaguán?

Zaguán CoreX eliminates vendor lock-in and optimizes costs while unlocking advanced capabilities:

Multi-Provider Abstraction: Access OpenAI, Anthropic, Google, Alibaba, DeepSeek, Groq, Perplexity, xAI, Mistral, Cohere, and more through one API
Cost Optimization: 40-60% cost reduction through smart routing and provider arbitrage
Advanced Features: Reasoning control, multimodal AI, real-time data, long context windows
Enterprise Performance: 2-3x faster responses, 5,000+ concurrent connections
Zero Vendor Lock-in: Switch providers by changing model name only

Getting Started

Register for an account at zaguanai.com
Select a tier that fits your needs
Obtain your API key from your account dashboard
Choose an API endpoint:
- https://api.zaguanai.com/ - Main endpoint proxied through Cloudflare (recommended)
- https://api-eu-fi-01.zaguanai.com/ - Direct connection for lower latency

Installation

npm install @zaguan_ai/sdk

Quick Start

import { ZaguanClient } from '@zaguan_ai/sdk';

// Initialize the client with your API key
const client = new ZaguanClient({
  baseUrl: 'https://api.zaguanai.com/', // or https://api-eu-fi-01.zaguanai.com/
  apiKey: 'your-api-key-from-zaguanai.com',
});

// Simple chat completion
const response = await client.chat({
  model: 'openai/gpt-4o-mini',
  messages: [{ role: 'user', content: 'Hello, world!' }],
});

console.log(response.choices[0].message.content);

Streaming Responses

For real-time responses, use the streaming API:

// Streaming chat completion
for await (const chunk of client.chatStream({
  model: 'openai/gpt-4o-mini',
  messages: [{ role: 'user', content: 'Tell me a story' }],
})) {
  if (chunk.choices[0]?.delta?.content) {
    process.stdout.write(chunk.choices[0].delta.content);
  }
}

Multi-Provider Access

Access any of the 15+ supported AI providers with a simple model name change:

// OpenAI
const openaiResponse = await client.chat({
  model: 'openai/gpt-4o-mini',
  messages: [{ role: 'user', content: 'Hello!' }],
});

// Anthropic
const anthropicResponse = await client.chat({
  model: 'anthropic/claude-3-5-sonnet',
  messages: [{ role: 'user', content: 'Hello!' }],
});

// Google Gemini
const googleResponse = await client.chat({
  model: 'google/gemini-2.0-flash',
  messages: [{ role: 'user', content: 'Hello!' }],
});

Advanced Features

Provider-Specific Parameters

Access advanced features of each provider:

const response = await client.chat({
  model: 'google/gemini-2.5-pro',
  messages: [{ role: 'user', content: 'Solve this complex problem...' }],
  provider_specific_params: {
    reasoning_effort: 'high',
    thinking_budget: 10000,
  },
});

Function Calling

Use tools and functions with any provider:

const tools = [{
  type: 'function',
  function: {
    name: 'get_weather',
    description: 'Get weather information for a location',
    parameters: {
      type: 'object',
      properties: {
        location: { type: 'string' }
      },
      required: ['location']
    }
  }
}];

const response = await client.chat({
  model: 'openai/gpt-4o-mini',
  messages: [{ role: 'user', content: 'What's the weather in Paris?' }],
  tools: tools,
  tool_choice: 'auto'
});

API Reference

ZaguanClient

Constructor

new ZaguanClient(config: ZaguanConfig)

Configuration Options:

baseUrl: Your Zaguán CoreX instance URL (https://api.zaguanai.com/ or https://api-eu-fi-01.zaguanai.com/)
apiKey: Your API key obtained from zaguanai.com
timeoutMs: Optional timeout for requests (default: no timeout)
fetch: Optional custom fetch implementation

Core Methods

chat(request: ChatRequest, options?: RequestOptions): Promise<ChatResponse>
chatStream(request: ChatRequest, options?: RequestOptions): AsyncIterable<ChatChunk>
listModels(options?: RequestOptions): Promise<ModelInfo[]>
getCapabilities(options?: RequestOptions): Promise<ModelCapabilities[]>
getCapabilitiesWithFilter(filter, options?: RequestOptions): Promise<ModelCapabilities[]>

Credits Methods (when credits system is enabled)

getCreditsBalance(options?: RequestOptions): Promise<CreditsBalance>
getCreditsHistory(options?: CreditsHistoryOptions, requestOptions?: RequestOptions): Promise<CreditsHistory>
getCreditsStats(options?: CreditsStatsOptions, requestOptions?: RequestOptions): Promise<CreditsStats>

Features

🎯 OpenAI Compatibility: Drop-in replacement for OpenAI SDK with familiar interfaces
🔌 Multi-Provider Support: Unified access to 15+ AI providers through a single API
⚡ Production Ready: Built-in timeouts, retries, and streaming support
_typeDefinition: Comprehensive TypeScript definitions for all API surfaces
🛡️ Error Handling: Structured error types for better error handling
🔄 Streaming: Async iterable interface for real-time responses
🔐 Secure: Bearer token authentication and request ID tracking

Credits Management

When the credits system is enabled on your Zaguán instance, you can monitor usage and track costs:

// Check your credits balance
const balance = await client.getCreditsBalance();
console.log(`Credits remaining: ${balance.credits_remaining}`);
console.log(`Tier: ${balance.tier}`);
console.log(`Bands: ${balance.bands.join(', ')}`);

// Get usage history
const history = await client.getCreditsHistory({
  page: 1,
  page_size: 10,
  model: 'openai/gpt-4o-mini', // Optional filter
});

// Get usage statistics
const stats = await client.getCreditsStats({
  start_date: '2024-01-01T00:00:00Z',
  end_date: '2024-12-31T23:59:59Z',
  group_by: 'day',
});

See examples/credits-usage.ts for a complete example.

Function Calling

Use tools and functions with any provider that supports them:

const tools = [{
  type: 'function',
  function: {
    name: 'get_weather',
    description: 'Get weather information for a location',
    parameters: {
      type: 'object',
      properties: {
        location: { type: 'string', description: 'City name' },
        unit: { type: 'string', enum: ['celsius', 'fahrenheit'] }
      },
      required: ['location']
    }
  }
}];

const response = await client.chat({
  model: 'openai/gpt-4o-mini',
  messages: [{ role: 'user', content: 'What's the weather in Paris?' }],
  tools,
  tool_choice: 'auto'
});

// Handle tool calls
if (response.choices[0].message.tool_calls) {
  for (const toolCall of response.choices[0].message.tool_calls) {
    const result = executeFunction(toolCall.function.name, toolCall.function.arguments);
    // Send result back to model...
  }
}

See examples/function-calling.ts for a complete example.

Vision and Multimodal

Analyze images with vision-capable models:

const response = await client.chat({
  model: 'openai/gpt-4o',
  messages: [{
    role: 'user',
    content: [
      { type: 'text', text: "What's in this image?" },
      {
        type: 'image_url',
        image_url: {
          url: 'https://example.com/image.jpg',
          detail: 'high' // 'low', 'high', or 'auto'
        }
      }
    ]
  }]
});

Supports both URL and base64-encoded images. See examples/vision-multimodal.ts for more examples.

Provider-Specific Features

Access advanced features of each provider through provider_specific_params:

Google Gemini Reasoning

const response = await client.chat({
  model: 'google/gemini-2.0-flash-thinking-exp',
  messages: [{ role: 'user', content: 'Solve this complex problem...' }],
  provider_specific_params: {
    reasoning_effort: 'high', // 'none', 'low', 'medium', 'high'
    thinking_budget: 10000,
    include_thinking: true
  }
});

Anthropic Extended Thinking

const response = await client.chat({
  model: 'anthropic/claude-3-5-sonnet',
  messages: [{ role: 'user', content: 'Complex reasoning task...' }],
  provider_specific_params: {
    thinking: {
      type: 'enabled',
      budget_tokens: 5000
    }
  }
});

Perplexity Search

const response = await client.chat({
  model: 'perplexity/sonar-reasoning',
  messages: [{ role: 'user', content: 'Latest AI news?' }],
  provider_specific_params: {
    search_domain_filter: ['arxiv.org'],
    return_citations: true,
    search_recency_filter: 'month'
  }
});

DeepSeek Thinking Control

const response = await client.chat({
  model: 'deepseek/deepseek-chat',
  messages: [{ role: 'user', content: 'Explain quantum physics' }],
  thinking: false // Disable <think> tags
});

OpenAI Reasoning Models

const response = await client.chat({
  model: 'openai/o1',
  messages: [{ role: 'user', content: 'Complex problem...' }],
  reasoning_effort: 'high' // 'minimal', 'low', 'medium', 'high'
});

See examples/provider-specific.ts for comprehensive examples of all providers.

Reasoning Tokens

Many providers expose reasoning/thinking tokens in the usage details:

const response = await client.chat({
  model: 'openai/o1-mini',
  messages: [{ role: 'user', content: 'Solve this...' }]
});

console.log(`Total tokens: ${response.usage.total_tokens}`);
if (response.usage.completion_tokens_details?.reasoning_tokens) {
  console.log(`Reasoning tokens: ${response.usage.completion_tokens_details.reasoning_tokens}`);
}

Providers with reasoning token support:

✅ OpenAI (o1, o3, o1-mini)
✅ Google Gemini (with reasoning_effort)
✅ Anthropic Claude (with extended thinking)
✅ DeepSeek (R1, Reasoner)
✅ Alibaba Qwen (QwQ)
⚠️ Perplexity (uses <think> tags in content, not usage details)

Request Cancellation

Use AbortController to cancel requests:

const controller = new AbortController();

// Cancel after 5 seconds
setTimeout(() => controller.abort(), 5000);

try {
  const response = await client.chat({
    model: 'openai/gpt-4o-mini',
    messages: [{ role: 'user', content: 'Long task...' }]
  }, {
    signal: controller.signal
  });
} catch (error) {
  if (error instanceof Error && error.message.includes('aborted')) {
    console.log('Request was cancelled');
  }
}

Advanced Features

Audio Processing

// Transcribe audio to text
const transcription = await client.transcribeAudio({
  file: audioBlob,
  model: 'openai/whisper-1',
  language: 'en',
  response_format: 'verbose_json',
  timestamp_granularities: ['word', 'segment'],
});

// Translate audio to English
const translation = await client.translateAudio({
  file: audioBlob,
  model: 'openai/whisper-1',
});

// Generate speech from text
const audioBuffer = await client.generateSpeech({
  model: 'openai/tts-1',
  input: 'Hello, world!',
  voice: 'alloy',
  response_format: 'mp3',
});

Image Generation

// Generate images
const images = await client.generateImage({
  prompt: 'A futuristic city at sunset',
  model: 'openai/dall-e-3',
  quality: 'hd',
  size: '1024x1024',
});

// Edit images
const edited = await client.editImage({
  image: imageBlob,
  prompt: 'Add a rainbow',
  size: '1024x1024',
});

// Create variations
const variations = await client.createImageVariation({
  image: imageBlob,
  n: 2,
});

Text Embeddings

const embeddings = await client.createEmbeddings({
  input: ['Text to embed', 'Another text'],
  model: 'openai/text-embedding-3-small',
});

Batch Processing

// Create batch job
const batch = await client.createBatch({
  input_file_id: 'file-abc123',
  endpoint: '/v1/chat/completions',
  completion_window: '24h',
});

// Check status
const status = await client.retrieveBatch(batch.id);

// List batches
const batches = await client.listBatches();

// Cancel batch
await client.cancelBatch(batch.id);

Assistants API

// Create assistant
const assistant = await client.createAssistant({
  model: 'openai/gpt-4o-mini',
  name: 'Math Tutor',
  instructions: 'You are a helpful math tutor.',
  tools: [{ type: 'code_interpreter' }],
});

// Create thread
const thread = await client.createThread({
  messages: [{ role: 'user', content: 'Help me solve 2x + 5 = 15' }],
});

// Create run
const run = await client.createRun(thread.id, {
  assistant_id: assistant.id,
});

// Check run status
const runStatus = await client.retrieveRun(thread.id, run.id);

Fine-Tuning

// Create fine-tuning job
const job = await client.createFineTuningJob({
  training_file: 'file-abc123',
  model: 'openai/gpt-4o-mini-2024-07-18',
  hyperparameters: { n_epochs: 3 },
});

// List jobs
const jobs = await client.listFineTuningJobs();

// Get job details
const jobDetails = await client.retrieveFineTuningJob(job.id);

// List events
const events = await client.listFineTuningEvents(job.id);

Content Moderation

const moderation = await client.createModeration({
  input: 'Text to moderate',
  model: 'text-moderation-latest',
});

console.log('Flagged:', moderation.results[0].flagged);
console.log('Categories:', moderation.results[0].categories);

Retry Logic with Exponential Backoff

const client = new ZaguanClient({
  baseUrl: 'https://api.zaguanai.com/',
  apiKey: 'your-api-key',
  retry: {
    maxRetries: 3,
    initialDelayMs: 1000,
    maxDelayMs: 10000,
    backoffMultiplier: 2,
    retryableStatusCodes: [408, 429, 500, 502, 503, 504],
  },
});

Logging and Observability

const client = new ZaguanClient({
  baseUrl: 'https://api.zaguanai.com/',
  apiKey: 'your-api-key',
  onLog: (event) => {
    switch (event.type) {
      case 'request_start':
        console.log(`Starting ${event.method} ${event.url}`);
        break;
      case 'request_end':
        console.log(`Completed in ${event.latencyMs}ms`);
        break;
      case 'request_error':
        console.error(`Failed: ${event.error.message}`);
        break;
      case 'retry_attempt':
        console.log(`Retry ${event.attempt}/${event.maxRetries}`);
        break;
    }
  },
});

Streaming Message Reconstruction

const chunks = [];
for await (const chunk of client.chatStream({
  model: 'openai/gpt-4o-mini',
  messages: [{ role: 'user', content: 'Hello!' }],
})) {
  chunks.push(chunk);
  process.stdout.write(chunk.choices[0]?.delta?.content || '');
}

// Reconstruct complete message from chunks
const complete = ZaguanClient.reconstructMessageFromChunks(chunks);
console.log('Complete message:', complete.choices[0].message.content);

Anthropic Messages API

The SDK provides native support for Anthropic's Messages API, which is the recommended way to access Anthropic-specific features like extended thinking:

Basic Messages Request

const response = await client.messages({
  model: 'claude-3-5-sonnet-20241022',
  max_tokens: 1024,
  messages: [
    {
      role: 'user',
      content: 'Explain quantum entanglement',
    },
  ],
});

console.log(response.content[0].text);

Extended Thinking (Beta)

const response = await client.messages({
  model: 'claude-3-5-sonnet-20241022',
  max_tokens: 4096,
  messages: [
    {
      role: 'user',
      content: 'Solve this complex problem step by step...',
    },
  ],
  thinking: {
    type: 'enabled',
    budget_tokens: 5000, // 1000-10000
  },
});

// Access thinking and response separately
for (const block of response.content) {
  if (block.type === 'thinking') {
    console.log('Thinking:', block.thinking);
    console.log('Signature:', block.signature);
  } else if (block.type === 'text') {
    console.log('Response:', block.text);
  }
}

Streaming Messages

for await (const chunk of client.messagesStream({
  model: 'claude-3-5-sonnet-20241022',
  max_tokens: 1024,
  messages: [{ role: 'user', content: 'Tell me a story' }],
})) {
  if (chunk.type === 'content_block_delta' && chunk.delta?.text) {
    process.stdout.write(chunk.delta.text);
  }
}

Token Counting

const count = await client.countTokens({
  model: 'claude-3-5-sonnet-20241022',
  messages: [{ role: 'user', content: 'Hello, world!' }],
});

console.log(`Input tokens: ${count.input_tokens}`);

Batch Processing

// Create a batch
const batch = await client.createMessagesBatch([
  {
    custom_id: 'request-1',
    params: {
      model: 'claude-3-5-sonnet-20241022',
      max_tokens: 1024,
      messages: [{ role: 'user', content: 'Hello 1' }],
    },
  },
  {
    custom_id: 'request-2',
    params: {
      model: 'claude-3-5-sonnet-20241022',
      max_tokens: 1024,
      messages: [{ role: 'user', content: 'Hello 2' }],
    },
  },
]);

// Check batch status
const status = await client.getMessagesBatch(batch.id);
console.log(`Status: ${status.processing_status}`);

// Get results when complete
if (status.processing_status === 'ended') {
  const results = await client.getMessagesBatchResults(batch.id);
  // Process JSONL stream...
}

Helper Methods

The SDK provides utility methods for common tasks:

Extract Perplexity Thinking

Perplexity embeds reasoning in <think> tags within the content:

const response = await client.chat({
  model: 'perplexity/sonar-reasoning',
  messages: [{ role: 'user', content: 'Analyze this problem...' }],
});

const content = response.choices[0].message.content;
const { thinking, response: cleanResponse } =
  ZaguanClient.extractPerplexityThinking(content);

console.log('Thinking:', thinking);
console.log('Response:', cleanResponse);

Check for Reasoning Tokens

const response = await client.chat({
  model: 'openai/o1-mini',
  messages: [{ role: 'user', content: 'Solve this...' }],
});

if (ZaguanClient.hasReasoningTokens(response.usage)) {
  const reasoningTokens =
    response.usage.completion_tokens_details.reasoning_tokens;
  console.log(`Used ${reasoningTokens} reasoning tokens`);
}

Error Handling

The SDK provides structured error types:

import { APIError, InsufficientCreditsError, RateLimitError, BandAccessDeniedError } from '@zaguan_ai/sdk';

try {
  const response = await client.chat({...});
} catch (error) {
  if (error instanceof InsufficientCreditsError) {
    console.error(`Need ${error.creditsRequired}, have ${error.creditsRemaining}`);
    console.error(`Resets on: ${error.resetDate}`);
  } else if (error instanceof RateLimitError) {
    console.error(`Rate limited. Retry after ${error.retryAfter} seconds`);
  } else if (error instanceof BandAccessDeniedError) {
    console.error(`Access denied to band ${error.band}`);
    console.error(`Requires ${error.requiredTier}, you have ${error.currentTier}`);
  } else if (error instanceof APIError) {
    console.error(`API Error ${error.statusCode}: ${error.message}`);
    console.error(`Request ID: ${error.requestId}`);
  }
}

Supported Providers

Zaguán CoreX supports 15+ AI providers with 500+ models:

| Provider | Key Models | Capabilities | | ----------------- | -------------------------------- | ------------------------------------------ | | OpenAI | GPT-4o, GPT-4o-mini, o1, o3 | Vision, audio, reasoning, function calling | | Google Gemini | Gemini 2.0 Flash, Gemini 2.5 Pro | 2M context, advanced reasoning | | Anthropic | Claude 3.5 Sonnet, Claude 3 Opus | Extended thinking, citations | | Alibaba Qwen | Qwen 2.5, QwQ | Advanced reasoning, multilingual | | DeepSeek | DeepSeek V3, DeepSeek R1 | Cost-effective reasoning | | Groq | Llama 3, Mixtral | Ultra-fast inference | | Perplexity | Sonar, Sonar Reasoning | Real-time web search | | xAI | Grok 2, Grok 2 Vision | Real-time data | | Mistral | Mistral Large, Mixtral | Open models, multilingual | | + More | 500+ models | Specialized capabilities |

Contributing

We welcome contributions! Please see our contributing guide for details.

Fork the repository
Create your feature branch (git checkout -b feature/AmazingFeature)
Commit your changes (git commit -m 'Add some amazing feature')
Push to the branch (git push origin feature/AmazingFeature)
Open a pull request

License

This project is licensed under the Apache 2.0 License - see the LICENSE file for details.

Support

For support, please open an issue on GitHub.

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

Zaguán TypeScript SDK

What's New in v1.4.1

New Features

Statistics

Previous Releases

Why Zaguán?

Getting Started

Installation

Quick Start

Streaming Responses

Multi-Provider Access

Advanced Features

Provider-Specific Parameters

Function Calling

API Reference

ZaguanClient

Constructor

Core Methods

Credits Methods (when credits system is enabled)

Features

Credits Management

Function Calling

Vision and Multimodal

Provider-Specific Features

Google Gemini Reasoning

Anthropic Extended Thinking

Perplexity Search

DeepSeek Thinking Control

OpenAI Reasoning Models

Reasoning Tokens

Request Cancellation

Advanced Features

Audio Processing

Image Generation

Text Embeddings

Batch Processing

Assistants API

Fine-Tuning

Content Moderation

Retry Logic with Exponential Backoff

Logging and Observability

Streaming Message Reconstruction

Anthropic Messages API

Basic Messages Request

Extended Thinking (Beta)

Streaming Messages

Token Counting

Batch Processing

Helper Methods

Extract Perplexity Thinking

Check for Reasoning Tokens

Error Handling

Supported Providers

Contributing

License

Support