zelai-cloud-sdk
v1.12.0
Published
Official ZelAI SDK - Multimodal AI for autonomous image/video generation, vision analysis, agentic LLM workflows, STT speech-to-text, and TTS text-to-speech
Maintainers
Readme
zelai-cloud-sdk
Official TypeScript/JavaScript SDK for ZelStudio.com Cloud AI Generation API
Generate images, videos, and text using state-of-the-art AI models through a simple and intuitive API.
🤖 New AI agents can now discover and use this SDK automatically via
skill.md. See AI Agent Integration.
Features
- Text-to-Image - Generate stunning images from text prompts
- Image-to-Image - Edit and transform existing images
- Dual-Image Editing - Face restoration, character consistency, and image merging with two source images
- AI Image Upscale - Upscale images 2-4x using AI
- Image-to-Video - Create videos from static images
- LLM Text Generation - Generate text with context, memory, and JSON support
- LLM Streaming - Real-time token-by-token streaming with SSE and WebSocket
- Image Vision - Analyze images with LLM for structured data extraction
- STT Speech-to-Text - Audio transcription with streaming and multi-language support
- TTS Text-to-Speech - Voice synthesis with voice models, cloning, realtime mode, and streaming
- 14 Style Presets - Realistic, anime, manga, watercolor, cinematic, and more
- 7 Format Presets - Portrait, landscape, profile, story, post, smartphone, banner
- Built-in Watermarking - Apply custom watermarks to generated content
- WebSocket Support - Real-time generation with progress updates
- CDN Operations - Format conversion, resizing, frame extraction
- Full TypeScript Support - Comprehensive type definitions
🤖 For AI Agents and Tools:
- OpenAI-Compatible API - Drop-in
/v1/chat/completionsendpoint - AI Agent Integration - Enable any AI agent (Claude, GPT, etc.) to discover and use the SDK via
skill.md
Installation
npm install zelai-cloud-sdkQuick Start
import { createClient, STYLES, FORMATS } from 'zelai-cloud-sdk';
// Initialize client
const client = createClient('zelai_pk_your_api_key_here');
// Generate an image
const image = await client.generateImage({
prompt: 'a futuristic city at sunset with flying cars',
style: STYLES.cine.id,
format: FORMATS.landscape.id
});
console.log('Image ID:', image.imageId);
// Generate text
const text = await client.generateText({
prompt: 'Explain quantum computing in simple terms',
system: 'You are a helpful science teacher'
});
console.log(text.response);
// Stream text in real-time
const controller = client.generateTextStream({
prompt: 'Write a short story about AI',
onChunk: (chunk) => process.stdout.write(chunk),
onComplete: (result) => console.log(`\nTokens: ${result.totalTokens}`)
});
await controller.done;
// Dual-image editing (merge, blend, mix two images)
const result = await client.editImage('image-1-id', {
imageId2: 'image-2-id',
prompt: 'make an image with both subjects'
});
console.log('Merged Image ID:', result.imageId);
// Text-to-Speech
const speech = await client.generateSpeech({
text: 'Hello, how can I help you today?',
voice: TTS_VOICES.PAUL
});
console.log(`Duration: ${speech.duration}s`);
// Text-to-Speech with realtime mode (low-latency)
const realtimeSpeech = await client.generateSpeech({
text: 'Fast response with realtime mode.',
voice: TTS_VOICES.ALICE,
realtime: true
});
// Speech-to-Text
const transcript = await client.transcribeAudio({
audio: audioBase64,
language: 'en'
});
console.log(transcript.text);Documentation
Full documentation is available in the Wiki.
| Guide | Description | |-------|-------------| | Getting Started | Installation, API key, initialization | | Image Generation | Text-to-image, editing, upscaling, styles & formats | | Video Generation | Image-to-video creation | | LLM & Streaming | Text generation, streaming, OpenAI-compatible API | | STT Speech-to-Text | Audio transcription, streaming, multi-language | | TTS Text-to-Speech | Voice synthesis, cloning, realtime mode, streaming | | CDN Operations | Downloads, watermarks, format conversion | | WebSocket API | Real-time generation with progress updates | | API Reference | Complete endpoint documentation | | Examples | Full code examples | | Troubleshooting | Common issues, debug mode, best practices | | AI Agent Integration | Enable AI agents to use the SDK |
OpenAI Compatibility
Use with OpenAI client libraries:
import OpenAI from 'openai';
const client = new OpenAI({
apiKey: 'zelai_pk_your_api_key_here',
baseURL: 'https://api.zelstudio.com/v1'
});
const completion = await client.chat.completions.create({
model: 'default',
messages: [{ role: 'user', content: 'Hello!' }],
stream: true
});
for await (const chunk of completion) {
process.stdout.write(chunk.choices[0]?.delta?.content || '');
}Available Styles
| Style | Description |
|-------|-------------|
| raw | Unprocessed, natural look |
| realistic | Photo-realistic |
| cine | Cinematic, film-like |
| portrait | Optimized for portraits |
| anime | Japanese anime style |
| manga | Japanese manga style |
| watercolor | Watercolor painting |
| paint | Oil/acrylic painting |
| comicbook | Western comic style |
See Image Generation for all 14 styles.
Testing
The SDK includes comprehensive test suites covering REST, WebSocket, and OpenAI-compatible endpoints.
# Run all tests
npm test
# Run specific test suites
npm run test:rest # REST API tests (25 tests)
npm run test:ws # WebSocket tests (38 tests)
npm run test:openai # OpenAI-compatible tests (15 tests)
npm run test:stt # STT speech-to-text tests
npm run test:tts # TTS text-to-speech testsSee tests/README.md for detailed test documentation.
Changelog
See CHANGELOG.md for version history and release notes.
License
MIT License - see LICENSE for details.
Support
- Documentation: Wiki
- Issues: GitHub Issues
- Email: [email protected]
