voiceai-sdk

v0.1.5

Published

4 months ago

Official SDK for SLNG.AI Voice API - Text-to-Speech, Speech-to-Text, and LLM services

Downloads

0High
0Medium
0Low

lukeam

voiceai slng tts stt text-to-speech speech-to-text ai voice audio elevenlabs orpheus whisper llm

VoiceAI SDK

🎙️ The official Node.js/TypeScript SDK for SLNG.AI - Simple, powerful voice AI for developers.

Quick Start

npm install voiceai-sdk

import { VoiceAI, tts, stt, llm } from 'voiceai-sdk';

// Initialize once in your app
new VoiceAI({ 
  apiKey: 'your-api-key' // Get yours at https://slng.ai/signup
});

// Text to Speech
const audio = await tts.synthesize('Hello world', 'orpheus');

// Speech to Text  
const transcript = await stt.transcribe(audioFile, 'whisper-v3');

// LLM Completion
const response = await llm.complete('What is the meaning of life?', 'llama-4-scout');

Why SLNG.AI?

🚀 All Voice AI in One Place - TTS, STT, and LLMs through a single API
🎯 Best Models - Access to Orpheus, ElevenLabs, Whisper, and more
💳 Simple Pricing - Pay-as-you-go with transparent credit system
👩‍💻 Developer First - Clean API, great docs, responsive founders
⚡ Fast Integration - Get started in minutes, not hours
🌍 Multi-language - Support for 29+ languages across models

Installation

npm install voiceai-sdk
# or
yarn add voiceai-sdk
# or
pnpm add voiceai-sdk

Authentication

Get your API key at https://slng.ai/signup

import { VoiceAI } from 'voiceai-sdk';

new VoiceAI({ 
  apiKey: process.env.VOICEAI_API_KEY,
  timeout: 60000 // Optional: custom timeout in ms (default: 30000)
});

Text-to-Speech (TTS)

Simple Usage

import { tts } from 'voiceai-sdk';

// Quick synthesis with model name
const audio = await tts.synthesize('Hello world', 'orpheus');

// Use convenience methods
const audio = await tts.orpheus('Hello world');
const audio = await tts.vui('Hello world');
const audio = await tts.koroko('Hello world');

// Orpheus Indic for Indian languages (Mumbai region - low latency)
const audio = await tts.orpheusIndic('नमस्ते', { language: 'hi' });
const audio = await tts.orpheusIndic('வணக்கம்', { language: 'ta' });

Advanced Options

// With voice and language options
const audio = await tts.orpheus('Bonjour le monde', {
  voice: 'pierre',
  language: 'fr',
  stream: false
});

// ElevenLabs models
const audio = await tts.elevenlabs.multiV2('Hello world', {
  voice: 'Rachel',
  language: 'en',
  stability: 0.5,
  similarity_boost: 0.75
});

// Voice cloning with XTTS
const audio = await tts.xtts('Hello world', {
  speakerVoice: 'base64_encoded_audio', // 6+ seconds of reference audio
  language: 'en'
});

// Voice cloning with MARS6  
const audio = await tts.mars6('Hello world', {
  audioRef: 'base64_encoded_audio',
  language: 'en-us',
  refText: 'Reference transcript' // Optional but recommended
});

Streaming

const stream = await tts.synthesize('Long text...', 'orpheus', {
  stream: true
});

// Handle streaming response
for await (const chunk of stream) {
  // Process audio chunks
}

Available Models

console.log(tts.models);
// ['vui', 'orpheus', 'orpheus-indic', 'koroko', 'xtts-v2', 'mars6', 'elevenlabs/multi-v2', ...]

// Get voices for a model
const voices = tts.getVoices('orpheus');
// ['tara', 'leah', 'jess', 'leo', 'dan', ...]

// Get supported languages
const languages = tts.getLanguages('orpheus');
// ['en', 'fr', 'de', 'ko', 'zh', 'es', 'it', 'hi']

// Orpheus Indic supports 8 major Indian languages
const indicLanguages = tts.getLanguages('orpheus-indic');
// ['hi', 'ta', 'te', 'bn', 'mr', 'gu', 'kn', 'ml']
// Hindi, Tamil, Telugu, Bengali, Marathi, Gujarati, Kannada, Malayalam

Speech-to-Text (STT)

Basic Transcription

import { stt } from 'voiceai-sdk';

// Transcribe audio file
const result = await stt.transcribe(audioFile, 'whisper-v3');
console.log(result.text);

// Convenience methods
const result = await stt.whisper(audioFile);
const result = await stt.kyutai(audioFile, { language: 'fr' });

With Options

// Whisper with options
const result = await stt.whisper(audioFile, {
  language: 'es',
  timestamps: true,
  diarization: true
});

// Kyutai - optimized for French and English (Mumbai region)
const result = await stt.kyutai(audioFile, {
  language: 'fr',  // 'en' or 'fr' only
  timestamps: true
});

// Access segments with timestamps
result.segments?.forEach(segment => {
  console.log(`[${segment.start}-${segment.end}]: ${segment.text}`);
});

Supported Input Types

// File object (browser)
const file = document.getElementById('audio-input').files[0];
const result = await stt.whisper(file);

// Blob
const blob = new Blob([audioData], { type: 'audio/wav' });
const result = await stt.whisper(blob);

// ArrayBuffer
const buffer = await fetch('audio.mp3').then(r => r.arrayBuffer());
const result = await stt.whisper(buffer);

// Base64 string
const base64Audio = 'data:audio/wav;base64,...';
const result = await stt.whisper(base64Audio);

Language Models (LLM)

Simple Completion

import { llm } from 'voiceai-sdk';

// Single prompt
const result = await llm.complete('Explain quantum computing', 'llama-4-scout');
console.log(result.content);

// Convenience method
const result = await llm.llamaScout('What is the speed of light?');

Chat Format

const messages = [
  { role: 'system', content: 'You are a helpful assistant' },
  { role: 'user', content: 'What is the capital of France?' }
];

const result = await llm.llamaScout(messages, {
  temperature: 0.7,
  maxTokens: 500
});

Streaming Responses

const stream = await llm.llamaScout('Write a story...', {
  stream: true
});

for await (const chunk of stream) {
  process.stdout.write(chunk);
}

Handling Cold Starts

Some models may take 60-90 seconds to start up on first use. The SDK handles this automatically with:

Smart timeouts: Models known to be slow get longer timeouts
Clear messages: Timeout errors explain cold starts
Warmup utilities: Pre-warm models before use

Pre-warming Models

import { warmup } from 'voiceai-sdk';

// Warm up a single model
await warmup.tts('orpheus');
await warmup.stt('whisper-v3');
await warmup.llm('llama-4-scout');

// Warm up multiple models in parallel
await warmup.multiple([
  { type: 'tts', model: 'orpheus' },
  { type: 'stt', model: 'whisper-v3' },
  { type: 'llm', model: 'llama-4-scout' }
]);

Custom Timeouts

// Global timeout for all requests
new VoiceAI({ 
  apiKey: 'your-key',
  timeout: 120000 // 2 minutes
});

// The SDK automatically uses longer timeouts for known slow models:
// - Orpheus: 90s
// - Orpheus Indic: 90s
// - XTTS-v2: 90s  
// - Whisper-v3: 120s
// - MARS6: 90s

Error Handling

try {
  const audio = await tts.synthesize('Hello', 'orpheus');
} catch (error) {
  if (error.message.includes('Authentication failed')) {
    // Invalid API key
  } else if (error.message.includes('Insufficient credits')) {
    // Need more credits
  } else if (error.message.includes('Rate limit')) {
    // Too many requests
  } else if (error.message.includes('timed out')) {
    // Model may be cold starting - retry in a moment
  }
}

Examples

Build a Voice Assistant

import { VoiceAI, tts, stt, llm } from 'voiceai-sdk';

new VoiceAI({ apiKey: process.env.VOICEAI_API_KEY });

async function voiceAssistant(audioInput) {
  // 1. Transcribe user's speech
  const transcript = await stt.whisper(audioInput);
  
  // 2. Generate AI response
  const response = await llm.llamaScout(transcript.text);
  
  // 3. Convert response to speech
  const audio = await tts.orpheus(response.content, {
    voice: 'tara',
    language: 'en'
  });
  
  return audio;
}

Multilingual TTS

const languages = {
  en: 'Hello world',
  fr: 'Bonjour le monde',
  de: 'Hallo Welt',
  es: 'Hola mundo'
};

for (const [lang, text] of Object.entries(languages)) {
  const audio = await tts.orpheus(text, {
    language: lang,
    voice: getVoiceForLanguage(lang)
  });
  // Save or play audio
}

Voice Cloning

// Clone voice with XTTS-v2
const referenceAudio = await loadAudioAsBase64('speaker.wav');

const clonedSpeech = await tts.xtts('This is my cloned voice', {
  speakerVoice: referenceAudio,
  language: 'en'
});

// Clone with MARS6 (supports prosody)
const clonedWithProsody = await tts.mars6('Excited speech!', {
  audioRef: referenceAudio,
  refText: 'This is how I normally speak',
  language: 'en-us',
  temperature: 0.8
});

TypeScript Support

Full TypeScript support with exported types:

import { 
  VoiceAI,
  TTSOptions,
  TTSResult,
  STTOptions,
  STTResult,
  LLMMessage,
  LLMOptions,
  LLMResult 
} from 'voiceai-sdk';

Need Help?

Documentation: https://slng.ai/docs
Dashboard: https://slng.ai/dashboard
Pricing: https://slng.ai/pricing

Feedback & Support

We're building this for developers like you. Your feedback matters!

📧 Email founders: [email protected]
🤖 Request models: Need a specific model? Just ask!
🐛 Report issues: [email protected]
💬 Discord: Coming soon!

Contributing

We welcome contributions! Feel free to:

Report bugs
Suggest new features
Submit pull requests
Request new models

License

MIT © SLNG.AI

Built with ❤️ by the SLNG.AI team. Making voice AI simple for developers everywhere.