saym
v1.0.1
Published
Say iMproved - Advanced text-to-speech CLI with custom voice models
Downloads
9
Maintainers
Readme
saym - Say iMproved
A powerful text-to-speech command-line tool that extends the traditional say command with advanced voice synthesis capabilities using multiple AI-powered TTS providers. Create custom voice models from your own voice and speak in multiple languages with natural-sounding output.
Live Demo (with Audio)
Here's a video example of saym in action with audio output:
📹️ saym usage example (click to play with sound)
This video demonstrates saym reading its own command description using ElevenLabs' high-quality voice synthesis engine. Turn on your audio to hear the synthesized speech!
Features
- 🎯 High-Quality Synthesis: Leverage advanced AI voice synthesis from multiple providers
- 💬 Simple CLI Interface: Easy-to-use command-line interface similar to the native
saycommand - 🔊 Audio Output Options: Save to file or play directly through speakers
- 🎛️ Voice Customization: High-quality voice synthesis with provider-optimized settings
- 🔄 Multiple Providers: Support for multiple industry-leading TTS APIs with easy provider switching
Installation
Quick Start with npx (No Installation Required)
# Use directly with npx
npx saym "Hello world"
# Always use the latest version
npx saym@latest "Hello world"
# Use without downloading (requires cache)
npx --no-install saym "Hello world"
# Set up your API key first (at least one is required)
export ELEVENLABS_API_KEY="your-api-key"
# or
export CARTESIA_API_KEY="your-api-key"
# or
export RESEMBLE_API_KEY="your-api-key"Understanding npx behavior
- First run: Downloads the package and caches it
- Subsequent runs: Uses cached version (fast)
@latest: Always checks for and uses the latest version--no-install: Only runs if already in cache (no download)
Global Installation
# Install globally via npm
npm install -g saym
# Now use the saym command directly
saym "Hello world"Local Development
# Clone the repository
git clone https://github.com/yourusername/saym.git
cd saym
# Install dependencies
npm install
# Build the project
npm run build
# Set up your API keys (at least one is required)
export ELEVENLABS_API_KEY="your-elevenlabs-api-key"
export CARTESIA_API_KEY="your-cartesia-api-key"
export RESEMBLE_API_KEY="your-resemble-api-key"
# For XTTS v2 (optional)
export XTTS_SERVER_URL="http://localhost:8020" # Optional, defaults to localhost:8020Usage
Basic Usage
# Speak text using default voice
saym "Hello, world!"
# Use a specific voice model by ID or name
saym -v "voice-id-or-name" "This is my custom voice"
# Specify language for accurate pronunciation (important for non-English)
saym -v "voice-id" -l ja "今日は良い天気ですね"
# Use Cartesia provider instead of default ElevenLabs
saym -p cartesia -v "12345678-abcd-efgh-ijkl-9876543210ab" "Hello from Cartesia!"
# Use XTTS v2 provider (requires XTTS server running)
saym -p xtts -v "voice.wav" "Hello from XTTS v2!"
# Use Resemble AI provider (voice cloning and emotion control)
saym -p resemble -v "voice-uuid" "Hello from Resemble AI!"
# Read from file
saym -f input.txt
# Save output to file
saym -o output.mp3 "Save this speech to a file"
# Stream audio for real-time playback
saym -s "Stream this text as it's being synthesized"Voice Management
# List available voices for current provider (owned voices only)
saym voices
# List voices from a specific provider
saym voices -p cartesia
saym voices -p xtts
saym voices -p resemble
# List all public voices (including pre-made voices)
saym voices --allLanguage Support
# Japanese with ElevenLabs (recommended for accurate pronunciation)
saym -v "japanese-voice-id" -l ja "今日は良い天気ですね"
# Spanish with explicit language
saym -v "spanish-voice-id" -l es "Hola, ¿cómo estás?"
# Cartesia (automatic language detection)
saym -p cartesia -v "voice-id" "今日は"Advanced Options
# Use different audio format
saym --format wav -o output.wav "Save as WAV file"
# Configuration management
saym config # Show current configuration
saym use elevenlabs # Switch to ElevenLabs (simple!)
saym use cartesia # Switch to Cartesia (simple!)
saym use xtts # Switch to XTTS v2 (simple!)
saym voice <voice-id> # Set default voice for current provider
saym voice <voice-id> -p cartesia # Set default voice for specific provider
# Advanced configuration (for power users)
saym config provider elevenlabs # Alternative way to set provider
saym config voice <voice-id> # Alternative way to set voice
saym config reset # Reset to defaults
# List supported providers
saym providersUsing Custom Voice Models
To create custom voice models, use the respective web interfaces:
- ElevenLabs: Visit ElevenLabs Voice Lab to create and train custom voices
- Cartesia: Visit Cartesia to access voice cloning features
- XTTS v2: Use your own voice samples (.wav files) directly with the XTTS server
Once you have created a custom voice through these services, you can use it with saym:
# Use your custom voice ID
saym --voice <your-voice-id> "Hello, this is my voice!"
# Or set as default (see Configuration section below for details)
saym config set-default-voice elevenlabs <your-voice-id>
saym "Now using my voice by default!"Configuration
Create a .saymrc file in your home directory for default settings:
{
"defaultVoice": "global-fallback-voice-id",
"defaultLanguage": "en",
"outputFormat": "mp3",
"ttsProvider": "elevenlabs",
"providers": {
"elevenlabs": {
"apiKey": "optional-if-not-in-env",
"defaultVoice": "elevenlabs-specific-voice-id"
},
"cartesia": {
"apiKey": "optional-if-not-in-env",
"defaultVoice": "cartesia-specific-voice-id"
},
"xtts": {
"apiKey": "optional-if-not-in-env",
"serverUrl": "http://localhost:8020",
"defaultVoice": "voice.wav"
}
}
}Setting Up Default Provider and Voices
saym uses a priority system for selecting voices:
- Command line voice (
-v voice-id) - highest priority - Provider-specific default voice - per-provider defaults
- Global default voice - fallback for all providers
Step 1: Choose Your Default TTS Provider
# Set ElevenLabs as default (high quality, more expensive)
saym use elevenlabs
# Or set Cartesia as default (ultra-low latency, cost-effective)
saym use cartesia
# Or set XTTS v2 as default (self-hosted, no API costs)
saym use xttsStep 2: Find Available Voices
# List voices for your default provider
saym voices
# List voices for a specific provider
saym voices -p elevenlabs
saym voices -p cartesia
saym voices -p xtts
# List ALL voices (including public ones)
saym voices --all
saym voices -p cartesia --allStep 3: Set Provider-Specific Default Voices
# Set default voice for current provider
saym voice "abc123def456ghi789"
# Or set for specific provider
saym voice "12345678-abcd-efgh-ijkl-9876543210ab" -p cartesia
saym voice "abc123def456ghi789" -p elevenlabs
saym voice "voice.wav" -p xtts
# Optional: Set global fallback voice (advanced)
saym config set defaultVoice "some-voice-id"Step 4: Test Your Configuration
# Use default provider and its default voice
saym "Hello world"
# Use specific provider with its default voice
saym -p elevenlabs "Hello from ElevenLabs"
saym -p cartesia "Hello from Cartesia"
# Override with specific voice
saym -p elevenlabs -v "different-voice-id" "Hello with specific voice"Step 5: View Your Configuration
# Show all current settings
saym config
# Show supported providers
saym providersQuick Setup Examples
For ElevenLabs users (super simple!):
# 1. Switch to ElevenLabs
saym use elevenlabs
# 2. Find your preferred voice
saym voices
# 3. Set it as default
saym voice "your-voice-id"
# 4. Test
saym "This uses my ElevenLabs default voice"For Cartesia users (super simple!):
# 1. Switch to Cartesia
saym use cartesia
# 2. Find your preferred voice (owned voices only by default)
saym voices
# 3. Set it as default
saym voice "your-voice-id"
# 4. Test
saym "This uses my Cartesia default voice"For XTTS v2 users (self-hosted):
# 1. Switch to XTTS v2
saym use xtts
# 2. List available voice files
saym voices
# 3. Set your voice file as default
saym voice "voice.wav"
# 4. Test
saym "This uses my XTTS v2 voice"For users with both providers:
# Set default provider
saym use cartesia
# Set default voices for both providers
saym voice "cartesia-voice-id" # For current (cartesia)
saym voice "elevenlabs-voice-id" -p elevenlabs # For elevenlabs
# Now you can easily switch:
saym "Uses Cartesia (default provider)"
saym -p elevenlabs "Uses ElevenLabs with its default voice"Requirements
- Node.js 18+ or Deno
- At least one TTS provider:
- ElevenLabs API account and API key, OR
- Cartesia API account and API key, OR
- Resemble AI API account and API key, OR
- XTTS v2 server running locally or remotely
- FFmpeg (for audio format conversions)
API Key Setup
You can use ElevenLabs, Cartesia, Resemble AI, or XTTS v2 (or all). Here's how to set up each:
ElevenLabs Setup
1. Create an ElevenLabs Account
- Visit ElevenLabs and click "Sign Up"
- Create an account using email or Google/GitHub authentication
- Choose a subscription plan (Free tier available with limited usage)
2. Generate ElevenLabs API Key
- Log in to your ElevenLabs dashboard
- Click on your profile icon (top right) → "Profile + API Key"
- In the API section, click "Generate API Key"
- Copy the generated API key immediately (it won't be shown again)
Cartesia Setup
1. Create a Cartesia Account
- Visit Cartesia and sign up for access
- Create an account and get API access
- Cartesia offers ultra-low latency TTS with their Sonic models
2. Generate Cartesia API Key
- Log in to your Cartesia dashboard
- Navigate to API keys section
- Generate and copy your API key
Resemble AI Setup
1. Create a Resemble AI Account
- Visit Resemble AI and sign up
- Create an account to access voice cloning and synthesis features
- Resemble AI offers advanced voice cloning with emotion control
2. Generate Resemble AI API Key
- Log in to your Resemble AI dashboard
- Navigate to Settings → API Keys
- Click "Create New API Key"
- Copy the generated API key
XTTS v2 Setup
XTTS v2 is a self-hosted TTS system with voice cloning capabilities.
Please follow the 📖 XTTS v2 Setup Guide for complete installation and configuration instructions.
3. Verify API Keys
Test your API key setup:
# Check if environment variables are set
echo $ELEVENLABS_API_KEY
echo $CARTESIA_API_KEY
echo $RESEMBLE_API_KEY
echo $XTTS_SERVER_URL
# Test with saym (ElevenLabs)
saym voices
# Test with saym (Cartesia)
saym voices -p cartesia
# Test with saym (Resemble AI)
saym voices -p resemble
# Test with saym (XTTS v2)
saym voices -p xttsLicense
MIT License - see LICENSE file for details
