koko-tts
v0.2.1
Published
π€ The friendly TTS CLI - Just run 'koko' for instant text-to-speech magic!
Downloads
3
Maintainers
Readme
π€ Koko TTS - Kokoro Text-to-Speech CLI
A simple, powerful command-line tool for text-to-speech generation using the Kokoro TTS engine. Convert text to natural-sounding speech with 28+ professional voices.
β¨ Features
- π― 28 Professional Voices with quality grades (American & British English)
- π Interactive Mode - Just run
kokofor a clean, guided experience - π File Input - Process text files via CLI or interactive mode
- ποΈ Voice Control - Choose speed, temperature, and voice
- π¦ Zero Config - Works out of the box
- π Streaming - Real-time generation for long texts
- πΎ Multiple Formats - WAV and PCM output
- π Auto-Chunking - Bypass 25-second limit with automatic text splitting
- π§© Audio Stitching - Chunks automatically combined into single files
- π§Ή Auto-Cleanup - Temp files cleaned automatically
π Quick Start
Interactive Mode (Easiest)
# Using npx (no installation needed)
npx koko-tts@latest
# Or install globally first
npm install -g koko-tts
koko
# Clean, guided interface:
# π€ Koko TTS
# Simple text-to-speech generation
#
# What would you like to do?
# β― β¨ Generate speech
# π Browse voices
# πͺ ExitCommand Line (For Scripts & Automation)
# Generate speech instantly
npx koko-tts@latest generate "Hello, this is Koko TTS!"
# With specific voice
npx koko-tts@latest generate "Welcome to Koko!" --voice af_heart
# From a text file
npx koko-tts@latest generate --file story.txt --voice bf_emmaUsing Nix (Recommended for Development)
# Clone and enter development environment
git clone https://github.com/piotutic/koko-tts
cd kokoro-tts-typescript
nix develop
# Build and use
npm run build
koko generate "Hello from Nix!"π Usage
Basic Commands
# Simple generation (uses default voice af_sarah)
koko generate "Your text here"
# Choose a specific voice
koko generate "Hello world" --voice af_heart
# Read from file
koko generate --file input.txt --output audiobook.wav
# Adjust speaking speed and expressiveness
koko generate "Custom speech" --speed 0.8 --temperature 0.9
# Quiet mode (minimal output)
koko generate "Silent generation" --quietList Available Voices
# Show all voices
koko voices
# Filter by category
koko voices --category recommended
koko voices --category american
koko voices --category british
# JSON output for scripting
koko voices --jsonInteractive Mode
# Launch interactive interface (default when no arguments)
koko
# Or explicitly
koko interactive
# Interactive features:
# - Choose between typing text or loading from file
# - Smart voice selection with defaults
# - File browser with validation
# - Custom filename or smart auto-naming
# - Clean, professional interfaceπ Voice Options
Recommended Voices (Highest Quality)
| Voice ID | Description | Language | Gender |
| ------------ | ---------------------- | ---------- | ------ |
| af_heart | Warm, expressive β | US English | Female |
| af_bella | Clear, professional β | US English | Female |
| bf_emma | Elegant, refined β | UK English | Female |
| am_michael | Smooth, versatile | US English | Male |
| bm_george | Distinguished, clear | UK English | Male |
β = Top quality voices
Voice Categories
- American Female:
af_heart,af_bella,af_sarah(default),af_nicole,af_kore - American Male:
am_michael,am_fenrir,am_puck,am_echo,am_eric - British Female:
bf_emma,bf_isabella,bf_alice,bf_lily - British Male:
bm_george,bm_fable,bm_lewis,bm_daniel
βοΈ Options
| Option | Description | Default | Range |
| --------------- | ----------------- | ------------ | -------------- |
| --voice | Voice to use | af_sarah | See voice list |
| --speed | Speaking speed | 1.0 | 0.5 - 2.0 |
| --temperature | Expressiveness | 0.7 | 0.1 - 1.0 |
| --output | Output filename | output.wav | Any path |
| --file | Input text file | - | Any .txt file |
| --quiet | Minimal output | false | Boolean |
| --streaming | Stream long texts | false | Boolean |
π Examples
Basic Text Generation
# Simple generation
koko generate "Welcome to Koko TTS!"
# Professional presentation voice
koko generate "Good morning everyone" --voice af_bella --speed 0.9
# Storytelling with British accent
koko generate "Once upon a time..." --voice bf_emma --temperature 0.8File Processing
# Command line file processing
koko generate --file chapter1.txt --voice am_michael --output chapter1.wav
# Interactive file processing
koko
# Choose "β¨ Generate speech"
# Choose "π Load from file"
# Enter file path with validation
# Batch process with streaming (for long files)
koko generate --file novel.txt --streaming --output audiobook.wavInteractive Mode Workflow
# Start interactive mode
koko
# 1. Main Menu
# What would you like to do?
# β― β¨ Generate speech
# π Browse voices
# πͺ Exit
# 2. Input Method (when generating speech)
# How would you like to provide text?
# β― β¨οΈ Type text manually
# π Load from file
# 3. File Input (if file selected)
# Enter file path: story.txt
# β
Loaded 1,240 characters from file
# Preview: "Once upon a time in a distant galaxy..."
# 4. Voice Selection
# Use default voice (Sarah)? (Y/n)
# Select voice: Heart (Female) / Michael (Male) / etc.
# 5. Filename Customization
# Output filename: (koko_20250917T143022.wav)
# Press Enter for default or type custom name: my-presentation
# β Uses: my-presentation.wav
# 6. Generation
# Generating: my-presentation.wav
# β
Success! Generated: my-presentation.wavVoice Comparison
# Test the same text with different voices
koko generate "Voice test" --voice af_heart --output heart.wav
koko generate "Voice test" --voice bf_emma --output emma.wav
koko generate "Voice test" --voice am_michael --output michael.wavFilename Examples
# Interactive mode filename options:
# Press Enter β koko_20250917T143022.wav (auto-timestamp)
# Type "presentation" β presentation.wav (auto .wav)
# Type "chapter1.wav" β chapter1.wav (keeps extension)
# Type "audio-notes" β audio-notes.wav (auto .wav)
# Command line (unchanged)
koko generate "text" --output my-custom-name.wavπ§ Advanced Usage
Configuration
# Initialize configuration file
koko config --init
# Use custom configuration
koko generate "Text" --config my-settings.yml
# Save current settings as preset
koko generate "Test" --voice af_heart --save-config my-preset.ymlAudio Stitching & Chunks
# Default: Chunks are automatically stitched into single file
koko generate --file long-text.txt
# β Creates: combined-audio.wav
# Keep individual chunks AND combined file
koko generate --file long-text.txt --keep-chunks
# β Creates: combined-audio.wav + individual chunk files
# Disable stitching (legacy behavior)
koko generate --file long-text.txt --no-stitch
# β Creates: audio_001.wav, audio_002.wav, audio_003.wav...Cleanup
# Clean temp files older than 24 hours (default)
koko cleanup
# Custom cleanup age
koko cleanup --max-age 48
# Verbose cleanup output
koko cleanup --verbosePerformance & Quality
# High quality (slower)
koko generate "Text" --dtype fp32
# Balanced quality/speed (default)
koko generate "Text" --dtype q8
# Fast generation (lower quality)
koko generate "Text" --dtype q4π Performance Tips
- First Run: Downloads the model (~100MB), subsequent runs are much faster
- Voice Selection:
af_heartandaf_bellaprovide the best quality - Speed Settings: 0.8-0.9 for presentations, 1.0-1.2 for casual content
- Long Texts: Use
--streamingfor files over 500 characters - File Format: WAV provides best compatibility
π Organized Directory Structure
Koko TTS automatically organizes all files in a .koko-tts/ directory:
.koko-tts/
βββ config/ # Configuration files
βββ cache/ # Audio cache for faster re-generation
βββ outputs/ # Generated audio files
β βββ YYYY-MM-DD/ # CLI outputs by date
β βββ interactive/ # Interactive mode outputs
β βββ YYYY-MM-DD/
βββ temp/ # Temporary files (auto-cleaned)Benefits:
- Clean workspace (no scattered output files)
- Easy cleanup (delete entire
.koko-tts/folder) - Organized by date and generation mode
- Add to
.gitignore:echo ".koko-tts/" >> .gitignore
π Troubleshooting
Common Issues
Model Download Fails
# Check internet connection and retry
koko generate "test" --verboseAudio File Issues
# Verify output file was created
ls -la output.wav
# Test audio playback (Linux)
ffplay output.wav
# or
aplay output.wavPermission Denied
# Ensure CLI has execute permissions
npm run build # This sets permissions automaticallyπ¦ Development
Project Structure
src/
βββ cli.ts # Main CLI application
βββ tts-engine.ts # TTS engine wrapper
βββ voices.ts # Voice configurations
βββ types.ts # TypeScript definitions
βββ utils.ts # Utility functionsBuild Commands
# Build TypeScript
npm run build
# Type checking
npm run type-check
# Clean build files
npm run clean
# Development mode
npm run cli -- generate "dev test"Adding New Voices
Edit src/voices.ts to add new voice configurations with metadata.
π License
Apache License 2.0 - see LICENSE file for details.
π Acknowledgments
- Kokoro TTS - Original model by hexgrad
- kokoro-js - JavaScript implementation by Xenova
- Transformers.js - Machine learning in JavaScript
π Links
Simple, powerful text-to-speech: Just run koko for interactive mode or koko generate "your text" for command line π€
