@okw/stt
v1.0.0
Published
Text to Speech CLI using ElevenLabs API
Maintainers
Readme
stt - Text to Speech CLI 🎤
Give your AI agent a voice using ElevenLabs API. Streams audio directly to your terminal.
Why?
AI coding agents can use this CLI to speak to you - announce completions, read errors aloud, or provide audio feedback while you work.
Requirements
- Node.js 18+
- mpv for audio playback
- ElevenLabs API key (get one here)
# macOS
brew install mpvSetup
Set your ElevenLabs API key:
export ELEVENLABS_API_KEY=your_api_key_hereUsage
# Basic usage (recommended)
npx @okw/stt "Hello, world!"
# Save to file instead of playing
npx @okw/stt -o output.mp3 "Hello, world!"
# Use a different voice
npx @okw/stt -v CwhRBWXzGAHq8TQ4Fs17 "Hello, world!"
# Adjust speech speed
npx @okw/stt --speed 1.5 "Faster speech"
# Use audio tags (v3 model)
npx @okw/stt "I'm so excited! [laughs] This is amazing!"Global Installation
If you use it frequently, install globally:
npm install -g @okw/stt
stt "Hello, world!"Give Your AI Agent a Voice
Add this to your AGENTS.md to let your AI agent speak:
## Voice Output
You can speak to the user using: `npx @okw/stt "Your message"`
Use sparingly for:
- Task completion announcements
- Critical errors
- When user explicitly asks for voice output
Keep messages to 1-2 sentences. Don't read everything aloud.Options
| Option | Description | Default |
| --------------------- | ------------------------------- | ---------------------- |
| -v, --voice <id> | Voice ID | UgBBYS2sOqTuMpoF3BR0 |
| -m, --model <id> | Model ID | eleven_v3 |
| -s, --stability <n> | Voice stability (0-1) | 0.5 |
| -b, --boost <n> | Similarity boost (0-1) | 0.75 |
| --speed <n> | Speech speed (0.5-2.0) | 1.0 |
| -o, --output <file> | Save to file instead of playing | - |
Available Voices
Run this to list available voices:
curl -s "https://api.elevenlabs.io/v1/voices" \
-H "xi-api-key: $ELEVENLABS_API_KEY" | \
jq -r '.voices[] | "\(.voice_id) | \(.name)"'Popular premade voices:
| ID | Name |
| ---------------------- | --------------------------- |
| CwhRBWXzGAHq8TQ4Fs17 | Roger - Laid-Back, Casual |
| EXAVITQu4vr4xnSDxMaL | Sarah - Mature, Reassuring |
| IKne3meq5aSn9XLyUdCD | Charlie - Deep, Confident |
| onwK4e9ZLuTAKqWW03F9 | Daniel - Steady Broadcaster |
| pFZP5JQG7iQjIQuC4Bku | Lily - Velvety Actress |
Models
| Model | Description |
| ------------------------ | ------------------------------------ |
| eleven_v3 | Most expressive, supports audio tags |
| eleven_turbo_v2_5 | Low latency, good quality |
| eleven_flash_v2_5 | Ultra-low latency (<75ms) |
| eleven_multilingual_v2 | Best for non-English |
License
MIT
