pi-xai-tts
v0.0.3
Published
A pi extension to read aloud the last AI assistant output using xAI TTS
Readme
pi-xai-tts
A pi extension for voice interaction: speech-to-text input via microphone, and text-to-speech playback of assistant responses — powered by xAI.
Installation
Install via npm:
$ pi install npm:pi-xai-ttsOr install from git:
$ pi install git:github.com/richardanaya/pi-xai-ttsConfiguration
Create a JSON file at ~/.pi/xai-tts.json with your API key:
{
"xaiApiKey": "your-api-key-here",
"voice": "leo"
}Replace your-api-key-here with your actual xAI API key. You can get one at https://console.x.ai/
Optional Settings
voice: The voice to use for speech synthesis. Options:
leo(default) - Authoritative, strongeve- Energetic, upbeatara- Warm, friendlyrex- Confident, clearsal- Smooth, balanced
language: BCP-47 language code (e.g.,
en,zh,pt-BR). Defaults toen.speed: Playback speed multiplier (e.g.,
0.5for half speed,1.5for 1.5x speed,2.0for double speed). Defaults to1.0(normal speed). Range:0.5to2.0.
Usage
After the AI responds to your message, type /listen to hear the last assistant message read aloud.
To stop playback early, type /listen-stop.
Auto-Listen
To automatically hear every assistant response without typing /listen each time:
/auto-listen-onTo disable automatic playback:
/auto-listen-offVoice Input (Speech-to-Text)
Press F12 to start recording your voice. A mic widget appears above the editor while recording. Press F12 again to stop — the audio is transcribed via xAI and sent as your prompt.
F12 also stops playback. If the assistant is currently speaking, pressing F12 will stop the audio immediately.
Accent / Dialect Mode
To make the AI speak with a specific accent or dialect (affecting both text responses and TTS output):
# Make the AI talk like a pirate
/add-accent talk like a pirate
# Remove the accent
/remove-accentThe accent is persisted in your config file and injected into the system prompt before every agent turn, so the AI writes in character for all responses. This makes TTS output sound natural and consistent.
Examples
# Ask pi something
> What is the capital of France?
[pi responds with "The capital of France is Paris..."]
# Listen to the response
/listen
# The response will be read aloud using xAI TTS
# Stop playback early if needed
/listen-stop
# Press F12, speak, then press F12 again to transcribe and send
# Enable pirate speak for all future responses
/add-accent talk like a pirate
# Remove it later
/remove-accent
# Auto-play every assistant response
/auto-listen-on
# Return to manual playback
/auto-listen-offRequirements
Playback (TTS)
Requires FFmpeg to be installed, specifically the ffplay command.
- macOS:
brew install ffmpeg - Ubuntu/Debian:
sudo apt-get install ffmpeg - Fedora:
sudo dnf install ffmpeg - Windows: Download from https://ffmpeg.org/download.html and add to PATH
Voice Input (STT)
Requires one of the following audio recording tools:
sox (preferred — most portable)
- macOS:
brew install sox - Ubuntu/Debian:
sudo apt-get install sox libsox-fmt-all - Fedora:
sudo dnf install sox
- macOS:
arecord (Linux ALSA, usually pre-installed)
ffmpeg (also used for playback)
API Reference
This extension uses xAI's audio APIs:
Text-to-Speech
- Endpoint:
POST https://api.x.ai/v1/tts - Docs: https://docs.x.ai/developers/model-capabilities/audio/text-to-speech
- Endpoint:
Speech-to-Text
- Endpoint:
POST https://api.x.ai/v1/audio/transcriptions - Docs: https://docs.x.ai/developers/model-capabilities/audio/speech-to-text
- Endpoint:
