pi-xai-tts

v0.0.3

Published

2 months ago

A pi extension to read aloud the last AI assistant output using xAI TTS

0High
0Medium
0Low

pi-xai-tts

A pi extension for voice interaction: speech-to-text input via microphone, and text-to-speech playback of assistant responses — powered by xAI.

Installation

Install via npm:

$ pi install npm:pi-xai-tts

Or install from git:

$ pi install git:github.com/richardanaya/pi-xai-tts

Configuration

Create a JSON file at ~/.pi/xai-tts.json with your API key:

{
  "xaiApiKey": "your-api-key-here",
  "voice": "leo"
}

Replace your-api-key-here with your actual xAI API key. You can get one at https://console.x.ai/

Optional Settings

voice: The voice to use for speech synthesis. Options:
- leo (default) - Authoritative, strong
- eve - Energetic, upbeat
- ara - Warm, friendly
- rex - Confident, clear
- sal - Smooth, balanced
language: BCP-47 language code (e.g., en, zh, pt-BR). Defaults to en.
speed: Playback speed multiplier (e.g., 0.5 for half speed, 1.5 for 1.5x speed, 2.0 for double speed). Defaults to 1.0 (normal speed). Range: 0.5 to 2.0.

Usage

After the AI responds to your message, type /listen to hear the last assistant message read aloud.

To stop playback early, type /listen-stop.

Auto-Listen

To automatically hear every assistant response without typing /listen each time:

/auto-listen-on

To disable automatic playback:

/auto-listen-off

Voice Input (Speech-to-Text)

Press F12 to start recording your voice. A mic widget appears above the editor while recording. Press F12 again to stop — the audio is transcribed via xAI and sent as your prompt.

F12 also stops playback. If the assistant is currently speaking, pressing F12 will stop the audio immediately.

Accent / Dialect Mode

To make the AI speak with a specific accent or dialect (affecting both text responses and TTS output):

# Make the AI talk like a pirate
/add-accent talk like a pirate

# Remove the accent
/remove-accent

The accent is persisted in your config file and injected into the system prompt before every agent turn, so the AI writes in character for all responses. This makes TTS output sound natural and consistent.

Examples

# Ask pi something
> What is the capital of France?

[pi responds with "The capital of France is Paris..."]

# Listen to the response
/listen

# The response will be read aloud using xAI TTS

# Stop playback early if needed
/listen-stop

# Press F12, speak, then press F12 again to transcribe and send

# Enable pirate speak for all future responses
/add-accent talk like a pirate

# Remove it later
/remove-accent

# Auto-play every assistant response
/auto-listen-on

# Return to manual playback
/auto-listen-off

Requirements

Playback (TTS)

Requires FFmpeg to be installed, specifically the ffplay command.

macOS: brew install ffmpeg
Ubuntu/Debian: sudo apt-get install ffmpeg
Fedora: sudo dnf install ffmpeg
Windows: Download from https://ffmpeg.org/download.html and add to PATH

Voice Input (STT)

Requires one of the following audio recording tools:

sox (preferred — most portable)
- macOS: brew install sox
- Ubuntu/Debian: sudo apt-get install sox libsox-fmt-all
- Fedora: sudo dnf install sox
arecord (Linux ALSA, usually pre-installed)
ffmpeg (also used for playback)

API Reference

This extension uses xAI's audio APIs:

Text-to-Speech
- Endpoint: POST https://api.x.ai/v1/tts
- Docs: https://docs.x.ai/developers/model-capabilities/audio/text-to-speech
Speech-to-Text
- Endpoint: POST https://api.x.ai/v1/audio/transcriptions
- Docs: https://docs.x.ai/developers/model-capabilities/audio/speech-to-text

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

pi-xai-tts

Installation

Configuration

Optional Settings

Usage

Auto-Listen

Voice Input (Speech-to-Text)

Accent / Dialect Mode

Examples

Requirements

Playback (TTS)

Voice Input (STT)

API Reference