@arvoretech/pi-kokoro-tts

v1.1.0

Published

4 hours ago

PI extension that speaks the assistant's responses out loud using a Kokoro-FastAPI text-to-speech endpoint

0High
0Medium
0Low

pi extension text-to-speech tts kokoro voice

@arvoretech/pi-kokoro-tts

PI extension that speaks the assistant's responses out loud using a Kokoro-FastAPI text-to-speech endpoint.

Pairs with @arvoretech/pi-elevenlabs-stt to enable a full voice loop: speak to pi (STT), pi answers in text, and this extension reads the answer back to you (TTS).

What it does

Registers a keyboard shortcut that toggles voice mode. While voice mode is on, every final assistant response is streamed to the Kokoro endpoint (POST /v1/audio/speech) and played through ffplay as it arrives.

Toggle voice mode: press the shortcut (default ctrl+super+s on macOS, ctrl+alt+s elsewhere). The footer shows 🔊 voice on while enabled.
Audio streams in pcm (24 kHz mono) directly into ffplay, so playback starts before the full response is synthesized.
A new response interrupts any playback already in progress.
The voice-mode state is persisted in the session and restored on --resume.

Commands

| Command | Description | |---------|-------------| | /voice | Toggle voice mode on/off. | | /voice-select | Select the Kokoro voice (e.g. pf_dora, pm_alex, af_heart). | | /say [text] | Speak the given text. With no argument, repeats the last spoken response. | | /tts-stop | Stop the current playback. |

Requirements

ffplay on PATH (ships with ffmpeg; used to play the audio stream).
A reachable Kokoro-FastAPI endpoint (see configuration).

Configuration

| Env var | Default | Description | |---------|---------|-------------| | KOKORO_TTS_URL | https://tts.arvore.com.br/v1 | Base URL of the Kokoro-FastAPI OpenAI-compatible API (without trailing slash). | | KOKORO_TTS_API_KEY | falls back to ARVORE_TTS_API_KEY | API key sent as the X-API-Key header. Required by the Arvore Kokoro gateway. | | KOKORO_TTS_VOICE | pf_dora | Voice name. pf_dora / pm_alex / pm_santa are the Brazilian Portuguese voices. Combinations like pf_dora+af_heart are supported by Kokoro. | | KOKORO_TTS_MODEL | kokoro | Model name sent in the request. | | KOKORO_TTS_SPEED | 1 | Speaking speed multiplier (0.25–4). | | KOKORO_TTS_STREAMING | true | Stream audio in chunks as the response arrives (low latency). Set to false/0/off/no to synthesize and play only the final response. | | KOKORO_TTS_SHORTCUT | ctrl+super+s (macOS), ctrl+alt+s (other) | Shortcut that toggles voice mode. On macOS, super is the Cmd key. |

Notes

Markdown is stripped before synthesis: code blocks, links, headings, and URLs are removed or simplified so the speech sounds natural.
Responses are truncated to 4000 characters per utterance.
Requires interactive (TUI) mode for the shortcut and footer status.
/voice-select opens an interactive picker (TUI only) listing the available PT/EN voices fetched from the endpoint, with the current voice marked. The selected voice is persisted in the session and restored on --resume, the same way the voice-mode state is. Voice mode itself is toggled with /voice or the shortcut.

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

@arvoretech/pi-kokoro-tts

What it does

Commands

Requirements

Configuration

Notes