pi-voice
v0.4.0
Published
Voice interface for pi coding agent
Readme
pi-voice
Headless voice interface for the Pi Coding Agent. Hold a key, speak, and pi executes your instructions with voice feedback.
Demo using ElevenLabs provider (make sure unmuted)
https://github.com/user-attachments/assets/76adb941-83cf-4394-b8d2-f6d73a1df8bc
Installation
npm i -g pi-voice
# or
bun i -g pi-voiceUsage
pi-voice is a daemon-style application that runs in the background once started. You can push-to-talk with the agent.
pi-voice start # start the daemon in the background
pi-voice status # show state, PID, and uptime
pi-voice stop # stop the daemonThe push-to-talk trigger defaults to Cmd+Shift+I (macOS) / Win+Shift+I (Windows). Hold the key to record, release to send.
Setting
pi agent configuration
pi-voice launches a Pi agent session with the directory where pi-voice start was executed. This means all standard pi configuration works as-is:
AGENTS.md— walked up fromcwdto the filesystem root.pi/settings.json— project-level settings.pi/skills/,.pi/extensions/,.pi/prompts/— project-level resources~/.pi/agent/— global settings, skills, extensions, prompts, and models- and more
Refer to the Pi documentation for details on these settings.
pi-voice configuration
You can configure pi-voice in .pi/pi-voice.json:
{
"key": "ctrl+t",
"provider": "local"
}| Key | Description |
| --- | --- |
| key | Push-to-talk shortcut. Combine modifiers (ctrl, shift, alt/opt, meta/cmd) and a main key with +. Examples: "ctrl+t", "alt+space", "ctrl+shift+r". Default: "meta+shift+i". |
| provider | Speech provider for STT & TTS. "local", "gemini" (Vertex AI or Gemini API), "openai", or "elevenlabs". Default: "local". |
Environment variables
| Provider | Required variables |
| --- | --- |
| local | None (model is auto-downloaded on first launch). Optional: WHISPER_MODEL_PATH (custom model path), WHISPER_MODEL (model name, default medium-q5_0), SAY_VOICE (macOS say voice name, e.g. "Kyoko"). |
| gemini | Vertex AI: GOOGLE_CLOUD_PROJECT, GOOGLE_CLOUD_LOCATION (optional, default us-central1). Gemini API: GEMINI_API_KEY or GOOGLE_API_KEY. If GOOGLE_CLOUD_PROJECT is set, Vertex AI is used; set GOOGLE_GENAI_USE_VERTEXAI=false to force API key mode. |
| openai | OPENAI_API_KEY |
| elevenlabs | ELEVENLABS_API_KEY. Optional: ELEVENLABS_VOICE_ID (TTS voice, default CwhRBWXzGAHq8TQ4Fs17), ELEVENLABS_TTS_MODEL (default eleven_flash_v2_5). |
Logging
The daemon writes structured JSON logs to both the console and a log file. The default log file path is $XDG_CONFIG_HOME/pi-voice/daemon.log (falls back to ~/.config/pi-voice/daemon.log).
To override the log file path:
export PI_VOICE_LOG_PATH=/path/to/custom.logWhisper model (local provider)
The local provider uses Whisper for STT and the macOS say command for TTS. On first launch, a ggml-format Whisper model (medium-q5_0, ~514 MB) is automatically downloaded to ~/.pi-agent/whisper/ and cached for subsequent runs.
To use a different model, set WHISPER_MODEL:
export WHISPER_MODEL=base # smaller & fasterOr point to your own model file directly:
export WHISPER_MODEL_PATH=/path/to/ggml-custom.binContributing
See CONTRIBUTING.md for development setup, build commands, and release workflow.
