@arvoretech/pi-elevenlabs-stt
v1.0.2
Published
PI extension for push-to-talk speech-to-text using the ElevenLabs Scribe API
Readme
@arvoretech/pi-elevenlabs-stt
PI extension for push-to-talk speech-to-text using the ElevenLabs Scribe API.
What it does
Registers a keyboard shortcut that toggles microphone recording. Press once to start recording from your default mic, press again to stop and transcribe. The resulting text is appended to the editor input.
- First press: starts recording from the selected microphone via
ffmpeg(shows🎙 recordingin the footer). - Second press: stops recording, sends the audio to ElevenLabs Scribe, and inserts the transcript into the editor (
⏳ transcribing…).
Commands
| Command | Description |
|---------|-------------|
| /stt-devices | List available microphone inputs and select one. The active device is marked with ✓; the system default is marked (default). |
| /stt-device-clear | Reset back to the system default microphone. |
The selected device is persisted in the session and restored on --resume.
Requirements
ffmpegonPATH(used to capture microphone audio).- An ElevenLabs API key exported as
ELEVENLABS_API_KEY.
The extension auto-detects the audio input backend:
| Platform | Backend |
|----------|---------|
| macOS | avfoundation |
| Linux (PulseAudio / PipeWire) | pulse |
| Linux (ALSA only) | alsa |
Configuration
| Env var | Default | Description |
|---------|---------|-------------|
| ELEVENLABS_API_KEY | — | Required. ElevenLabs API key used for transcription. |
| ELEVENLABS_STT_SHORTCUT | ctrl+alt+t (Linux/Windows), ctrl+alt+r (macOS) | Keyboard shortcut that toggles recording. On macOS, alt is the Option key. Avoid super (Cmd) — terminals usually don't forward it to pi. |
Some terminals and window managers reserve
ctrl+alt+t/ctrl+cmd+t. If the shortcut doesn't reach pi, setELEVENLABS_STT_SHORTCUTto another combination (e.g.alt+t,ctrl+alt+r).
Privacy
Recorded microphone audio is sent to the ElevenLabs API for transcription. Recordings are written to a temporary file and deleted immediately after transcription (and on session shutdown). Nothing else is stored.
Notes
- Recordings shorter than 400ms are ignored.
- Uses the
scribe_v2model with automatic language detection. - Requires interactive (TUI) mode for the shortcut and editor insertion.
