opencode-voice
v0.1.4
Published
Speech-to-text plugin for OpenCode — voice input with Deepgram, Groq, and OpenAI Whisper
Maintainers
Readme
opencode-voice
Speech-to-text plugin for OpenCode — speak into your microphone and see your words appear in the prompt field in real-time.
Features
- 🎙️ Real-time transcription — words appear as you speak
- 🔌 3 providers — Deepgram (streaming), Groq Whisper (fast), OpenAI Whisper
- 🖥️ Cross-platform — macOS, Linux, Windows
- ⌨️ Simple toggle —
Ctrl+Shift+Vto start/stop - 👁️ Live preview — interim text shown in overlay, final text inserted in prompt
- ✏️ Review before send — text goes to prompt for editing, never auto-sent
Requirements
A recording tool must be installed on your system:
| Platform | Recommended | Install |
| -------- | ----------- | ---------------------------------------------------- |
| macOS | SoX | brew install sox |
| Linux | SoX | sudo apt install sox |
| Windows | SoX | winget install sox or choco install sox.portable |
Fallback: FFmpeg is also supported on macOS and Linux.
Installation
Add to your OpenCode config (~/.config/opencode/opencode.json):
{
"plugin": [["opencode-voice", { "provider": "deepgram" }]]
}Or without options (configure via environment variables only):
{
"plugin": ["opencode-voice"]
}Configuration
Plugin Options (in opencode.json)
| Option | Type | Default | Description |
| ----------------- | ---------------------------------------------- | ---------------------------- | ----------------------------------- |
| provider | "deepgram" | "groq" | "openai-whisper" | — | STT provider to use |
| language | string | auto-detect | Language code (e.g. "en", "fr") |
| chunkDurationMs | number | 5000 (Groq) / 10000 (OpenAI) | Chunk size for HTTP providers |
Environment Variables
| Variable | Description |
| ------------------------- | -------------------------------------------------------- |
| OPENCODE_VOICE_PROVIDER | Override provider (takes precedence over plugin options) |
| OPENCODE_VOICE_LANGUAGE | Override language |
| DEEPGRAM_API_KEY | Deepgram API key |
| GROQ_API_KEY | Groq API key |
| OPENAI_API_KEY | OpenAI API key |
Security: API keys are read from environment variables only — never stored in config files.
Usage
- Press
Ctrl+Shift+Vto start recording - Speak — you'll see a
● Recording...indicator with live preview - Press
Ctrl+Shift+Vagain to stop - The transcribed text appears in the prompt field
- Review/edit, then press Enter to send
Providers
| Provider | Protocol | Latency | Interim Results | Best For | | ------------------ | --------- | ------- | ------------------ | ------------------- | | Deepgram | WebSocket | ~100ms | ✅ Yes | Real-time streaming | | Groq | HTTP | ~200ms | ❌ No (5s chunks) | Speed + cost | | OpenAI Whisper | HTTP | ~500ms | ❌ No (10s chunks) | Accuracy |
Deepgram (Recommended)
Best real-time experience. Uses WebSocket streaming with interim results.
["opencode-voice", { "provider": "deepgram" }]Get a free API key at deepgram.com.
Groq Whisper
Ultra-fast HTTP transcription (189x realtime). Good balance of speed and cost.
["opencode-voice", { "provider": "groq" }]Get a free API key at console.groq.com.
OpenAI Whisper
Most widely used. Requires an OpenAI API key.
["opencode-voice", { "provider": "openai-whisper" }]Troubleshooting
"No recording tool found"
Install SoX for your platform (see Requirements above).
"Invalid API key"
Check that your API key environment variable is set correctly:
echo $DEEPGRAM_API_KEY # or GROQ_API_KEY / OPENAI_API_KEYNo microphone input
- Check that your microphone is connected and set as the default input device
- On Linux, ensure PulseAudio is running:
pulseaudio --check - On macOS, grant microphone permissions to your terminal app
Text appears in wrong position
The transcribed text is inserted at the cursor position in the prompt. If you've typed text before recording, the transcription will be appended after it.
Roadmap (v2)
- Google Cloud Speech (gRPC streaming)
- Local Whisper (whisper.cpp, no API key needed)
- OpenAI Realtime API (WebSocket, ultra-low latency)
License
MIT
