claude-voice

v1.5.11

Published

8 days ago

Voice interface extension for Claude Code - TTS, STT, and wake word detection

0High
0Medium
0Low

enescyc

claude claude-code voice tts stt speech ai extension jarvis wake-word

Claude Voice Extension

The hands-free voice interface for Claude Code. Say "Hey Jarvis" and start coding with your voice — no buttons, no typing.

npm install -g claude-voice

That's it. The extension auto-installs hooks, downloads speech models, and starts with Claude Code. Say "Hey Jarvis" and speak your command.

What Gets Installed

The postinstall script sets up everything automatically:

Default configuration (~/.claude-voice/config.json)
Claude Code hooks (session-start, stop, post-tool-use, notification)
Claude Code plugin (voice skill)
Wake word model (openWakeWord or Sherpa-ONNX KWS)
STT model — Whisper Small (~488MB) for high-accuracy speech recognition
TTS voice — Piper neural voice (en_US-joe-medium, ~50MB)
Audio tools check (sox, afplay, etc.)

Total download: ~1GB on first install. All models run locally — no API keys needed.

Why Claude Voice?

Wake word activation — Say "Hey Jarvis" from across the room. No keyboard needed. No buttons to hold. Your voice triggers Claude instantly.

Hear every response — Claude doesn't just respond in text. It speaks back to you. Step away from the screen and still follow along.

Works offline — Local STT (Whisper Small via Sherpa-ONNX) and local TTS (Piper neural voice) mean zero API calls, zero latency, zero cost. Your voice data never leaves your machine.

5 TTS providers, 3 STT providers — Choose the quality/cost/privacy balance that works for you. From free local voices to premium OpenAI and ElevenLabs.

Custom wake words — Train your own Picovoice keyword model and drop the .ppn file in ~/.claude-voice/models/. Say "Hey Claude" instead of "Hey Jarvis".

Tool announcements — Hear "Build completed", "Tests passed", "File written" while you're making coffee. Know what Claude is doing without watching the screen.

Multi-language — Speak in English, Turkish, German, French, Spanish, Japanese, Chinese, and more. Change with one command: claude-voice config set stt.language=tr

How It Works

You speak → "Hey Jarvis..." → Wake word detected → STT transcribes → Claude Code receives
Claude responds → Hook captures → TTS speaks aloud → You hear the response

The extension integrates via Claude Code hooks: auto-start on session, speak responses, announce tool completions, and voice alerts for permission prompts.

Providers

| | Local (Free) | Cloud | |---|---|---| | TTS | Piper (default), macOS Say, espeak | OpenAI, ElevenLabs | | STT | Sherpa-ONNX Whisper Small (default) | OpenAI Whisper | | Wake Word | openWakeWord (default), Sherpa-ONNX | Picovoice |

Quick presets:

claude-voice setup              # Interactive setup wizard
claude-voice openai             # Cloud TTS + STT (requires API key)
claude-voice local --download   # Piper TTS + larger Whisper model (offline)

Wake Word

Default: openWakeWord

Works out of the box. Say "Hey Jarvis" to activate.

Picovoice (best accuracy)

Picovoice offers 97%+ keyword detection accuracy with custom wake words.

claude-voice picovoice --key YOUR_ACCESS_KEY

Get a free access key at console.picovoice.ai.

Custom Wake Words (Picovoice)

Train a custom keyword at Picovoice Console, download the .ppn file, and place it in ~/.claude-voice/models/:

# Example: use "Hey Claude" as wake word
cp hey-claude_en_mac_v3_0_0.ppn ~/.claude-voice/models/hey-claude.ppn
claude-voice config set wakeWord.provider=picovoice
claude-voice config set wakeWord.keyword=hey-claude
claude-voice restart

The detector searches for <keyword>.ppn in ~/.claude-voice/models/ first, then falls back to Picovoice built-in keywords (jarvis, alexa, computer, etc.).

Configuration

claude-voice config                         # View all
claude-voice config set tts.provider=openai # Set value
claude-voice config set stt.language=tr     # Change language
claude-voice config edit                    # Open in editor

Config file: ~/.claude-voice/config.json

| Option | Default | Description | |--------|---------|-------------| | tts.provider | piper | piper, macos-say, openai, elevenlabs, espeak, disabled | | tts.autoSpeak | false | Auto-speak Claude responses | | tts.maxSpeechLength | 5000 | Max characters to speak | | stt.provider | sherpa-onnx | sherpa-onnx, openai, whisper-local, disabled | | stt.sherpaOnnx.model | whisper-small | whisper-tiny (75MB), whisper-base (142MB), whisper-small (488MB) | | stt.language | en | Language code (en, tr, de, fr, es, ja, zh...) | | wakeWord.enabled | true | Enable wake word detection | | wakeWord.provider | openwakeword | openwakeword, sherpa-onnx, picovoice | | wakeWord.keyword | jarvis | Wake word (or custom .ppn filename without extension) | | wakeWord.sensitivity | 0.5 | Detection sensitivity (0.0-1.0) | | voiceOutput.enabled | false | TTS-friendly response formatting | | toolTTS.enabled | false | Announce tool completions | | recording.silenceThreshold | 3500 | Silence duration to stop recording (ms) | | recording.maxDuration | 60000 | Max recording length (ms) |

CLI Commands

# Daemon
claude-voice start / stop / restart / status

# Setup & Diagnostics
claude-voice setup                # Interactive wizard
claude-voice doctor               # Diagnose issues

# Provider Presets
claude-voice openai               # Cloud TTS + STT
claude-voice local --download     # Local with model downloads
claude-voice picovoice --key KEY  # Best wake word accuracy

# Models & Voices
claude-voice model list / download <id>     # STT models (whisper-tiny/base/small)
claude-voice voice list / download <id>     # Piper TTS voices

# Wake Word
claude-voice openwakeword --install         # Better wake word detection

# Testing
claude-voice test-tts "Hello"
claude-voice test-stt recording.wav

# Utilities
claude-voice logs -f              # Follow daemon logs
claude-voice devices              # List audio devices

Platform Support

| | macOS | Linux | |---|---|---| | TTS | Piper, Say, OpenAI, ElevenLabs | Piper, espeak, OpenAI, ElevenLabs | | STT | Sherpa-ONNX, OpenAI | Sherpa-ONNX, OpenAI | | Wake Word | openWakeWord, Sherpa-ONNX, Picovoice | openWakeWord, Sherpa-ONNX, Picovoice |

Requires: Node.js 18+, microphone access. Python 3 recommended (for Piper TTS and openWakeWord).

Troubleshooting

claude-voice doctor               # Auto-diagnose and fix issues
claude-voice logs                 # Check daemon logs
claude-voice start -f             # Run in foreground for debugging

Wake word not detecting? Run claude-voice openwakeword --install for better accuracy, or switch to Picovoice for 97%+ detection rate.

STT not working? Check model is installed: claude-voice model list. Re-download if needed: claude-voice model download whisper-small.

License

Apache License 2.0

Issues | Releases