@whiteboxideas/react-native-voice
v0.1.0
Published
A modular voice command library for React Native apps — STT, LLM tool-calling, TTS pipeline with pluggable providers.
Maintainers
Readme
react-native-voice
A modular voice command library for React Native apps — STT, LLM tool-calling, TTS pipeline with pluggable providers.
Features
- Full voice pipeline — Record → Transcribe → LLM → Execute → Speak, all in one press-and-hold interaction
- Pluggable providers — Swap STT, LLM, and TTS implementations without changing app code
- Built-in providers — Whisper STT, OpenAI LLM, Anthropic LLM, Expo Speech TTS
- Command registry — Declarative command definitions with typed parameters, categories, and confirmation support
- Ready-to-use UI —
VoiceButton(FAB),VoiceBubble(status overlay), animatedRecordingIndicator - React hooks —
useVoiceCommand,useAudioRecording,useAudioPlaybackfor custom UIs - Secure API key management —
expo-secure-storewith env variable fallback - Tree-shakeable — Subpath exports so you only bundle what you use
Architecture
┌─────────────┐ ┌──────────────┐ ┌───────────────┐ ┌──────────────┐
│ Recording │────▶│ STT Provider │────▶│ LLM Provider │────▶│ Command │
│ (expo-av) │ │ (Whisper) │ │ (OpenAI / │ │ Executor │
│ │ │ │ │ Anthropic) │ │ │
└─────────────┘ └──────────────┘ └───────────────┘ └──────┬───────┘
│
▼
┌──────────────┐
│ TTS Provider │
│ (Expo Speech)│
└──────────────┘
Pipeline stages: idle → recording → transcribing → understanding → executing → speaking → doneInstallation
npm install react-native-voice
# or
yarn add react-native-voicePeer dependencies (required)
npx expo install expo-avOptional dependencies
Install based on which features you use:
| Dependency | Required for |
|---|---|
| expo-haptics | Haptic feedback in VoiceButton |
| react-native-reanimated | Animations in VoiceBubble and RecordingIndicator |
| expo-speech | ExpoSpeechTTSProvider |
| expo-secure-store | createExpoApiKeyResolver (secure key storage) |
| expo-constants | createExpoApiKeyResolver (env variable fallback) |
| expo-file-system | File existence check in WhisperSTTProvider |
# Install all optional deps at once
npx expo install expo-haptics react-native-reanimated expo-speech expo-secure-store expo-constants expo-file-systemProvider-specific packages
# OpenAI (Whisper STT + OpenAI LLM)
# No extra packages — uses fetch against the OpenAI REST API
# Anthropic (Claude LLM)
# No extra packages — uses fetch against the Anthropic REST API
# Expo Speech (TTS)
npx expo install expo-speechQuick Start
1. Define your commands
import type { CommandDefinition } from 'react-native-voice';
const commands: CommandDefinition[] = [
{
name: 'set_theme',
description: 'Change the app theme to light or dark mode',
category: 'settings',
parameters: [
{
name: 'theme',
type: 'string',
description: 'The theme to set',
required: true,
enum: ['light', 'dark'],
},
],
handler: async (params) => {
setTheme(params.theme as string);
return { success: true, message: `Theme set to ${params.theme}` };
},
},
{
name: 'navigate',
description: 'Navigate to a screen in the app',
category: 'navigation',
parameters: [
{
name: 'screen',
type: 'string',
description: 'The screen name to navigate to',
required: true,
},
],
handler: async (params) => {
router.push(params.screen as string);
return { success: true, message: `Navigated to ${params.screen}` };
},
},
];2. Set up providers and registry
import {
CommandRegistry,
createExpoApiKeyResolver,
} from 'react-native-voice';
import { WhisperSTTProvider, OpenAILLMProvider } from 'react-native-voice/providers/openai';
import { ExpoSpeechTTSProvider } from 'react-native-voice/providers/expo-speech';
const resolveApiKey = createExpoApiKeyResolver();
const registry = new CommandRegistry();
registry.registerAll(commands);
const pipelineConfig = {
sttProvider: new WhisperSTTProvider(resolveApiKey),
llmProvider: new OpenAILLMProvider(resolveApiKey), // defaults to gpt-4o-mini
ttsProvider: new ExpoSpeechTTSProvider(),
ttsEnabled: true,
};3. Wrap your app with VoiceProvider
import { VoiceProvider } from 'react-native-voice';
export default function App() {
return (
<VoiceProvider config={pipelineConfig} registry={registry}>
<YourApp />
</VoiceProvider>
);
}4. Add the voice UI
import { VoiceButton, VoiceBubble } from 'react-native-voice/ui';
function YourApp() {
return (
<View style={{ flex: 1 }}>
{/* Your app content */}
<VoiceBubble />
<VoiceButton />
</View>
);
}Press and hold the mic button to record, release to process. The bubble shows pipeline progress, the transcript, and command results.
API Reference
Subpath Exports
| Import path | Contents |
|---|---|
| react-native-voice | Core classes, hooks, types, VoiceProvider, createExpoApiKeyResolver |
| react-native-voice/ui | VoiceButton, VoiceBubble, RecordingIndicator |
| react-native-voice/providers/openai | WhisperSTTProvider, OpenAILLMProvider |
| react-native-voice/providers/anthropic | AnthropicLLMProvider |
| react-native-voice/providers/expo-speech | ExpoSpeechTTSProvider |
Core
CommandRegistry
Manages command definitions. Passed to VoiceProvider and PipelineOrchestrator.
const registry = new CommandRegistry();
registry.register(command); // Register a single command
registry.registerAll(commands); // Register an array of commands
registry.get('command_name'); // Get a command by name
registry.getAll(); // Get all registered commands
registry.getByCategory('nav'); // Get commands in a category
registry.toPromptDescription(); // Serialize commands for LLM system promptPipelineOrchestrator
Runs the full STT → LLM → Execute → TTS pipeline. Used internally by useVoiceCommand, but available for custom integrations.
const orchestrator = new PipelineOrchestrator(pipelineConfig, registry);
const result = await orchestrator.process(audioUri, (stage, detail) => {
console.log(stage, detail); // 'transcribing', 'understanding', etc.
});
// Update config at runtime
orchestrator.updateConfig({ ttsEnabled: false });Returns: PipelineResult
interface PipelineResult {
success: boolean;
transcript: string;
actions: LLMAction[];
commandResults: CommandResult[];
error?: string;
durationMs: number;
}CommandExecutor
Executes LLM-parsed actions against registered commands. Handles confirmation flow.
const executor = new CommandExecutor(registry, onConfirmation);
const results = await executor.execute(actions);PromptBuilder
Builds the LLM system prompt from the command registry. Automatically serializes command names, descriptions, parameters, and expected JSON response format.
const builder = new PromptBuilder();
const systemPrompt = builder.build(registry, optionalContext);Hooks
useVoiceCommand(options)
High-level hook that orchestrates the full voice pipeline. Used internally by VoiceProvider.
const {
startListening, // () => Promise<void> — start recording
stopListening, // () => Promise<void> — stop recording and process
dismiss, // () => void — reset to idle
isListening, // boolean — currently recording
isProcessing, // boolean — pipeline is running (post-recording)
stage, // PipelineStage — current pipeline stage
lastResult, // PipelineResult | null — last pipeline result
recordingUri, // string | null — URI of last recording (kept on error)
error, // string | null — error message
} = useVoiceCommand({ config, registry });useAudioRecording()
Low-level hook for managing microphone recording via expo-av.
const {
startRecording, // () => Promise<void>
stopRecording, // () => Promise<AudioRecordingResult | null>
isRecording, // boolean
recordingState, // RecordingState: 'idle' | 'requesting_permission' | 'recording' | 'stopping'
error, // string | null
} = useAudioRecording();stopRecording returns { uri: string; durationMs: number } or null if no recording was captured.
useAudioPlayback(uri)
Hook for audio playback (used by VoiceBubble for error replay).
const {
play, // () => Promise<void>
pause, // () => Promise<void>
seekTo, // (seconds: number) => Promise<void>
isPlaying, // boolean
isLoaded, // boolean
currentTime, // number (seconds)
duration, // number (seconds)
} = useAudioPlayback(uri); // pass null to unloadComponents
VoiceProvider / useVoice()
React context provider that wraps useVoiceCommand and makes voice state available to child components.
<VoiceProvider config={pipelineConfig} registry={registry}>
{children}
</VoiceProvider>Access the context with useVoice():
const {
startListening, stopListening, dismiss,
isListening, isProcessing, stage,
lastResult, recordingUri, error,
registry,
} = useVoice();VoiceButton
Floating action button with press-and-hold to record. Uses expo-haptics for feedback and renders a built-in mic icon.
import { VoiceButton } from 'react-native-voice/ui';
<VoiceButton
size={60} // Button diameter (default: 60)
color="#4A90D9" // Idle color (default: '#4A90D9')
activeColor="#E53935" // Recording color (default: '#E53935')
bottom={90} // Position from bottom (default: 90)
right={20} // Position from right (default: 20)
/>Must be used inside a VoiceProvider. Automatically disables during pipeline processing.
VoiceBubble
Animated status overlay that shows pipeline stage, transcript, command results, errors, and audio playback controls on error.
import { VoiceBubble } from 'react-native-voice/ui';
<VoiceBubble
backgroundColor="#1E1E1E" // Bubble background (default: '#1E1E1E')
textColor="#F0F0F0" // Text color (default: '#F0F0F0')
accentColor="#6AB0FF" // Accent/highlight color (default: '#6AB0FF')
/>Features:
- Shows current pipeline stage with animated indicator
- Displays transcript ("You said: ...")
- Shows command results (green for success, red for failure)
- On error: audio playback controls + re-record button
- Auto-dismisses 3 seconds after reaching "done" stage
- Tap backdrop to dismiss
Requires react-native-reanimated.
RecordingIndicator
Animated pulsing ring shown around the voice button during recording.
import { RecordingIndicator } from 'react-native-voice/ui';
<RecordingIndicator
isRecording={true} // Controls animation (default: false)
color="#E53935" // Pulse color (default: '#E53935')
size={80} // Ring diameter (default: 80)
/>Requires react-native-reanimated.
Providers
All providers accept an ApiKeyResolver for key management. Keys are resolved lazily on each API call.
WhisperSTTProvider
OpenAI Whisper speech-to-text. Sends audio as audio/m4a to the Whisper API.
import { WhisperSTTProvider } from 'react-native-voice/providers/openai';
const stt = new WhisperSTTProvider(resolveApiKey);
const transcript = await stt.transcribe(audioUri);Resolves the 'openai' provider key.
OpenAILLMProvider
OpenAI chat completions with JSON response mode.
import { OpenAILLMProvider } from 'react-native-voice/providers/openai';
const llm = new OpenAILLMProvider(resolveApiKey); // defaults to gpt-4o-mini
const llm = new OpenAILLMProvider(resolveApiKey, 'gpt-4o'); // specify model
const response = await llm.complete(systemPrompt, userMessage);Resolves the 'openai' provider key.
AnthropicLLMProvider
Anthropic Claude chat completions.
import { AnthropicLLMProvider } from 'react-native-voice/providers/anthropic';
const llm = new AnthropicLLMProvider(resolveApiKey); // defaults to claude-sonnet-4-6
const llm = new AnthropicLLMProvider(resolveApiKey, 'claude-haiku-4-5'); // specify model
const response = await llm.complete(systemPrompt, userMessage);Resolves the 'anthropic' provider key.
ExpoSpeechTTSProvider
Text-to-speech using expo-speech. No API key needed.
import { ExpoSpeechTTSProvider } from 'react-native-voice/providers/expo-speech';
const tts = new ExpoSpeechTTSProvider();
await tts.speak('Hello!');
tts.stop();Utils
createExpoApiKeyResolver()
Factory that returns an ApiKeyResolver function. Checks two sources in order:
expo-secure-store— looks for keyvoice_ux_api_key_{provider}(e.g.voice_ux_api_key_openai)- Environment variable — looks for
EXPO_PUBLIC_OPENAI_API_KEYorEXPO_PUBLIC_ANTHROPIC_API_KEYviaexpo-constants
import { createExpoApiKeyResolver } from 'react-native-voice';
const resolveApiKey = createExpoApiKeyResolver();
// Use with providers
const stt = new WhisperSTTProvider(resolveApiKey);
const llm = new OpenAILLMProvider(resolveApiKey);Dependencies (expo-secure-store, expo-constants) are dynamically imported — the library won't crash if they're not installed, it will just skip that source.
Defining Commands
Commands are defined as CommandDefinition objects and registered with a CommandRegistry:
interface CommandDefinition {
name: string; // Unique identifier (used by LLM in actions)
description: string; // Natural language description (sent to LLM)
category: string; // Grouping key
parameters?: CommandParameter[]; // Typed parameter definitions
requiresConfirmation?: boolean; // If true, triggers onConfirmation callback before executing
handler: (params: Record<string, unknown>) => CommandResult | Promise<CommandResult>;
}
interface CommandParameter {
name: string;
type: 'string' | 'number' | 'boolean';
description: string;
required?: boolean;
enum?: string[]; // Constrain values (sent to LLM)
}
interface CommandResult {
success: boolean;
message: string; // Displayed in VoiceBubble and spoken by TTS
data?: unknown;
}Confirmation support
For destructive or sensitive commands, set requiresConfirmation: true and provide an onConfirmation callback in the pipeline config:
const commands: CommandDefinition[] = [
{
name: 'delete_item',
description: 'Delete an item permanently',
category: 'data',
requiresConfirmation: true,
parameters: [{ name: 'id', type: 'string', description: 'Item ID', required: true }],
handler: async (params) => {
await deleteItem(params.id as string);
return { success: true, message: 'Item deleted' };
},
},
];
const pipelineConfig: PipelineConfig = {
sttProvider: new WhisperSTTProvider(resolveApiKey),
llmProvider: new OpenAILLMProvider(resolveApiKey),
onConfirmation: async (action, command) => {
// Show a confirmation dialog, return true/false
return await showConfirmDialog(`Execute ${command.name}?`);
},
};API Key Management
Option 1: Environment variables (development)
Add keys to your .env or app.config.js:
# .env
EXPO_PUBLIC_OPENAI_API_KEY=sk-...
EXPO_PUBLIC_ANTHROPIC_API_KEY=sk-ant-...Option 2: Secure Store (production)
Store keys at runtime using expo-secure-store:
import * as SecureStore from 'expo-secure-store';
// Save a key (e.g., from a settings screen)
await SecureStore.setItemAsync('voice_ux_api_key_openai', 'sk-...');
await SecureStore.setItemAsync('voice_ux_api_key_anthropic', 'sk-ant-...');createExpoApiKeyResolver() checks Secure Store first, then falls back to env variables.
Option 3: Custom resolver
Implement your own ApiKeyResolver for any key source (remote config, auth token exchange, etc.):
type ApiKeyResolver = (provider: string) => Promise<string | null>;
const customResolver: ApiKeyResolver = async (provider) => {
const response = await fetch(`https://your-api.com/keys/${provider}`);
const { key } = await response.json();
return key;
};
const stt = new WhisperSTTProvider(customResolver);Custom Providers
Implement the provider interfaces to bring your own STT, LLM, or TTS:
interface STTProvider {
transcribe(audioUri: string): Promise<string>;
}
interface LLMProvider {
complete(systemPrompt: string, userMessage: string): Promise<string>;
}
interface TTSProvider {
speak(text: string): Promise<void>;
stop(): void;
}Example — custom STT provider:
class MySTTProvider implements STTProvider {
async transcribe(audioUri: string): Promise<string> {
// Send audio to your own API
const formData = new FormData();
formData.append('audio', { uri: audioUri, type: 'audio/m4a', name: 'audio.m4a' } as any);
const res = await fetch('https://my-api.com/transcribe', { method: 'POST', body: formData });
const data = await res.json();
return data.text;
}
}Pipeline Stages
The PipelineStage type tracks where the pipeline is in its lifecycle:
| Stage | Description |
|---|---|
| idle | No activity |
| recording | Microphone is active |
| transcribing | Audio is being sent to the STT provider |
| understanding | Transcript is being processed by the LLM |
| executing | Commands are being executed |
| speaking | TTS is speaking the response |
| done | Pipeline completed successfully |
| error | An error occurred at any stage |
Listen for stage changes via useVoiceCommand, useVoice(), or the onStage callback on PipelineOrchestrator.process().
Types
All types are exported from the main entry point:
import type {
// Commands
CommandDefinition,
CommandParameter,
CommandResult,
// Pipeline
LLMAction,
LLMResponse,
PipelineStage,
PipelineConfig,
PipelineResult,
PipelineStageListener,
ConfirmationCallback,
// Providers
ApiKeyResolver,
STTProvider,
LLMProvider,
TTSProvider,
// Audio
RecordingState,
AudioRecordingConfig,
AudioRecordingResult,
} from 'react-native-voice';Example App
The example/ directory contains a full Expo SDK 54 app demonstrating:
- Command registration and categories
- Provider setup with
createExpoApiKeyResolver VoiceButton+VoiceBubbleintegration- Tab-based navigation with expo-router
To run the example:
cd example
yarn install
npx expo startLicense
MIT
