@shuttle_space/tauri-plugin-stt-api
v0.1.1
Published
Speech-to-text recognition API for Tauri with multi-language support
Maintainers
Readme
Tauri Plugin STT (Speech-to-Text)
Cross-platform speech recognition plugin for Tauri 2.x applications. Provides real-time speech-to-text functionality for desktop (Windows, macOS, Linux) and mobile (iOS, Android).
Features
- 🎤 Real-time Speech Recognition - Convert speech to text with low latency
- 📱 Cross-platform Support - iOS, Android, macOS, Windows, Linux
- 🌐 Multi-language Support - 9 languages with automatic model download
- 📝 Interim Results - Get partial transcriptions while speaking
- 🔄 Continuous Mode - Auto-restart recognition after each utterance
- 🔐 Permission Handling - Request and check microphone/speech permissions
- 📥 Auto Model Download - Vosk models are downloaded automatically on first use
Platform Support
| Platform | Status | API Used | Model Download | | -------- | ------- | ------------------------------------- | -------------- | | iOS | ✅ Full | SFSpeechRecognizer (Speech framework) | Not required | | Android | ✅ Full | SpeechRecognizer API | Not required | | macOS | ✅ Full | Vosk (offline speech recognition) | Automatic | | Windows | ✅ Full | Vosk (offline speech recognition) | Automatic | | Linux | ✅ Full | Vosk (offline speech recognition) | Automatic |
Supported Languages (Desktop)
| Language | Code | Model Size | | ---------- | ----- | ---------- | | English | en-US | 40 MB | | Portuguese | pt-BR | 31 MB | | Spanish | es-ES | 39 MB | | French | fr-FR | 41 MB | | German | de-DE | 45 MB | | Russian | ru-RU | 45 MB | | Chinese | zh-CN | 43 MB | | Japanese | ja-JP | 48 MB | | Italian | it-IT | 39 MB |
Models are downloaded automatically from alphacephei.com/vosk/models when you first use a language.
Installation
Rust
Add the plugin to your Cargo.toml:
[dependencies]
tauri-plugin-stt = "0.1"TypeScript
Install the JavaScript guest bindings:
npm install tauri-plugin-stt-api
# or
yarn add tauri-plugin-stt-api
# or
pnpm add tauri-plugin-stt-apiSetup
Register Plugin
In your Tauri app setup:
fn main() {
tauri::Builder::default()
.plugin(tauri_plugin_stt::init())
.run(tauri::generate_context!())
.expect("error while running application");
}Permissions
Add permissions to your capabilities/default.json:
{
"permissions": ["stt:default"]
}For granular permissions, you can specify individual commands:
{
"permissions": [
"stt:allow-is-available",
"stt:allow-get-supported-languages",
"stt:allow-check-permission",
"stt:allow-request-permission",
"stt:allow-start-listening",
"stt:allow-stop-listening",
"stt:allow-register-listener",
"stt:allow-remove-listener"
]
}iOS Configuration (Required)
For iOS apps, you must create an Info.plist file in your src-tauri directory with permission descriptions:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
<key>NSMicrophoneUsageDescription</key>
<string>This app needs access to the microphone for speech recognition.</string>
<key>NSSpeechRecognitionUsageDescription</key>
<string>This app needs access to speech recognition to convert your voice to text.</string>
</dict>
</plist>Then reference it in your tauri.conf.json:
{
"bundle": {
"iOS": {
"infoPlist": "Info.plist"
}
}
}Note: Without these permission descriptions, the app will crash when requesting permissions on iOS.
Android Configuration
Android permissions are automatically included from the plugin's AndroidManifest.xml. No additional configuration needed.
Vosk Library (Desktop Only)
Desktop builds auto-download and link Vosk runtime binaries during build:
- macOS (
x86_64,aarch64) - Linux (
x86_64) - Windows (
x86,x86_64) No manuallibvoskinstall is required for local development.
If you need to use your own Vosk runtime location (for offline/intranet/CI):
export TAURI_STT_VOSK_LIB_DIR=/absolute/path/to/libvosk-directoryTAURI_STT_VOSK_LIB_DIR should contain:
- macOS:
libvosk.dylib - Linux:
libvosk.so - Windows:
libvosk.libandlibvosk.dll
Usage
TypeScript API
import {
isAvailable,
getSupportedLanguages,
startListening,
stopListening,
onResult,
onStateChange,
onError,
} from "tauri-plugin-stt-api";
// Check if STT is available
const result = await isAvailable();
// Get supported languages (with installed status)
const languages = await getSupportedLanguages();
// Listen for results
const resultListener = await onResult((result) => {
console.log("Recognized:", result.transcript, result.isFinal);
});
// Listen for download progress (when model is being downloaded)
import { listen } from "@tauri-apps/api/event";
const downloadListener = await listen<{
status: string;
model: string;
progress: number;
}>("stt://download-progress", (event) => {
console.log(`${event.payload.status}: ${event.payload.progress}%`);
});
// Start listening
await startListening({
language: "en-US",
interimResults: true,
continuous: true,
// maxDuration and onDevice are supported by the guest SDK
});
// Stop listening
await stopListening();Configuration Options
interface ListenConfig {
language?: string; // Language code (e.g., "en-US", "pt-BR")
modelPath?: string; // Desktop/Android offline: local Vosk model dir or .zip path
interimResults?: boolean; // Return partial results while speaking
continuous?: boolean; // Continue listening after utterance ends
maxDuration?: number; // Max listening duration in milliseconds (0 = unlimited)
onDevice?: boolean; // Prefer on-device recognition (iOS)
}Event Listeners
// Listen for results
const unlistenResult = await onResult((result) => {
console.log(result.transcript, result.isFinal);
});
// Listen for state changes
const unlistenState = await onStateChange((event) => {
console.log("State:", event.state); // "idle" | "listening" | "processing"
});
// Listen for errors
const unlistenError = await onError((error) => {
console.error(`[${error.code}] ${error.message}`);
});
// Clean up listeners
unlistenResult();
unlistenState();
unlistenError();Events
| Event | Payload | Description |
| ------------------------- | -------------------------------------- | ---------------------------------------- |
| stt://result | { transcript, isFinal, confidence? } | Recognition result |
| stt://state-change | { state } | State change (idle/listening/processing) |
| stt://error | { code, message, details? } | Error event |
| stt://download-progress | { status, model, progress } | Model download progress |
API Reference
startListening(config?: ListenConfig): Promise<void>
Start speech recognition.
Config Options:
language: Language code (e.g., "en-US", "pt-BR")modelPath: Custom offline model path.- Desktop/Android: supports Vosk directory or
.zip - iOS: treated as "offline requested" signal (uses Apple on-device recognition, path content is not read)
- Desktop/Android: supports Vosk directory or
interimResults: Return partial results (default:false)continuous: Continue listening after utterance ends (default:false)maxDuration: Max listening duration in ms (0 = unlimited)onDevice: Use on-device recognition (iOS only, default:false)
Bundle Local Vosk Model (Desktop/Android Offline)
If you want to ship a model inside your app package (instead of runtime download), add the model zip to Tauri resources and pass the resolved path to modelPath.
src-tauri/tauri.conf.json:
{
"bundle": {
"resources": ["../vosk-model-small-cn-0.22.zip"]
}
}App side:
import { resolveResource } from "@tauri-apps/api/path";
import { startListening } from "tauri-plugin-stt-api";
const modelZip = await resolveResource("vosk-model-small-cn-0.22.zip");
await startListening({
language: "zh-CN",
modelPath: modelZip,
interimResults: true,
});The plugin will automatically extract the zip to app data and load it.
stopListening(): Promise<void>
Stop current speech recognition session.
isAvailable(): Promise<AvailabilityResponse>
Check if STT is available on the device.
Returns:
available: Whether STT is availablereason: Optional reason if unavailable
getSupportedLanguages(): Promise<SupportedLanguagesResponse>
Get list of supported languages.
Returns: Array of languages with:
code: Language code (e.g., "en-US")name: Display nameinstalled: Whether model is installed (desktop only)
checkPermission(): Promise<PermissionResponse>
Check current permission status.
Returns:
microphone: "granted" | "denied" | "unknown"speechRecognition: "granted" | "denied" | "unknown"
requestPermission(): Promise<PermissionResponse>
Request microphone and speech recognition permissions.
Returns: Same as checkPermission()
onResult(handler: (result: RecognitionResult) => void): Promise<UnlistenFn>
Listen for recognition results.
Result:
transcript: Recognized textisFinal: Whether this is a final resultconfidence: Confidence score (0.0-1.0, if available)
onStateChange(handler: (event: StateChangeEvent) => void): Promise<UnlistenFn>
Listen for state changes.
States: "idle", "listening", "processing"
onError(handler: (error: SttError) => void): Promise<UnlistenFn>
Listen for errors.
Error Codes:
NOT_AVAILABLE: STT not available on devicePERMISSION_DENIED: Microphone permission deniedSPEECH_PERMISSION_DENIED: Speech recognition permission deniedNETWORK_ERROR: Network error (server-based recognition)AUDIO_ERROR: Audio capture errorTIMEOUT: Recognition timeoutNO_SPEECH: No speech detectedLANGUAGE_NOT_SUPPORTED: Requested language not supportedCANCELLED: Recognition cancelled by userALREADY_LISTENING: Already in listening stateNOT_LISTENING: Not currently listeningBUSY: Recognizer busyUNKNOWN: Unknown error
Building
Without STT (Default)
npm run devWith STT
npm run dev -- --features stt
# or
npm run dev:sttTroubleshooting
Desktop: "library 'vosk' not found"
Solution:
- Use the default auto-download flow on desktop (macOS/Linux/Windows supported targets).
- Or set
TAURI_STT_VOSK_LIB_DIRto a directory containing Vosk runtime library files.
# Example
export TAURI_STT_VOSK_LIB_DIR=/absolute/path/to/libvosk-directory
# Verify expected files exist
ls "$TAURI_STT_VOSK_LIB_DIR"Desktop: "Model not found" or automatic download fails
Problem: Vosk models are downloaded automatically on first use for each language.
Solution:
- Ensure internet connectivity
- Check app data directory:
~/.local/share/tauri-plugin-stt/models/(Linux/macOS) or%APPDATA%/tauri-plugin-stt/models/(Windows) - Manual download: Download from alphacephei.com/vosk/models and extract to models directory
- Model naming: Ensure folder name matches expected pattern (e.g.,
vosk-model-small-en-us-0.15)
Mobile: "Speech recognition not available"
iOS Solution:
- Ensure iOS 10+ (speech recognition requires iOS 10+)
- Check Settings → Privacy → Speech Recognition → Enable for your app
- For on-device recognition, iOS 13+ is required
Android Solution:
- Install Google app (provides speech recognition service)
- Check Settings → Apps → Default apps → Digital assistant app
- Ensure internet connectivity for server-based recognition
Permission denied errors
Solution: Call requestPermission() before startListening()
const perm = await requestPermission();
if (perm.microphone !== "granted") {
console.error("Microphone permission required");
return;
}
await startListening();No audio input detected
Checklist:
- ✅ Microphone is working in other apps
- ✅ Correct microphone selected in system settings
- ✅ Microphone not muted (hardware or software)
- ✅ App has microphone permission
- ✅ No other app is using the microphone exclusively
Interim results not showing
Note: Interim results availability varies by platform:
- iOS/Android: Full support
- Desktop (Vosk): Partial support (depends on model)
await startListening({
interimResults: true, // Enable interim results
continuous: true, // Keep listening
});Recognition accuracy is low
Tips:
- Use correct language code for your accent (e.g., "en-GB" vs "en-US")
- Speak clearly and avoid background noise
- On iOS, download enhanced voices in Settings → Accessibility → Spoken Content
- Desktop: Use larger Vosk models for better accuracy (at cost of size)
"ALREADY_LISTENING" error
Solution: Stop current session before starting a new one:
try {
await stopListening();
} catch (e) {
// Ignore if not listening
}
await startListening();Download progress events not firing
Note: Download progress events are only for desktop (Vosk models). Mobile uses native speech recognition without downloads.
import { listen } from "@tauri-apps/api/event";
const unlisten = await listen("stt://download-progress", (event) => {
console.log(`${event.payload.status}: ${event.payload.progress}%`);
});Examples
See the examples/stt-example directory for a complete working demo with React + Material UI, featuring:
- Real-time transcription with interim results
- Language selection
- Permission handling
- Error handling with visual feedback
- Download progress monitoring
- Results history
Platform-Specific Notes
iOS
- Requires iOS 10+ for basic speech recognition
- iOS 13+ required for on-device recognition (
onDevice: true) - Must add
NSSpeechRecognitionUsageDescriptionto Info.plist - Must add
NSMicrophoneUsageDescriptionto Info.plist - Offline mode uses Apple's on-device recognizer (no custom Vosk model loading on iOS)
- If
modelPathis passed on iOS, plugin forces on-device mode and fails if locale/device does not support offline recognition
Android
- Requires Android API 23+ (Android 6.0+)
- Google app must be installed for speech recognition
- Internet required for server-based recognition
- Must request
RECORD_AUDIOpermission in AndroidManifest.xml - Runtime permission flow uses Tauri
PermissionState(requestPermission()thenstartListening()) - Offline-first mode: when
modelPathis passed, Android uses local Vosk model (zip auto-extract supported) - Fallback mode: when
modelPathis not passed, Android uses systemSpeechRecognizer
Desktop (Windows, macOS, Linux)
- Requires Vosk library installation (see Vosk Library section)
- Models downloaded automatically (40-50 MB per language)
- Fully offline after model download
- Models stored in app data directory
License
MIT
