npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

tauri-plugin-stt-api

v0.1.0

Published

Speech-to-text recognition API for Tauri with multi-language support

Readme

Tauri Plugin STT (Speech-to-Text)

Cross-platform speech recognition plugin for Tauri 2.x applications. Provides real-time speech-to-text functionality for desktop (Windows, macOS, Linux) and mobile (iOS, Android).

Features

  • 🎤 Real-time Speech Recognition - Convert speech to text with low latency
  • 📱 Cross-platform Support - iOS, Android, macOS, Windows, Linux
  • 🌐 Multi-language Support - 9 languages with automatic model download
  • 📝 Interim Results - Get partial transcriptions while speaking
  • 🔄 Continuous Mode - Auto-restart recognition after each utterance
  • 🔐 Permission Handling - Request and check microphone/speech permissions
  • 📥 Auto Model Download - Vosk models are downloaded automatically on first use

Platform Support

| Platform | Status | API Used | Model Download | | -------- | ------- | ------------------------------------- | -------------- | | iOS | ✅ Full | SFSpeechRecognizer (Speech framework) | Not required | | Android | ✅ Full | SpeechRecognizer API | Not required | | macOS | ✅ Full | Vosk (offline speech recognition) | Automatic | | Windows | ✅ Full | Vosk (offline speech recognition) | Automatic | | Linux | ✅ Full | Vosk (offline speech recognition) | Automatic |

Supported Languages (Desktop)

| Language | Code | Model Size | | ---------- | ----- | ---------- | | English | en-US | 40 MB | | Portuguese | pt-BR | 31 MB | | Spanish | es-ES | 39 MB | | French | fr-FR | 41 MB | | German | de-DE | 45 MB | | Russian | ru-RU | 45 MB | | Chinese | zh-CN | 43 MB | | Japanese | ja-JP | 48 MB | | Italian | it-IT | 39 MB |

Models are downloaded automatically from alphacephei.com/vosk/models when you first use a language.

Installation

Rust

Add the plugin to your Cargo.toml:

[dependencies]
tauri-plugin-stt = "0.1"

TypeScript

Install the JavaScript guest bindings:

npm install tauri-plugin-stt-api
# or
yarn add tauri-plugin-stt-api
# or
pnpm add tauri-plugin-stt-api

Setup

Register Plugin

In your Tauri app setup:

fn main() {
    tauri::Builder::default()
        .plugin(tauri_plugin_stt::init())
        .run(tauri::generate_context!())
        .expect("error while running application");
}

Permissions

Add permissions to your capabilities/default.json:

{
  "permissions": ["stt:default"]
}

For granular permissions, you can specify individual commands:

{
  "permissions": [
    "stt:allow-is-available",
    "stt:allow-get-supported-languages",
    "stt:allow-check-permission",
    "stt:allow-request-permission",
    "stt:allow-start-listening",
    "stt:allow-stop-listening",
    "stt:allow-register-listener",
    "stt:allow-remove-listener"
  ]
}

For granular permissions, you can specify individual commands:

{
  "permissions": [
    "stt:allow-is-available",
    "stt:allow-get-supported-languages",
    "stt:allow-check-permission",
    "stt:allow-request-permission",
    "stt:allow-start-listening",
    "stt:allow-stop-listening",
    "stt:allow-register-listener",
    "stt:allow-remove-listener"
  ]
}

Vosk Library (Desktop Only)

The Vosk runtime library must be installed on your system:

macOS

# Download and install libvosk
curl -LO https://github.com/alphacep/vosk-api/releases/download/v0.3.42/vosk-osx-0.3.42.zip
unzip vosk-osx-0.3.42.zip
sudo cp vosk-osx-0.3.42/libvosk.dylib /usr/local/lib/

Linux

wget https://github.com/alphacep/vosk-api/releases/download/v0.3.42/vosk-linux-x86_64-0.3.42.zip
unzip vosk-linux-x86_64-0.3.42.zip
sudo cp vosk-linux-x86_64-0.3.42/libvosk.so /usr/local/lib/
sudo ldconfig

Windows

Download from GitHub Releases and add to PATH.

Usage

TypeScript API

import {
  isAvailable,
  getSupportedLanguages,
  startListening,
  stopListening,
  onResult,
  onStateChange,
  onError,
} from "tauri-plugin-stt-api";

// Check if STT is available
const result = await isAvailable();

// Get supported languages (with installed status)
const languages = await getSupportedLanguages();

// Listen for results
const resultListener = await onResult(result => {
  console.log("Recognized:", result.transcript, result.isFinal);
});

// Listen for download progress (when model is being downloaded)
import { listen } from "@tauri-apps/api/event";
const downloadListener = await listen<{
  status: string;
  model: string;
  progress: number;
}>("stt://download-progress", event => {
  console.log(`${event.payload.status}: ${event.payload.progress}%`);
});

// Start listening
await startListening({
  language: "en-US",
  interimResults: true,
  continuous: true,
  // maxDuration and onDevice are supported by the guest SDK
});

// Stop listening
await stopListening();

Configuration Options

interface ListenConfig {
  language?: string; // Language code (e.g., "en-US", "pt-BR")
  interimResults?: boolean; // Return partial results while speaking
  continuous?: boolean; // Continue listening after utterance ends
  maxDuration?: number; // Max listening duration in milliseconds (0 = unlimited)
  onDevice?: boolean; // Prefer on-device recognition (iOS)
}

Event Listeners

// Listen for results
const unlistenResult = await onResult(result => {
  console.log(result.transcript, result.isFinal);
});

// Listen for state changes
const unlistenState = await onStateChange(event => {
  console.log("State:", event.state); // "idle" | "listening" | "processing"
});

// Listen for errors
const unlistenError = await onError(error => {
  console.error(`[${error.code}] ${error.message}`);
});

// Clean up listeners
unlistenResult();
unlistenState();
unlistenError();

Events

| Event | Payload | Description | | ------------------------- | -------------------------------------- | ---------------------------------------- | | stt://result | { transcript, isFinal, confidence? } | Recognition result | | stt://state-change | { state } | State change (idle/listening/processing) | | stt://error | { code, message, details? } | Error event | | stt://download-progress | { status, model, progress } | Model download progress |

API Reference

startListening(config?: ListenConfig): Promise<void>

Start speech recognition.

Config Options:

  • language: Language code (e.g., "en-US", "pt-BR")
  • interimResults: Return partial results (default: false)
  • continuous: Continue listening after utterance ends (default: false)
  • maxDuration: Max listening duration in ms (0 = unlimited)
  • onDevice: Use on-device recognition (iOS only, default: false)

stopListening(): Promise<void>

Stop current speech recognition session.

isAvailable(): Promise<AvailabilityResponse>

Check if STT is available on the device.

Returns:

  • available: Whether STT is available
  • reason: Optional reason if unavailable

getSupportedLanguages(): Promise<SupportedLanguagesResponse>

Get list of supported languages.

Returns: Array of languages with:

  • code: Language code (e.g., "en-US")
  • name: Display name
  • installed: Whether model is installed (desktop only)

checkPermission(): Promise<PermissionResponse>

Check current permission status.

Returns:

  • microphone: "granted" | "denied" | "unknown"
  • speechRecognition: "granted" | "denied" | "unknown"

requestPermission(): Promise<PermissionResponse>

Request microphone and speech recognition permissions.

Returns: Same as checkPermission()

onResult(handler: (result: RecognitionResult) => void): Promise<UnlistenFn>

Listen for recognition results.

Result:

  • transcript: Recognized text
  • isFinal: Whether this is a final result
  • confidence: Confidence score (0.0-1.0, if available)

onStateChange(handler: (event: StateChangeEvent) => void): Promise<UnlistenFn>

Listen for state changes.

States: "idle", "listening", "processing"

onError(handler: (error: SttError) => void): Promise<UnlistenFn>

Listen for errors.

Error Codes:

  • NOT_AVAILABLE: STT not available on device
  • PERMISSION_DENIED: Microphone permission denied
  • SPEECH_PERMISSION_DENIED: Speech recognition permission denied
  • NETWORK_ERROR: Network error (server-based recognition)
  • AUDIO_ERROR: Audio capture error
  • TIMEOUT: Recognition timeout
  • NO_SPEECH: No speech detected
  • LANGUAGE_NOT_SUPPORTED: Requested language not supported
  • CANCELLED: Recognition cancelled by user
  • ALREADY_LISTENING: Already in listening state
  • NOT_LISTENING: Not currently listening
  • BUSY: Recognizer busy
  • UNKNOWN: Unknown error

Building

Without STT (Default)

npm run dev

With STT

npm run dev -- --features stt
# or
npm run dev:stt

Troubleshooting

Desktop: "library 'vosk' not found"

Solution: Install the Vosk library as described in the Vosk Library section.

# macOS
ls /usr/local/lib/libvosk.dylib  # Should exist

# Linux
ldconfig -p | grep vosk  # Should show libvosk.so

# Windows
where vosk.dll  # Should be in PATH

Desktop: "Model not found" or automatic download fails

Problem: Vosk models are downloaded automatically on first use for each language.

Solution:

  1. Ensure internet connectivity
  2. Check app data directory: ~/.local/share/tauri-plugin-stt/models/ (Linux/macOS) or %APPDATA%/tauri-plugin-stt/models/ (Windows)
  3. Manual download: Download from alphacephei.com/vosk/models and extract to models directory
  4. Model naming: Ensure folder name matches expected pattern (e.g., vosk-model-small-en-us-0.15)

Mobile: "Speech recognition not available"

iOS Solution:

  1. Ensure iOS 10+ (speech recognition requires iOS 10+)
  2. Check Settings → Privacy → Speech Recognition → Enable for your app
  3. For on-device recognition, iOS 13+ is required

Android Solution:

  1. Install Google app (provides speech recognition service)
  2. Check Settings → Apps → Default apps → Digital assistant app
  3. Ensure internet connectivity for server-based recognition

Permission denied errors

Solution: Call requestPermission() before startListening()

const perm = await requestPermission();
if (perm.microphone !== "granted") {
  console.error("Microphone permission required");
  return;
}
await startListening();

No audio input detected

Checklist:

  • ✅ Microphone is working in other apps
  • ✅ Correct microphone selected in system settings
  • ✅ Microphone not muted (hardware or software)
  • ✅ App has microphone permission
  • ✅ No other app is using the microphone exclusively

Interim results not showing

Note: Interim results availability varies by platform:

  • iOS/Android: Full support
  • Desktop (Vosk): Partial support (depends on model)
await startListening({
  interimResults: true, // Enable interim results
  continuous: true, // Keep listening
});

Recognition accuracy is low

Tips:

  • Use correct language code for your accent (e.g., "en-GB" vs "en-US")
  • Speak clearly and avoid background noise
  • On iOS, download enhanced voices in Settings → Accessibility → Spoken Content
  • Desktop: Use larger Vosk models for better accuracy (at cost of size)

"ALREADY_LISTENING" error

Solution: Stop current session before starting a new one:

try {
  await stopListening();
} catch (e) {
  // Ignore if not listening
}
await startListening();

Download progress events not firing

Note: Download progress events are only for desktop (Vosk models). Mobile uses native speech recognition without downloads.

import { listen } from "@tauri-apps/api/event";

const unlisten = await listen("stt://download-progress", event => {
  console.log(`${event.payload.status}: ${event.payload.progress}%`);
});

Examples

See the examples/stt-example directory for a complete working demo with React + Material UI, featuring:

  • Real-time transcription with interim results
  • Language selection
  • Permission handling
  • Error handling with visual feedback
  • Download progress monitoring
  • Results history

Platform-Specific Notes

iOS

  • Requires iOS 10+ for basic speech recognition
  • iOS 13+ required for on-device recognition (onDevice: true)
  • Must add NSSpeechRecognitionUsageDescription to Info.plist
  • Must add NSMicrophoneUsageDescription to Info.plist

Android

  • Requires Android API 23+ (Android 6.0+)
  • Google app must be installed for speech recognition
  • Internet required for server-based recognition
  • Must request RECORD_AUDIO permission in AndroidManifest.xml

Desktop (Windows, macOS, Linux)

  • Requires Vosk library installation (see Vosk Library section)
  • Models downloaded automatically (40-50 MB per language)
  • Fully offline after model download
  • Models stored in app data directory

License

MIT