npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

tauri-plugin-stt-api

v0.2.0

Published

Speech-to-text recognition API for Tauri with multi-language support

Downloads

20,026

Readme

Tauri Plugin STT (Speech-to-Text)

Cross-platform speech recognition plugin for Tauri 2.x. Desktop targets (Windows, macOS, Linux) use whisper.cpp via whisper-rs; mobile targets delegate to the native OS engines (SFSpeechRecognizer on iOS, SpeechRecognizer on Android).

Highlights

  • One model, 99 languages — Whisper is multilingual; users download a single GGML model and it works for English, Portuguese, Mandarin, …
  • No native runtime to shipwhisper-rs builds whisper.cpp statically; there is no libvosk.so / .dylib to install separately.
  • Explicit model lifecycle — the host app controls when (and whether) a model is downloaded. start_listening returns ModelNotInstalled instead of silently pulling hundreds of MB.
  • Hardware acceleration — opt-in metal / cuda / vulkan features map straight to the matching whisper.cpp backend.

Platform Matrix

| Platform | Engine | Model | | -------- | ----------------------------------------- | ----- | | iOS | SFSpeechRecognizer (Speech.framework) | OS | | Android | SpeechRecognizer | OS | | macOS | whisper.cpp via whisper-rs (Metal opt.) | GGML | | Windows | whisper.cpp via whisper-rs (CUDA opt.) | GGML | | Linux | whisper.cpp via whisper-rs (Vulkan opt.) | GGML |

Installation

[dependencies]
tauri-plugin-stt = { version = "0.2", features = ["metal"] } # macOS
# or "cuda" / "vulkan" — omit for plain CPU inference

Register the plugin and the four model-management commands:

fn main() {
    tauri::Builder::default()
        .plugin(tauri_plugin_stt::init())
        .run(tauri::generate_context!())
        .unwrap();
}

Capability:

{ "permissions": ["stt:default"] }

Model Catalogue

| id | display | size | tier | | ---------- | ------------ | ------ | -------------- | | tiny | Tiny | 75 MB | fastest | | base | Base | 142 MB | balanced ⭐ | | small | Small | 466 MB | accurate | | medium | Medium | 1.5 GB | very accurate | | large-v3 | Large v3 | 3.0 GB | most accurate |

Files are fetched from https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-<id>.bin and stored under <app_data_dir>/whisper-models/. The active selection is persisted to whisper-models/active.txt.

Commands

  • list_models(){ models, active, total_disk_bytes }
  • install_model(id) — downloads the model, emits stt://download-progress
  • remove_model(id) — deletes the file and clears the active marker if needed
  • set_active_model(id) — picks which installed model start_listening loads
  • start_listening({ language?, max_duration? }) — push-to-talk session
  • stop_listening() — runs Whisper over the captured audio and emits one final result
  • is_available() — reports available: true only when a model is installed
  • get_supported_languages() — curated list of UI-facing locales
  • check_permission() / request_permission() — microphone permission helpers

Events

  • stt://download-progress{ status, modelId, model, progress, downloaded?, total? }
  • stt://result{ transcript, isFinal, confidence }
  • stt://error{ code, message }
  • plugin:stt:result — same payload as stt://result (legacy listener channel)
  • plugin:stt:stateChange{ state, isAvailable, language }

Behaviour Notes

  • Whisper is not a streaming recogniser. The plugin buffers audio while recording and runs a single pass on stop_listening. UX is push-to-talk.
  • Audio is captured at the device default rate, downmixed to mono, then decimated to 16 kHz with nearest-neighbour. Whisper is robust enough that a high-quality resampler buys nothing measurable.
  • Inference uses min(available_parallelism(), 4) threads — beyond that whisper.cpp shows diminishing returns and we want headroom for the UI.

Mobile

The mobile bridges expose the same JS API surface but list_models returns an empty list and install_model / remove_model / set_active_model are no-ops: the OS engine has no downloadable model concept. Use is_available to gate UI; on iOS / Android it reflects actual recognizer availability.

License

MIT.