npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

react-native-sherpa-onnx

v0.3.9

Published

Offline Speech-to-text, text-to-speech, speaker diarization, speech enhancement, source separation, and VAD with sherpa-onnx for React NativeSpeech-to-Text with sherpa-onnx for React Native

Readme

react-native-sherpa-onnx

React Native SDK for sherpa-onnx – offline and streaming speech processing

npm version npm downloads npm license Android iOS

⚠️ SDK 0.3.0 – Breaking changes from 0.2.0
Since the last release I have restructured and improved the SDK significantly: full iOS support, smoother behaviour, fewer failure points, and a much smaller footprint (~95% size reduction). As a result, logic and the public API have changed. If you are upgrading from 0.2.x, please follow the Breaking changes (upgrading to 0.3.0) section and the updated API documentation

A React Native TurboModule that provides offline and streaming speech processing capabilities using sherpa-onnx. The SDK aims to support all functionalities that sherpa-onnx offers, including offline and online (streaming) speech-to-text, text-to-speech (batch and streaming), speaker diarization, speech enhancement, source separation, and VAD (Voice Activity Detection).

Installation

npm install react-native-sherpa-onnx

If your project uses Yarn (v3+) or Plug'n'Play, configure Yarn to use the Node Modules linker to avoid postinstall issues:

# .yarnrc.yml
nodeLinker: node-modules

Alternatively, set the environment variable during install:

YARN_NODE_LINKER=node-modules yarn install

Android

No additional setup required. The library automatically handles native dependencies via Gradle. For execution provider support (CPU, NNAPI, XNNPACK, QNN) and optional QNN setup, see Execution provider support. For building Android native libs yourself, see sherpa-onnx-prebuilt.

iOS

The sherpa-onnx XCFramework is not shipped in the repo or npm (size ~80MB). It is downloaded automatically when you run pod install; no manual steps are required. The version used is pinned in third_party/sherpa-onnx-prebuilt/IOS_RELEASE_TAG (format: sherpa-onnx-ios-vX.Y.Z or sherpa-onnx-ios-vX.Y.Z-N with optional build number) and the archive is fetched from GitHub Releases.

Setup

cd your-app/ios
bundle install
bundle exec pod install

The podspec runs scripts/setup-ios-framework.sh, which downloads the XCFramework (and, if needed, libarchive sources) so the Pod builds correctly. Libarchive is compiled from source as part of the Pod; its version is pinned in third_party/libarchive_prebuilt/IOS_RELEASE_TAG.

Building the iOS framework

To build the sherpa-onnx iOS XCFramework yourself (e.g. custom version or patches), see third_party/sherpa-onnx-prebuilt/README.md and the Framework - Sherpa-Onnx (iOS) Release workflow.

Model download (optional)

If you use the download manager to fetch models at runtime, add the following to your AppDelegate so background downloads can finish when the app is in the background or after it was terminated. Without it, downloads only work reliably while the app is in the foreground.

  • Swift (RN 0.77+): In your bridging header add #import <RNBackgroundDownloader.h>. In AppDelegate.swift, implement:
    func application(_ application: UIApplication, handleEventsForBackgroundURLSession identifier: String, completionHandler: @escaping () -> Void) {
      RNBackgroundDownloader.setCompletionHandlerWithIdentifier(identifier, completionHandler: completionHandler)
    }
  • Objective-C: In AppDelegate.m add #import <RNBackgroundDownloader.h> and the application:handleEventsForBackgroundURLSession:completionHandler: implementation that calls [RNBackgroundDownloader setCompletionHandlerWithIdentifier:identifier completionHandler:completionHandler].

Full step-by-step: Download manager – Setup (iOS & Android). Expo users can use the library’s config plugin to apply this automatically.

Android: Foreground service permissions (Play Console), visible download notifications, and POST_NOTIFICATIONS (API 33+) are covered in Download manager – Android: foreground service & notifications.

Table of contents

Bundled sherpa-onnx version

| Platform | Version | |----------|---------| | Android | 1.12.31 | | iOS | 1.12.31 |

Feature Support

| Feature | Status | Docs | Notes | |---------|--------|------|-------| | Offline Speech-to-Text | ✅ Supported | STT | No internet required; multiple model types (Zipformer, Paraformer, Whisper, etc.). See Supported Model Types. | | Online (streaming) Speech-to-Text | ✅ Supported | Streaming STT | Real-time recognition from microphone or stream; partial results, endpoint detection. Use streaming-capable models (e.g. transducer, paraformer). | | Live capture API | ✅ Supported | PCM live stream | Native microphone capture with resampling for live transcription (use with streaming STT). | | Text-to-Speech | ✅ Supported | TTS | Multiple model types (VITS, Matcha, Kokoro, etc.). See Supported Model Types. | | Streaming Text-to-Speech | ✅ Supported | Streaming TTS | Incremental speech generation for low time-to-first-byte and playback while generating. | | Execution providers (CPU, NNAPI, XNNPACK, Core ML, QNN) | ✅ Supported | Execution providers | CPU default; optional accelerators per platform. | | Play Asset Delivery (PAD) | ✅ Supported | Model setup | Android only. Archives: Extraction API. | | Automatic Model type detection | ✅ Supported | Model detection | detectSttModel() and detectTtsModel() for a path. | | Model quantization | ✅ Supported | Model setup | Automatic detection and preference for quantized (int8) models. | | Flexible model loading | ✅ Supported | Model setup | Asset models, file system models, or auto-detection. | | TypeScript | ✅ Supported | — | Full type definitions included. | | Speech Enhancement | ❌ Not yet supported | Enhancement | Scheduled for release 0.4.0 | | Speaker Diarization | ❌ Not yet supported | Diarization | Scheduled for release 0.5.0 | | Source Separation | ❌ Not yet supported | Separation | Scheduled for release 0.6.0 | | VAD (Voice Activity Detection) | ❌ Not yet supported | VAD | Scheduled for release 0.7.0 |

Platform Support Status

| Platform | Status | Notes | |----------|--------|-------| | Android | ✅ Production Ready | CI/CD automated, multiple models supported | | iOS | ✅ Production Ready | CI/CD automated, multiple models supported |

Known issues

Supported Model Types

Speech-to-Text (STT) Models

| Model Type | modelType Value | Description | Download Links | | ------------------------ | ----------------- | ---------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------ | | Auto Detect | 'auto' | Automatically detects model layout/type from files in the model folder and picks the best supported STT type. | n/a | | Zipformer/Transducer | 'transducer' | Encoder–decoder–joiner (e.g. icefall). Good balance of speed and accuracy. Folder name should contain zipformer or transducer for auto-detection. | Download | | LSTM Transducer | 'transducer' | Same layout as Zipformer (encoder–decoder–joiner). LSTM-based streaming ASR; detected as transducer. Folder name may contain lstm. | Download | | Paraformer | 'paraformer' | Single-model non-autoregressive ASR; fast and accurate. Detected by model.onnx; no folder token required. | Download | | NeMo CTC | 'nemo_ctc' | NeMo CTC; good for English and streaming. Folder name should contain nemo or parakeet. | Download | | Whisper | 'whisper' | Multilingual, encoder–decoder; strong zero-shot. Detected by encoder+decoder (no joiner); folder token optional. | Download | | WeNet CTC | 'wenet_ctc' | CTC from WeNet; compact. Folder name should contain wenet. | Download | | SenseVoice | 'sense_voice' | Multilingual with emotion/punctuation. Folder name should contain sense or sensevoice. | Download | | FunASR Nano | 'funasr_nano' | Lightweight LLM-based ASR. Folder name should contain funasr or funasr-nano. | Download | | Moonshine (v1) | 'moonshine' | Four-part streaming-capable ASR (preprocess, encode, uncached/cached decode). Folder name should contain moonshine. | Download | | Moonshine (v2) | 'moonshine_v2' | Two-part Moonshine (encoder + merged decoder); .onnx or .ort. Folder name should contain moonshine (v2 preferred if both layouts present). | Download | | Fire Red ASR | 'fire_red_asr' | Fire Red encoder–decoder ASR. Folder name should contain fire_red or fire-red. | Download | | Dolphin | 'dolphin' | Single-model CTC. Folder name should contain dolphin. | Download | | Canary | 'canary' | NeMo Canary multilingual. Folder name should contain canary. | Download | | Omnilingual | 'omnilingual' | Omnilingual CTC. Folder name should contain omnilingual. | Download | | MedASR | 'medasr' | Medical ASR CTC. Folder name should contain medasr. | Download | | Telespeech CTC | 'telespeech_ctc'| Telespeech CTC. Folder name should contain telespeech. | Download | | Tone CTC (t-one) | 'tone_ctc' | Lightweight streaming CTC (e.g. t-one). Folder name should contain t-one, t_one, or tone (as word). | Download |

For real-time (streaming) recognition from a microphone or audio stream, use streaming-capable model types: transducer, paraformer, zipformer2_ctc, nemo_ctc, or tone_ctc. See Streaming (Online) Speech-to-Text.

Text-to-Speech (TTS) Models

| Model Type | modelType Value | Description | Download Links | | ---------------- | ----------------- | ---------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------- | | Auto Detect | 'auto' | Automatically detects the TTS model layout from files in the model folder and selects the matching supported type. | n/a | | VITS | 'vits' | Fast, high-quality TTS (Piper, Coqui, MeloTTS, MMS). Folder name should contain vits if used with other voice models. | Download | | Matcha | 'matcha' | High-quality acoustic model + vocoder. Detected by acoustic_model + vocoder; no folder token required. | Download | | Kokoro | 'kokoro' | Multi-speaker, multi-language. Folder name should contain kokoro (not kitten) for auto-detection. | Download | | KittenTTS | 'kitten' | Lightweight, multi-speaker. Folder name should contain kitten (not kokoro) for auto-detection. | Download | | Zipvoice | 'zipvoice' | Standard TTS with sid. Voice cloning (reference audio + referenceText): batch via generateSpeech only—streaming TTS does not support reference audio for Zipvoice. Default numSteps when omitted is 5 on Android and iOS (matches sherpa-onnx GenerationConfig / Kotlin helper). Cloning is supported on Android & iOS. Encoder + decoder + vocoder. | Download | | Pocket | 'pocket' | Flow-matching TTS. Voice cloning on Android: batch and streaming TTS. iOS: cloning is experimental. Detected by lm_flow, lm_main, text_conditioner, vocab/token_scores. | Download | | Supertonic | 'supertonic' | Lightning-fast, on-device text-to-speech system designed for extreme performance with minimal computational overhead. | Download |

For streaming TTS (incremental generation, low latency), use createStreamingTTS() with supported model types. See Streaming Text-to-Speech.

Documentation

Note: For when to use listAssetModels() vs listModelsAtPath() and how to combine bundled and PAD/file-based models, see Model Setup.

Requirements

  • React Native >= 0.70
  • Android API 24+ (Android 7.0+)
  • iOS 13.0+

Example Apps

We provide example applications to help you get started with react-native-sherpa-onnx:

Example App (Audio to Text)

The example app included in this repository demonstrates audio-to-text transcription, text-to-speech, and streaming features. It includes:

  • Multiple model type support (Zipformer, Paraformer, NeMo CTC, Whisper, WeNet CTC, SenseVoice, FunASR Nano, Moonshine, and more)
  • Model selection and configuration
  • Offline audio file transcription
  • Online (streaming) STT – live transcription from the microphone with partial results
  • Streaming TTS – incremental speech generation and playback
  • Test audio files for different languages

Getting started:

cd example
yarn install
yarn android  # or yarn ios

Video to Text Comparison App

A comprehensive comparison app that demonstrates video-to-text transcription using react-native-sherpa-onnx alongside other speech-to-text solutions:

Repository: mobile-videototext-comparison

Features:

  • Video to audio conversion (using native APIs)
  • Audio to text transcription
  • Video to text (video --> WAV --> text)
  • Comparison between different STT providers
  • Performance benchmarking

This app showcases how to integrate react-native-sherpa-onnx into a real-world application that processes video files and converts them to text.

Contributing

License

MIT

Third-Party Libraries

This SDK includes the following open source components:

Full license texts are available in the THIRD_PARTY_LICENSES directory.

LGPL Notice

This SDK includes LGPL-licensed components such as FFmpeg and Shine.
Applications using this SDK must ensure compliance with LGPL requirements when distributing binaries.

FFmpeg source code can be obtained at: https://ffmpeg.org

Qualcomm QNN Support

This SDK supports optional integration with Qualcomm AI Runtime (QNN).

QNN is proprietary software provided by Qualcomm and is not included in this SDK.
To use QNN acceleration, users must obtain and include the required QNN libraries separately and comply with Qualcomm's license terms:

https://softwarecenter.qualcomm.com/

Responsibility

By using this SDK, you are responsible for complying with all third-party licenses included in this project.


Made with create-react-native-library