react-native-sherpa-onnx

v0.2.0

Published

2 days ago

Offline Speech-to-text, text-to-speech, speaker diarization, speech enhancement, source separation, and VAD with sherpa-onnx for React NativeSpeech-to-Text with sherpa-onnx for React Native

0High
0Medium
0Low

xdcobra

react-native ios android

react-native-sherpa-onnx

React Native SDK for sherpa-onnx - providing offline speech processing capabilities

A React Native TurboModule that provides offline speech processing capabilities using sherpa-onnx. The SDK aims to support all functionalities that sherpa-onnx offers, including offline speech-to-text, text-to-speech, speaker diarization, speech enhancement, source separation, and VAD (Voice Activity Detection).

Feature Support

| Feature | Status | |---------|--------| | Offline Speech-to-Text | ✅ Supported | | Text-to-Speech | ✅ Supported | | Speaker Diarization | ❌ Not yet supported | | Speech Enhancement | ❌ Not yet supported | | Source Separation | ❌ Not yet supported | | VAD (Voice Activity Detection) | ❌ Not yet supported |

Platform Support Status

| Platform | Status | Notes | |----------|--------|-------| | Android | ✅ Production Ready | Fully tested, CI/CD automated, multiple models supported | | iOS | 🟡 Beta / Experimental | XCFramework + Podspec ready✅ GitHub Actions builds pass❌ No local Xcode testing (Windows-only dev) |

🔧 iOS Contributors WANTED!

Full iOS support is a priority! Help bring sherpa-onnx to iOS devices.

What's ready:

✅ XCFramework integration
✅ Podspec configuration
✅ GitHub Actions CI (macOS runner)
✅ TypeScript bindings

What's needed:

Local Xcode testing (Simulator + Device)
iOS example app (beyond CI)
TurboModule iOS testing
Edge case testing

Supported Model Types

Speech-to-Text (STT) Models

| Model Type | modelType Value | Description | Download Links | | ------------------------ | ----------------- | ---------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------ | | Zipformer/Transducer | 'transducer' | Requires encoder.onnx, decoder.onnx, joiner.onnx, and tokens.txt | Download | | Paraformer | 'paraformer' | Requires model.onnx (or model.int8.onnx) and tokens.txt | Download | | NeMo CTC | 'nemo_ctc' | Requires model.onnx (or model.int8.onnx) and tokens.txt | Download | | Whisper | 'whisper' | Requires encoder.onnx, decoder.onnx, and tokens.txt | Download | | WeNet CTC | 'wenet_ctc' | Requires model.onnx (or model.int8.onnx) and tokens.txt | Download | | SenseVoice | 'sense_voice' | Requires model.onnx (or model.int8.onnx) and tokens.txt | Download | | FunASR Nano | 'funasr_nano' | Requires encoder_adaptor.onnx, llm.onnx, embedding.onnx, and tokenizer directory | Download |

Text-to-Speech (TTS) Models

| Model Type | modelType Value | Description | Download Links | | ---------------- | ----------------- | ---------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------- | | VITS | 'vits' | Fast, high-quality TTS. Includes Piper, Coqui, MeloTTS, MMS variants. Requires model.onnx, tokens.txt | Download | | Matcha | 'matcha' | High-quality acoustic model + vocoder. Requires acoustic_model.onnx, vocoder.onnx, tokens.txt | Download | | Kokoro | 'kokoro' | Multi-speaker, multi-language. Requires model.onnx, voices.bin, tokens.txt, espeak-ng-data/ | Download | | KittenTTS | 'kitten' | Lightweight, multi-speaker. Requires model.onnx, voices.bin, tokens.txt, espeak-ng-data/ | Download | | Zipvoice | 'zipvoice' | Voice cloning capable. Requires encoder.onnx, decoder.onnx, vocoder.onnx, tokens.txt | Download |

Features

✅ Offline Speech-to-Text - No internet connection required for speech recognition
✅ Multiple Model Types - Supports Zipformer/Transducer, Paraformer, NeMo CTC, Whisper, WeNet CTC, SenseVoice, and FunASR Nano models
✅ Model Quantization - Automatic detection and preference for quantized (int8) models
✅ Flexible Model Loading - Asset models, file system models, or auto-detection
✅ Android Support - Fully supported on Android
✅ iOS Support - Fully supported on iOS (requires sherpa-onnx XCFramework)
✅ TypeScript Support - Full TypeScript definitions included
🚧 Additional Features Coming Soon - Speaker Diarization, Speech Enhancement, Source Separation, and VAD support are planned for future releases

Installation

npm install react-native-sherpa-onnx

If your project uses Yarn (v3+) or Plug'n'Play, configure Yarn to use the Node Modules linker to avoid postinstall issues:

# .yarnrc.yml
nodeLinker: node-modules

Alternatively, set the environment variable during install:

YARN_NODE_LINKER=node-modules yarn install

Android

No additional setup required. The library automatically handles native dependencies via Gradle.

iOS

The sherpa-onnx XCFramework is not included in the repository or npm package due to its size (~80MB), but no manual action is required! The framework is automatically downloaded during pod install.

Quick Setup

cd example
bundle install
bundle exec pod install --project-directory=ios

That's it! The Podfile automatically:

Copies required header files from the git submodule
Downloads the latest XCFramework from GitHub Releases
Verifies everything is in place before building

For Advanced Users: Building the Framework Locally

If you want to build the XCFramework yourself instead of using the prebuilt release:

# Clone sherpa-onnx repository
git clone https://github.com/k2-fsa/sherpa-onnx.git
cd sherpa-onnx
git checkout v1.12.23

# Build the iOS XCFramework (requires macOS, Xcode, CMake, and ONNX Runtime)
./build-ios.sh

# Copy to your project
cp -r build-ios/sherpa_onnx.xcframework /path/to/react-native-sherpa-onnx/ios/Frameworks/

Then run pod install as usual.

Note: The iOS implementation uses the same C++ wrapper as Android, ensuring consistent behavior across platforms.

Documentation

Example Model READMEs

Requirements

React Native >= 0.70
Android API 24+ (Android 7.0+)
iOS 13.0+ (requires sherpa-onnx XCFramework - see iOS Setup below)

Example Apps

We provide example applications to help you get started with react-native-sherpa-onnx:

Example App (Audio to Text)

The example app included in this repository demonstrates basic audio-to-text transcription capabilities. It includes:

Multiple model type support (Zipformer, Paraformer, NeMo CTC, Whisper, WeNet CTC, SenseVoice, FunASR Nano)
Model selection and configuration
Audio file transcription
Test audio files for different languages

Getting started:

cd example
yarn install
yarn android  # or yarn ios

Video to Text Comparison App

A comprehensive comparison app that demonstrates video-to-text transcription using react-native-sherpa-onnx alongside other speech-to-text solutions:

Repository: mobile-videototext-comparison

Features:

Video to audio conversion (using native APIs)
Audio to text transcription
Video to text (video --> WAV --> text)
Comparison between different STT providers
Performance benchmarking

This app showcases how to integrate react-native-sherpa-onnx into a real-world application that processes video files and converts them to text.

Contributing

License

MIT

Made with create-react-native-library

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

react-native-sherpa-onnx

Feature Support

Platform Support Status

🔧 iOS Contributors WANTED!

Supported Model Types

Speech-to-Text (STT) Models

Text-to-Speech (TTS) Models

Features

Installation

Android

iOS

Quick Setup

For Advanced Users: Building the Framework Locally

Documentation

Example Model READMEs

Requirements

Example Apps

Example App (Audio to Text)

Video to Text Comparison App

Contributing

License