@siteed/sherpa-onnx.rn
v1.1.0
Published
React Native wrapper for sherpa-onnx TTS and STT capabilities
Maintainers
Readme
@siteed/sherpa-onnx.rn
React Native wrapper for sherpa-onnx providing Text-to-Speech (TTS) and Speech-to-Text (STT) capabilities for mobile applications.
Status: 🚧 In Development
✅ Native Integration Testing: Complete framework with 100% test success rate ✅ Model Management: Lightweight strategy for CI to production environments ✅ Cross-Platform: Android and iOS support validated ✅ Web/WASM: All features work in the browser via WebAssembly ⏳ Architecture Support: Old/New React Native architecture compatibility planned
Features
- Text-to-speech (TTS) — VITS, Kokoro, Matcha models
- Speech-to-text (STT) — offline and streaming recognition
- Voice activity detection (VAD), keyword spotting (KWS)
- Speaker identification, language identification, audio tagging
- Speaker diarization, punctuation restoration, speech denoising
- Support for both old and new React Native architectures
- Pre-built native libraries for Android and iOS
- Web/WASM support — all features work in the browser via a combined WebAssembly build
Installation
npm install @siteed/sherpa-onnx.rn
# or
yarn add @siteed/sherpa-onnx.rnLinking
This package is a native module and requires proper linking. For React Native 0.60 and above, linking should happen automatically through auto-linking.
If you're using Expo, you'll need to use a custom development client or eject to a bare workflow:
npx expo prebuildiOS
For iOS, run pod install:
cd ios && pod installAndroid
Android integration is handled automatically by the package.
Web
Web works zero-config — the WASM inference engine (~12.6 MB, similar to Android .so files) loads automatically from jsDelivr CDN and is browser-cached after first load.
import { loadWasmModule, isSherpaOnnxReady } from '@siteed/sherpa-onnx.rn';
import { Platform } from 'react-native';
if (Platform.OS === 'web') {
// Fire-and-forget — UI renders immediately, WASM loads in background
loadWasmModule({
onProgress: (event) => {
console.log(`[WASM] ${event.phase} (${event.loaded}/${event.total})`);
},
});
}
// Later: await readiness with waitForReady(), or poll with isSherpaOnnxReady()
// Services like ASR.initialize() internally await WASM — no manual gating needed.Self-hosting (optional): copy files from node_modules/@siteed/sherpa-onnx.rn/wasm/ and call configureSherpaOnnx({ wasmBasePath: '/your/wasm/path/' }).
See Getting Started: Web for the full guide including onProgress, loading indicators, and model configuration.
Usage
Text-to-Speech (TTS)
import SherpaOnnx from '@siteed/sherpa-onnx.rn';
import type { TtsModelConfig } from '@siteed/sherpa-onnx.rn';
// Initialize TTS with a Kokoro model
const config: TtsModelConfig = {
modelDir: `${FileSystem.documentDirectory}models/kokoro-en-v0_19/`,
ttsModelType: 'kokoro',
modelFile: 'model.onnx',
tokensFile: 'tokens.txt',
voicesFile: 'voices.bin',
dataDir: `${FileSystem.documentDirectory}models/kokoro-en-v0_19/espeak-ng-data`,
};
const initResult = await SherpaOnnx.TTS.initialize(config);
console.log(`TTS ready: ${initResult.numSpeakers} speakers, ${initResult.sampleRate}Hz`);
// Generate speech
const result = await SherpaOnnx.TTS.generateSpeech('Hello world', {
speakerId: 0,
speakingRate: 1.0,
playAudio: false,
});
console.log('Audio file:', result.filePath);
// Clean up
await SherpaOnnx.TTS.release();Speech-to-Text (Offline)
import SherpaOnnx from '@siteed/sherpa-onnx.rn';
import type { AsrModelConfig } from '@siteed/sherpa-onnx.rn';
const config: AsrModelConfig = {
modelDir: `${FileSystem.documentDirectory}models/sherpa-onnx-whisper-tiny.en/`,
modelType: 'whisper',
streaming: false,
numThreads: 2,
decodingMethod: 'greedy_search',
modelFiles: {
encoder: 'tiny.en-encoder.int8.onnx',
decoder: 'tiny.en-decoder.int8.onnx',
tokens: 'tiny.en-tokens.txt',
},
};
await SherpaOnnx.ASR.initialize(config);
const result = await SherpaOnnx.ASR.recognizeFromFile('/path/to/audio.wav');
console.log('Recognized text:', result.text);
await SherpaOnnx.ASR.release();Streaming Recognition
import SherpaOnnx from '@siteed/sherpa-onnx.rn';
import type { AsrModelConfig } from '@siteed/sherpa-onnx.rn';
const config: AsrModelConfig = {
modelDir: `${FileSystem.documentDirectory}models/streaming-zipformer/`,
modelType: 'zipformer',
streaming: true,
numThreads: 2,
modelFiles: {
encoder: 'encoder-epoch-99-avg-1.int8.onnx',
decoder: 'decoder-epoch-99-avg-1.onnx',
joiner: 'joiner-epoch-99-avg-1.int8.onnx',
tokens: 'tokens.txt',
},
};
await SherpaOnnx.ASR.initialize(config);
await SherpaOnnx.ASR.createOnlineStream();
// Feed audio chunks from microphone
const samples = [...]; // Float32 PCM samples
await SherpaOnnx.ASR.acceptWaveform(16000, samples);
// Get partial result
const result = await SherpaOnnx.ASR.getResult();
console.log('Partial:', result.text);
// Check for endpoint (sentence boundary)
const { isEndpoint } = await SherpaOnnx.ASR.isEndpoint();
if (isEndpoint) {
await SherpaOnnx.ASR.resetStream();
}
await SherpaOnnx.ASR.release();Models
You need to provide Sherpa-ONNX model files to use this library. You can download pre-trained models from the Sherpa-ONNX repository or train your own.
The most common model types are:
- Transducer models (encoder, decoder, joiner)
- CTC models (single model file)
- Paraformer models (encoder, decoder)
Building from Source
If you need to build the native libraries yourself:
# Clone the repository
git clone https://github.com/yourusername/yourrepo.git
cd yourrepo/packages/sherpa-onnx.rn
# Set up dependencies
./setup.sh
# Build for all platforms
./build-all.shAPI Reference
Services
All services are accessed via the SherpaOnnx default export:
SherpaOnnx.TTS— Text-to-Speech (initialize,generateSpeech,stopSpeech,release)SherpaOnnx.ASR— Speech-to-Text (initialize,recognizeFromFile,recognizeFromSamples,createOnlineStream,acceptWaveform,getResult,isEndpoint,resetStream,release)SherpaOnnx.VAD— Voice Activity DetectionSherpaOnnx.SpeakerId— Speaker IdentificationSherpaOnnx.KWS— Keyword SpottingSherpaOnnx.AudioTagging— Audio Event ClassificationSherpaOnnx.LanguageId— Spoken Language IdentificationSherpaOnnx.Punctuation— Punctuation RestorationSherpaOnnx.Diarization— Speaker DiarizationSherpaOnnx.Archive— Model archive extraction (extractTarBz2)SherpaOnnx.validateLibraryLoaded()— Check native library status
TtsModelConfig
interface TtsModelConfig {
modelDir: string; // Directory containing model files
ttsModelType: 'vits' | 'kokoro' | 'matcha';
modelFile: string; // Primary model file (e.g., model.onnx)
tokensFile: string; // Tokens file (e.g., tokens.txt)
voicesFile?: string; // Voices file for Kokoro models
vocoderFile?: string; // Vocoder file for Matcha models
lexiconFile?: string; // Lexicon file
dataDir?: string; // Data directory (espeak-ng-data)
dictDir?: string; // Dictionary directory
lang?: string; // Language code for multi-lingual Kokoro
ruleFstsFile?: string; // Rule FSTs file paths
ruleFarsFile?: string; // Rule FARs file paths
numThreads?: number; // Default: 1
debug?: boolean; // Default: false
provider?: 'cpu' | 'gpu'; // Default: 'cpu'
}Troubleshooting
Android
If you're having issues with the Android build, check that:
- The Kotlin standard library is properly included in your app's
build.gradle - JNI libraries are properly included in the build
iOS
For iOS, ensure that:
- The necessary static libraries are included in your Xcode project
- The pod installation completed successfully
Documentation
- Web/WASM Guide - How to deploy on web: setup, model customization, feature selection
- WASM Build Guide - Building the WASM binary from source
- Testing Framework - Native integration testing documentation
Development Status
✅ Completed
- Native integration testing framework (12 tests, 100% success rate)
- Model management strategy for testing environments
- Android and iOS native library integration validated
- Web/WASM support for all features
- Cross-platform documentation structure
⏳ In Progress / Planned
- React Native architecture-specific validation (Old vs New Architecture)
Contributing
See the testing framework in docs/testing/ for validation methodology and development workflow.
License
MIT
