capacitor-offline-speech-recognition

v3.0.0

Published

7 months ago

A Capacitor plugin that provides offline speech-to-text functionality for Android and iOS platforms. The plugin offers true offline recognition for Android with multiple languages, while iOS provides offline support for English with online fallback for ot

0High
0Medium
0Low

gaudravi09

capacitor plugin native

Capacitor Offline Speech Recognition Plugin

A Capacitor plugin that provides offline speech-to-text functionality for Android and iOS platforms. The plugin uses the Vosk engine on both platforms for fully offline recognition with downloadable language models.

Maintainers

| Maintainer | GitHub | Social | | ---------- | ------ | ------ | | Ravi Gaud | GaudRavi09 | - |

Maintenance Status: Actively Maintained

Installation

To use npm

npm install capacitor-offline-speech-recognition

To use yarn

yarn add capacitor-offline-speech-recognition

Sync native files

npx cap sync

Platform Support

✅ Android - Full offline support with Vosk models for 15+ languages
✅ iOS - Full offline support with Vosk models (uses Vosk instead of Apple's Speech framework)
❌ Web - Not supported (requires offline model files)

System Requirements

Android

Minimum SDK: API level 24 (Android 7.0)
Target SDK: API level 34 (Android 14)
Storage: ~50MB per language model
RAM: Minimum 2GB recommended for optimal performance

iOS

Minimum iOS: 14.0
Target iOS: 17.0+
Storage: ~50MB per language model
RAM: Minimum 2GB recommended for optimal performance

Dependencies

Capacitor: ^5.0.0
Android: Vosk Android SDK 0.3.70
iOS: Vosk iOS static library (bundled xcframework), Accelerate, AVFoundation, AudioToolbox, and SSZipArchive

iOS

Minimum iOS Requirements

The plugin requires the following minimum iOS versions:

Minimum iOS: 12.0
Target iOS: 17.0
Deployment Target: 12.0

Permissions

Since iOS now uses Vosk (not Apple's Speech framework), only the microphone usage description is required in your app Info.plist:

NSMicrophoneUsageDescription (Privacy - Microphone Usage Description)

iOS Setup

Add the following permission to your iOS app's Info.plist file:

<key>NSMicrophoneUsageDescription</key>
<string>This app needs access to microphone for speech recognition functionality.</string>

CocoaPods integration: after installing this plugin, run the following to integrate the bundled libvosk.xcframework and dependencies (SSZipArchive, Accelerate, AVFoundation, AudioToolbox):

cd ios/App
pod install

The plugin downloads and unzips Vosk language models on-demand into the app’s Documents directory.

Android

Minimum SDK Requirements

The plugin requires the following minimum SDK versions:

Minimum SDK: API level 24 (Android 7.0)
Target SDK: API level 34 (Android 14)
Compile SDK: API level 34 (Android 14)

Permissions

The plugin automatically includes the required permissions in its AndroidManifest.xml:

<uses-permission android:name="android.permission.INTERNET" />
<uses-permission android:name="android.permission.RECORD_AUDIO" />
<uses-permission android:name="android.permission.ACCESS_NETWORK_STATE" />
<uses-permission android:name="android.permission.READ_EXTERNAL_STORAGE" />
<uses-permission android:name="android.permission.WRITE_EXTERNAL_STORAGE" />

Permission Details:

RECORD_AUDIO - Required for speech recognition
INTERNET - Required for model downloads from alphacephei.com
ACCESS_NETWORK_STATE - Required for connectivity checks
READ_EXTERNAL_STORAGE - Required for reading downloaded model files
WRITE_EXTERNAL_STORAGE - Required for storing downloaded model files

Android Setup

No additional configuration required. The plugin handles all permissions automatically through Capacitor's permission system.

Note: For Android 13+ (API level 33+), the READ_EXTERNAL_STORAGE and WRITE_EXTERNAL_STORAGE permissions are automatically managed by the system for app-specific storage.

Example

import { OfflineSpeechRecognition } from 'capacitor-offline-speech-recognition';

// Get supported languages
const languages = await OfflineSpeechRecognition.getSupportedLanguages();
console.log('Supported languages:', languages.languages);

// Download a language model
const downloadListener = await OfflineSpeechRecognition.addListener('downloadProgress', (progress) => {
  console.log(`Download progress: ${progress.progress}% - ${progress.message}`);
});

const downloadResult = await OfflineSpeechRecognition.downloadLanguageModel({ 
  language: 'en-us' 
});
console.log('Download result:', downloadResult);

// Remove download listener
await downloadListener.remove();

// Start speech recognition
const recognitionListener = await OfflineSpeechRecognition.addListener('recognitionResult', (result) => {
  console.log(`Recognized: ${result.text} (Final: ${result.isFinal})`);
});

await OfflineSpeechRecognition.startRecognition({ language: 'en-us' });

// Stop recognition
await OfflineSpeechRecognition.stopRecognition();

// Remove recognition listener
await recognitionListener.remove();

API

getSupportedLanguages()

getSupportedLanguages() => Promise<{ languages: Language[]; }>

Get all supported languages for speech recognition

Returns: Promise<{ languages: Language[]; }>

getDownloadedLanguageModels()

getDownloadedLanguageModels() => Promise<{ models: DownloadedModel[]; }>

Get all downloaded language models on the device

Returns: Promise<{ models: DownloadedModel[]; }>

downloadLanguageModel(...)

downloadLanguageModel(options: { language: string; }) => Promise<{ success: boolean; language: string; modelName?: string; message?: string; }>

Download a language model for offline use

| Param | Type | Description | | ------------- | ---------------------------------- | --------------------------------------------------- | | options | { language: string; } | - Language code to download |

Returns: Promise<{ success: boolean; language: string; modelName?: string; message?: string; }>

deleteLanguageModel(...)

deleteLanguageModel(options: { language: string; }) => Promise<{ success: boolean; language: string; modelName?: string; message?: string; }>

Delete a downloaded language model from the device

| Param | Type | Description | | ------------- | ---------------------------------- | -------------------------------------------------------------- | | options | { language: string; } | - Language code of the model to delete |

Returns: Promise<{ success: boolean; language: string; modelName?: string; message?: string; }>

startRecognition(...)

startRecognition(options?: { language?: string | undefined; } | undefined) => Promise<void>

Start speech recognition

| Param | Type | Description | | ------------- | ----------------------------------- | ----------------------------------------------------------------------------- | | options | { language?: string; } | - Language code for recognition (defaults to 'en-us') |

stopRecognition()

stopRecognition() => Promise<void>

Stop speech recognition

addListener('downloadProgress', ...)

addListener(eventName: 'downloadProgress', listenerFunc: (progress: DownloadProgress) => void) => Promise<{ remove: () => void; }>

Add listener for download progress updates

| Param | Type | | ------------------ | ------------------------------------------------------------------------------------ | | eventName | 'downloadProgress' | | listenerFunc | (progress: DownloadProgress) => void |

Returns: Promise<{ remove: () => void; }>

addListener('recognitionResult', ...)

addListener(eventName: 'recognitionResult', listenerFunc: (result: RecognitionResult) => void) => Promise<{ remove: () => void; }>

Add listener for recognition results

| Param | Type | | ------------------ | ------------------------------------------------------------------------------------ | | eventName | 'recognitionResult' | | listenerFunc | (result: RecognitionResult) => void |

Returns: Promise<{ remove: () => void; }>

removeAllListeners()

removeAllListeners() => Promise<void>

Remove all listeners

Interfaces

Language

| Prop | Type | | --------------- | ------------------- | | code | string | | name | string | | modelFile | string |

DownloadedModel

| Prop | Type | | -------------- | ------------------- | | language | string | | name | string | | path | string | | size | number |

DownloadProgress

| Prop | Type | | -------------- | ------------------- | | progress | number | | message | string |

RecognitionResult

| Prop | Type | | -------------- | -------------------- | | text | string | | isFinal | boolean | | language | string |

Supported Languages

Android (Vosk Models)

English (US) - en-us
German - de
French - fr
Spanish - es
Portuguese - pt
Chinese - zh
Russian - ru
Turkish - tr
Vietnamese - vi
Italian - it
Hindi - hi
Gujarati - gu
Telugu - te
Japanese - ja
Korean - ko

iOS (Vosk Models)

English (US) - en-us
German - de
French - fr
Spanish - es
Portuguese - pt
Chinese - zh
Russian - ru
Turkish - tr
Vietnamese - vi
Italian - it
Hindi - hi
Gujarati - gu
Telugu - te
Japanese - ja
Korean - ko

Platform Differences

| Feature | Android (Vosk) | iOS (Vosk) | |---------|----------------|------------| | Models | Downloaded Vosk models (50MB+ each) | Downloaded Vosk models (50MB+ each) | | Offline | True offline (all languages) | True offline (all languages) | | Download | Real model downloads from alphacephei.com | Real model downloads from alphacephei.com | | Languages | 15+ Vosk models | 15+ Vosk models | | Storage | App storage (cache/Documents) | App storage (Documents) |

Permission Handling

iOS

Only microphone permission is required. The plugin requests it when starting recognition.

Android

The plugin uses Capacitor's permission system:

Automatically requests permissions when needed
Permissions requested before starting recognition or downloading models
Handles permission granted/denied scenarios

Troubleshooting

iOS Issues

Permission denied: Check NSMicrophoneUsageDescription exists in Info.plist
No progress updates: Ensure listener is registered before calling downloadLanguageModel
Linker errors: Run pod install after plugin updates; ensure Pods include Accelerate, AVFoundation, AudioToolbox, SSZipArchive
Model verification failed: Re-download the model; extraction may have failed or been interrupted

Android Issues

Permission denied: Check if user manually denied permissions
Model download fails: Check internet permission and connectivity
Recognition fails: Check microphone permission
Build errors: Ensure minimum SDK 21 and target SDK 34
Vosk model loading fails: Check if model files are corrupted or incomplete

Common Issues

Models not downloading: Check internet connection and storage permissions
Recognition not working: Ensure microphone permission is granted
Language not supported: Check if language is available on the device
App crashes on older devices: Ensure device meets minimum requirements (Android 5.0+, iOS 12.0+)
Storage issues: Ensure device has sufficient storage for model downloads (50MB+ per model)

License

MIT

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

Capacitor Offline Speech Recognition Plugin

Maintainers

Installation

Platform Support

System Requirements

Android

iOS

Dependencies

iOS

Minimum iOS Requirements

Permissions

iOS Setup

Android

Minimum SDK Requirements

Permissions

Android Setup

Example

API

getSupportedLanguages()

getDownloadedLanguageModels()

downloadLanguageModel(...)

deleteLanguageModel(...)

startRecognition(...)

stopRecognition()

addListener('downloadProgress', ...)

addListener('recognitionResult', ...)

removeAllListeners()

Interfaces

Language

DownloadedModel

DownloadProgress

RecognitionResult

Supported Languages

Android (Vosk Models)

iOS (Vosk Models)

Platform Differences

Permission Handling

iOS

Android

Troubleshooting

iOS Issues

Android Issues

Common Issues

License

Contributing