npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@capgo/capacitor-speech-recognition

v8.1.0

Published

Capacitor plugin for comprehensive on-device speech recognition with live partial results.

Readme

@capgo/capacitor-speech-recognition

Natural, low-latency speech recognition for Capacitor apps with parity across iOS and Android, streaming partial results, and permission helpers baked in.

Why this plugin?

This package starts from the excellent capacitor-community/speech-recognition plugin, but folds in the most requested pull requests from that repo (punctuation support, segmented sessions, crash fixes) and keeps them maintained under the Capgo umbrella. You get the familiar API plus:

  • Merged community PRs – punctuation toggles on iOS (PR #74), segmented results & silence handling on Android (PR #104), and the recognitionRequest safety fix (PR #105) ship out-of-the-box.
  • 🚀 New Capgo features – configurable silence windows, streaming segment listeners, consistent permission helpers, and a refreshed example app.
  • 🛠️ Active maintenance – same conventions as all Capgo plugins (SPM, Podspec, workflows, example app) so it tracks Capacitor major versions without bit-rot.
  • 📦 Drop-in migration – TypeScript definitions remain compatible with the community plugin while exposing the extra options (addPunctuation, allowForSilence, segmentResults, etc.).

Documentation

The most complete doc is available here: https://capgo.app/docs/plugins/speech-recognition/

Compatibility

| Plugin version | Capacitor compatibility | Maintained | | -------------- | ----------------------- | ---------- | | v8.*.* | v8.*.* | ✅ | | v7.*.* | v7.*.* | On demand | | v6.*.* | v6.*.* | ❌ | | v5.*.* | v5.*.* | ❌ |

Note: The major version of this plugin follows the major version of Capacitor. Use the version that matches your Capacitor installation (e.g., plugin v8 for Capacitor 8). Only the latest major version is actively maintained.

Install

bun add @capgo/capacitor-speech-recognition
bunx cap sync

Usage

import { SpeechRecognition } from '@capgo/capacitor-speech-recognition';

await SpeechRecognition.requestPermissions();

const { available } = await SpeechRecognition.available();
if (!available) {
  console.warn('Speech recognition is not supported on this device.');
}

const partialListener = await SpeechRecognition.addListener('partialResults', (event) => {
  console.log('Partial:', event.matches?.[0]);
});

await SpeechRecognition.start({
  language: 'en-US',
  maxResults: 3,
  partialResults: true,
});

// Later, when you want to stop listening
await SpeechRecognition.stop();
await partialListener.remove();

On-device recognition mode

This plugin now supports an opt-in on-device recognition path behind the explicit useOnDeviceRecognition flag.

What it is

The default path keeps the long-standing recognizer flow for backward compatibility. useOnDeviceRecognition switches to a newer local speech pipeline when the platform supports it:

  • On iOS 26+, it uses Apple's SpeechAnalyzer / SpeechTranscriber stack.
  • On recent Android versions, it uses the on-device SpeechRecognizer path.

Why you might want it

  • Better alignment with the latest native speech APIs.
  • Improved on-device model handling on supported platforms.
  • A cleaner rollout path if you want to adopt newer speech stacks without changing every user immediately.

Why it is opt-in

Even when a new stack is technically available, changing recognition behavior silently can affect:

  • transcript wording
  • punctuation behavior
  • partial-result timing
  • product metrics and user expectations

That is why the plugin keeps the legacy recognizer by default and requires an explicit flag for the new path.

Recommended rollout

  1. Check generic speech support with available().
  2. Check the on-device path with isOnDeviceRecognitionAvailable().
  3. Enable useOnDeviceRecognition only when that second check returns true.
  4. Roll it out gradually if your app depends on stable transcripts or analytics.

Example

import { SpeechRecognition } from '@capgo/capacitor-speech-recognition';

await SpeechRecognition.requestPermissions();

const { available } = await SpeechRecognition.available();
if (!available) {
  throw new Error('Speech recognition is not available on this device.');
}

const { available: onDeviceRecognitionAvailable } =
  await SpeechRecognition.isOnDeviceRecognitionAvailable({
    language: 'en-US',
  });

await SpeechRecognition.start({
  language: 'en-US',
  partialResults: true,
  useOnDeviceRecognition: onDeviceRecognitionAvailable,
});

When not to use it yet

Stay on the default path if:

  • you need unchanged behavior for existing users
  • you have not validated transcripts for your target locale
  • you want identical production behavior across older and newer OS versions

Platform notes

  • iOS uses the newer on-device path only on iOS 26+ and only for locales Apple exposes through the newer speech stack.
  • Android uses the on-device recognizer only in inline mode. popup: true keeps using the system dialog and is not compatible with useOnDeviceRecognition.
  • On Android, a supported on-device language may require a model download before recognition can begin.

Push-to-talk and session events

This plugin also supports a push-to-talk oriented flow built around three APIs:

  • setPTTState({ held }) lets your UI tell the plugin when the button is pressed or released.
  • forceStop() stops the active session immediately and emits the last cached partial result with forced: true when available.
  • getLastPartialResult() lets you read back the latest cached transcript at any point.

continuousPTT is the experimental cross-platform mode that keeps a held push-to-talk session alive by restarting recognition as speech segments finalize. Android and iOS both support this restart flow for inline/native recognition.

The plugin also emits deterministic session lifecycle events so UIs can react cleanly:

  • listeningState now carries state, sessionId, reason, and optional errorCode in addition to the legacy status.
  • error is emitted for every native recognizer error instead of relying only on promise rejections.
  • readyForNextSession signals when native resources are torn down and the plugin is ready for another start.

iOS usage descriptions

Add the following keys to your app Info.plist:

  • NSSpeechRecognitionUsageDescription
  • NSMicrophoneUsageDescription

API

available()

available() => Promise<SpeechRecognitionAvailability>

Checks whether the native speech recognition service is usable on the current device.

Returns: Promise<SpeechRecognitionAvailability>


isOnDeviceRecognitionAvailable(...)

isOnDeviceRecognitionAvailable(options?: Pick<SpeechRecognitionStartOptions, "language"> | undefined) => Promise<SpeechRecognitionAvailability>

Checks whether the platform's newer on-device recognition path is available for the selected locale.

This is the capability check you should use before enabling useOnDeviceRecognition. A true result means the current device, OS version, and locale can use the newer on-device path for that platform.

Returns false when the device only supports the legacy recognizer path.

Platform SDK docs: iOS: Speech Android: SpeechRecognizer

| Param | Type | | ------------- | ----------------------------------------------------------------------------------------------------------------------------------- | | options | Pick<SpeechRecognitionStartOptions, 'language'> |

Returns: Promise<SpeechRecognitionAvailability>


start(...)

start(options?: SpeechRecognitionStartOptions | undefined) => Promise<SpeechRecognitionMatches>

Begins capturing audio and transcribing speech.

When partialResults is true, the returned promise resolves immediately and updates are streamed through the partialResults listener until the session ends.

The default path keeps the legacy recognizer behavior for backward compatibility. Pass useOnDeviceRecognition: true only after checking {@link SpeechRecognitionPlugin.isOnDeviceRecognitionAvailable}.

| Param | Type | | ------------- | --------------------------------------------------------------------------------------- | | options | SpeechRecognitionStartOptions |

Returns: Promise<SpeechRecognitionMatches>


stop()

stop() => Promise<void>

Stops listening and tears down native resources.


forceStop(...)

forceStop(options?: ForceStopOptions | undefined) => Promise<void>

Force stops the current session.

On Android, this first tries a normal stop and then falls back to destroy/recreate after timeout. On iOS, the current session is stopped immediately.

If a partial transcript is cached, it is emitted through the partialResults listener with forced: true.

| Param | Type | | ------------- | ------------------------------------------------------------- | | options | ForceStopOptions |


getLastPartialResult()

getLastPartialResult() => Promise<LastPartialResult>

Gets the last cached partial transcription result.

Returns: Promise<LastPartialResult>


setPTTState(...)

setPTTState(options: PTTStateOptions) => Promise<void>

Updates the current push-to-talk button state.

Use this together with continuousPTT or with a custom hold-to-talk flow.

| Param | Type | | ------------- | ----------------------------------------------------------- | | options | PTTStateOptions |


getSupportedLanguages()

getSupportedLanguages() => Promise<SpeechRecognitionLanguages>

Gets the locales supported by the underlying recognizer.

Android 13+ devices no longer expose this list; in that case languages is empty.

Returns: Promise<SpeechRecognitionLanguages>


isListening()

isListening() => Promise<SpeechRecognitionListening>

Returns whether the plugin is actively listening for speech.

Returns: Promise<SpeechRecognitionListening>


checkPermissions()

checkPermissions() => Promise<SpeechRecognitionPermissionStatus>

Gets the current permission state.

Returns: Promise<SpeechRecognitionPermissionStatus>


requestPermissions()

requestPermissions() => Promise<SpeechRecognitionPermissionStatus>

Requests the microphone + speech recognition permissions.

Returns: Promise<SpeechRecognitionPermissionStatus>


getPluginVersion()

getPluginVersion() => Promise<{ version: string; }>

Returns the native plugin version bundled with this package.

Useful when reporting issues to confirm that native and JS versions match.

Returns: Promise<{ version: string; }>


addListener('endOfSegmentedSession', ...)

addListener(eventName: 'endOfSegmentedSession', listenerFunc: () => void) => Promise<PluginListenerHandle>

Listen for segmented session completion events (Android only).

| Param | Type | | ------------------ | ------------------------------------ | | eventName | 'endOfSegmentedSession' | | listenerFunc | () => void |

Returns: Promise<PluginListenerHandle>


addListener('segmentResults', ...)

addListener(eventName: 'segmentResults', listenerFunc: (event: SpeechRecognitionSegmentResultEvent) => void) => Promise<PluginListenerHandle>

Listen for segmented recognition results (Android only).

| Param | Type | | ------------------ | ----------------------------------------------------------------------------------------------------------------------- | | eventName | 'segmentResults' | | listenerFunc | (event: SpeechRecognitionSegmentResultEvent) => void |

Returns: Promise<PluginListenerHandle>


addListener('partialResults', ...)

addListener(eventName: 'partialResults', listenerFunc: (event: SpeechRecognitionPartialResultEvent) => void) => Promise<PluginListenerHandle>

Listen for partial transcription updates emitted while partialResults is enabled.

| Param | Type | | ------------------ | ----------------------------------------------------------------------------------------------------------------------- | | eventName | 'partialResults' | | listenerFunc | (event: SpeechRecognitionPartialResultEvent) => void |

Returns: Promise<PluginListenerHandle>


addListener('listeningState', ...)

addListener(eventName: 'listeningState', listenerFunc: (event: SpeechRecognitionListeningEvent) => void) => Promise<PluginListenerHandle>

Listen for changes to the native listening state.

| Param | Type | | ------------------ | --------------------------------------------------------------------------------------------------------------- | | eventName | 'listeningState' | | listenerFunc | (event: SpeechRecognitionListeningEvent) => void |

Returns: Promise<PluginListenerHandle>


addListener('error', ...)

addListener(eventName: 'error', listenerFunc: (event: SpeechRecognitionErrorEvent) => void) => Promise<PluginListenerHandle>

Listen for recognition errors.

| Param | Type | | ------------------ | ------------------------------------------------------------------------------------------------------- | | eventName | 'error' | | listenerFunc | (event: SpeechRecognitionErrorEvent) => void |

Returns: Promise<PluginListenerHandle>


addListener('readyForNextSession', ...)

addListener(eventName: 'readyForNextSession', listenerFunc: (event: SpeechRecognitionReadyEvent) => void) => Promise<PluginListenerHandle>

Listen for the recognizer becoming ready for another session.

| Param | Type | | ------------------ | ------------------------------------------------------------------------------------------------------- | | eventName | 'readyForNextSession' | | listenerFunc | (event: SpeechRecognitionReadyEvent) => void |

Returns: Promise<PluginListenerHandle>


removeAllListeners()

removeAllListeners() => Promise<void>

Removes every registered listener.


Interfaces

SpeechRecognitionAvailability

| Prop | Type | | --------------- | -------------------- | | available | boolean |

SpeechRecognitionStartOptions

Configure how the recognizer behaves when calling {@link SpeechRecognitionPlugin.start}.

| Prop | Type | Description | | ---------------------------- | -------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | language | string | Locale identifier such as en-US. When omitted the device language is used. | | maxResults | number | Maximum number of final matches returned by native APIs. Defaults to 5. | | prompt | string | Prompt message shown inside the Android system dialog (ignored on iOS). | | popup | boolean | When true, Android shows the OS speech dialog instead of running inline recognition. Defaults to false. | | partialResults | boolean | Emits partial transcription updates through the partialResults listener while audio is captured. | | addPunctuation | boolean | Enables native punctuation handling where supported (iOS 16+). | | useOnDeviceRecognition | boolean | Opt in to the platform's newer on-device recognition path when available. On iOS 26+, this uses Apple's SpeechAnalyzer / SpeechTranscriber pipeline. On recent Android versions, this uses the on-device SpeechRecognizer path. It is intentionally opt-in so existing apps keep the legacy flow unless they choose to roll out the new behavior. Use {@link SpeechRecognitionPlugin.isOnDeviceRecognitionAvailable} before enabling it in production. Platform SDK docs: iOS: Speech, SpeechAnalyzer, SpeechTranscriber Android: SpeechRecognizer Defaults to false. | | allowForSilence | number | Allow a number of milliseconds of silence before splitting the recognition session into segments. Required to be greater than zero and currently supported on Android only. | | continuousPTT | boolean | EXPERIMENTAL: Keep a PTT session alive across silence by restarting recognition while the button stays held. This restart behavior is implemented for Android inline recognition and iOS native recognition. |

SpeechRecognitionMatches

| Prop | Type | | ------------- | --------------------- | | matches | string[] |

ForceStopOptions

Options for {@link SpeechRecognitionPlugin.forceStop}.

| Prop | Type | Description | | ------------- | ------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | timeout | number | Android only: timeout in milliseconds before forcing stop via destroy/recreate. On iOS, the current session is stopped immediately and this value is ignored. Defaults to 1500. |

LastPartialResult

Result from {@link SpeechRecognitionPlugin.getLastPartialResult}.

| Prop | Type | Description | | --------------- | --------------------- | --------------------------------------------------------------- | | available | boolean | Whether a partial result is currently cached. | | text | string | The most recent transcript text known to the native recognizer. | | matches | string[] | All current match alternatives when available. |

PTTStateOptions

Options for {@link SpeechRecognitionPlugin.setPTTState}.

| Prop | Type | Description | | ---------- | -------------------- | ----------------------------------------- | | held | boolean | Whether the PTT button is currently held. |

SpeechRecognitionLanguages

| Prop | Type | | --------------- | --------------------- | | languages | string[] |

SpeechRecognitionListening

| Prop | Type | | --------------- | -------------------- | | listening | boolean |

SpeechRecognitionPermissionStatus

Permission map returned by checkPermissions and requestPermissions.

On Android the state maps to the RECORD_AUDIO permission. On iOS it combines speech recognition plus microphone permission.

| Prop | Type | | ----------------------- | ----------------------------------------------------------- | | speechRecognition | PermissionState |

PluginListenerHandle

| Prop | Type | | ------------ | ----------------------------------------- | | remove | () => Promise<void> |

SpeechRecognitionSegmentResultEvent

Raised whenever a segmented result is produced (Android only).

| Prop | Type | | ------------- | --------------------- | | matches | string[] |

SpeechRecognitionPartialResultEvent

Raised whenever a partial transcription is produced.

| Prop | Type | Description | | --------------------- | --------------------- | --------------------------------------------------------------------------------------------------------------------------------- | | matches | string[] | Current recognition matches when the native recognizer reports them. This can be omitted for forced or accumulated-only payloads. | | accumulated | string | Accumulated transcription from earlier continuous PTT cycles. | | accumulatedText | string | Final accumulated text including the current result. | | isRestarting | boolean | true when the plugin is restarting recognition inside a continuous PTT session. | | forced | boolean | true when the payload was emitted by forceStop(). |

SpeechRecognitionListeningEvent

Raised when the listening state changes.

The original status field is preserved for backward compatibility and is present on the binary started / stopped states.

| Prop | Type | Description | | --------------- | --------------------------------------------------------------------- | ---------------------------------------------------------- | | state | ListeningFiniteState | Finite state of the recognition session. | | sessionId | number | Unique identifier for the current listening session. | | reason | ListeningReason | Why this state transition occurred. | | errorCode | string | Error code when the transition is caused by an error. | | status | 'started' | 'stopped' | Backward-compatible binary state used by earlier releases. |

SpeechRecognitionErrorEvent

Raised whenever native recognition reports an error.

| Prop | Type | | --------------- | ------------------- | | code | string | | message | string | | sessionId | number |

SpeechRecognitionReadyEvent

Emitted after native resources have been torn down and the plugin is ready for another session.

| Prop | Type | | --------------- | ------------------- | | sessionId | number |

Type Aliases

Pick

From T, pick a set of properties whose keys are in the union K

{ [P in K]: T[P]; }

PermissionState

'prompt' | 'prompt-with-rationale' | 'granted' | 'denied'

ListeningFiniteState

Finite state values for the recognition session lifecycle.

'startingListening' | 'started' | 'stoppingListening' | 'stopped'

ListeningReason

Why a listening state transition happened.

'userStart' | 'userStop' | 'forceStop' | 'results' | 'silence' | 'error' | 'unknown'