@superapp_men/speech-to-text
v1.1.1
Published
Real-time speech recognition for SuperApp Partner Apps
Downloads
179
Maintainers
Readme
@superapp_men/speech-to-text
Speech recognition for SuperApp partner apps. The partner app starts listening, the user speaks, the partner app stops listening and receives the transcribed text.
Install
npm install @superapp_men/speech-to-textHow It Works
Partner App (iframe) SuperApp (Capacitor host)
| |
|-- startListening(config) ------> |-- starts native mic
| |
| (user speaks...) | (Android recognizes speech)
| |
|-- stopListening() -------------> |-- stops native mic
| |-- waits for final result
| <-- final transcript ----------- |- The partner app controls when to start and stop
- The SuperApp handles native speech recognition via Android's
SpeechRecognizer - When the user pauses, the recognizer captures that segment and silently restarts
- All segments are accumulated into one final transcript returned when you stop
Quick Start
import {
SpeechToText,
RecognitionState,
Language,
} from "@superapp_men/speech-to-text";
const stt = new SpeechToText({ timeout: 10000, debug: true });
// Request permission
const permission = await stt.requestPermission();
if (permission !== "granted") return;
// Start listening — mic opens, user speaks
await stt.startListening({
language: Language.AR_MA,
stopMode: "manual",
partialResults: false,
popup: false,
maxAlternatives: 1,
});
// ... user speaks for as long as they want ...
// Stop listening — returns the full transcript
const result = await stt.stopListening();
console.log(result.transcript); // "everything the user said"Config
await stt.startListening({
language: "ar-MA", // Language code (default: "en-US")
stopMode: "manual", // "manual" — you call stopListening() when done
partialResults: false, // false — no interim text, only final result on stop
popup: false, // false — no native OS popup, partner app manages UI
maxAlternatives: 1, // Number of alternative transcriptions (default: 1)
maxDuration: 30000, // Safety timeout in ms (optional, default: none)
});React Example
import { useEffect, useState, useMemo } from "react";
import {
SpeechToText,
RecognitionState,
Language,
type RecognitionResult,
} from "@superapp_men/speech-to-text";
function SpeechRecorder() {
const [instanceKey, setInstanceKey] = useState(0);
const stt = useMemo(
() => new SpeechToText({ timeout: 10000, debug: true }),
[instanceKey]
);
const [state, setState] = useState(RecognitionState.IDLE);
const [transcript, setTranscript] = useState("");
const [error, setError] = useState<string | null>(null);
useEffect(() => {
const unsubs = [
stt.on("stateChange", ({ state }) => setState(state)),
stt.on("result", ({ result }) => setTranscript(result.transcript)),
stt.on("error", ({ message }) => setError(message)),
stt.on("listeningStopped", () => setInstanceKey((k) => k + 1)),
];
return () => { unsubs.forEach((u) => u()); stt.destroy(); };
}, [stt]);
const isListening =
state === RecognitionState.LISTENING ||
state === RecognitionState.STARTING;
const handleStart = async () => {
setError(null);
setTranscript("");
const p = await stt.requestPermission();
if (p !== "granted") { setError("Permission denied"); return; }
await stt.startListening({
language: Language.AR_MA,
stopMode: "manual",
partialResults: false,
popup: false,
});
};
const handleStop = async () => {
const result = await stt.stopListening();
if (result) setTranscript(result.transcript);
};
return (
<div>
<p>State: {state}</p>
{error && <p style={{ color: "red" }}>{error}</p>}
<button onClick={handleStart} disabled={isListening}>
{isListening ? "Listening..." : "Start"}
</button>
<button onClick={handleStop} disabled={!isListening}>
Stop
</button>
{transcript && (
<div>
<p>Transcript:</p>
<p dir="auto"><strong>{transcript}</strong></p>
</div>
)}
</div>
);
}API
new SpeechToText(config?)
| Option | Type | Default | Description |
| -------- | --------- | ------- | ---------------------- |
| timeout | number | 5000 | Request timeout (ms) |
| debug | boolean | false | Enable console logging |
Methods
| Method | Returns | Description |
| -------------------- | --------------------------- | ----------------------------------- |
| isAvailable() | Promise<boolean> | Can the device do speech recognition? |
| checkPermission() | Promise<PermissionStatus> | Current mic permission status |
| requestPermission()| Promise<PermissionStatus> | Ask the user for mic access |
| getSupportedLanguages() | Promise<string[]> | Available language codes |
| startListening(config) | Promise<void> | Start the mic |
| stopListening() | Promise<RecognitionResult> | Stop the mic, get the transcript |
| getState() | RecognitionState | Current state |
| isListening() | boolean | Quick check |
| on(event, callback)| () => void | Subscribe (returns unsubscribe fn) |
| destroy() | void | Cleanup |
Events
| Event | Payload | Description |
| ------------------ | ------------------------------ | ------------------------ |
| stateChange | { state, previousState } | State changed |
| listeningStarted | { sessionId, config } | Mic opened |
| listeningStopped | { sessionId, duration } | Mic closed |
| result | { result: RecognitionResult }| Final transcript arrived |
| error | { code, message } | Something went wrong |
RecognitionResult
{
transcript: string; // The transcribed text
confidence: number; // 0-1 confidence score
isFinal: boolean; // Always true
timestamp: number; // When the result was generated
}Languages
enum Language {
EN_US = "en-US",
ES_ES = "es-ES",
FR_FR = "fr-FR",
AR_SA = "ar-SA",
AR_MA = "ar-MA",
}You can also pass any language code string (e.g. "zgh-MA").
States
enum RecognitionState {
IDLE, STARTING, LISTENING, PROCESSING, STOPPED, ERROR
}Important Notes
- After each
stopListening(), create a newSpeechToTextinstance for the next session. The native mic needs a fresh acquisition on Android. - The
maxDurationconfig is a safety timeout — if the user forgets to stop, recognition ends automatically after that time. - Long speech works: the native recognizer silently restarts on pauses and accumulates all text into one final result.
For SuperApp Developers
Import from the /superapp entry point:
import {
MessageType,
RecognitionState,
type SuperAppMessage,
type StartListeningPayload,
} from "@superapp_men/speech-to-text/superapp";See SpeechToTextPackageService.ts for the reference implementation.
License
MIT
Support
- Email: [email protected]
