@apptek/streaming-js-sdk
v1.0.1
Published
AppTek Streaming JavaScript SDK
Downloads
225
Maintainers
Readme
AppTek Streaming JS SDK
A TypeScript SDK for integrating bidirectional streaming ASR (Automatic Speech Recognition) and Translation with AppTek's services.
Features
JavaScript SDK Documentation: Read the Type Definition Docs
Full GRPC Documentation: Read the GRPC Docs
High-Level SDK: Easy-to-use
AppTekSDKclass for managing sessions.Microphone Streaming: Built-in support for capturing and streaming microphone audio.
Generic MediaStream Support: Process audio from
<video>or<audio>elements or any WebRTC stream.Get available languages: Access available languages for live transcription and translation.
Installation
npm install @apptek/streaming-js-sdkGetting Started
1. Initialize the SDK
You can initialize the SDK with your proxy URL and license key.
import { AppTekSDK } from "@apptek/streaming-js-sdk";
// The proxyUrl defaults to "https://accessibility.apptek.com/grpc-proxy" if omitted.
const sdk = new AppTekSDK(
"https://accessibility.apptek.com/grpc-proxy",
"YOUR_APPTEK_LICENSE_KEY"
);
// specific license validation call
await sdk.init();2. Get available languages for transcription and translation
Example config:
const config = {
audioConfiguration: {
sampleType: "INT16",
sampleRateHz: 16000,
channels: 1
},
speechConfiguration: {
langCode: "en-US"
}
};3. Stream from Microphone
To start recording from the user's microphone:
const config = {
audioConfiguration: {
sampleType: "INT16",
sampleRateHz: 16000,
channels: 1
},
speechConfiguration: {
langCode: "en-US"
}
};
await sdk.startMicrophone(config, {
onData: (data) => {
// Handle partial or final transcriptions
if (data.transcription?.stableTranscriptions) {
console.log("Partial:", data.transcription.stableTranscriptions);
}
if (data.segment) {
console.log("Final:", data.segment.text);
}
},
onError: (err) => {
console.error("Streaming Error:", err);
},
onStatusChange: (status) => {
console.log("SDK Status:", status); // Idle, Connecting, Recording, etc.
}
});4. Stream from Video/Audio Element
You can process audio from an existing MediaStream (e.g., from a <video> element):
const videoElement = document.querySelector('video');
const stream = videoElement.captureStream(); // or valid MediaStream
await sdk.startFromStream(stream, config, {
onData: (data) => console.log(data),
onError: (err) => console.error(err)
});5. Waveform Visualization
You can receive raw audio data (Float32Array) to draw a real-time waveform.
await sdk.startMicrophone(config, {
onData: handleTranscription,
onError: handleError,
onMicrophoneData: (audioData) => {
// audioData is a Float32Array (-1.0 to 1.0)
// can be used to visualize audio data
}
});API Reference
AppTekSDK
Constructor
new AppTekSDK(proxyUrl?: string, licenseKey?: string)proxyUrl: Base URL of the proxy server (Default:https://accessibility.apptek.com/grpc-proxy).licenseKey: AppTek license key.
Methods
init(licenseKey?: string): Promise<boolean>Validates the license key with the server.startMicrophone(config, callbacks, stream?): Promise<void>Starts an audio session.config:Recognize2StreamConfigobject.callbacks: Event handlers (onData,onError,onStatusChange,onMicrophoneData).stream: OptionalMediaStreamto use (overrides microphone).
startFromStream(stream, config, callbacks): Promise<void>Alias forstartMicrophonespecifically for external streams.stopMicrophone(): voidStops the current streaming session.getLanguages(): Promise<Recognize2AvailableResponse>Fetch available recognition languages.getTranslateLanguages(): Promise<TranslateAvailableResponse>Fetch available translation languages.
Configuration Object
interface Recognize2StreamConfig {
audioConfiguration: {
sampleType: "INT16";
sampleRateHz: number; // e.g., 16000
channels: 1;
};
speechConfiguration: {
langCode: string; // e.g., "en-US"
model?: string;
};
translateConfigurations?: Array<{
domain: string;
targetLangCode: string; // e.g., "es"
}>;
diarizerConfiguration?: {
enable: boolean;
};
}Troubleshooting & Advanced Configuration
AudioWorklet & CORS / CSP Issues
The SDK uses an AudioWorklet to process microphone audio efficiently. By default, it attempts to load this worklet via an inline Blob URL to make getting started easier. However, this can cause issues in two scenarios:
- Content Security Policy (CSP): Your site blocks
worker-src blob:. - CORS: In some strict browser environments, loading worklets from cross-origin Blobs can fail.
Symptoms:
- Error:
DOMException: The user aborted a request. - Error:
SecurityError: The operation is insecure. - Network error loading
blob:...
Workaround (Recommended for Production):
Host the processor file yourself to avoid Blob/CSP issues.
- There is a file named
pcm-processor.js(or similar audio processor code). You can extract the code fromMicrophone.tsexplicitly or serve a static file. - Host this file on your server (e.g.,
/assets/pcm-processor.js). - Pass the
workletUrltostartMicrophone.
const micConfig = {
audioConfiguration: { ... },
speechConfiguration: { ... }
};
// Use your own hosted processor file
await sdk.startMicrophone(micConfig, callbacks, undefined, {
workletUrl: "/assets/pcm-processor.js"
});Note: You may need to access the internal MicrophoneRecorder directly if the high-level SDK doesn't expose workletUrl in the top-level config yet, or ensure your config object is passed correctly down to the microphone.
