@livefantasia/speechengine-client
v0.7.9-alpha
Published
Node.js client library for LiveFantasia SpeechEngine streaming API
Downloads
330
Maintainers
Readme
LiveFantasia SpeechEngine Client for Node.js
A powerful Node.js client library for the LiveFantasia SpeechEngine platform, providing real-time speech recognition capabilities through WebSocket streaming.
Features
- 🎤 Real-time Speech Recognition: Stream audio data and receive live transcription results
- 🌐 WebSocket Streaming: Efficient real-time communication with the SpeechEngine API
- 🔄 Multiple Sessions: Support for concurrent streaming sessions
- 🎯 TypeScript Support: Full TypeScript definitions included
- 📊 Session Management: Built-in session lifecycle management and statistics
- 🛠️ Utility Classes: Helper classes like
TranscriptionManagerfor easy result handling - 🎵 Audio Format Support: Support 16KHz, 16Bits, mono Wave format.
- 📤 HTTP Transcription: Support for file-based transcription with real-time SSE updates
- 🌍 Multi-language: Support for multiple languages
- 📝 Comprehensive Examples: Rich set of examples for different use cases
Installation
npm install @livefantasia/speechengine-clientOptional Dependencies
For real-time microphone examples, you may want to install one of these packages depending on your platform:
# For Apple Silicon compatibility (recommended)
npm install micNote: These packages are only required if you want to run the real-time microphone examples. They are not needed for the core library functionality.
Platform Compatibility Notes
Apple Silicon (M1/M2/M3) Macs:
- ✅ Recommended: Use
micmodule for real-time microphone examples - ❌ Avoid:
naudiodoncan cause segmentation faults and build failures on ARM architecture
Intel/x86 Systems:
- ✅ Both
micandnaudiodonshould work - 💡 Tip:
micis more universally compatible across platforms
CI/CD Environments:
- ⚠️ Important:
naudiodonrequires native compilation and may fail in containerized environments (Ubuntu, Alpine Linux) - ✅ Solution: Use
micor exclude audio dependencies from CI builds if not needed
Build Issues with naudiodon:
If you encounter build failures related to naudiodon, this is typically due to:
- Missing system audio libraries (ALSA, PulseAudio on Linux)
- Incompatible architecture (ARM vs x86)
- Missing build tools (node-gyp, Python, C++ compiler)
Recommended approach: Use the real-time-microphone-node-mic.ts example with the mic module for better cross-platform compatibility.
Quick Start
1. Set up environment variables
export SPEECHENGINE_API_KEY="your-api-key-here"
export SPEECHENGINE_BASE_URL="https://api.livefantasia.com"2. Basic streaming example
import { SpeechEngineWSClient, TranscriptionManager, TranscriptionUpdateMessage } from '@livefantasia/speechengine-client';
import * as fs from 'fs';
async function basicExample() {
// Initialize the client
const client = new SpeechEngineWSClient({
baseUrl: process.env.SPEECHENGINE_BASE_URL!,
apiKey: process.env.SPEECHENGINE_API_KEY!,
});
let session;
try {
// Create a streaming session (model is required)
session = await client.createSession({
language: 'en-US',
model: 'general_stt_en_latest',
wordTimestamp: true,
});
// Use TranscriptionManager for easy result handling
const transcriptionManager = new TranscriptionManager();
// Set up event handlers
session.on('sessionReady', () => {
console.log('Session ready, starting audio stream...');
});
session.on('transcriptionUpdate', (message: TranscriptionUpdateMessage) => {
transcriptionManager.handleTranscriptionUpdate(message);
console.log('Live transcription:', transcriptionManager.getConcatenatedTranscription());
});
session.on('sessionEnd', () => {
console.log('Final transcription:', transcriptionManager.getFinalTranscription());
});
// Connect and start the stream
await session.connect();
await session.startStream();
// Stream audio data
const audioData = fs.readFileSync('path/to/your/audio.wav');
await session.sendAudio(audioData);
await session.endStream(3); // 3-second grace period
} catch (error) {
console.error('Error:', error);
} finally {
await client.closeAllSessions();
}
}
basicExample();Conventions
- All message payloads emitted to your handlers use camelCase, consistent with Node.js conventions.
segmentId,text,startMs,endMs,isFinal,utteranceOrder,words[]withword,startMs,endMs.
- Stream start options are provided in camelCase via
startStream(options)and are converted internally to the server’s snake_case. - Configure defaults at session creation using
SessionConfigcamelCase fields.
Stream Start Options
Use startStream(options) to enable word timestamps. VAD configuration is set when creating the session.
await session.startStream({
wordTimestamp: true,
});These options are validated locally; invalid values throw a ClientErrorCode.INVALID_PARAMETER error before any network call.
API Reference
SpeechEngineClient
The main client class for interacting with the SpeechEngine API.
Constructor
const client = new SpeechEngineClient(config: SpeechEngineClientConfig);Configuration Options:
apiKey: string- Your SpeechEngine API keybaseUrl: string- Base URL for the API (include scheme, e.g.,https://api.livefantasia.com)defaultLanguage?: Language- Default language ('en-US'by default)connectionTimeoutMs?: number- Connection timeout in milliseconds (default: 10000)maxConcurrentSessions?: number- Max concurrent sessions (default: 10)debug?: boolean- Enable client debug logginglogger?: LoggerConfig- Customize logging behavior
Methods
createSession(config: SessionConfig): Promise<StreamingSession>
Creates a new streaming session.
Session Configuration (required fields in bold):
language: Language- Language code (e.g.,'en-US','es-ES')model: string- Model ID for recognition (required)wordTimestamp?: boolean- Enable word-level timestampsvadMinSilenceDuration?: number- Minimum silence duration (300ms - 1500ms)productCode?: string- Product code for billing (default: 'STT_STREAMING')
requestSessionToken(config: SessionConfig): Promise<SessionInitResponse>
Requests a Control Plane token without creating a session.
createSession(config: HttpTranscriptionConfig): Promise<HttpSession>
Creates a new HTTP session.
createSessionFromToken(sessionInitData: SessionInitResponse, config: HttpTranscriptionConfig): HttpSession
Creates a session using a pre-obtained token and URL.
closeSession(sessionId: string): void
Closes a single session.
HttpSession
Represents an active HTTP transcription session.
Methods
transcribe(file: string | Uint8Array | Buffer | Readable, mediaType?: string): AsyncGenerator<HttpTranscriptionEvent, void, unknown>
Transcribe an audio file or data using the HTTP SSE endpoint.
- If
fileis a string, it's treated as a file path. - If
fileis binary data,mediaTypeis required.
getSessionId(): string
Returns the session ID.
getServiceUrl(): string
Returns the service URL used for this session.
closeAllSessions(): Promise<void>
Closes all active sessions created by this client.
close(): Promise<void>
Closes the client and all active sessions.
StreamingSession
Represents an active streaming session.
Events
sessionReady- Session is ready to receive audiotranscriptionUpdate- New transcription data receivedsessionEnd- Session endederror- Error occurred
Methods
connect(): Promise<void>
Connects to the session (establishes WebSocket connection).
startStream(options?: { wordTimestamp?: boolean }): Promise<void>
Starts the stream; emits sessionReady when streaming can begin.
sendAudio(audioData: Buffer): Promise<void>
Sends audio data while in streaming state.
endStream(graceWaitingTime?: number): Promise<void>
Ends the stream and finalizes transcription, waiting up to graceWaitingTime seconds.
disconnect(): Promise<void>
Disconnects the session and closes the WebSocket.
TranscriptionManager
Utility class for managing transcription results.
Methods
handleTranscriptionUpdate(message: TranscriptionUpdateMessage): void
Processes a transcription update message with interim replacement and deduplication.
getConcatenatedTranscription(): string
Gets the assembled transcription (interim + final as appropriate).
getFinalTranscription(): string
Gets the final transcription result.
getSegments(): TranscriptionSegment[]
Gets all transcription segments.
Examples
The library comes with comprehensive examples in the examples/ directory:
Basic Examples
simple-streaming.ts- Recommended streaming workflow using TranscriptionManagerminimal-streaming.ts- Direct event handling without utilities
Advanced Examples
multiple-sessions.ts- Managing multiple concurrent sessionserror-handling.ts- Comprehensive error handling patterns
Streaming Examples
real-time-microphone-node-mic.ts- Real-time microphone streaming with mic module (Apple Silicon compatible)file-streaming.ts- Streaming audio from files
Running Examples
# Basic streaming example
npx ts-node examples/basic/simple-streaming.ts
# Streaming with VAD options
npx ts-node examples/basic/simple-streaming-vad.ts
# Real-time microphone (Apple Silicon compatible)
npm install mic
npx ts-node examples/streaming/real-time-microphone-node-mic.ts
# Multiple sessions
npx ts-node examples/advanced/multiple-sessions.tsApple Silicon Compatibility
The real-time microphone example uses the mic module which provides excellent compatibility with Apple Silicon Macs (M1/M2/M3). This avoids the known issues with naudiodon/PortAudio that can cause segmentation faults on ARM-based Macs.
npm install mic
npx ts-node examples/streaming/real-time-microphone-node-mic.tsWhy mic over naudiodon?
Cross-Platform Stability:
micworks reliably across macOS (Intel & Apple Silicon), Linux, and Windowsnaudiodonhas known compatibility issues with ARM architecture and CI/CD environments
Build Reliability:
michas fewer native dependencies and simpler build requirementsnaudiodonrequires PortAudio and can fail in containerized environments (Docker, CI/CD)
Development Experience:
micprovides a simpler API for basic microphone access- Less prone to segmentation faults and memory issues on Apple Silicon
If you encounter build issues with audio dependencies in your CI/CD pipeline, consider excluding them from your production dependencies or using the file-based streaming examples instead.
Supported Languages
- English (
en) - Spanish (
es) - French (
fr) - German (
de) - Italian (
it) - Portuguese (
pt) - And more...
Error Handling
The library provides comprehensive error handling with specific error types:
import { SpeechEngineError } from '@livefantasia/speechengine-client';
try {
await session.connect();
} catch (error) {
if (error instanceof SpeechEngineError) {
console.error('SpeechEngine Error:', error.code, error.message);
console.error('Category:', error.category);
console.error('Retryable:', error.retryable);
}
}Logging
The client’s logs can be routed into your application’s logger. By default, logs print to the console at info level.
Winston integration example
import { createSpeechEngineClient } from '@livefantasia/speechengine-client';
import winston from 'winston';
const appLogger = winston.createLogger({
level: 'info',
transports: [new winston.transports.Console()],
});
const client = createSpeechEngineClient({
baseUrl: 'https://api.livefantasia.com',
apiKey: process.env.SPEECHENGINE_API_KEY!,
logger: {
level: 'info',
enableConsole: false,
customHandler: (entry) => {
const level = entry.level.toLowerCase();
const prefix = `${entry.component}${entry.sessionId ? ':' + entry.sessionId : ''}`;
const message = `${prefix} - ${entry.message}`;
const meta = entry.data ? { data: entry.data, ts: entry.timestamp.toISOString() } : { ts: entry.timestamp.toISOString() };
appLogger.log({ level, message, ...meta });
},
},
});Notes:
- Set
enableConsole: falseto prevent duplicate console output. customHandlerreceives structured entries; you control formatting and routing.- Sensitive auth data (JWTs and
Bearertokens) is redacted before logging.
Development
Building
npm run buildTesting
npm test
npm run test:coverageLinting
npm run lint
npm run lint:fixType Checking
npm run type-checkRequirements
- Node.js >= 20.0.0
- TypeScript >= 5.1.0 (for development)
License
MIT License - see the LICENSE file for details.
Support
- Documentation: API Documentation
- Issues: GitHub Issues
- Examples: See the
examples/directory for comprehensive usage examples
Contributing
We welcome contributions! Please see our contributing guidelines for more information.
Made with ❤️ by LiveFantasia
