npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@livefantasia/speechengine-client

v1.0.0-alpha

Published

Node.js client library for LiveFantasia SpeechEngine streaming API

Readme

LiveFantasia SpeechEngine Client for Node.js

npm version License: MIT Node.js Version

A powerful Node.js client library for the LiveFantasia SpeechEngine platform, providing real-time speech recognition capabilities through WebSocket streaming.

Features

  • 🎤 Real-time Speech Recognition: Stream audio data and receive live transcription results
  • 🌐 WebSocket Streaming: Efficient real-time communication with the SpeechEngine API
  • 🔄 Multiple Sessions: Support for concurrent streaming sessions
  • 🎯 TypeScript Support: Full TypeScript definitions included
  • 📊 Session Management: Built-in session lifecycle management and statistics
  • 🛠️ Utility Classes: Helper classes like TranscriptionManager for easy result handling
  • 🎵 Audio Format Support: Support 16KHz, 16Bits, mono Wave format.
  • 📤 HTTP Transcription: Support for file-based transcription with real-time SSE updates
  • Async Transcription: Support for long-running transcription jobs via URL
  • 🌍 Multi-language: Support for multiple languages
  • 📝 Comprehensive Examples: Rich set of examples for different use cases
  • 🎯 Hotwords Support: Boost recognition accuracy for specific words and phrases

Installation

npm install @livefantasia/speechengine-client

Optional Dependencies

For real-time microphone examples, you may want to install one of these packages depending on your platform:

# For Apple Silicon compatibility (recommended)
npm install mic

Note: These packages are only required if you want to run the real-time microphone examples. They are not needed for the core library functionality.

Platform Compatibility Notes

Apple Silicon (M1/M2/M3) Macs:

  • Recommended: Use mic module for real-time microphone examples
  • Avoid: naudiodon can cause segmentation faults and build failures on ARM architecture

Intel/x86 Systems:

  • ✅ Both mic and naudiodon should work
  • 💡 Tip: mic is more universally compatible across platforms

CI/CD Environments:

  • ⚠️ Important: naudiodon requires native compilation and may fail in containerized environments (Ubuntu, Alpine Linux)
  • Solution: Use mic or exclude audio dependencies from CI builds if not needed

Build Issues with naudiodon: If you encounter build failures related to naudiodon, this is typically due to:

  • Missing system audio libraries (ALSA, PulseAudio on Linux)
  • Incompatible architecture (ARM vs x86)
  • Missing build tools (node-gyp, Python, C++ compiler)

Recommended approach: Use the real-time-microphone-node-mic.ts example with the mic module for better cross-platform compatibility.

Usage

Using SpeechEngineClient

import { SpeechEngineClient } from '@livefantasia/speechengine-client';

const config = {
  apiKey: process.env.SPEECHENGINE_API_KEY!,
  baseUrl: process.env.SPEECHENGINE_BASE_URL!,
};

const client = new SpeechEngineClient(config);

// Create Streaming Session
const session = await client.createStreamingSession({
  language: 'en-US',
  model: 'general_stt_en_latest'
});

// Submit Async Job
const asyncSession = await client.submitAsyncJob('https://example.com/audio.wav', {
  language: 'en-US',
  model: 'general_stt_en_latest'
});

Async Transcription

For long-running audio files (e.g., > 1 minute), use the Async Session. For production usage, prefer callback_url over waitForCompletion(). waitForCompletion() is a convenience polling helper and may wait up to 24 hours depending on job duration. If you use polling, the minimum pollingIntervalMs is 60000 (60 seconds). When using polling, call cancelPolling() or disconnect() when your app no longer needs the polling lifecycle. Server-side job cancellation is not currently supported by AsyncSession.

const asyncSession = await client.submitAsyncJob('https://example.com/audio.wav', {
  language: 'en-US',
  model: 'general_stt_en_latest'
});

console.log('Job submitted:', asyncSession.getSessionId());

const result = await asyncSession.waitForCompletion({ pollingIntervalMs: 60000 });
console.log('Transcription:', result.text);

To receive a callback when the job completes or fails, provide a callback URL and optional headers:

const asyncSession = await client.submitAsyncJob(process.env.ASYNC_AUDIO_URL!, {
  language: 'en-US',
  model: 'general_stt_en_latest',
  callback_url: process.env.ASYNC_CALLBACK_URL!,
  callback_headers: process.env.ASYNC_CALLBACK_HEADERS
    ? JSON.parse(process.env.ASYNC_CALLBACK_HEADERS)
    : undefined,
  // Optional: Boost recognition for specific words/phrases (English languages only)
  hotwords: ['SpeechEngine', 'LiveFantasia'],
  hotwordContextScore: 1.0, // Neutral boost
});

Quick Start

1. Set up environment variables

export SPEECHENGINE_API_KEY="your-api-key-here"
export SPEECHENGINE_BASE_URL="https://api.livefantasia.com"

2. Basic streaming example

import { SpeechEngineClient, TranscriptionManager, UtteranceUpdateMessage } from '@livefantasia/speechengine-client';
import * as fs from 'fs';

async function basicExample() {
  // Initialize the client
  const client = new SpeechEngineClient({
    baseUrl: process.env.SPEECHENGINE_BASE_URL!,
    apiKey: process.env.SPEECHENGINE_API_KEY!,
  });

  let session;
  try {
    // Create a streaming session (model is required)
    session = await client.createStreamingSession({
      language: 'en-US',
      model: 'general_stt_en_latest',
      // Optional: Boost recognition for specific words/phrases (English languages only)
      hotwords: ['SpeechEngine', 'LiveFantasia', 'speech-to-text'],
      hotwordContextScore: 1.5, // Moderate boost (range: 0.2-5.0, recommended: 0.7-2.0)
    });

    // Use TranscriptionManager for easy result handling
    const transcriptionManager = new TranscriptionManager();

    // Set up event handlers
    session.on('sessionReady', () => {
      console.log('Session ready, starting audio stream...');
    });

    session.on('utteranceUpdate', (message: UtteranceUpdateMessage) => {
      transcriptionManager.handleUtteranceUpdate(message);
      console.log('Live transcription:', transcriptionManager.getConcatenatedTranscription());
    });

    session.on('sessionEnd', () => {
      console.log('Final transcription:', transcriptionManager.getFinalTranscription());
    });

    // Connect and start the stream
    await session.connect();
    await session.startStream({
      wordTimestamp: true,
    });

    // Stream audio data
    const audioData = fs.readFileSync('path/to/your/audio.wav');
    await session.sendAudio(audioData);
    await session.endStream(3); // 3-second grace period

  } catch (error) {
    console.error('Error:', error);
  } finally {
    await client.closeAllSessions();
  }
}

basicExample();

Conventions

  • All message payloads emitted to your handlers use camelCase, consistent with Node.js conventions.
    • segmentId, text, startMs, endMs, isFinal, utteranceOrder, words[] with word, startMs, endMs.
    • utteranceOrder is preserved whenever the server sends it and should be used as the stable ordering key for finalized multi-segment delivery.
  • Stream start options are provided in camelCase via startStream(options) and are converted internally to the server’s snake_case.
  • Configure defaults at session creation using SessionConfig camelCase fields.

Stream Start Options

Use startStream(options) to enable word timestamps. VAD configuration is set when creating the session.

await session.startStream({
  wordTimestamp: true,
});

These options are validated locally; invalid values throw a ClientErrorCode.INVALID_PARAMETER error before any network call.

API Reference

SpeechEngineClient

The main client class for interacting with the SpeechEngine API.

Constructor

const client = new SpeechEngineClient(config: SpeechEngineClientConfig);

Configuration Options:

  • apiKey: string - Your SpeechEngine API key
  • baseUrl: string - Base URL for the API (include scheme, e.g., https://api.livefantasia.com)
  • connectionTimeoutMs?: number - Default timeout in milliseconds for HTTP request establishment and WebSocket connection setup (default: 10000). This covers token requests, async HTTP APIs, and the initial HTTP transcription request unless overridden per HTTP session.
  • maxConcurrentSessions?: number - Max concurrent sessions (default: unlimited)
  • logger?: LoggerConfig - Customize logging behavior (level, enableConsole, enableStructured, customHandler)
  • authType?: 'header' | 'query' - WebSocket auth mode (query is browser-compatible but can expose token values in URL logs/history)

Important Security Note: When using authType: 'query', the authentication token is passed via URL query parameter. This can expose the token in:

  • Server access logs
  • Browser history
  • Proxy logs
  • Network monitoring tools

For production environments, prefer authType: 'header' (the default in Node.js). The query mode is provided for browser compatibility where header-based auth is not supported by the native WebSocket API.

Methods

createStreamingSession(config: Partial<StreamingSessionConfig>): Promise<StreamingSession>

Creates a new real-time WebSocket streaming session.

Session Configuration (required fields in bold):

  • language?: string - Language code (e.g., 'en-US', 'es-ES'). Defaults to 'en-US' if not specified.
  • model: string - Model ID for recognition (required)
  • vadMinSilenceDuration?: number - Minimum silence duration (300ms - 1500ms)
  • productCode?: string - Product code for billing (default: 'STT_STREAMING')
  • hotwords?: string[] - Array of words or phrases to boost recognition accuracy (English languages only: en or en-*, max 15 entries, 150 chars total)
  • hotwordContextScore?: number - Boost sensitivity value from 0.2 to 5.0 (default: 1.0). Higher values increase hotword priority but may cause hallucinations. Recommended range: 0.7–2.0
createHttpSession(config: HttpTranscriptionConfig): Promise<HttpSession>

Creates a new HTTP session for file transcription (SSE). config.model is required.

HTTP timeout behavior:

  • timeout?: number controls only the initial HTTP request/upload handshake for the transcription request.
  • When timeout is omitted, it defaults to 120000 ms (2 minutes) internally.
  • connectionTimeoutMs on the client does not affect HTTP transcription timeout — it only covers WebSocket handshakes and token requests.
  • streamIdleTimeoutMs?: number optionally fails the transcription if no SSE data arrives for the configured interval.
  • Best practice: use an idle timeout instead of a total stream timeout. Start with 120000 ms if your server does not send heartbeats, or 2-3x your heartbeat interval if it does.
submitAsyncJob(audioUrl: string, options?: SubmitJobOptions, config?: Partial<AsyncClientConfig>): Promise<AsyncSession>

Submits an audio URL for async transcription via POST /api/v1/sessions/async and returns a session handle.

Async Job Options:

  • language?: string - Language code for transcription
  • model?: string - Model ID for recognition
  • callbackUrl?: string - Webhook URL for completion notification
  • callbackHeaders?: Record<string, string> - Custom headers for callback request
  • hotwords?: string[] - Array of words or phrases to boost recognition (English languages only: en or en-*)
  • hotwordContextScore?: number - Boost sensitivity value from 0.2 to 5.0
requestSessionToken(config?: Partial<StreamingSessionConfig>): Promise<SessionInitResponse>

Requests a session token from server without creating a session.

createStreamingSessionFromToken(sessionInitData: SessionInitResponse, config: Partial<StreamingSessionConfig>): StreamingSession

Creates a streaming session from an existing token.

createHttpSessionFromToken(sessionInitData: SessionInitResponse, config: Partial<HttpTranscriptionConfig>): HttpSession

Creates an HTTP session from an existing token.

getAsyncSession(sessionId: string): AsyncSession

Retrieves an existing async session by ID for status tracking. This is useful for checking the status of previously submitted jobs. If the session already exists in the active sessions registry, it will be returned; otherwise, a new session instance will be created.

closeSession(sessionId: string): Promise<void>

Closes a single session.

closeAllSessions(): Promise<void>

Closes all active sessions created by this client.

getActiveSessionCount(): number

Returns the number of active sessions currently tracked by the client.

getSession(sessionId: string): BaseSession | undefined

Retrieves an active session instance by ID.

HttpSession

Represents an active HTTP transcription session.

Methods

transcribe(audio: string | Uint8Array, mediaType?: string): AsyncGenerator<HttpTranscriptionEvent, void, unknown>

Transcribe an audio file or data using the HTTP SSE endpoint.

  • If audio is a string, it's treated as a file path (Node.js only).
  • If audio is binary data, mediaType is required (Buffer is supported as Uint8Array).
  • The session's timeout applies to establishing the request only.
  • The optional streamIdleTimeoutMs applies while waiting for the next SSE chunk/event.
getSessionId(): string

Returns the session ID.

getConfig(): HttpSessionConfig

Returns the session configuration used for this HTTP session.

AsyncSession

Represents an asynchronous transcription job.

Methods

waitForCompletion(options?: PollingOptions): Promise<AsyncTranscriptionResult>

Polls for job completion and returns the final result.

  • This is a convenience helper and may block for a long time (up to 24 hours).
  • For production use, prefer callback-based completion via callback_url.
  • pollingIntervalMs has a minimum value of 60000 (60 seconds).
  • waitForCompletion() is single-flight per session; concurrent callers receive the same promise.
  • Call cancelPolling() or disconnect() when polling should stop.
getStatus(): Promise<AsyncJobStatusResponse>

Retrieves the current status of the async job.

cancelPolling(): void

Stops the active polling loop created by waitForCompletion(). Call this method when:

  • Your application is shutting down and you no longer need to wait for results
  • You want to stop polling for a job you've already set up a webhook callback for
  • You need to clean up resources before the polling would naturally complete

Note: This only stops the client-side polling loop. It does not cancel the server-side transcription job.

disconnect(): Promise<void>

Disconnects the session and cancels any active polling. Equivalent to calling cancelPolling() followed by session cleanup.

getSessionId(): string

Returns the session ID.

StreamingSession

Represents an active streaming session.

Events

  • sessionReady - Session is ready to receive audio
  • utteranceUpdate - New transcription data received
  • sessionEnd - Session ended
  • error - Error occurred

Methods

connect(): Promise<void>

Connects to the session (establishes WebSocket connection).

startStream(options?: { wordTimestamp?: boolean }): Promise<void>

Starts the stream; emits sessionReady when streaming can begin.

sendAudio(audioData: Uint8Array): Promise<boolean>

Sends audio data while in streaming state and returns true when audio is sent. Returns false when called outside the streaming state.

endStream(graceWaitingTime?: number): Promise<void>

Ends the stream and finalizes transcription, waiting up to graceWaitingTime seconds.

disconnect(): Promise<void>

Disconnects the session and closes the WebSocket.

TranscriptionManager

Utility class for managing transcription results.

Methods

handleUtteranceUpdate(message: UtteranceUpdateMessage): void

Processes an utterance update message with interim replacement and deduplication.

getConcatenatedTranscription(): string

Gets the assembled transcription (interim + final as appropriate).

getFinalTranscription(): string

Gets the final transcription result.

getSegments(): TranscriptionSegment[]

Gets all transcription segments.

Examples

The library comes with comprehensive examples in the examples/ directory:

Basic Examples

  • simple-streaming.ts - Recommended streaming workflow using TranscriptionManager
  • minimal-streaming.ts - Direct event handling without utilities

Advanced Examples

  • multiple-sessions.ts - Managing multiple concurrent sessions
  • simple-streaming-file-logging.ts - Streaming with custom structured logging

Streaming Examples

  • real-time-microphone-node-mic.ts - Real-time microphone streaming with mic module (Apple Silicon compatible)

Running Examples

# Basic streaming example
npx ts-node examples/basic/simple-streaming.ts

# Real-time microphone (Apple Silicon compatible)
npm install mic
npx ts-node examples/streaming/real-time-microphone-node-mic.ts

# Multiple sessions
npx ts-node examples/advanced/multiple-sessions.ts

Apple Silicon Compatibility

The real-time microphone example uses the mic module which provides excellent compatibility with Apple Silicon Macs (M1/M2/M3). This avoids the known issues with naudiodon/PortAudio that can cause segmentation faults on ARM-based Macs.

npm install mic
npx ts-node examples/streaming/real-time-microphone-node-mic.ts

Why mic over naudiodon?

Cross-Platform Stability:

  • mic works reliably across macOS (Intel & Apple Silicon), Linux, and Windows
  • naudiodon has known compatibility issues with ARM architecture and CI/CD environments

Build Reliability:

  • mic has fewer native dependencies and simpler build requirements
  • naudiodon requires PortAudio and can fail in containerized environments (Docker, CI/CD)

Development Experience:

  • mic provides a simpler API for basic microphone access
  • Less prone to segmentation faults and memory issues on Apple Silicon

If you encounter build issues with audio dependencies in your CI/CD pipeline, consider excluding them from your production dependencies or using the file-based streaming examples instead.

Supported Languages

  • English (en)
  • Spanish (es)
  • French (fr)
  • German (de)
  • Italian (it)
  • Portuguese (pt)
  • And more...

Error Handling

The library provides comprehensive error handling with specific error types:

import { SpeechEngineError } from '@livefantasia/speechengine-client';

try {
  await session.connect();
} catch (error) {
  if (error instanceof SpeechEngineError) {
    console.error('SpeechEngine Error:', error.code, error.message);
    console.error('Category:', error.category);
    console.error('Retryable:', error.retryable);
  }
}

Error enums are exported as SessionErrorCode and ClientErrorCode from the package root.

Common SessionErrorCode values include:

  • INVALID_DATAPACKET_SIZE: Audio packet size is invalid, including malformed PCM frames or repeated undersized streaming packets.
  • INVALID_REQUEST_FORMAT: Request payload or event sequence is invalid.
  • INVALID_LANGUAGE: Requested language is unsupported.
  • OUT_OF_CREDIT: Session exhausted its available quota.
  • RATE_LIMIT_EXCEEDED: Service-side request or usage throttling was triggered.
  • CAPACITY_EXCEEDED: The platform is temporarily at session capacity.

Logging

The client’s logs can be routed into your application’s logger. By default, logs print to the console at info level.

Winston integration example

import { SpeechEngineClient } from '@livefantasia/speechengine-client';
import winston from 'winston';

const appLogger = winston.createLogger({
  level: 'info',
  transports: [new winston.transports.Console()],
});

const client = new SpeechEngineClient({
  baseUrl: 'https://api.livefantasia.com',
  apiKey: process.env.SPEECHENGINE_API_KEY!,
  logger: {
    level: 'info',
    enableConsole: false,
    customHandler: (entry) => {
      const level = entry.level.toLowerCase();
      const prefix = `${entry.component}${entry.sessionId ? ':' + entry.sessionId : ''}`;
      const message = `${prefix} - ${entry.message}`;
      const meta = entry.data ? { data: entry.data, ts: entry.timestamp.toISOString() } : { ts: entry.timestamp.toISOString() };
      appLogger.log({ level, message, ...meta });
    },
  },
});

Notes:

  • Set enableConsole: false to prevent duplicate console output.
  • customHandler receives structured entries; you control formatting and routing.
  • Sensitive auth data (JWTs and Bearer tokens) is redacted before logging.

Development

Building

npm run build

Testing

npm test
npm run test:coverage

Linting

npm run lint
npm run lint:fix

Type Checking

npm run type-check

Requirements

  • Node.js >= 20.0.0
  • TypeScript >= 5.1.0 (for development)

License

MIT License - see the LICENSE file for details.

Support

Contributing

We welcome contributions! Please see our contributing guidelines for more information.


Made with ❤️ by LiveFantasia