npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2025 – Pkg Stats / Ryan Hefner

@volley/recognition-client-sdk

v0.1.417

Published

Recognition Service TypeScript/Node.js Client SDK

Downloads

5,581

Readme

@volley/recognition-client-sdk

TypeScript SDK for real-time speech recognition via WebSocket.

Installation

npm install @volley/recognition-client-sdk

Quick Start

import {
  createClientWithBuilder,
  RecognitionProvider,
  DeepgramModel,
  STAGES
} from '@volley/recognition-client-sdk';

// Create client with builder pattern (recommended)
const client = createClientWithBuilder(builder =>
  builder
    .stage(STAGES.STAGING)  // ✨ Simple environment selection using enum
    .provider(RecognitionProvider.DEEPGRAM)
    .model(DeepgramModel.NOVA_2)
    .onTranscript(result => {
      console.log('Final:', result.finalTranscript);
      console.log('Interim:', result.pendingTranscript);
    })
    .onError(error => console.error(error))
);

// Stream audio
await client.connect();
client.sendAudio(pcm16AudioChunk);  // Call repeatedly with audio chunks
await client.stopRecording();       // Wait for final transcript

// Check the actual URL being used
console.log('Connected to:', client.getUrl());

Alternative: Direct Client Creation

import {
  RealTimeTwoWayWebSocketRecognitionClient,
  RecognitionProvider,
  DeepgramModel,
  Language,
  STAGES
} from '@volley/recognition-client-sdk';

const client = new RealTimeTwoWayWebSocketRecognitionClient({
  stage: STAGES.STAGING,  // ✨ Recommended: Use STAGES enum for type safety
  asrRequestConfig: {
    provider: RecognitionProvider.DEEPGRAM,
    model: DeepgramModel.NOVA_2,
    language: Language.ENGLISH_US
  },
  onTranscript: (result) => console.log(result),
  onError: (error) => console.error(error)
});

// Check the actual URL being used
console.log('Connected to:', client.getUrl());

Configuration

Environment Selection

Recommended: Use stage parameter with STAGES enum for automatic environment configuration:

import {
  RecognitionProvider,
  DeepgramModel,
  Language,
  STAGES
} from '@volley/recognition-client-sdk';

builder
  .stage(STAGES.STAGING)  // STAGES.LOCAL | STAGES.DEV | STAGES.STAGING | STAGES.PRODUCTION
  .provider(RecognitionProvider.DEEPGRAM)  // DEEPGRAM, GOOGLE
  .model(DeepgramModel.NOVA_2)              // Provider-specific model enum
  .language(Language.ENGLISH_US)            // Language enum
  .interimResults(true)                     // Enable partial transcripts

Available Stages and URLs:

| Stage | Enum | WebSocket URL | |-------|------|---------------| | Local | STAGES.LOCAL | ws://localhost:3101/ws/v1/recognize | | Development | STAGES.DEV | wss://recognition-service-dev.volley-services.net/ws/v1/recognize | | Staging | STAGES.STAGING | wss://recognition-service-staging.volley-services.net/ws/v1/recognize | | Production | STAGES.PRODUCTION | wss://recognition-service.volley-services.net/ws/v1/recognize |

💡 Using the stage parameter automatically constructs the correct URL for each environment.

Automatic Connection Retry:

The SDK automatically retries failed connections with sensible defaults - no configuration needed!

Default behavior (works out of the box):

  • 4 connection attempts (try once, retry 3 times if failed)
  • 200ms delay between retries
  • Handles temporary service unavailability (503)
  • Fast failure (~600ms total on complete failure)
  • Timing: Attempt 1 → FAIL → wait 200ms → Attempt 2 → FAIL → wait 200ms → Attempt 3 → FAIL → wait 200ms → Attempt 4
import { STAGES } from '@volley/recognition-client-sdk';

// ✅ Automatic retry - no config needed!
const client = new RealTimeTwoWayWebSocketRecognitionClient({
  stage: STAGES.STAGING,
  // connectionRetry works automatically with defaults
});

Optional: Customize retry behavior (only if needed):

const client = new RealTimeTwoWayWebSocketRecognitionClient({
  stage: STAGES.STAGING,
  connectionRetry: {
    maxAttempts: 2,  // Fewer attempts (min: 1, max: 5)
    delayMs: 500     // Longer delay between attempts
  }
});

⚠️ Note: Retry only applies to initial connection establishment. If the connection drops during audio streaming, the SDK will not auto-retry (caller must handle this).

Advanced: Custom URL for non-standard endpoints:

builder
  .url('wss://custom-endpoint.example.com/ws/v1/recognize')  // Custom WebSocket URL
  .provider(RecognitionProvider.DEEPGRAM)
  // ... rest of config

💡 Note: If both stage and url are provided, url takes precedence.

Event Handlers

builder
  .onTranscript(result => {})     // Handle transcription results
  .onError(error => {})            // Handle errors
  .onConnected(() => {})           // Connection established
  .onDisconnected((code) => {})   // Connection closed
  .onMetadata(meta => {})          // Timing information

Optional Parameters

builder
  .gameContext({                   // Context for better recognition
    gameId: 'session-123',
    prompt: 'Expected responses: yes, no, maybe'
  })
  .userId('user-123')              // User identification
  .platform('web')                 // Platform identifier
  .logger((level, msg, data) => {})  // Custom logging

API Reference

Client Methods

await client.connect();           // Establish connection
client.sendAudio(chunk);          // Send PCM16 audio
await client.stopRecording();     // End and get final transcript
client.getAudioUtteranceId();     // Get session UUID
client.getUrl();                  // Get actual WebSocket URL being used
client.getState();                // Get current state
client.isConnected();             // Check connection status

TranscriptionResult

{
  type: 'Transcription';                   // Message type discriminator
  audioUtteranceId: string;                // Session UUID
  finalTranscript: string;                 // Confirmed text (won't change)
  finalTranscriptConfidence?: number;      // Confidence 0-1 for final transcript
  pendingTranscript?: string;              // In-progress text (may change)
  pendingTranscriptConfidence?: number;    // Confidence 0-1 for pending transcript
  is_finished: boolean;                    // Transcription complete (last message)
  voiceStart?: number;                     // Voice activity start time (ms from stream start)
  voiceDuration?: number;                  // Voice duration (ms)
  voiceEnd?: number;                       // Voice activity end time (ms from stream start)
  startTimestamp?: number;                 // Transcription start timestamp (ms)
  endTimestamp?: number;                   // Transcription end timestamp (ms)
  receivedAtMs?: number;                   // Server receive timestamp (ms since epoch)
  accumulatedAudioTimeMs?: number;         // Total audio duration sent (ms)
}

Providers

Deepgram

import { RecognitionProvider, DeepgramModel } from '@volley/recognition-client-sdk';

builder
  .provider(RecognitionProvider.DEEPGRAM)
  .model(DeepgramModel.NOVA_2);        // NOVA_2, NOVA_3, FLUX_GENERAL_EN

Google Cloud Speech-to-Text

import { RecognitionProvider, GoogleModel } from '@volley/recognition-client-sdk';

builder
  .provider(RecognitionProvider.GOOGLE)
  .model(GoogleModel.LATEST_SHORT);    // LATEST_SHORT, LATEST_LONG, TELEPHONY, etc.

Available Google models:

  • LATEST_SHORT - Optimized for short audio (< 1 minute)
  • LATEST_LONG - Optimized for long audio (> 1 minute)
  • TELEPHONY - Optimized for phone audio
  • TELEPHONY_SHORT - Short telephony audio
  • MEDICAL_DICTATION - Medical dictation (premium)
  • MEDICAL_CONVERSATION - Medical conversations (premium)

Audio Format

The SDK expects PCM16 audio:

  • Format: Linear PCM (16-bit signed integers)
  • Sample Rate: 16kHz recommended
  • Channels: Mono Please reach out to AI team if ther are essential reasons that we need other formats.

Error Handling

builder.onError(error => {
  console.error(`Error ${error.code}: ${error.message}`);
});

// Check disconnection type
import { isNormalDisconnection } from '@volley/recognition-client-sdk';

builder.onDisconnected((code, reason) => {
  if (!isNormalDisconnection(code)) {
    console.error('Unexpected disconnect:', code);
  }
});

Troubleshooting

Connection Issues

WebSocket fails to connect

  • Verify the recognition service is running
  • Check the WebSocket URL format: ws:// or wss://
  • Ensure network allows WebSocket connections

Authentication errors

  • Verify audioUtteranceId is provided
  • Check if service requires additional auth headers

Audio Issues

No transcription results

  • Confirm audio format is PCM16, 16kHz, mono
  • Check if audio chunks are being sent (use onAudioSent callback)
  • Verify audio data is not empty or corrupted

Poor transcription quality

  • Try different models (e.g., NOVA_2 vs NOVA_2_GENERAL)
  • Adjust language setting to match audio
  • Ensure audio sample rate matches configuration

Performance Issues

High latency

  • Use smaller audio chunks (e.g., 100ms instead of 500ms)
  • Choose a model optimized for real-time (e.g., Deepgram Nova 2)
  • Check network latency to service

Memory issues

  • Call disconnect() when done to clean up resources
  • Avoid keeping multiple client instances active

Publishing

This package uses automated publishing via semantic-release with npm Trusted Publishers (OIDC).

First-Time Setup (One-time)

After the first manual publish, configure npm Trusted Publishers:

  1. Go to https://www.npmjs.com/package/@volley/recognition-client-sdk/access
  2. Click "Add publisher" → Select "GitHub Actions"
  3. Configure:
    • Organization: Volley-Inc
    • Repository: recognition-service
    • Workflow: sdk-release.yml
    • Environment: Leave empty (not required)

How It Works

  • Automated releases: Push to dev branch triggers semantic-release
  • Version bumping: Based on conventional commits (feat/fix/BREAKING CHANGE)
  • No tokens needed: Uses OIDC authentication with npm
  • Provenance: Automatic supply chain attestation
  • Path filtering: Only releases when SDK or libs change

Manual Publishing (Not Recommended)

If needed for testing:

cd packages/client-sdk-ts
npm login --scope=@volley
pnpm build
npm publish --provenance --access public

Contributing

This SDK is part of the Recognition Service monorepo. To contribute:

  1. Make changes to SDK or libs
  2. Test locally with pnpm test
  3. Create PR to dev branch with conventional commit messages (feat:, fix:, etc.)
  4. After merge, automated workflow will publish new version to npm

License

Proprietary