npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2025 – Pkg Stats / Ryan Hefner

@fciannella/nvidia-asr-client

v0.1.9

Published

Minimal cross-platform wrapper around NVIDIA/Riva streaming ASR WebSocket API with optional client-side silence detection.

Downloads

27

Readme

NVIDIA ASR Client

Minimal cross-platform wrapper around NVIDIA/Riva streaming ASR WebSocket API with optional client-side silence detection.

Features

  • Works in Node.js and browsers without any additional dependencies
  • Built-in audio resampling
  • Support for different input formats (f32, PCM_s16, G.711 μ-law)
  • Client-side silence detection to determine when utterances are complete
  • Minimal footprint with no external dependencies in browser

Installation

npm install @fciannella/nvidia-asr-client

Usage (Browser)

Modern ES Modules Approach

<!DOCTYPE html>
<html>
<head>
  <meta charset="UTF-8">
  <!-- Prevent browser caching during development -->
  <meta http-equiv="Cache-Control" content="no-cache, no-store, must-revalidate">
  <meta http-equiv="Pragma" content="no-cache">
  <meta http-equiv="Expires" content="0">
</head>
<body>
  <div id="transcript"></div>
  <button id="startBtn">Start</button>
  <button id="stopBtn" disabled>Stop</button>

  <script type="module">
    // Dynamic import with cache-busting during development
    const moduleUrl = './node_modules/@fciannella/nvidia-asr-client/dist/index.js?' + Date.now();
    const { NvidiaAsrClient } = await import(moduleUrl);
    
    let asr = null;
    let stopFn = null;
    
    async function startASR() {
      // Setup ASR client
      asr = new NvidiaAsrClient({
        websocketUrl: 'wss://your-riva-endpoint/v1/speech_recognition/streaming_multi',
        languageCode: 'en-US', // Change to 'it-IT', 'es-ES', etc. if supported by server
        silenceTimeout: 1.5,
        closeOnSilence: false,
      });
      
      asr.on('partial', (e) => {
        document.getElementById('transcript').textContent = e.text;
      });
      
      asr.on('final', (e) => {
        document.getElementById('transcript').textContent = e.text;
      });
      
      // Connect and setup WebAudio
      await asr.connect();
      const audioContext = new (window.AudioContext || window.webkitAudioContext)();
      const stream = await navigator.mediaDevices.getUserMedia({ audio: true });
      const source = audioContext.createMediaStreamSource(stream);
      const processor = audioContext.createScriptProcessor(4096, 1, 1);
      
      processor.onaudioprocess = (e) => {
        const float32Data = new Float32Array(e.inputBuffer.getChannelData(0));
        asr.write(float32Data, audioContext.sampleRate);
      };
      
      source.connect(processor);
      processor.connect(audioContext.destination);
      
      // Return cleanup function
      return () => {
        processor.disconnect();
        source.disconnect();
        stream.getTracks().forEach(track => track.stop());
        asr.finish();
        setTimeout(() => asr.end(), 1500);
      };
    }
    
    document.getElementById('startBtn').addEventListener('click', async () => {
      document.getElementById('startBtn').disabled = true;
      document.getElementById('stopBtn').disabled = false;
      stopFn = await startASR();
    });
    
    document.getElementById('stopBtn').addEventListener('click', () => {
      if (stopFn) {
        stopFn();
        stopFn = null;
        document.getElementById('startBtn').disabled = false;
        document.getElementById('stopBtn').disabled = true;
      }
    });
  </script>
</body>
</html>

A complete example is available in examples/browser-example.html.

Notes on Browser Usage

  1. WebSocket Endpoint: Ensure your Riva server allows cross-origin requests from your web application.
  2. Caching: During development, use cache-busting techniques as shown in the example.
  3. Language Selection: The server must support the language code you specify. Not all deployments support all languages.
  4. Audio Context: Modern browsers require a user gesture (like a button click) before allowing audio capture.

Usage (Node.js)

For Node.js usage, you'll need to install the optional dependencies:

npm install ws mic
import { NvidiaAsrClient } from '@fciannella/nvidia-asr-client';
import mic from 'mic';

const SAMPLE_RATE = 16000;

const asr = new NvidiaAsrClient({
  websocketUrl: 'wss://your-riva-endpoint/v1/speech_recognition/streaming_multi',
  languageCode: 'en-US',
  silenceTimeout: 1.5,
  closeOnSilence: false,
});

asr.on('partial', (e) => {
  process.stdout.write(`\r[${e.serverFinal ? 'FINAL' : 'PARTIAL'}] ${e.text}        `);
});

asr.on('final', (e) => {
  console.log(`\n[USER_FINAL] ${e.text}`);
});

asr.on('silence', () => {
  console.log('\n--- silence detected ---');
});

asr.on('error', (err) => console.error('ASR error', err));

(async () => {
  await asr.connect();

  const micInstance = mic({
    rate: String(SAMPLE_RATE),
    channels: '1',
    encoding: 'signed-integer',
    bitwidth: 16,
    endian: 'little',
    fileType: 'raw',
  });

  const stream = micInstance.getAudioStream();
  stream.on('data', (buf) => {
    // convert Int16 PCM -> Float32 [-1,1]
    const int16 = new Int16Array(buf.buffer, buf.byteOffset, buf.byteLength / 2);
    const float32 = new Float32Array(int16.length);
    for (let i = 0; i < int16.length; i++) float32[i] = int16[i] / 0x8000;
    asr.write(float32, SAMPLE_RATE);
  });

  micInstance.start();

  process.on('SIGINT', () => {
    micInstance.stop();
    asr.finish();
    setTimeout(() => process.exit(0), 1500);
  });
})();

API

Constructor

new NvidiaAsrClient(options: NvidiaAsrOptions)

Options

interface NvidiaAsrOptions {
  websocketUrl?: string;            // Required: Your Riva endpoint URL
  languageCode?: string;            // Default: 'en-US'
  silenceTimeout?: number;          // Seconds of inactivity before finalizing
  closeOnSilence?: boolean;         // Default: true
  inputFormat?: 'f32' | 'pcm_s16' | 'g711_ulaw'; // Default: 'f32'
  inputSampleRate?: number;         // Default: 16000
  targetSampleRate?: number;        // Default: 16000
}

Methods

  • connect(): Promise - Opens WebSocket and sends configuration packet
  • write(chunk, sampleRate?): void - Send audio data to the ASR service
  • finish(): void - Signal end-of-audio but keep the socket open
  • end(): void - Flushes EOS marker and closes the WebSocket immediately

Events

  • partial: { text: string, serverFinal: boolean }
  • final: { text: string }
  • silence: Emitted when silence is detected
  • error: Error event

Troubleshooting

Language Support

If specifying a non-English language code (e.g., 'it-IT', 'es-ES') doesn't result in transcription in that language, the issue is likely on the server side:

  1. The server may not have that language model loaded
  2. The server may be configured to ignore client language settings
  3. The specific language may not be supported by your Riva deployment

Contact your Riva server administrator to confirm which languages are available.

Browser Caching

When developing or updating the client, use cache-busting techniques:

  1. Add timestamp query parameters to imports: import(...)?v=${Date.now()}
  2. Use cache control meta tags in your HTML
  3. Run your development server with cache disabled (e.g., http-server -c-1)
  4. Use browser developer tools to clear cache and perform hard reloads

License

MIT