npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@tekyzinc/stt-component

v0.3.3

Published

Framework-agnostic speech-to-text with real-time streaming transcription and mid-recording Whisper correction

Downloads

1,191

Readme

STT-Component

Version

A framework-agnostic, browser-first speech-to-text package with real-time streaming transcription and mid-recording Whisper correction, powered by @huggingface/transformers.

Features

  • Streaming transcription -- real-time interim text as you speak
  • Mid-recording Whisper correction -- automatic correction cycles triggered by speech pauses or forced intervals
  • Configurable Whisper models -- tiny, base, small, medium (ONNX via transformers.js)
  • WebGPU + WASM -- GPU-accelerated inference in Chrome/Edge with automatic WASM fallback for Firefox/Safari
  • Event-driven API -- subscribe to transcript, correction, error, and status events
  • Framework-agnostic -- works with React, Vue, Svelte, vanilla JS, or any framework
  • Web Worker inference -- non-blocking model loading and transcription via dedicated worker thread
  • Configurable correction timing -- pause threshold, forced interval, or disable entirely
  • Audio chunking -- configurable chunk length and stride for long-form audio
  • Node.js support -- compatible with Node.js >= 18 via @huggingface/transformers

Quick Start

npm install @tekyzinc/stt-component
import { STTEngine } from '@tekyzinc/stt-component';

const engine = new STTEngine({ model: 'tiny' });

engine.on('transcript', (text) => console.log('Interim:', text));
engine.on('correction', (text) => console.log('Corrected:', text));

await engine.init();
await engine.start();
// ... user speaks ...
const finalText = await engine.stop();

API Reference

STTEngine

The main class. Extends TypedEventEmitter<STTEvents>.

constructor(config?: STTConfig, workerUrl?: URL)

Creates a new engine instance. All config fields are optional -- sensible defaults are applied.

init(): Promise<void>

Spawns the Web Worker and loads the Whisper model. Emits status events with download progress. Throws on model load failure.

start(): Promise<void>

Requests microphone access and begins recording. Enables mid-recording correction cycles. Engine must be in ready state (call init() first).

stop(): Promise<string>

Stops recording, runs a final Whisper transcription on the full audio, emits the final correction event, and returns the transcribed text.

destroy(): void

Terminates the worker, releases the microphone and AudioContext, removes all event listeners. Call when done with the engine.

getState(): Readonly<STTState>

Returns a snapshot of the current engine state.

notifyPause(): void

Manually signals a speech pause to the correction orchestrator, which may trigger an early correction cycle.

on(event, listener): void

Subscribe to an event. Type-safe -- TypeScript enforces correct callback signatures.

off(event, listener): void

Unsubscribe a specific listener.

Events

| Event | Callback Signature | Description | |-------|-------------------|-------------| | transcript | (text: string) => void | Real-time streaming text via Web Speech API (display in italics) | | correction | (text: string) => void | Whisper-corrected text replacing interim text (display in normal style) | | error | (error: STTError) => void | Actionable error ({ code: string, message: string }) | | status | (state: STTState) => void | Engine state changes | | debug | (message: string) => void | Internal diagnostic logs (Speech API lifecycle, errors, results) |

Error Codes

| Code | When | |------|------| | MIC_DENIED | Microphone access denied or unavailable | | MODEL_LOAD_FAILED | Whisper model download or initialization failed | | TRANSCRIPTION_FAILED | Whisper inference failed (recording continues) | | WORKER_ERROR | Web Worker encountered an error | | STREAMING_ERROR | Web Speech API streaming error |

Engine States (STTStatus)

idle -> loading -> ready -> recording -> processing -> ready

| Status | Meaning | |--------|---------| | idle | Engine created but not initialized | | loading | Model downloading / initializing | | ready | Model loaded, ready to record | | recording | Actively capturing audio | | processing | Running final transcription after stop |

Configuration

All fields are optional. Defaults shown in the table.

| Option | Type | Default | Description | |--------|------|---------|-------------| | model | 'tiny' \| 'base' \| 'small' \| 'medium' | 'tiny' | Whisper model size | | backend | 'webgpu' \| 'wasm' \| 'auto' | 'auto' | Compute backend (auto = WebGPU with WASM fallback) | | language | string | 'en' | Transcription language | | dtype | string | 'q4' | Model quantization dtype | | correction.enabled | boolean | true | Enable mid-recording Whisper correction | | correction.provider | 'whisper' | 'whisper' | Correction engine provider | | correction.pauseThreshold | number (ms) | 3000 | Silence duration before triggering correction | | correction.forcedInterval | number (ms) | 5000 | Maximum interval between forced corrections | | streaming.enabled | boolean | true | Enable real-time streaming transcript via Web Speech API | | streaming.provider | 'web-speech-api' | 'web-speech-api' | Streaming provider (Chrome/Edge) | | chunking.chunkLengthS | number (seconds) | 30 | Chunk length for Whisper processing | | chunking.strideLengthS | number (seconds) | 5 | Stride length for overlapping chunks |

STTState

Returned by getState() and emitted with status events.

interface STTState {
  status: STTStatus;
  isModelLoaded: boolean;
  loadProgress: number;       // 0-100
  backend: 'webgpu' | 'wasm' | null;
  error: string | null;
}

Usage Examples

Vanilla JavaScript

<script type="module">
  import { STTEngine } from '@tekyzinc/stt-component';

  const engine = new STTEngine({ model: 'tiny' });
  const output = document.getElementById('output');

  engine.on('correction', (text) => {
    output.textContent = text;
  });

  engine.on('error', (err) => {
    console.error(`[${err.code}] ${err.message}`);
  });

  await engine.init();

  document.getElementById('start').onclick = () => engine.start();
  document.getElementById('stop').onclick = async () => {
    const final = await engine.stop();
    output.textContent = final;
  };
</script>

React Pattern

No React dependency required -- this just shows the integration pattern.

import { useEffect, useRef, useState } from 'react';
import { STTEngine } from '@tekyzinc/stt-component';

function VoiceInput() {
  const engineRef = useRef<STTEngine | null>(null);
  const [text, setText] = useState('');

  useEffect(() => {
    const engine = new STTEngine({ model: 'tiny' });
    engineRef.current = engine;

    engine.on('correction', setText);
    engine.on('error', (err) => console.error(err.code, err.message));

    engine.init();
    return () => engine.destroy();
  }, []);

  return (
    <div>
      <button onClick={() => engineRef.current?.start()}>Record</button>
      <button onClick={() => engineRef.current?.stop()}>Stop</button>
      <p>{text}</p>
    </div>
  );
}

Error Handling

import { STTEngine } from '@tekyzinc/stt-component';

const engine = new STTEngine();

engine.on('error', (err) => {
  switch (err.code) {
    case 'MIC_DENIED':
      alert('Please allow microphone access.');
      break;
    case 'MODEL_LOAD_FAILED':
      console.error('Model failed to load:', err.message);
      break;
    case 'TRANSCRIPTION_FAILED':
      // Non-fatal: recording continues, correction will retry
      console.warn('Transcription error:', err.message);
      break;
  }
});

engine.on('status', (state) => {
  console.log(`Status: ${state.status}, progress: ${state.loadProgress}%`);
});

await engine.init();

Browser Compatibility

| Browser | Backend | Notes | |---------|---------|-------| | Chrome 113+ | WebGPU | Full GPU acceleration | | Edge 113+ | WebGPU | Full GPU acceleration | | Firefox | WASM | Automatic fallback, slower inference | | Safari 18+ | WASM | Automatic fallback, slower inference |

When backend is set to 'auto' (default), the engine attempts WebGPU first and falls back to WASM silently.

Node.js

Compatible with Node.js >= 18 via @huggingface/transformers. In Node.js, the engine uses the WASM backend (no WebGPU). Audio capture (startCapture) requires browser APIs (navigator.mediaDevices), so in Node.js you would provide pre-recorded audio to the worker directly or use a Node.js audio library for capture.

Exports

The package exports all public types and utilities:

// Main API
import { STTEngine } from '@tekyzinc/stt-component';

// Types
import type {
  STTConfig,
  STTState,
  STTEvents,
  STTError,
  STTModelSize,
  STTBackend,
  STTStatus,
  STTCorrectionProvider,
  STTStreamingProvider,
  STTStreamingConfig,
} from '@tekyzinc/stt-component';

// Utilities (advanced usage)
import {
  DEFAULT_STT_CONFIG,
  resolveConfig,
  TypedEventEmitter,
  WorkerManager,
  CorrectionOrchestrator,
  SpeechStreamingManager,
} from '@tekyzinc/stt-component';

Troubleshooting

Vite: New features not working after upgrading

Symptom: After running npm install @tekyzinc/stt-component@latest, new features (like streaming transcription) don't appear. The old behavior persists despite the upgrade.

Cause: Vite pre-bundles dependencies into node_modules/.vite/deps/ for faster dev server startup. When you upgrade a package, Vite may continue serving the stale cached bundle instead of the updated code. The files in node_modules/@tekyzinc/stt-component/dist/ are correct, but Vite's dev server never reads them — it serves the pre-bundled copy from .vite/deps/.

Fix (immediate):

# Delete the Vite dependency cache
rm -rf node_modules/.vite

# Restart the dev server
npm run dev

Fix (permanent — recommended):

Add this to your vite.config.ts to exclude the package from pre-bundling entirely. Since @tekyzinc/stt-component ships as ESM, pre-bundling is unnecessary:

// vite.config.ts
export default defineConfig({
  optimizeDeps: {
    exclude: ['@tekyzinc/stt-component'],
  },
  // ... rest of your config
});

How to verify: After clearing the cache and restarting, open the browser DevTools Network tab and confirm the module is loaded directly from node_modules/@tekyzinc/stt-component/dist/ rather than from .vite/deps/.

Why this happens: Vite's dependency pre-bundling uses esbuild to convert packages into optimized ESM bundles on first run. This cache is keyed by the package.json lockfile hash, but certain upgrade scenarios (especially with scoped private packages) may not trigger cache invalidation. The result is that npm install updates the source files but Vite keeps serving the old pre-bundled version.

Web Speech API streaming not working

Symptom: correction events fire (Whisper is working) but transcript events never fire (no real-time streaming text).

Check these in order:

  1. Vite cache (most common) — see the section above. If you recently upgraded the package, this is almost certainly the issue.

  2. Browser support — Web Speech API streaming requires Chrome or Edge. Firefox and Safari do not support SpeechRecognition. Check with:

    import { SpeechStreamingManager } from '@tekyzinc/stt-component';
    console.log('Supported:', SpeechStreamingManager.isSupported());
  3. Streaming disabled — Streaming is enabled by default, but verify your config:

    const engine = new STTEngine({
      streaming: { enabled: true },  // default: true
    });
  4. Debug events — Subscribe to debug for internal diagnostics:

    engine.on('debug', (msg) => console.log(msg));

    Look for messages starting with [SSM] (SpeechStreamingManager) — they show whether the Speech API initialized, received results, or encountered errors.

Other bundlers (Webpack, Rollup, esbuild)

If using a bundler other than Vite, similar caching issues can occur:

  • Webpack: Delete .cache/ or the node_modules/.cache directory and restart
  • Turbopack: Delete .next/cache and restart
  • General rule: If an upgrade doesn't seem to take effect, clear your bundler's cache directory and rebuild

License

MIT