npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

voxtral-transcribe-ts

v0.1.3

Published

Minimal TypeScript wrapper for local Voxtral Mini 4B Realtime transcription in Node.js.

Readme

voxtral-transcribe-ts

Minimal TypeScript wrapper for local transcription with Voxtral Mini 4B Realtime in Node.js.

This package targets the ONNX checkpoint:

  • onnx-community/Voxtral-Mini-4B-Realtime-2602-ONNX

It is intentionally small:

  • Node/TS only, no Python
  • thin wrapper around @huggingface/transformers + ONNX Runtime
  • 0 external audio decoder dependency
  • optional Mistral API transcription backend with no extra dependency

The built-in file loader only supports .wav input so the package can stay lightweight. If you already have PCM samples in memory, use transcribeAudio().

Architecture and multi-target rollout plan: PLAN.md

Install

npm install voxtral-transcribe-ts

Quick Start

import { VoxtralTranscriber } from "voxtral-transcribe-ts";

const transcriber = new VoxtralTranscriber({
  device: "cpu",
  dtype: "q4",
});

const result = await transcriber.transcribeFile("./sample.wav");
console.log(result.text);

await transcriber.dispose();

By default, the package now auto-selects the audio decoder backend:

  • Node/local: InternalWavDecoder
  • Browser: BrowserNativeAudioDecoder

The package now ships conditional entries:

  • package root in Node -> dist/index.node.js
  • package root in browser-aware bundlers -> dist/index.browser.js
  • explicit subpaths:
    • voxtral-transcribe-ts/node
    • voxtral-transcribe-ts/browser

Environment Matrix

| Environment | Package entry | Inference runtime | Default decoder | File input strategy | |---|---|---|---|---| | Node / local | voxtral-transcribe-ts or voxtral-transcribe-ts/node | @huggingface/transformers + onnxruntime-node | InternalWavDecoder | wav by default, multiformat via FfmpegDecoder | | Browser | voxtral-transcribe-ts in browser-aware bundlers or voxtral-transcribe-ts/browser | browser-safe package entry | BrowserNativeAudioDecoder | URL, Blob, File, browser codec support dependent on runtime | | Server high-perf | voxtral-transcribe-ts/node | @huggingface/transformers + onnxruntime-node | FfmpegDecoder recommended | multiformat through ffmpeg | | Mistral API | voxtral-transcribe-ts/node | HTTPS API call to Mistral | Mistral-hosted | local path upload or file_url |

Decoder Matrix

| Decoder | Environment | Purpose | Notes | |---|---|---|---| | InternalWavDecoder | Node, browser | Minimal fallback | wav only | | FfmpegDecoder | Node / server | Best multiformat local path | Not available in browser builds | | BrowserNativeAudioDecoder | Browser | Native client-side decoding | Depends on browser codec support |

You can override this with:

  • target: "auto" | "node" | "browser"
  • audioDecoderBackend
  • inferenceBackend

Raw Audio

import { transcribeAudio } from "voxtral-transcribe-ts";

const samples = new Float32Array([/* mono PCM samples */]);

const result = await transcribeAudio(samples, {
  sampleRate: 16_000,
});

console.log(result.text);

API

new VoxtralTranscriber(options?)

Options:

  • model: defaults to onnx-community/Voxtral-Mini-4B-Realtime-2602-ONNX
  • modelPath: optional local path or pre-provisioned snapshot path used instead of fetching by model id
  • device: defaults to cpu
  • dtype: defaults to q4
  • cacheDir
  • localFilesOnly
  • requireLocalModel: when true, fail instead of attempting a runtime download
  • revision
  • progressCallback
  • target: defaults to auto

await transcriber.load()

Preloads the processor and model.

await transcriber.transcribeFile(path, options?)

Reads a WAV file, downmixes it to mono, resamples it to the model sample rate, and returns:

type VoxtralTranscriptionResult = {
  decoder: string;
  durationMs: number;
  model: string;
  sampleRate: number;
  text: string;
};

await transcriber.transcribeAudio(samples, options?)

Transcribes mono PCM samples already loaded in memory.

Options:

  • sampleRate: defaults to 16000
  • maxNewTokens
  • skipSpecialTokens: defaults to true

Advanced

The transcriber now separates:

  • inference backend
  • audio decoder backend

The current default pair is:

  • TransformersInferenceBackend
  • InternalWavDecoder in Node
  • BrowserNativeAudioDecoder in browsers

For multiformat local/server decoding, use FfmpegDecoder.

import { FfmpegDecoder, VoxtralTranscriber } from "voxtral-transcribe-ts";

const transcriber = new VoxtralTranscriber({
  audioDecoderBackend: new FfmpegDecoder(),
});

const result = await transcriber.transcribeFile("./sample.mp3");
console.log(result.text);

Browser inputs can be passed as URLs or Blob / File objects when using BrowserNativeAudioDecoder or the default browser auto-selection.

You can also create an instance through createTranscriber(options), which uses the same defaults and target rules as new VoxtralTranscriber(options).

Optional Mistral API Backend

The package also exposes an optional hosted Voxtral transcription backend. This is not local/offline, but it is useful when latency matters more than self-hosting.

It adds no npm dependency and uses the platform fetch / FormData APIs.

import { MistralVoxtralApiTranscriber } from "voxtral-transcribe-ts/node";

const transcriber = new MistralVoxtralApiTranscriber({
  // Optional in Node if process.env.MISTRAL_API_KEY is set.
  apiKey: process.env.MISTRAL_API_KEY,
});

const result = await transcriber.transcribeFile("./sample.mp3", {
  language: "fr",
});

console.log(result.text);

For remote audio, avoid downloading it yourself:

import { transcribeFileWithMistral } from "voxtral-transcribe-ts/node";

const result = await transcribeFileWithMistral("https://example.com/audio.wav", {
  apiKey: process.env.MISTRAL_API_KEY,
  language: "fr",
});

API options:

  • model: defaults to voxtral-mini-2602
  • apiKey: defaults to process.env.MISTRAL_API_KEY in the Node transcriber
  • baseUrl: defaults to https://api.mistral.ai/v1
  • language
  • diarize
  • timestampGranularities: segment or word
  • contextBias
  • temperature

Browser builds also expose MistralVoxtralApiTranscriber, but do not put a long-lived Mistral API key in frontend code. Use a short-lived token or proxy if you need this path in a browser.

Enterprise / Artifactory

There are two separate concerns in enterprise environments:

  • npm dependency installation
  • model provisioning

npm install voxtral-transcribe-ts only installs the package and its npm dependencies. It does not download the Voxtral model checkpoint during package installation.

By default, the model may still be fetched later at runtime when the transcriber first loads. In registry-controlled environments such as Artifactory, the recommended setup is:

  • proxy npm dependencies through your internal registry
  • pre-provision the Voxtral model snapshot on disk or in an internal artifact store
  • point the transcriber at that local snapshot
  • require local-only model loading so runtime fails fast instead of reaching out to Hugging Face
import { FfmpegDecoder, VoxtralTranscriber } from "voxtral-transcribe-ts/node";

const transcriber = new VoxtralTranscriber({
  audioDecoderBackend: new FfmpegDecoder(),
  modelPath: "/opt/models/Voxtral-Mini-4B-Realtime-2602-ONNX",
  requireLocalModel: true,
});

With modelPath set, the package treats the model as a local artifact and enables local-only loading for the runtime backend. That is the mode to use for Artifactory + local model deployments.

Browser Entry

import { createTranscriber } from "voxtral-transcribe-ts/browser";

const transcriber = createTranscriber({
  target: "browser",
});

Node Entry

import { createTranscriber, FfmpegDecoder } from "voxtral-transcribe-ts/node";

const transcriber = createTranscriber({
  target: "node",
  audioDecoderBackend: new FfmpegDecoder(),
});

WAV Support

The internal WAV decoder supports:

  • PCM 8/16/24/32-bit
  • IEEE float 32-bit
  • mono or multi-channel input, mixed down to mono

For mp3, m4a, ogg, or flac, decode audio yourself and call transcribeAudio().

If you want the package to decode those formats for you on local/server, instantiate the transcriber with FfmpegDecoder.

Validation

npm run validate
npm run test:smoke

Benchmark

The repository includes a benchmark harness for comparing voxtral-transcribe-ts against faster-whisper on WER, CER, and real-time factor.

See BENCHMARK.md.

CI / Release

The repository now ships a GitHub Actions workflow in .github/workflows/typescript-ci.yml modeled after graphify.

It does four things:

  • runs npm run validate on Node 20 and 22
  • builds a tarball and installs it into a pristine temp project with npm install
  • verifies the published root, node, and browser exports
  • publishes to npm on tags matching v*

Publish strategy:

  • default: GitHub Actions trusted publishing with id-token: write
  • fallback: if NPM_TOKEN is configured as a repository secret, the workflow uses that token instead

Local pre-publish check:

npm run test:smoke

That smoke test now proves a fresh-machine install path: it packs the library, creates an empty temp project, runs npm install <tarball>, then verifies the installed dependencies and exports from that temp install.

The runtime tests also cover the enterprise local-model path: modelPath + requireLocalModel is forwarded as a local-only load contract so runtime can be configured with zero remote model fetch.

Typical release flow:

npm version patch
git push
git push --tags