faster-whisper-ts
v1.2.1
Published
TypeScript port of SYSTRAN/faster-whisper for Node.js, built on CTranslate2, Koffi, FFmpeg, and ONNX Runtime.
Maintainers
Readme
faster-whisper-ts
TypeScript port of SYSTRAN/faster-whisper for Node.js. This repository keeps the native CTranslate2 bridge, audio decoding, mel feature extraction, tokenizer handling, Silero VAD, and end-to-end Whisper transcription in a root-level npm package.
Thanks to the original faster-whisper project for the implementation direction, API model, and baseline behavior this port is built from.
Status
- Root package is now npm-only.
- Native bridge build works from repo root.
- Current npm package version is
1.2.1, aligned with the upstream Pythonfaster-whisper1.2.1baseline for the supported Node.js/Linux CPU target. - The stable gate is backed by the parity matrix documented below, including local
tiny+baseCTranslate2 model runs. - Current smoke coverage passes on:
- bridge load/free
- audio decoding
- tokenizer behavior
- feature extraction
- VAD
- end-to-end transcription on
tests/data/jfk.flac
- Transcription timestamp splitting, VAD timestamp restoration, word timestamps, clip timestamps, hotwords, hallucination silence skipping, and language-detection thresholds now follow the upstream logic instead of the earlier POC shortcuts.
- Some broader validation remains intentionally outside the current stable scope:
- full benchmark matrix
- broader runtime/platform validation beyond the current Linux CPU path
- larger model matrix beyond the current
tiny+basepre-stable run
Requirements
- Node.js 20+
npmcmake- a C++17 compiler toolchain
ffmpegavailable onPATHfor Node.js audio decoding, orFASTER_WHISPER_FFMPEG_PATHset explicitly
The package builds whisper_bridge locally during install. The repository vendors the CTranslate2 headers and shared libraries needed by the current Linux CPU path.
A fresh npm install still needs network access for onnxruntime-node. The Node.js decoder no longer depends on ffmpeg-static; it uses the system ffmpeg binary instead.
Runtime Support
- Supported release target: Node.js 20+ on the current Linux CPU path with system
ffmpeg - Kept in code but not yet declared as a stable package target: the browser-oriented FFmpeg WASM fallback in
audio.ts - Not yet packaged as a stable release target: GPU runtime variants and prebuilt native binaries
- First npm release decision: install from source at
postinstall; prebuilt binaries are explicitly deferred
Installation
Current package name:
npm install faster-whisper-tsFrom this repository before publication:
git clone https://github.com/rhanka/faster-whisper.git
cd faster-whisper
npm ci
npm run buildUsage
import { WhisperModel } from "faster-whisper-ts";
const model = new WhisperModel("/absolute/path/to/ctranslate2-model", "cpu", 0, "default");
const [segments, info] = await model.transcribe("tests/data/jfk.flac", {
beamSize: 5,
vadFilter: true,
});
console.log(info);
for (const segment of segments) {
console.log(`[${segment.start.toFixed(2)} -> ${segment.end.toFixed(2)}] ${segment.text}`);
}
model.free();The exported surface currently includes:
WhisperModelTokenizerFeatureExtractorSileroVADModelgetSpeechTimestampsdecodeAudiopadOrTrim
Current Option Status
The current TypeScript surface now covers the main transcription options needed for Python 1.2.1 parity on the supported Node.js/Linux CPU path.
Validated in the current smoke path:
- base transcription from a local audio file
beamSizevadFilterwordTimestampsclipTimestampshallucinationSilenceThresholdhotwordslanguageDetectionThresholdlanguageDetectionSegments- root-level native bridge loading and teardown
Still pending beyond this scoped stable release:
- broader multilingual/runtime validation beyond the current smoke path
- wider model validation beyond the current
tiny+basepre-stable run - benchmark parity and performance documentation
Current Stable Scope And Limitations
The current npm line is intentionally scoped. It is validated for the current Node.js/Linux CPU path with a pre-stable tiny + base CTranslate2 model matrix; broader platform/model parity can still be expanded after this stable release.
Documented release limitations:
- supported release target: Node.js 20+ on the current Linux CPU path
- browser/WASM audio fallback is kept in code, but is not declared stable
- GPU runtime variants and prebuilt native binaries are not part of this release
- parity is demonstrated by smoke/regression coverage and CI on the supported path, not yet by a full upstream option-by-option matrix across models and platforms
If you need GPU, browser/WASM, prebuilt binaries, or benchmark-backed performance claims, this npm line is not the final parity target yet.
Word Timestamps
Word-level timestamp extraction is supported through the native CTranslate2 alignment path:
const [segments] = await model.transcribe("audio.mp3", {
wordTimestamps: true,
hallucinationSilenceThreshold: 1.0,
});
for (const word of segments[0]?.words ?? []) {
console.log(word.start, word.end, word.word, word.probability);
}VAD
Silero VAD is packaged at assets/silero_vad_v6.onnx and resolved relative to the installed package layout. Enable it during transcription with:
const [segments] = await model.transcribe("audio.mp3", {
vadFilter: true,
});If ffmpeg is not on PATH, set:
FASTER_WHISPER_FFMPEG_PATH=/absolute/path/to/ffmpegTests
Run the full local validation flow from the repository root:
npm test
npm run test:pristine-installnpm test will:
- build the native bridge and TypeScript output,
- ensure the tiny test model is available in
test-models/faster-whisper-tinyor viaFASTER_WHISPER_TEST_MODEL, - run the current smoke suite.
If you already have a converted CTranslate2 model elsewhere, point tests to it:
FASTER_WHISPER_TEST_MODEL=/absolute/path/to/model npm testnpm run test:pristine-install adds the release-style smoke check: it packs the library, creates an empty temp project, runs npm install <tarball>, then verifies the installed runtime dependencies, native build output, and root exports from that clean install.
Benchmark And Performance
The npm-only port does not currently ship the old Python benchmark harness. Performance claims from the original repository are therefore not restated here until the Node.js port has its own reproducible benchmark matrix.
The current priority is correctness of the root package, deterministic native builds, and a documented install path for the Linux CPU target.
Logging
The current package does not yet expose a structured logging API equivalent to the original Python surface. In practice:
- native bridge and CTranslate2 messages may still appear on stderr/stdout,
- smoke tests rely on process output today,
- package-level logging controls remain to be designed if a stable public logger is introduced later.
Packaging
Useful root-level commands:
npm run build
npm run lint
npm test
npm run test:parity-matrix
npm run test:pristine-install
npm run pack:checknpm run test:parity-matrix runs the pre-stable parity matrix. By default it uses the local tiny CTranslate2 test model. For the stable parity gate, run it with at least tiny and base:
FASTER_WHISPER_PARITY_MODELS="tiny=/path/to/tiny,base=/path/to/base" npm run build:ts
FASTER_WHISPER_PARITY_MODELS="tiny=/path/to/tiny,base=/path/to/base" npm run test:parity-matrixnpm run pack:check verifies the publishable tarball from the repo root. The published package is intended to contain only the compiled JS, assets, build scripts, vendored native inputs, and license/readme files required at install time.
Release and publish steps are documented in RELEASING.md.
Going Further
- Use a local converted CTranslate2 Whisper model directory and pass its absolute path to
new WhisperModel(...). - Integrate the package from Node.js first; the browser-oriented fallback path is intentionally not documented as stable yet.
- Keep release expectations scoped to the current Linux CPU path until GPU and prebuilt distribution are validated separately.
Model Conversion
Model conversion tooling is intentionally not bundled in this npm-only repository. For now, prepare converted Whisper models outside this package using the original faster-whisper / CTranslate2 workflow, then point the Node.js API at the resulting model directory.
The expected runtime inputs remain the converted model weights plus tokenizer/config files such as tokenizer.json.
Performance Comparison Notes
This repository does not currently claim benchmark parity with upstream faster-whisper. The first npm release is positioned as a TypeScript/Node packaging and runtime port, not as a reproduced performance study.
Removed From The npm Port
The following repository surfaces from the original Python-first layout were intentionally removed from this npm-only port:
- Python packaging and import surface
- Python-oriented benchmark harness
- Docker assets tied to the previous Python workflow
These removals are intentional and part of the root-level npm cut-over, not accidental omissions.
Notes On Parity
This repository is not claiming complete feature parity with upstream faster-whisper yet. The current port is centered on the existing TypeScript implementation that already validates the main inference path. Items still intentionally deferred or not yet fully documented include:
- benchmark reproduction in Node beyond the current smoke coverage
- broader model/runtime validation beyond the tiny-model Linux CPU path
- stable word-timestamp parity
- clip-based seek controls and the remaining deferred transcription options listed above
License
This repository remains under the MIT license model. See LICENSE.
