whisperforge
v0.5.5
Published
Fast GPU-accelerated speech-to-text CLI with streaming, quantization, speaker diarization, and multilingual support. Includes model conversion from HuggingFace safetensors.
Maintainers
Readme
whisperforge
Fast GPU-accelerated speech-to-text CLI with streaming, quantization, speaker diarization, and multilingual support.
Quick Links
- Full Documentation: WhisperForge Repository
- Installation:
cargo install whisperforge - Usage Guide: CLI Reference
Quick Start
# Transcribe audio (auto-selects WGPU when compiled in, otherwise CPU)
wforge transcribe -a audio.wav -m tiny_en_converted
# Force CPU backend
wforge transcribe -a audio.wav -m tiny_en_converted --device cpu
# SRT output with speaker diarization
wforge transcribe -a audio.wav -m tiny_en_converted --output-format srt --diarize -o output.srt
# Convert a HuggingFace model and list local models
wforge convert --model-id openai/whisper-tiny.en --output models/tiny_en_converted
wforge list-modelsOptions
-a, --audio-file— Input audio (WAV, MP3, FLAC, OGG, M4A)-m, --model— Model name [default: tiny_en_converted]--output-format— text | srt | json [default: text]--device— auto | cpu | wgpu | cuda [default: auto]--diarize— Enable speaker diarization--decoding-preset— fast | balanced | accurate
For complete options and usage examples, see the full documentation.
See Also
whisperforge-core— Librarywhisperforge-align— VAD & SRTwhisperforge-diarize— Speaker diarization
