whisper-fast

v1.0.1

Published

3 months ago

WhisperFast - 高性能 Whisper 音频转文字 Node.js 原生模块

0High
0Medium
0Low

whisper speech-to-text audio-transcription whisper.cpp napi-rs native offline asr speech-recognition whisper-fast audio transcription ai machine-learning

WhisperFast

基于 Rust + napi-rs + whisper.cpp 的 Node.js 原生转写库。本 README 采用“使用方 / 维护者”双视角结构，便于按角色快速定位信息。

使用方

安装

npm install whisper-fast

说明：安装后会按当前平台加载对应的预编译原生模块，无需手动编译。

能力概览

通过 WhisperModel 按本地模型文件路径加载模型
支持文件转写与 Buffer 转写
支持段级与词级时间戳
支持语言指定、翻译、线程数、prompt、解码参数
支持 Whisper 底层调试开关：printProgress、printRealtime、printTimestamps、printSpecial

说明：当前定位是“本地模型推理库”，模型文件由调用方自行准备和管理。

快速使用

const { WhisperModel } = require('whisper-fast')

async function main() {
  // 传入本地 ggml 模型文件路径
  const model = new WhisperModel('models/ggml-base.bin')
  // false 表示不启用 GPU（需 CUDA 构建时再改为 true）
  await model.load(false)
  try {
    // 执行文件转写；options 可按需扩展
    const result = await model.transcribeFile('audio.wav', {
      language: 'zh',
      nThreads: 4,
      wordTimestamps: false
    })
    // 结果主体文本
    console.log(result.text)
  } finally {
    // 始终释放模型，避免内存长期占用
    if (model.isLoaded()) {
      model.unload()
    }
  }
}

main()

说明：生产场景建议把 console.log 替换为落盘或业务系统回传。

API 概览

类

new WhisperModel(modelPath: string)
load(useGpu?: boolean): Promise<void>
isLoaded(): boolean
unload(): void
transcribeFile(audioPath: string, options?: TranscribeOptions): Promise<TranscriptionResult>
transcribeBuffer(audioBuffer: Buffer, options?: TranscribeOptions): Promise<TranscriptionResult>

说明：transcribeFile 适合文件场景，transcribeBuffer 适合上游已完成音频解码/切片的场景。

工具函数

getVersion(): string
isValidModelFile(path: string): boolean
debugAudioInfo(path: string): AudioDebugInfo
getAvailableModels(): string[]
getModelSize(modelName: string): number

说明：工具函数用于做运行时探测、模型有效性校验和容量评估。

TranscribeOptions（常用字段）

language?: string
wordTimestamps?: boolean
translate?: boolean
nThreads?: number
bestOf?: number
beamSize?: number
temperature?: number
noContext?: boolean
suppressNonSpeechTokens?: boolean
maxLen?: number
prompt?: string
printProgress?: boolean
printRealtime?: boolean
printTimestamps?: boolean
printSpecial?: boolean

说明：调试开关用于排查模型执行过程，默认建议关闭以减少控制台输出噪音。

示例目录（仓库内）

简化后的本地使用示例：.tmp-node-new-case/transcribe.js
运行方式：

cd .tmp-node-new-case
npm run start

说明：该示例会读取 .tmp-node-new-case/models 下的本地模型并输出 transcript.txt。

模型说明

本库当前使用本地 ggml 模型文件，不内置自动下载逻辑。请先准备模型文件，例如：

ggml-tiny.bin
ggml-base.bin
ggml-small.bin

可从 whisper.cpp 模型发布页获取后放到你的模型目录。说明：模型越大通常准确率越高，但内存占用与耗时也会增加。

维护者

开发构建

npm run build:debug
npm run build
npm run build:cross

说明：build:cross 用于跨平台产物构建，常用于发布前检查。

发布

npm install
npm run build
npm pack --dry-run
npm version patch
npm publish

发布前建议检查：

npm whoami 已登录
npm run build 成功
npm pack --dry-run 打包内容正确

说明：建议先执行 npm pack --dry-run，确认发布包只包含必要文件后再正式发布。

许可证

MIT

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme