npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2025 – Pkg Stats / Ryan Hefner

web-voice-kit

v0.3.0

Published

完全解耦的浏览器语音 SDK:独立的唤醒词检测 + 语音转写,支持智能自动停止。

Readme

Voice SDK

🎙️ 完全解耦的浏览器语音 SDK:独立的唤醒词检测 + 语音转写,自由组合使用。

✨ 新架构特性 (v0.3.0+)

  • 🔓 完全解耦:唤醒词检测和语音转写完全独立,可单独使用
  • 🎯 灵活组合:使用者自由决定如何组合和交互
  • ⏱️ 智能超时:三种自动停止机制(静音/无语音/最大时长)
  • 🎨 多种用法:独立组件、集成版本、原有版本(向后兼容)
  • 📦 TypeScript:完整的类型定义
  • 🚀 现代构建:Vite + ESM/CJS 输出

核心组件

1. WakeWordDetectorStandalone

独立的唤醒词检测器,基于 Vosk 本地模型。

2. SpeechTranscriberStandalone

独立的语音转写器,基于讯飞实时转写 API,支持智能自动停止。

3. VoiceSDKIntegrated

可选的便捷集成层,自动协调唤醒和转写。

Install

Once published to npm:

npm i web-voice-kit
# or
pnpm add web-voice-kit
# or
yarn add web-voice-kit

For local development in this repo, install deps and build:

npm i
npm run build

快速开始

方式一:独立组件(推荐)⭐

import { WakeWordDetectorStandalone, SpeechTranscriberStandalone } from 'web-voice-kit';

// 1. 创建唤醒词检测器
const detector = new WakeWordDetectorStandalone({
  modelPath: '/path/to/vosk-model.zip'
});
detector.setWakeWords(['小红', '小虹']);

// 2. 创建语音转写器(带智能自动停止)
const transcriber = new SpeechTranscriberStandalone({
  appId: 'YOUR_APP_ID',
  apiKey: 'YOUR_API_KEY',
  websocketUrl: 'wss://rtasr.xfyun.cn/v1/ws',
  autoStop: {
    enabled: true,
    silenceTimeoutMs: 3000,      // 静音3秒后停止
    noSpeechTimeoutMs: 5000,     // 5秒无语音停止
    maxDurationMs: 60000         // 最长60秒
  }
});

// 3. 自定义交互逻辑
detector.onWake(async () => {
  console.log('唤醒了!');
  await transcriber.start();
});

transcriber.onResult((result) => {
  console.log('转写:', result.transcript);
});

transcriber.onAutoStop((reason) => {
  console.log('自动停止:', reason);
  detector.reset();
});

// 4. 启动
await detector.start();

方式二:集成版本

import { VoiceSDKIntegrated } from 'web-voice-kit';

const sdk = new VoiceSDKIntegrated({
  wakeWord: ['小红', '小虹'],
  voskModelPath: '/path/to/vosk-model.zip',
  xunfei: {
    appId: 'YOUR_APP_ID',
    apiKey: 'YOUR_API_KEY',
    websocketUrl: 'wss://rtasr.xfyun.cn/v1/ws',
    autoStop: {
      enabled: true,
      silenceTimeoutMs: 3000
    }
  },
  autoStartTranscriberOnWake: true
}, {
  onWake: () => console.log('唤醒!'),
  onTranscript: (text, isFinal) => console.log('转写:', text),
  onAutoStop: (reason) => console.log('停止:', reason)
});

await sdk.start();

方式三:仅使用转写(无唤醒词)

import { SpeechTranscriberStandalone } from 'web-voice-kit';

const transcriber = new SpeechTranscriberStandalone({
  appId: 'YOUR_APP_ID',
  apiKey: 'YOUR_API_KEY',
  websocketUrl: 'wss://rtasr.xfyun.cn/v1/ws',
  autoStop: { enabled: true, silenceTimeoutMs: 2000 }
});

transcriber.onResult((result) => {
  console.log(result.transcript);
});

// 按钮触发
button.onclick = () => transcriber.start();

🎯 智能自动停止

SpeechTranscriberStandalone 提供三种自动停止机制:

1. 静音超时 (silenceTimeoutMs)

检测到语音后,静音超过指定时间自动停止。

  • 适用场景:用户说完话后自动结束
  • 推荐值:2000-5000ms

2. 无语音超时 (noSpeechTimeoutMs)

启动后一直没有语音活动,自动停止。

  • 适用场景:防止误触发
  • 推荐值:3000-8000ms

3. 最大时长 (maxDurationMs)

超过最大时长强制停止。

  • 适用场景:防止长时间占用
  • 推荐值:30000-120000ms
autoStop: {
  enabled: true,
  silenceTimeoutMs: 3000,      // 静音3秒停止
  noSpeechTimeoutMs: 5000,     // 5秒无语音停止  
  maxDurationMs: 60000         // 最长60秒
}

运行时可动态调整:

transcriber.updateAutoStopConfig({
  silenceTimeoutMs: 5000
});

📚 详细文档

完整的使用指南和 API 文档请查看:

🔧 模型配置

使用唤醒词检测时,需要提供 Vosk 模型:

const detector = new WakeWordDetectorStandalone({
  modelPath: '/path/to/vosk-model.zip'  // 必需!
});

模型获取:

  1. Vosk Models 下载
  2. 托管到你的服务器或 CDN
  3. 确保浏览器可访问(注意 CORS)

推荐模型:

  • 中文:vosk-model-small-cn-0.22 (约 42MB)
  • 英文:vosk-model-small-en-us-0.15 (约 40MB)

🆚 架构对比

| 特性 | 新架构(独立组件) | 旧架构 | |------|-------------------|--------| | 解耦程度 | ✅ 完全独立 | ❌ 强耦合 | | 灵活性 | ✅ 自由组合 | ❌ 固定流程 | | 自动停止 | ✅ 三种机制 | ⚠️ 简单超时 | | 状态管理 | ✅ 细粒度 | ⚠️ 粗粒度 | | 推荐度 | ⭐⭐⭐⭐⭐ | ⭐⭐⭐ |

🌐 浏览器支持

  • ✅ Chrome/Edge (推荐)
  • ✅ Firefox
  • ⚠️ Safari (部分功能)
  • ❌ IE (不支持)

需要支持:

  • navigator.mediaDevices.getUserMedia
  • Web Audio API
  • WebSocket

🛠️ 开发

npm install          # 安装依赖
npm run dev          # 开发服务器
npm run build        # 构建生产版本
npm run preview      # 预览构建结果

📄 License

MIT

🤝 贡献

欢迎提交 Issue 和 Pull Request!