@mingto/microsoft-lat

v1.0.5

Published

3 months ago

微软LAT 语音转文本

0High
0Medium
0Low

minto_marketing

LAT 语音转文本

@mingto/microsoft-lat

基于微软 Azure 认知服务的实时语音转文本库，支持流式识别和多语言。

特性

🎙️ 流式识别：支持实时音频流输入，边录制边识别
🌍 多语言支持：支持中文、英文等多种语言和方言
🔒 安全认证：使用 Azure 订阅密钥认证，安全可靠
📱 浏览器端支持：专为浏览器环境优化，内置麦克风管理
🔧 灵活配置：支持自定义语言、自动停止参数等
⚡ 实时反馈：提供识别中、识别完成、会话开始/结束等多种状态回调
🛡️ 自动销毁：页面卸载时实例自动销毁，避免内存泄漏

安装

pnpm add @mingto/microsoft-lat

前置要求

拥有 Microsoft Azure 账户
在 Azure 门户创建语音服务资源，获取订阅密钥 (key) 和区域 (region)
⚠️ 注意：密钥不要直接暴露在前端代码中，建议通过服务端接口获取

快速开始

import microsoftLat from '@mingto/microsoft-lat'

// 1. 配置系统参数（只需配置一次）
microsoftLat.config({
  subscriptionKey: 'YOUR_AZURE_SUBSCRIPTION_KEY',
  region: 'southeastasia',
})

// 2. 创建识别实例
const speechRecognizer = microsoftLat.create(
  {
    language: 'zh-CN',
  },
  {
    autoControl: true,
    initialDelay: 3500,
    subsequentDelay: 3000,
  }
)

// 3. 订阅事件
speechRecognizer.on('appResponseText', (text) => {
  console.log('实时识别中:', text)
})

speechRecognizer.on('appResultText', (text) => {
  console.log('识别完成:', text)
})

speechRecognizer.on('appFinish', () => {
  console.log('识别结束')
})

// 4. 开始识别
speechRecognizer.start()

// 5. 结束识别
// speechRecognizer.end()

API

全局配置

microsoftLat.config(systemConfig)

配置系统参数，全局只需配置一次。

microsoftLat.config({
  subscriptionKey: 'YOUR_AZURE_SUBSCRIPTION_KEY',
  region: 'southeastasia',
})

参数说明：

| 参数 | 类型 | 必填 | 说明 | |------|------|------|------| | subscriptionKey | string | ✅ | Azure 语音服务订阅密钥 | | region | string | ✅ | Azure 服务区域，如 eastasia、southeastasia 等 |

创建实例

microsoftLat.create(businessParams?, sectionDelayParams?)

创建语音识别控制器实例。

const lat = microsoftLat.create(
  {
    language: 'zh-CN',
  },
  {
    autoControl: true,
    initialDelay: 3500,
    subsequentDelay: 3000,
  }
)

业务参数 businessParams：

| 参数 | 类型 | 必填 | 默认值 | 说明 | |------|------|------|--------|------| | language | string | ❌ | 'zh-CN' | 源语言代码 |

分段延迟参数 sectionDelayParams：

| 参数 | 类型 | 必填 | 默认值 | 说明 | |------|------|------|--------|------| | autoControl | boolean | ❌ | true | 是否自动控制结束 | | initialDelay | number | ❌ | 3500 | 首次延迟时间（毫秒） | | subsequentDelay | number | ❌ | 3000 | 后续延迟时间（毫秒） |

实例方法

lat.start()

开始语音识别，进入待机状态。

lat.start()

lat.end()

停止录制音频，但已录制的数据会继续处理。

lat.end()

lat.on(eventName, callback)

订阅事件回调。

lat.on('appResponseText', (text) => {
  console.log('实时识别:', text)
})

事件说明

| 事件名 | 触发时机 | 回调参数 | 说明 | |--------|----------|----------|------| | appResponseText | 实时识别过程中 | string | 返回当前识别的文本片段 | | appResultText | 完整识别结果可用时 | string | 返回完整的识别文本 | | appFinish | 应用结束时 | - | 所有处理器已停止 |

使用示例

基础用法

import microsoftLat from '@mingto/microsoft-lat'

// 配置
microsoftLat.config({
  subscriptionKey: 'YOUR_AZURE_SUBSCRIPTION_KEY',
  region: 'southeastasia',
})

const lat = microsoftLat.create(
  { language: 'zh-CN' },
  { autoControl: false }
)

lat
  .on('appResponseText', (text) => {
    console.log('实时:', text)
  })
  .on('appResultText', (text) => {
    console.log('完成:', text)
  })
  .on('appFinish', () => {
    console.log('结束')
  })

lat.start()

// 5 秒后结束
setTimeout(() => {
  lat.end()
}, 5000)

英文识别

import microsoftLat from '@mingto/microsoft-lat'

microsoftLat.config({
  subscriptionKey: 'YOUR_AZURE_SUBSCRIPTION_KEY',
  region: 'southeastasia',
})

const lat = microsoftLat.create({ language: 'en-US' })

lat.on('appResultText', (text) => {
  console.log('Recognized:', text)
})

lat.start()

在 React 组件中使用

import { useEffect, useState } from 'react'
import microsoftLat from '@mingto/microsoft-lat'

const SpeechRecognition = () => {
  const [text, setText] = useState('')
  const [isListening, setIsListening] = useState(false)
  const latRef = { current: null }

  useEffect(() => {
    microsoftLat.config({
      subscriptionKey: 'YOUR_AZURE_SUBSCRIPTION_KEY',
      region: 'southeastasia',
    })

    const lat = microsoftLat.create({ language: 'zh-CN' })

    lat
      .on('appResponseText', (text) => setText(text))
      .on('appResultText', (text) => setText(text))
      .on('appFinish', () => setIsListening(false))

    latRef.current = lat

    return () => {
      lat.end()
    }
  }, [])

  const handleStart = () => {
    setIsListening(true)
    latRef.current?.start()
  }

  const handleStop = () => {
    latRef.current?.end()
  }

  return (
    <div>
      <div>{text}</div>
      <button onClick={handleStart} disabled={isListening}>
        开始
      </button>
      <button onClick={handleStop} disabled={!isListening}>
        停止
      </button>
    </div>
  )
}

export default SpeechRecognition

支持的语言

中文

zh-CN - 中文（普通话，中国大陆）
zh-TW - 中文（台湾）
zh-HK - 中文（香港）

英文

en-US - 英语（美国）
en-GB - 英语（英国）
en-AU - 英语（澳大利亚）
en-CA - 英语（加拿大）

其他常用语言

ja-JP - 日语
ko-KR - 韩语
fr-FR - 法语
de-DE - 德语
es-ES - 西班牙语
it-IT - 意大利语
pt-BR - 葡萄牙语（巴西）
ru-RU - 俄语

Azure 区域说明

常用的 Azure 语音服务区域：

| 区域代码 | 区域名称 | |----------|----------| | eastasia | 东亚（香港） | | southeastasia | 东南亚（新加坡） | | japaneast | 日本东部 | | japanwest | 日本西部 | | koreacentral | 韩国中部 | | australiaeast | 澳大利亚东部 | | centralindia | 印度中部 |

注意事项

⚠️ 安全提示：不要在前端代码中硬编码 Azure 密钥，建议通过后端接口获取临时令牌
⚠️ 需要用户授权麦克风权限，首次使用时浏览器会弹出授权提示
⚠️ 建议在网络环境良好的情况下使用，避免识别延迟
⚠️ 在页面离开或组件卸载时，实例会自动销毁
⚠️ 不同语言的识别准确率和支持度可能不同，请参考 Azure 官方文档
⚠️ 自动控制停止功能（autoControl: true）会根据静音自动停止识别，适用于短语音场景

参考文档

许可证

MIT

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

@mingto/microsoft-lat

特性

安装

前置要求

快速开始

API

全局配置

microsoftLat.config(systemConfig)

创建实例

microsoftLat.create(businessParams?, sectionDelayParams?)

实例方法

lat.start()

lat.end()

lat.on(eventName, callback)

事件说明

使用示例

基础用法

英文识别

在 React 组件中使用

支持的语言

中文

英文

其他常用语言

Azure 区域说明

注意事项

参考文档

许可证