@superbased/winrt-speech
v0.1.0
Published
Offline on-device speech recognition for Windows, powered by WinRT SpeechRecognizer. Native addon consumed by the `superbased` npm package.
Maintainers
Readme
@superbased/winrt-speech
Offline on-device speech recognition for Windows via Windows.Media.SpeechRecognition. Published as an optional dependency of the superbased npm package; not intended as a standalone consumer surface.
Requirements
- Windows 10 1903+ or Windows 11 (x64 or arm64)
- Node.js 20+ (prebuilts published for the N-API versions actually in use)
- At least one Windows speech data pack installed (Settings → Time & Language → Speech → Add a language)
- "Online speech recognition" toggle enabled in Settings → Privacy & Security → Speech
Install
npm install @superbased/winrt-speechPrebuilts are published ahead of time so npm install never falls through to node-gyp rebuild on consumer boxes.
API
import {
isSupported,
listSupportedLocales,
requestPermission,
transcribeFile,
} from '@superbased/winrt-speech';
isSupported(); // boolean
listSupportedLocales(); // ['en-US', 'en-GB', ...]
await requestPermission(); // 'authorized' | 'denied' | 'notDetermined'
// File-based transcription — v0.1 ships the binding + permission probe;
// full AudioGraph + FileInputNode plumbing lands in a follow-up. Use
// engine=sherpa on Windows until that patch ships.
const { text, confidence, locale } = await transcribeFile({
audioPath: 'C:\\path\\to\\audio.wav',
locale: 'en-US',
contextualStrings: ['SuperBased'],
});Why this package exists (vs @nodert-win11/*)
The Phase 4d spike tried @nodert-win11/windows.media.speechrecognition and aborted because:
- Prebuilts target Node 16-18; Node 22's
NODE_MODULE_VERSION=127is newer. - Falling through to
node-gyp rebuildrequires Windows SDK + MSBuild, which fails on ~95% of uncustomised dev boxes. - Upstream
binding.gypdoesn't emit/AI <WindowsUnionMetadataDir>, so the C++/WinRT compiler can't locate WinMD metadata.
@superbased/winrt-speech addresses all three: N-API prebuilds published ahead of time, our own binding.gyp with AdditionalUsingDirectories set to $(WindowsSdkDir)UnionMetadata\$(TargetPlatformVersion), release cadence controlled by us.
Error shape
try {
await transcribeFile({ audioPath: '...' });
} catch (err) {
// err.code is one of:
// 'unauthorized' — run requestPermission(), or toggle in Settings
// 'recognition_failed' — Windows.Media.SpeechRecognition rejected the session
// 'winrt_error' — generic HRESULT failure from the WinRT layer
// err.winrtStatus may contain the raw SpeechRecognitionResultStatus int.
}Building from source
npm install
npm run buildRequires Visual Studio 2022 Build Tools (or Visual Studio with "Desktop development with C++" + "Windows 10/11 SDK"). binding.gyp passes /std:c++17, /await, /EHsc, /permissive-, /Zc:__cplusplus, and includes the Windows SDK UnionMetadata path.
License
MIT. Authored by Gaja AI Private Limited.
