@hyssostech/voskspeech-plugin
v0.1.0-alpha.0
Published
Vosk offline speech recognition plugin for Sketch-thru-Plan. Runs entirely in-browser via WebAssembly using a domain-adapted Kaldi model.
Downloads
35
Readme
STP Plugin: Vosk Offline Speech
The Sketch-thru-Plan (STP) recognizer can employ transcribed speech generated by potentially different recognizers. To promote code reuse and make it possible to more easily swap recognizers, the functionality should be packaged as a plugin that conforms to a well-known interface.
This plugin is implemented based on Vosk WebAssembly, providing offline speech recognition that runs entirely in the browser. No cloud service, API keys, or network connection is required — a domain-adapted Kaldi model optimized for military planning vocabulary is included with the plugin and deployed alongside the application.
Prerequisites
This plugin requires a Vosk speech model to be served as static files alongside your application. A domain-adapted model is included at model/vosk-model-la-domain.zip (~28 MB).
Setting up the model
Extract the model into your application's directory:
# PowerShell (Windows)
Expand-Archive -Path plugins/speech/voskspeech-plugin/model/vosk-model-la-domain.zip -DestinationPath your-app/model
# bash / macOS / Linux
unzip plugins/speech/voskspeech-plugin/model/vosk-model-la-domain.zip -d your-app/modelThe resulting model/ directory (~50 MB on disk) contains the acoustic model, decoding graph, and configuration files required by the Vosk WASM runtime.
Note: The page must be served over HTTPS for microphone access. Modern browsers also require AudioWorklet (Chrome 66+, Firefox 76+, Safari 14.1+, Edge 79+) and WebAssembly support.
Accessing the plugin functionality
The plugin requires two separate scripts: the Vosk WASM runtime and the plugin bundle itself. The Vosk runtime (vosk-browser) is not bundled into the plugin — it is loaded separately so the browser can cache the large WASM module independently.
Copy vosk.js from node_modules/vosk-browser/dist/vosk.js into your app directory, then reference both scripts:
<!-- Vosk WASM runtime (exposes global `Vosk`) — must be loaded first -->
<script type="application/javascript" src="vosk.js"></script>
<!-- Vosk speech plugin (exposes global `StpVS`) -->
<script type="application/javascript" src="stpvoskspeech-bundle-min.js"></script>The vosk-processor.js AudioWorklet file must also be accessible from your application — it is loaded at runtime by the plugin. By default it is expected in the same directory as the page; use the workletPath constructor parameter to specify an alternate location.
Referencing the plugin
The plugin is built as a UMD library, and is therefore compatible with plain vanilla (IIFE), AMD and CommonJS. Also included is an ESM bundle (stpvoskspeech-bundle.esm.js).
When used in vanilla javascript, an StpVS exported global can be used to access the plugin types:
const speechreco = new StpVS.VoskSpeechRecognizer();If the model is served from a non-default location, or the AudioWorklet file is in a different directory, pass the paths to the constructor:
// VoskSpeechRecognizer(modelPath?, sampleRate?, workletPath?)
const speechreco = new StpVS.VoskSpeechRecognizer('./my-models/vosk-model-la-domain', 16000, './lib/vosk-processor.js');Model loading begins eagerly in the background as soon as the recognizer is created.
Examples
- The basic sample supports selection of Vosk alongside Azure and AWS speech plugins at runtime via a dropdown control
Building the project
The repository includes a pre-built dist folder that can be used directly for testing. If changes are made to the plugin and there is a need to rebuild, run:
npm install
npm run buildDocumentation
See the speech plugins overview for details on the ISpeechRecognizer interface, recognition strategies, and event handling.
