@hyssostech/awsspeech-plugin

v0.1.0-alpha.0

Published

2 months ago

Amazon Transcribe Streaming Speech plugin for Sketch-thru-Plan

Downloads

0High
0Medium
0Low

pbarthelmess

STP Plugin: Amazon Transcribe Streaming Speech

The Sketch-thru-Plan (STP) recognizer can employ transcribed speech generated by potentially different recognizers. To promote code reuse and make it possible to more easily swap recognizers, the functionality should be packaged as a plugin that conforms to a well-known interface.

This plugin is implemented based on Amazon Transcribe Streaming (real-time speech-to-text). It implements two different strategies:

Prerequisites

You will need AWS credentials (Access Key ID and Secret Access Key) with permissions to use Amazon Transcribe Streaming. Amazon Transcribe is a fully managed, always-on service — there is no resource to create or provision. As long as your credentials have the required permission, you can start using it immediately.

Obtaining AWS credentials

Go to the AWS Console → IAM → Users → Create user
Attach the AmazonTranscribeFullAccess managed policy (or a custom policy scoped to transcribe:StartStreamTranscription)
Under the user's Security credentials tab, click Create access key
Copy the Access Key ID and Secret Access Key

These can be passed to the plugin constructor directly, or via querystring parameters in the samples (e.g. ?awskey=AKIA...&awssecret=...&awsregion=us-east-1).

Important — URL encoding: AWS secret keys often contain + and / characters. When passing the secret via a URL querystring, these characters must be URL-encoded to avoid signature errors:
+ → %2B
/ → %2F
= → %3D
You can obtain the encoded value by running this in the browser console:
encodeURIComponent("paste-your-secret-key-here")
Alternatively, you can hardcode the credentials directly in the sample's index.js config section to avoid encoding issues during local testing.

For production browser applications, consider using Amazon Cognito Identity Pools to vend temporary credentials instead of embedding long-lived IAM keys in client-side code. The plugin constructor accepts an optional sessionToken parameter for this purpose.

Accessing the plugin functionality

You can get the plugin from npm:

npm install --save @hyssostech/awsspeech-plugin

Or you can embed directly as a script using jsdelivr. As always, it is recommended that a specific version be used rather than @latest to prevent breaking changes from affecting existing code

<script src="https://cdn.jsdelivr.net/npm/@hyssostech/awsspeech-plugin@latest/dist/stpawsspeech-bundle-min.js"></script>

Referencing the plugin

The plugin is built as a UMD library, and is therefore compatible with plain vanilla (IIFE), AMD and CommonJS. Also included is an ESM bundle (stpawsspeech-bundle.esm.js).

When used in vanilla javascript, an StpAWS exported global can be used to access the SDK types:

const speechReco = new StpAWS.AwsSpeechRecognizer(accessKeyId, secretAccessKey, region);

In typescript, import @hyssostech/awsspeech-plugin after installing via npm:

import * as StpAWS from "@hyssostech/awsspeech-plugin";
const speechReco = new StpAWS.AwsSpeechRecognizer(accessKeyId, secretAccessKey, region);

Or import individual types:

import { AwsSpeechRecognizer } from "@hyssostech/awsspeech-plugin";
const speechReco = new AwsSpeechRecognizer(accessKeyId, secretAccessKey, region);

Examples

The basic sample provides a dropdown for selection of this plugin as the speech recognizer provider.

Building the project

The repository includes a pre-built dist folder that can be used directly for testing. If changes are made to the sample and there is a need to rebuild, run:

npm install
npm run build

Note: Unlike the Azure plugin, the AWS Transcribe Streaming client SDK is bundled into the output (there is no separate CDN script to include). This results in a larger bundle but simplifies usage.

Documentation

Additional documentation can be found in the generated docs folder.

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme