speechmarkdown-js
v2.3.0
Published
Speech Markdown parser and formatters in TypeScript.
Readme
speechmarkdown-js
Speech Markdown grammar, parser, and formatters for use with JavaScript.
Supported platforms:
- amazon-alexa
- amazon-polly
- amazon-polly-neural
- apple-avspeechsynthesizer
- google-assistant
- ibm-watson
- microsoft-azure
- microsoft-sapi
- w3c
- samsung-bixby
- elevenlabs
Find the architecture here
Platform-specific SSML notes are tracked in docs/platforms. Use npm run docs:update-voices to refresh the auto-generated voice maps in src/formatters/data when vendor credentials are available.
Quick start
SSML - Amazon Alexa
Convert Speech Markdown to SSML for Amazon Alexa
const smd = require('speechmarkdown-js');
const markdown = `Sample [3s] speech [250ms] markdown`;
const options = {
platform: 'amazon-alexa',
};
const speech = new smd.SpeechMarkdown();
const ssml = speech.toSSML(markdown, options);The resulting SSML is:
<speak>
Sample <break time="3s"/> speech <break time="250ms"/> markdown
</speak>SSML - Google Assistant
Convert Speech Markdown to SSML for Google Assistant
const smd = require('speechmarkdown-js');
const markdown = `Sample [3s] speech [250ms] markdown`;
const options = {
platform: 'google-assistant',
};
const speech = new smd.SpeechMarkdown();
const ssml = speech.toSSML(markdown, options);The resulting SSML is:
<speak>
Sample <break time="3s"/> speech <break time="250ms"/> markdown
</speak>SSML - Microsoft Azure
Convert Speech Markdown to SSML for Microsoft Azure with automatic MSTTS namespace injection
const smd = require('speechmarkdown-js');
const markdown = `(This is exciting news!)[excited:"1.5"] The new features are here.`;
const options = {
platform: 'microsoft-azure',
};
const speech = new smd.SpeechMarkdown();
const ssml = speech.toSSML(markdown, options);The resulting SSML is:
<speak xmlns:mstts="https://www.w3.org/2001/mstts">
<mstts:express-as style="excited" styledegree="1.5">This is exciting news!</mstts:express-as> The new features are here.
</speak>Azure supports 27 express-as styles including emotional styles (excited, disappointed, friendly, cheerful, sad, angry, etc.) and scenario-specific styles (newscaster, customerservice, chat, etc.). See Azure platform documentation for complete details.
Plain Text
Convert Speech Markdown to Plain Text
const smd = require('speechmarkdown-js');
const markdown = `Sample [3s] speech [250ms] markdown`;
const options = {};
const speech = new smd.SpeechMarkdown();
const text = speech.toText(markdown, options);The resulting text is:
Sample speech markdownMore
Options
You can pass options into the constructor:
const smd = require('speechmarkdown-js');
const markdown = `Sample [3s] speech [250ms] markdown`;
const options = {
platform: 'amazon-alexa',
};
const speech = new smd.SpeechMarkdown(options);
const ssml = speech.toSSML(markdown);Or in the methods toSSML and toText:
const smd = require('speechmarkdown-js');
const markdown = `Sample [3s] speech [250ms] markdown`;
const options = {
platform: 'amazon-alexa',
};
const speech = new smd.SpeechMarkdown();
const ssml = speech.toSSML(markdown, options);Available options are:
platform(string) - Determines the formatter to use to render SSML. Valid values are:- "amazon-alexa"
- "amazon-polly"
- "amazon-polly-neural"
- "apple-avspeechsynthesizer"
- "google-assistant"
- "ibm-watson"
- "microsoft-azure"
- "microsoft-sapi"
- "w3c"
- "samsung-bixby"
- "elevenlabs"
includeFormatterComment(boolean) - Adds an XML comment to the SSML output indicating the formatter used. Default isfalse.includeSpeakTag(boolean) - Determines if the<speak>tag will be rendered in the SSML output. Default istrue.includeParagraphTag(boolean) - Determines if the<p>tag will be rendered in the SSML output. Default isfalse.preserveEmptyLines(boolean) - keep empty lines in markdown in SSML. Default istrue.escapeXmlSymbols(boolean) - Currently only foramazon-alexaandmicrosoft-azure. Escape XML text. Default isfalse.voices(object) - give custom names to voices and use that in your markdown:{ "platform": "amazon-alexa", "voices": { "Scott": { "voice": { "name": "Brian" } }, "Sarah": { "voice": { "name": "Kendra" } } } }{ "platform": "google-assistant", "voices": { "Brian": { "voice": { "gender": "male", "variant": 1, "language": "en-US" } }, "Sarah": { "voice": { "gender": "female", "variant": 3, "language": "en-US" } } } }
Working on this project?
Grammar
The biggest place we need help right now is with the completion of the grammar and formatters.
Short Format
- [x] break
- [x] emphasis - strong
- [x] emphasis - moderate
- [x] emphasis - none
- [x] emphasis - reduced
- [x] ipa
- [x] sub
Short-form examples:
(pecan)/'pi.kæn/→<phoneme alphabet="ipa" ph="'pi.kæn">pecan</phoneme>(Al){aluminum}→<sub alias="aluminum">Al</sub>/ˈdeɪtə/→<phoneme alphabet="ipa" ph="ˈdeɪtə">ipa</phoneme>
Standard Format
- [x] address
- [x] audio
- [x] break (time)
- [x] break (strength)
- [x] characters / chars
- [x] date
- [x] defaults (section)
- [x] disappointed
- [x] disappointed (section)
- [x] dj (section)
- [x] emphasis
- [x] excited
- [x] excited (section)
- [x] expletive / bleep
- [x] fraction
- [x] interjection
- [x] ipa
- [x] lang
- [x] lang (section)
- [x] mark
- [x] newscaster (section)
- [x] number
- [x] ordinal
- [x] telephone / phone
- [x] pitch
- [x] rate
- [x] sub
- [x] time
- [x] unit
- [x] voice
- [x] voice (section)
- [x] volume / vol
- [x] whisper
Available scripts
clean- remove coverage data, Jest cache and transpiled files,build- perform all build tasksbuild:ts- transpile TypeScript to ES5build:browser- creates single file./dist.browser/speechmarkdown.jsfile for use in browser,build:minify- creates single file./dist.browser/speechmarkdown.min.jsfile for use in browser,watch- interactive watch mode to automatically transpile source files,lint- lint source files and tests,test- run tests,test:watch- interactive watch mode to automatically re-run tests
License
Licensed under the MIT. See the LICENSE file for details.
