npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

simple-speech

v0.3.6

Published

Sane API for web speech

Downloads

797

Readme

simple-speech

Sane API for web speech

Install

npm install --save simple-speech

Usage


Important: You need user action to init the underlying browser APIs

For security reasons, most browsers lock it to work only after 'user-gesture', meaning nothing will work if you call library functions in top-level, it needs to be in an event handler:

// you need an user action in order for the browser to allow listening
const startButton = document.querySelector('#start')

Robot speaks 'Hello World':

import { synthesis } from 'simple-speech'

startButton.addEventListener('click', () =>
  synthesis.speak('Hello World')
)

Log what is said in your mic:

import { recognition } from 'simple-speech'

startButton.addEventListener('click', () =>
  recognition
    .listen()
    .then(msg => console.log(`You said: ${msg}`))
)

Repeats what you say with robot voice:

const listenAndRepeat = () =>
  recognition
    .listen()
    .then(synthesis.speak)
    .then(listenAndRepeat)

startButton.addEventListener('click', listenAndRepeat)

Modules

Synthesis

import { synthesis } from 'simple-speech'

const onClick = () => synthesis.speak('Hello World')

// Need to start from user action
$button.addEventListener('click', onClick)

Selecting voices with some kind of IDE support was an early motivation for this package. Currently, it provides intellisense support for narrowing down the voice choice, eg:

./voices.gif

On the first use call, I narrow the voices choices for only the ones that matches lang: 'fr-FR', so on the next use call, the type system has the info to narrow down the name options that are available for that language only.

You can set both fields in one go, but intellisense won't be able to narrow down:

const jacques = synthesis.use({
  lang: 'fr-FR',
  name: 'Jacques'
})

Other options available:

synthesis.use({
  volume: 0.8, // 0 to 1
  rate: 2, // 0.1 to 10
  pitch: 1.5, // 0 to 2

  // optionally, you can preload the text to be spoken
  // when calling `.speak()` with no arguments
  text: 'Hello, world!',
})

Recognition

import { recognition } from 'simple-speech'

const onClick = () => recognition.listen().then(console.log)

// Need to start from user action
$button.addEventListener('click', onClick)
// Say 'Hello World' after clicking the button and the console will log it

It also has an observable API. It emits more intuitive events than the underlying browser API:

const sub = recognition.use({ interimResults: true, maxAlternatives: 3 }).subscribe({ next: console.log })
// { tag: 'start', ... }
// { tag: 'audiostart', ... }
// { tag: 'soundstart', ... }
// { tag: 'speechstart', ... }
// { tag: 'interim', alternatives: [{ transcript: 'hello', confidence: 0.8999999761581421 }] }
// { tag: 'interim', alternatives: [{ transcript: 'world', confidence: 0.8763247828138491 }] }
// { tag: 'final', alternatives: [
//     { transcript: 'hello world', confidence: 0.8698675632476807 }
//     { transcript: 'hello wards', confidence: 0 }
//     { transcript: 'hello Ward', confidence: 0 }
// ] }

sub.unsubscribe() // Stop listening if not over already and cleans up

It's compatible with rxjs and similar libs:

import * as rx from 'rxjs'
import { recognition } from 'simple-speech'

const result$ = rx.from(recognition)
  .pipe(
    rx.filter(e => e.tag === 'interim' || e.tag === 'final'),
    rx.tap(e =>
      e.tag === 'interim' ? setInterim(e.value) :
      e.tag === 'final' ? setFinal(e.value) :
      null
    )
  )

$button.addEventListener('click', () => result$.subscribe())

Options available:

recognition.use({
  // language to recognize, check intellisense
  // autocomplete for all options
  lang: 'en-US',

  // If should emit interim results as well (only makes sense when using observable API)
  interimResults: true,

  // how many alternatives to present on the recognition
  maxAlternatives: 3,

  // uses the underlying 'continuous mode', meaning it will keep
  // emiting transcriptions instead of endind on the first 'final' result
  // (see below)
  continuous: false,
})

Continuous transcription recipe

The underlying API has a 'continous mode' but it doesn't seem to work that well for me, instead, with the Observable API you can use this:

import { recognition } from 'simple-speech'
import * as rx from 'rxjs'

const transcription$ = rx.from(
  recognition.use({
    interimResults: true,
    maxAlternatives: 1,
    continuous: false,
  })
).pipe(rx.repeat())