npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

five-min-stt

v1.0.1

Published

<!-- _One liner + link to confluence page_ _Screenshot of UI - optional_ -->

Downloads

4

Readme

5 min STT

A module to make STT (Speech to Text) modules, to work on a five minutes turnaround time. Refactored from autoEdit2 to use in autoEdit3.

Setup

git clone [email protected]:pietrop/five-min-stt.git
cd five-min-stt
npm install

Usage

Usage in development

see the example usage in src/

Usage in production

npm install five-min-stt
const fiveMinStt = require('five-min-stt');
const url = 'https://download.ted.com/talks/KateDarling_2018S-950k.mp4';
const audioFileOutput = './KateDarling_2018S-950k.wav';

const sttTranscribeFunction = async (filePath) => {
  return await assemblyai({ ApiKey, filePath });
};

fiveMinStt({ file: url, audioFileOutput, ffmpegBinPath, ffprobeBinPath, sttTranscribeFunction }).then((resp) => {
  console.log('example usage, fiveMinStt::', JSON.stringify(resp, null, 2));
});

optionally you can specify audioFileOutput

const audioFileOutput = './KateDarling_2018S-950k.wav';

fiveMinStt({ file: url, audioFileOutput, ffmpegBinPath, ffprobeBinPath, sttTranscribeFunction }).then((resp) => {
  console.log('example usage, fiveMinStt::', JSON.stringify(resp, null, 2));
});

Note that audioFileOutput - is optional,

  • if not provided it creates one in a tmp dir on the system, and the deletes it when done.
  • if provided name/path for audio version destination then is developer's responsability to decide if they want to keep or delete the audio file.

Note that if you are using with AssemblyAi STT, on free tier account, there's a limit of one concurrent transcript at a time. After which they get throttled. For pay as you go accounts a limit of 32. If exceed those it will also get throttled. But for 1 hour: 60 min / 5 = 12 concurrent transcription. See table below for more examples.

| hour | min | chunks | concurrent segments | | ---- | --- | ------ | ------------------- | | 1 | 60 | 5 | 12 | | 2 | 120 | 5 | 24 | | 3 | 180 | 5 | 36 |

The 3 hour lenght would go over the 32 concurrent transcriptions, and the exceeding one would be throttled.

System Architecture

  1. Convert to audio file
  2. Split audio file into 5 minutes segments, if over 5 minutes.
  3. send segments to STT service
  4. re-adjust results by adding offsets to word timings, and combine into one list
  5. delete tmp audio segments
  6. return resulting transcript

Initially developed to work with @pietrop/assemblyai-node-sdk but tries not to be opinionated about which STT service you use. Altho it assumes the result from the sttTranscriFunction has a words attribute with word object, with end, start timecodes and text attribute.

{
    "words": [
        {
            "end": 440,
            "start": 0,
            "text": "You",
            ...
        },
        ...
    ]

}

Note that the scirpt does not modify the unit of the timings for start and end, eg if they are in seconds or milliseconds that stays as it is.

Development env

  • npm > 6.1.0
  • Node 12

Node version is set in node version manager .nvmrc

nvm use

Linting

This repo uses prettier for linting. If you are using visual code you can add the Prettier - Code formatter extension, and configure visual code to do things like format on save.

You can also run the linting via npm scripts

npm run lint

and there's also a pre-commit hook that runs it too.

Build

NA

Tests

NA

Deployment

to publish to npm

npm run publish:public

To do a dry run

npm run publish:dry:run