@pietrop/serialize-stt-words

v1.0.0

Published

4 years ago

A module to serialize and deserialize words from STT in dpe format into arrays of each attribute.

0High
0Medium
0Low

pietrop

`serialize-stt-words`

A module to serialize and deserialize words from STT in dpe format into arrays of each attribute.

eg with euristics if mock8hours.json is 8 hours and 9.6MB

This is the breakdown of file size for each attribute saved seperately.

 58K paragraphEndTimes.json
 59K paragraphStartTimes.json
 93K speakersLit.json
637K textList.json
637K wordEndTimes.json
653K wordStartTimes.json

Well within the 1MB firebase document limit.

Setup

git clone [email protected]:pietrop/serialize-stt-words.git

cd serialize-stt-words

npm install

Usage

{
    "words": [
        {
            "text": "Hello",
            "start": 0,
            "end": 0.88
        },
        ....
    ],
  "paragraphs": [
        {
            "speaker": "SPEAKER_B",
            "start": 0,
            "end": 1.24
        },
    ...
   ]
}

Returns arrays of

npm install @pietrop/serialize-stt-words

const { serializeTranscript } = require('@pietrop/serialize-stt-words');
const { wordStartTimes, wordEndTimes, textList, paragraphStartTimes, paragraphEndTimes, speakersLit } = serializeTranscript(transcript);

{
    "wordStartTimes": [
        0,
        0.9,
        1.13,
        ...
    ],
  "wordEndTimes": [
        0.88,
        1.12,
        ...
    ],
    "textList": [
        "Media",
        "will",
        ...
    ],
    "paragraphStartTimes": [
        0,
        1.25,
        ...
    ],
    "paragraphEndTimes": [
        1.24,
        4,
        ...
    ],
    "speakersLit": [
        "SPEAKER_B",
        "SPEAKER_A",
        ...
    ]
}

The idea being that you could save each separate in a db and recombine later.

const { deserializeTranscript } = require('@pietrop/serialize-stt-words');
const desRes = deserializeTranscript({ wordStartTimes, wordEndTimes, textList, paragraphStartTimes, paragraphEndTimes, speakersLit });

Documentation

There's a docs folder in this repository.

docs/notes contains dev draft notes on various aspects of the project. This would generally be converted either into ADRs or guides when ready.

Development env

npm > 6.1.0
Node 12

Node version is set in node version manager .nvmrc

nvm use

Tests

npm test

Deployment

npm run publish:public

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

serialize-stt-words