mtl-voxy

v1.2.4

Published

8 months ago

Voxy SDK

0High
0Medium
0Low

meditechlabs

MediTechLabs Voxy

Voxy SDK Integration

Overview

Voxy SDK is a real-time speech-to-text transcription system built using Socket.io and plain bash. It allows you to capture audio from your microphone, transcribe speech in real time, and export the transcript as a text file. In addition, it monitors recording status, mode, microphone connection, and template data to enhance the user experience.

Features

Real-Time Transcription: Converts spoken words to text instantly.
Dynamic Recording Mode: Detects and updates the recording status in real time.
Data Export: Easily export both transcription and report data as .txt files.
Status Monitoring: Displays current mode, template, and microphone status in the UI.

Prerequisites

A working microphone
A valid API_URL for WebSocket-based speech recognition
A valid email and password for authentication
A unique user identifier (userID)

Installation

npm install mtl-voxy

Explanation

1. Import Required Module:

import useVoxy from 'mtl-voxy';

2. Initializing the SDK Connection:

const voxyInstance = await useVoxy({
  apiUrl: '<API_URL>',
  email: '<EMAIL>',
  password: '<PASSWORD>',
  userID: '123',
  sampleRate: 16000
 });

sampleRate: Determines the number of audio samples captured per second.

3. Toggling the Recording State:

    recordBtnElement.addEventListener('click', async () => {
        await voxyInstance.toggleRecording();
    });

This code attaches a click event listener to the recordBtnElement (typically a button in your HTML). When the user clicks this button, the provided callback function is executed.
The toggleRecording() method switches the current recording state. If recording is active, it stops the recording; if it is inactive, it starts the recording. This method manages the underlying logic to properly handle state changes and any related asynchronous operations.

4. Microphone Status:

    voxyInstance.getMicrophoneStatus((microphone) => {
        document.getElementById("microphone").textContent = microphone;
    });

The getMicrophoneStatus() method listens for updates about the microphone's state.

5. Recording Status:

    voxyInstance.getRecordingStatus((isRecording) => {
        document.getElementById("record-btn").textContent = isRecording ? "Stop Recording" : "Start Recording";
    });

The getRecordingStatus() method sets up a callback that receives a boolean (isRecording) indicating whether recording is active.

6. Mode Status:

    voxyInstance.getModeStatus((mode) => {
        document.getElementById("mode").textContent = mode;
    });

The getModeStatus() method registers a callback that receives the current mode as a parameter.

7. Template Information:

    voxyInstance.getTemplate((template) => {
        document.getElementById("template").textContent = template;
    });

The getTemplate()` method provides template-related data

8. Real-Time Transcription:

    voxyInstance.getTranscription((text) => {
        document.getElementById("transcript").value += text;
    });

The method getTranscription() is used to register a callback function. This function is called every time new transcription data is received.

9. Data Export:

    exportTranscriptionBtnElement.addEventListener('click', () => {
        voxyInstance.exportTranscriptionAsTxt(transcriptElement.value);
    });

    exportReportBtnElement.addEventListener('click', () => {
        voxyInstance.exportReportAsTxt(reportElement.value);
    });

The exportTranscriptionAsTxt() method is designed to take the provided text data and convert it into a downloadable .txt file.
The exportReportAsTxt() method takes the report data and generates a downloadable .txt file.

10. Error Handling:

    voxyInstance.errorHandler((error) => {
        console.error(error);
    });

By calling errorHandler() and passing in a callback function, you are setting up a mechanism to capture any errors that occur within the SDK.

`useVoxy` Parameters

| Parameter | Type | Required | Default | Description | |--------------------|--------|----------|---------|-------------| | apiUrl | String | Yes | None | The WebSocket server URL for speech recognition. | | email | String | Yes | None | The user's email for authentication. | | password | String | Yes | None | The user's password for authentication. | | userID | String | Yes | None | Unique identifier for the user. | | sampleRate | Number | No | 16000 | Defines the number of samples per second in the audio stream. |