mtl-voxy
v1.2.4
Published
Voxy SDK
Readme
Voxy SDK Integration
Overview
Voxy SDK is a real-time speech-to-text transcription system built using Socket.io and plain bash. It allows you to capture audio from your microphone, transcribe speech in real time, and export the transcript as a text file. In addition, it monitors recording status, mode, microphone connection, and template data to enhance the user experience.
Features
- Real-Time Transcription: Converts spoken words to text instantly.
- Dynamic Recording Mode: Detects and updates the recording status in real time.
- Data Export: Easily export both transcription and report data as
.txtfiles. - Status Monitoring: Displays current mode, template, and microphone status in the UI.
Prerequisites
- A working microphone
- A valid
API_URLfor WebSocket-based speech recognition - A valid
emailandpasswordfor authentication - A unique user identifier (
userID)
Installation
npm install mtl-voxyExplanation
1. Import Required Module:
import useVoxy from 'mtl-voxy';2. Initializing the SDK Connection:
const voxyInstance = await useVoxy({
apiUrl: '<API_URL>',
email: '<EMAIL>',
password: '<PASSWORD>',
userID: '123',
sampleRate: 16000
});sampleRate: Determines the number of audio samples captured per second.
3. Toggling the Recording State:
recordBtnElement.addEventListener('click', async () => {
await voxyInstance.toggleRecording();
});- This code attaches a click event listener to the recordBtnElement (typically a button in your HTML). When the user clicks this button, the provided callback function is executed.
- The
toggleRecording()method switches the current recording state. If recording is active, it stops the recording; if it is inactive, it starts the recording. This method manages the underlying logic to properly handle state changes and any related asynchronous operations.
4. Microphone Status:
voxyInstance.getMicrophoneStatus((microphone) => {
document.getElementById("microphone").textContent = microphone;
});- The
getMicrophoneStatus()method listens for updates about the microphone's state.
5. Recording Status:
voxyInstance.getRecordingStatus((isRecording) => {
document.getElementById("record-btn").textContent = isRecording ? "Stop Recording" : "Start Recording";
});- The
getRecordingStatus()method sets up a callback that receives a boolean (isRecording) indicating whether recording is active.
6. Mode Status:
voxyInstance.getModeStatus((mode) => {
document.getElementById("mode").textContent = mode;
});- The
getModeStatus()method registers a callback that receives the current mode as a parameter.
7. Template Information:
voxyInstance.getTemplate((template) => {
document.getElementById("template").textContent = template;
});- The
getTemplate()` method provides template-related data
8. Real-Time Transcription:
voxyInstance.getTranscription((text) => {
document.getElementById("transcript").value += text;
});- The method
getTranscription()is used to register a callback function. This function is called every time new transcription data is received.
9. Data Export:
exportTranscriptionBtnElement.addEventListener('click', () => {
voxyInstance.exportTranscriptionAsTxt(transcriptElement.value);
});
exportReportBtnElement.addEventListener('click', () => {
voxyInstance.exportReportAsTxt(reportElement.value);
});- The
exportTranscriptionAsTxt()method is designed to take the provided text data and convert it into a downloadable .txt file. - The
exportReportAsTxt()method takes the report data and generates a downloadable .txt file.
10. Error Handling:
voxyInstance.errorHandler((error) => {
console.error(error);
});- By calling
errorHandler()and passing in a callback function, you are setting up a mechanism to capture any errors that occur within the SDK.
useVoxy Parameters
| Parameter | Type | Required | Default | Description |
|--------------------|--------|----------|---------|-------------|
| apiUrl | String | Yes | None | The WebSocket server URL for speech recognition. |
| email | String | Yes | None | The user's email for authentication. |
| password | String | Yes | None | The user's password for authentication. |
| userID | String | Yes | None | Unique identifier for the user. |
| sampleRate | Number | No | 16000 | Defines the number of samples per second in the audio stream. |
