sofya.transcription
v0.0.19-beta.2
Published
a JavaScript library that provides a robust and flexible solution for real-time audio transcription. It is designed to transcribe audio streams and can be easily integrated into web applications.
Downloads
544
Maintainers
Readme
Sofya Transcription
Sofya Transcription is a JavaScript library that provides a robust and flexible solution for real-time audio transcription. It is designed to transcribe audio streams and can be easily integrated into web applications. The library also includes a functionality for capturing audio from media elements.
Features
- Real-Time Transcription: Transcribe audio streams in real time with high accuracy.
- Flexible Integration: Seamlessly integrates with your web applications.
- Media Element Audio Capture: Feature to capture audio from media elements like
<video>and<audio>. - Multiple Provider Support: Support for Sofya Compliance and Sofya as Service transcription providers.
- Type-Safe Configuration: TypeScript definitions for provider-specific configurations.
Installation
To install Sofya Transcription, you can use npm:
npm install sofya.transcription
Usage
Here's a basic example of how to use Sofya Transcription in your project:
Import the Library:
import { MediaElementAudioCapture, SofyaTranscriber } from 'sofya.transcription';Create a Transcription Service Instance:
// Using API key connection const transcriber = new SofyaTranscriber({ apiKey: 'YOUR_API_KEY', config: { language: 'en-US' } }); // Or using a specific provider const transcriber = new SofyaTranscriber({ provider: 'sofya_compliance', endpoint: 'YOUR_ENDPOINT', config: { language: 'en-US', token: 'YOUR_TOKEN', compartmentId: 'YOUR_COMPARTMENT_ID', region: 'YOUR_REGION' } });Initialize and Start Transcription:
// Wait for the transcriber to be ready transcriber.on('ready', () => { // Get media stream navigator.mediaDevices.getUserMedia({ audio: true }) .then(mediaStream => { // Start transcription transcriber.startTranscription(mediaStream); }) .catch(error => { console.error('Error accessing microphone:', error); }); });Handle Transcription Events:
transcriber.on('recognizing', (text) => { console.log('Recognizing: ' + text); }); transcriber.on('recognized', (text) => { console.log('Recognized: ' + text); }); transcriber.on('error', (error) => { console.error('Transcription error:', error); }); transcriber.on('stopped', () => { console.log('Transcription stopped'); });Control Transcription:
// Pause transcription transcriber.pauseTranscription(); // Resume transcription transcriber.resumeTranscription(); // Stop transcription await transcriber.stopTranscription();
API
SofyaTranscriber
constructor(connection: Connection): Creates a new instance of the transcription service with a connection object.
startTranscription(mediaStream: MediaStream): void: Starts the transcription process with a given
MediaStream.stopTranscription(): void: Stops the transcription process.
pauseTranscription(): void: Pauses the transcription process.
resumeTranscription(): void: Resumes the transcription process.
on(event: string, callback: Function): this: Registers an event handler for transcription events. Possible events include:
recognizing: Fired when transcription is in progress.recognized: Fired when transcription is complete.error: Fired when an error occurs.ready: Fired when the transcription service is ready to start.stopped: Fired when the transcription process is stopped.connected: Fired when the transcription service is connected to the provider.
Connection Types
The SDK supports different connection modes based on the provider:
API Key Connection
{
apiKey: string;
config?: BaseConfig;
}Sofya Compliance Provider Connection
{
provider: "sofya_compliance";
endpoint: string;
config: SofyaComplianceConfig;
}Sofya As Service Provider Connection
{
provider: "sofya_as_service";
endpoint: string;
config: SofyaSpeechConfig;
}STT WVAD Provider Connection
{
provider: "stt_wvad";
endpoint: string;
config: SofyaSpeechConfig;
}Configuration Types
BaseConfig
interface BaseConfig {
language: string;
}SofyaComplianceConfig
interface SofyaComplianceConfig extends BaseConfig {
token: string;
compartmentId: string;
region: string;
}SofyaSpeechConfig
interface SofyaSpeechConfig extends BaseConfig {}React Example
import React from 'react'
import { SofyaTranscriber } from 'sofya.transcription'
const App = () => {
const transcriberRef = React.useRef<SofyaTranscriber | null>(null)
const [transcription, setTranscription] = React.useState('')
const transcriptionRef = React.useRef('')
const getMediaStream = async () => {
const stream = await navigator.mediaDevices.getUserMedia({ audio: true })
return stream
}
const startTranscription = async () => {
try {
const stream = await getMediaStream()
// Create transcriber with API key connection
const transcriber = new SofyaTranscriber({
apiKey: 'your_api_key',
config: {
language: 'en-US'
}
})
transcriberRef.current = transcriber
transcriber.on("ready", () => {
transcriber.startTranscription(stream)
})
transcriber.on('recognizing', (result: string) => {
transcriptionRef.current = result
setTranscription(result)
})
transcriber.on('recognized', (result: string) => {
transcriptionRef.current = result
setTranscription(result)
})
transcriber.on('error', (error: Error) => {
console.error('Transcription error:', error)
})
} catch (error) {
console.error('Error starting transcription:', error)
}
}
const stopTranscription = async () => {
if (transcriberRef.current) {
await transcriberRef.current.stopTranscription()
}
}
return (
<div>
<button onClick={startTranscription}>Start Transcription</button>
<button onClick={stopTranscription}>Stop Transcription</button>
<div>
<h3>Transcription:</h3>
<p>{transcription}</p>
</div>
</div>
)
}
export default AppLicense
This project is licensed under the MIT License - see the LICENSE file for details.
