@charivo/stt-transcriber-openai

v0.0.1

Published

9 days ago

OpenAI STT transcriber for Charivo (local/testing)

Downloads

0High
0Medium
0Low

zeikar

@charivo/stt-transcriber-openai

OpenAI Whisper STT transcriber for Charivo (direct API access).

⚠️ Security Warning

This transcriber directly calls OpenAI Whisper API from the client. Only use for development/testing or in trusted environments. For production, use @charivo/stt-transcriber-remote to keep API keys secure on the server.

Installation

pnpm add @charivo/stt-transcriber-openai @charivo/core openai

Usage

Basic Setup

import { createOpenAISTTTranscriber } from "@charivo/stt-transcriber-openai";
import { createSTTManager } from "@charivo/stt-core";

// ⚠️ API key will be visible in client code
const transcriber = createOpenAISTTTranscriber({
  apiKey: "your-openai-api-key", // NOT SECURE
  defaultLanguage: "en"
});

const sttManager = createSTTManager(transcriber);

// Start recording
await sttManager.start();

// Stop and transcribe
const transcription = await sttManager.stop();
console.log("User said:", transcription);

Configuration

const transcriber = createOpenAISTTTranscriber({
  apiKey: "your-api-key",
  defaultModel: "whisper-1",
  defaultLanguage: "en"  // Optional: specify language for better accuracy
});

API Reference

Constructor

new OpenAISTTTranscriber(config: OpenAISTTTranscriberConfig)

Config:

apiKey: string - Your OpenAI API key (required)
defaultModel?: "whisper-1" - Model to use (default: "whisper-1")
defaultLanguage?: string - Default language code (e.g., "en", "es", "fr")

Methods

`startRecording(options?)`

Start recording audio from microphone.

await transcriber.startRecording();

// With language option
await transcriber.startRecording({ language: "es" });

`stopRecording()`

Stop recording and transcribe audio to text.

const transcription = await transcriber.stopRecording();
console.log("User said:", transcription);

`isRecording()`

Check if currently recording.

if (transcriber.isRecording()) {
  console.log("Recording in progress...");
}

Supported Languages

Whisper supports 99+ languages including:

| Language | Code | Language | Code | |----------|------|----------|------| | English | en | Spanish | es | | French | fr | German | de | | Italian | it | Portuguese | pt | | Dutch | nl | Russian | ru | | Chinese | zh | Japanese | ja | | Korean | ko | Arabic | ar |

And many more... If not specified, Whisper will auto-detect the language.

Available Models

whisper-1 - OpenAI's Whisper model for speech recognition

Security Best Practices

❌ Not Recommended (Client-side)

// API key exposed to users!
const transcriber = createOpenAISTTTranscriber({
  apiKey: "sk-..." // Anyone can see this in DevTools
});

✅ Recommended (Server-side)

Use remote transcriber + provider pattern:

Client:

import { createRemoteSTTTranscriber } from "@charivo/stt-transcriber-remote";
import { createSTTManager } from "@charivo/stt-core";

const transcriber = createRemoteSTTTranscriber({
  apiEndpoint: "/api/stt" // No API key here!
});
const sttManager = createSTTManager(transcriber);

Server:

import { createOpenAISTTProvider } from "@charivo/stt-provider-openai";

const provider = createOpenAISTTProvider({
  apiKey: process.env.OPENAI_API_KEY // Secure!
});

export async function POST(request) {
  const formData = await request.formData();
  const audioFile = formData.get('audio') as File;
  
  const audioBlob = new Blob([await audioFile.arrayBuffer()]);
  const transcription = await provider.transcribe(audioBlob);
  
  return Response.json({ transcription });
}

When to Use

Use OpenAI STT Transcriber when:

🧪 Prototyping or testing
🏠 Personal projects
🔒 Running in trusted environment (e.g., Electron app)

Use Remote Transcriber when:

🌐 Production web apps
👥 Multi-user applications
💰 Need to control costs
🔐 Security is important

Pricing

Same as OpenAI Whisper API:

whisper-1: $0.006 per minute (rounded to the nearest second)

Example: 30 seconds of audio = $0.003

Audio Format Support

Supports various audio formats:

MP3
MP4
MPEG
MPGA
M4A
WAV
WEBM

Maximum file size: 25 MB

Error Handling

try {
  await sttManager.start();
  const transcription = await sttManager.stop();
} catch (error) {
  if (error.code === "insufficient_quota") {
    console.error("OpenAI quota exceeded");
  } else if (error.code === "invalid_api_key") {
    console.error("Invalid API key");
  } else if (error.name === "NotAllowedError") {
    console.error("Microphone permission denied");
  } else {
    console.error("Transcription error:", error);
  }
}

Complete Example

import { createOpenAISTTTranscriber } from "@charivo/stt-transcriber-openai";
import { createSTTManager } from "@charivo/stt-core";

const transcriber = createOpenAISTTTranscriber({
  apiKey: process.env.NEXT_PUBLIC_OPENAI_API_KEY!, // ⚠️ Only for testing
  defaultLanguage: "en"
});

const sttManager = createSTTManager(transcriber);

// Connect event listeners
sttManager.setEventEmitter({
  emit: (event, data) => {
    if (event === "stt:start") {
      console.log("Recording started");
    } else if (event === "stt:stop") {
      console.log("Transcription:", data.transcription);
    } else if (event === "stt:error") {
      console.error("Error:", data.error);
    }
  }
});

// Start recording (handled internally by transcriber)
await sttManager.start({ language: "en" });

// Stop and get transcription (transcriber handles recording stop + API call)
const text = await sttManager.stop();
console.log("User said:", text);

Integration with Charivo

import { Charivo } from "@charivo/core";
import { createSTTManager } from "@charivo/stt-core";
import { createOpenAISTTTranscriber } from "@charivo/stt-transcriber-openai";

const charivo = new Charivo();

// Setup STT
const transcriber = createOpenAISTTTranscriber({
  apiKey: process.env.NEXT_PUBLIC_OPENAI_API_KEY!
});
const sttManager = createSTTManager(transcriber);
charivo.attachSTT(sttManager);

// Voice input flow
await sttManager.start();
const userMessage = await sttManager.stop();
await charivo.userSay(userMessage);
// → Character responds with voice and animation

Performance Tips

Specify language: Improves accuracy and speed
Use good audio quality: Clear audio = better transcription
Reduce background noise: Pre-process if possible
Handle errors gracefully: Network issues can happen

Browser Compatibility

Works in browsers that support:

MediaRecorder API
getUserMedia API
Fetch API

Supported browsers:

Chrome/Edge 49+
Firefox 29+
Safari 14.1+
Opera 36+

Related Packages

@charivo/stt-transcriber-remote - Secure client-side transcriber (recommended)
@charivo/stt-provider-openai - Server-side provider
@charivo/stt-core - STT core functionality

License

MIT

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

@charivo/stt-transcriber-openai

⚠️ Security Warning

Installation

Usage

Basic Setup

Configuration

API Reference

Constructor

Methods

startRecording(options?)

stopRecording()

isRecording()

Supported Languages

Available Models

Security Best Practices

❌ Not Recommended (Client-side)

✅ Recommended (Server-side)

When to Use

Use OpenAI STT Transcriber when:

Use Remote Transcriber when:

Pricing

Audio Format Support

Error Handling

Complete Example

Integration with Charivo

Performance Tips

Browser Compatibility

Related Packages

License

`startRecording(options?)`

`stopRecording()`

`isRecording()`