npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@charivo/stt-provider-openai

v0.0.1

Published

OpenAI STT provider for Charivo (server-side)

Downloads

48

Readme

@charivo/stt-provider-openai

OpenAI Whisper STT (Speech-to-Text) provider for Charivo framework (server-side).

⚠️ Important Security Note

This is a server-side provider that directly calls OpenAI Whisper API and should ONLY be used in Node.js/server environments. Using this in client-side code will expose your API key.

For client-side usage, use @charivo/stt-transcriber-remote instead.

Architecture

Node.js Server → OpenAISTTProvider → OpenAI Whisper API

Installation

pnpm add @charivo/stt-provider-openai @charivo/core openai

Usage

Server-side Only

import { createOpenAISTTProvider } from "@charivo/stt-provider-openai";

const provider = createOpenAISTTProvider({
  apiKey: process.env.OPENAI_API_KEY!, // Server environment variable
  defaultModel: "whisper-1",
  defaultLanguage: "en"
});

// Transcribe audio data
const transcription = await provider.transcribe(audioBlob);

// With custom options
const transcription2 = await provider.transcribe(audioBlob, {
  language: "es" // Spanish
});

API Endpoint Usage

// Express.js example
import express from 'express';
import multer from 'multer';
import { createOpenAISTTProvider } from "@charivo/stt-provider-openai";

const app = express();
const upload = multer({ storage: multer.memoryStorage() });
const provider = createOpenAISTTProvider({
  apiKey: process.env.OPENAI_API_KEY!
});

app.post('/api/stt', upload.single('audio'), async (req, res) => {
  try {
    if (!req.file) {
      return res.status(400).json({ error: 'No audio file provided' });
    }

    const audioBlob = new Blob([req.file.buffer], { 
      type: req.file.mimetype 
    });
    
    const transcription = await provider.transcribe(audioBlob, {
      language: req.body.language
    });
    
    res.json({ transcription });
  } catch (error) {
    res.status(500).json({ error: 'Transcription failed' });
  }
});

Next.js API Route Example

// app/api/stt/route.ts
import { NextRequest, NextResponse } from "next/server";
import { createOpenAISTTProvider } from "@charivo/stt-provider-openai";

const provider = createOpenAISTTProvider({
  apiKey: process.env.OPENAI_API_KEY!
});

export async function POST(request: NextRequest) {
  try {
    const formData = await request.formData();
    const audioFile = formData.get('audio') as File;
    const language = formData.get('language') as string | undefined;

    if (!audioFile) {
      return NextResponse.json(
        { error: "No audio file provided" },
        { status: 400 }
      );
    }

    // Convert File to Blob
    const audioBlob = new Blob([await audioFile.arrayBuffer()], {
      type: audioFile.type
    });

    const transcription = await provider.transcribe(audioBlob, {
      language
    });

    return NextResponse.json({ transcription });
  } catch (error) {
    console.error("STT error:", error);
    return NextResponse.json(
      { error: "Failed to transcribe audio" },
      { status: 500 }
    );
  }
}

API Reference

Configuration Options

interface OpenAISTTConfig {
  /** OpenAI API key (required) */
  apiKey: string;
  /** Default OpenAI Whisper model (default: "whisper-1") */
  defaultModel?: "whisper-1";
  /** Default language for transcription (e.g., "en", "es", "fr") */
  defaultLanguage?: string;
  /** Allow browser usage (dangerous - exposes API key) */
  dangerouslyAllowBrowser?: boolean;
}

Available Models

  • whisper-1 - OpenAI's Whisper model for speech recognition

Supported Languages

Whisper supports 99+ languages including:

  • English (en)
  • Spanish (es)
  • French (fr)
  • German (de)
  • Chinese (zh)
  • Japanese (ja)
  • Korean (ko)
  • And many more...

For best results, specify the language if known. If not specified, Whisper will auto-detect.

Methods

transcribe(audio, options?): Promise<string>

Transcribe audio data to text.

// With Blob
const transcription = await provider.transcribe(audioBlob);

// With ArrayBuffer
const transcription = await provider.transcribe(audioBuffer);

// With language option
const transcription = await provider.transcribe(audioBlob, {
  language: "es"
});

Parameters:

  • audio: Blob | ArrayBuffer - Audio data to transcribe
  • options?: STTOptions - Optional transcription options
    • language?: string - Language code (e.g., "en", "es")

Returns: Promise<string> - Transcribed text

Browser Usage (Not Recommended)

⚠️ Security Warning: This provider should NOT be used in browser as it exposes your API key to users.

Better alternative: Use @charivo/stt-transcriber-remote for client-side usage.

Environment Variables

OPENAI_API_KEY=your_openai_api_key_here

Error Handling

try {
  const transcription = await provider.transcribe(audioBlob);
} catch (error) {
  console.error("Transcription failed:", error);
  // Handle OpenAI API errors:
  // - Invalid audio format
  // - API key issues
  // - Rate limiting
  // - Network errors
}

Use Cases

  • API Endpoints: Provide STT service via your server
  • Secure Transcription: Keep API keys on server, expose via HTTP endpoint
  • Language Support: Leverage Whisper's multilingual capabilities
  • Rate Limiting: Control STT usage per user
  • Cost Monitoring: Track STT API usage and costs

Complete Example

Server (Next.js API Route)

// app/api/stt/route.ts
import { createOpenAISTTProvider } from "@charivo/stt-provider-openai";

const provider = createOpenAISTTProvider({
  apiKey: process.env.OPENAI_API_KEY!,
  defaultLanguage: "en"
});

export async function POST(request: NextRequest) {
  const formData = await request.formData();
  const audioFile = formData.get('audio') as File;
  const language = formData.get('language') as string | undefined;
  
  const audioBlob = new Blob([await audioFile.arrayBuffer()]);
  const transcription = await provider.transcribe(audioBlob, { language });
  
  return NextResponse.json({ transcription });
}

Client (uses Remote Transcriber)

import { createRemoteSTTTranscriber } from "@charivo/stt-transcriber-remote";
import { createSTTManager } from "@charivo/stt-core";

const transcriber = createRemoteSTTTranscriber({
  apiEndpoint: "/api/stt"
});
const sttManager = createSTTManager(transcriber);

// Start recording
await sttManager.start();

// Stop and get transcription
const text = await sttManager.stop();
console.log("User said:", text);

Pricing (OpenAI Whisper)

  • whisper-1: $0.006 per minute (rounded to the nearest second)

Example: 30 seconds of audio = $0.003

Audio Format Support

Whisper supports various audio formats:

  • MP3
  • MP4
  • MPEG
  • MPGA
  • M4A
  • WAV
  • WEBM

Maximum file size: 25 MB

Performance Tips

  1. Use appropriate audio quality: Higher quality doesn't always mean better transcription
  2. Specify language: Improves accuracy and speed
  3. Reduce background noise: Pre-process audio for better results
  4. Chunk long audio: Split audio files > 10 minutes for faster processing

Related Packages

License

MIT