@derogab/stt-proxy

v0.3.1

Published

a month ago

A simple and lightweight proxy for seamless integration with multiple STT (Speech-to-Text) providers including Whisper.cpp

0High
0Medium
0Low

derogab

STT speech-to-text transcription whisper whisper.cpp cloudflare cloudflare-ai proxy gateway

stt-proxy

A simple and lightweight proxy for seamless integration with multiple STT providers including Whisper.cpp and Cloudflare AI.

Features

Multi-provider support: Switch between STT providers with environment variables.
TypeScript support: Full TypeScript definitions included.
Simple API: Single function interface for all providers.
Automatic provider detection: Automatically selects the best available provider based on environment variables.

Installation

npm install @derogab/stt-proxy

Quick Start

import { transcribe } from '@derogab/stt-proxy';

const result = await transcribe('/path/to/audio.wav');
console.log(result.text);

Configuration

The package automatically detects which STT provider to use based on your environment variables. Configure one or more providers:

Provider Selection

STT_PROVIDER=cloudflare # Optional, force a specific provider (whisper.cpp, cloudflare)

When STT_PROVIDER is set, the specified provider will be used and an error is thrown if its credentials are not configured. When not set, providers are selected automatically based on priority.

Note: PROVIDER is supported as a fallback for backward compatibility when STT_PROVIDER is not set.

Whisper.cpp (Local)

WHISPER_CPP_MODEL_PATH=/path/to/ggml-base.bin # Required, path to your GGML model file

Download models from HuggingFace:

curl -L -o ggml-base.bin https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-base.bin

Cloudflare AI

CLOUDFLARE_ACCOUNT_ID=your-account-id # Required
CLOUDFLARE_AUTH_KEY=your-api-token    # Required

Uses the @cf/openai/whisper-large-v3-turbo model.

API Reference

`transcribe(audio: string | Buffer, options?): Promise<TranscribeOutput>`

Transcribes audio to text using the configured STT provider. The package automatically manages provider initialization and cleanup.

Parameters:

audio: Path to audio file (string) or audio Buffer
options (optional): Transcription options

Returns:

Promise that resolves to an object with text property

Options Format:

type TranscribeOptions = {
  language?: string;   // Language code (e.g., 'en', 'es', 'fr')
  translate?: boolean; // Translate to English
};

Output Format:

type TranscribeOutput = {
  text: string;
};

Example:

// Transcribe from file path
const result1 = await transcribe('/path/to/audio.wav');
console.log(result1.text);

// Transcribe from Buffer
const audioBuffer = fs.readFileSync('/path/to/audio.wav');
const result2 = await transcribe(audioBuffer);
console.log(result2.text);

// With options
const result3 = await transcribe('/path/to/audio.wav', {
  language: 'en',
  translate: false
});
console.log(result3.text);

Provider Priority

When STT_PROVIDER environment variable is set, that provider is used directly.

Otherwise, the package selects providers in the following order:

Whisper.cpp (if WHISPER_CPP_MODEL_PATH is set and file exists)
Cloudflare AI (if CLOUDFLARE_ACCOUNT_ID and CLOUDFLARE_AUTH_KEY are set)

If no providers are configured, the function throws an error.

Requirements

FFmpeg: Required for audio conversion (Whisper.cpp only).

# macOS
brew install ffmpeg

# Ubuntu/Debian
sudo apt install ffmpeg

# Windows (with Chocolatey)
choco install ffmpeg

Development

# Install dependencies
npm install

# Build the package
npm run build

# Run tests
npm test

Credits

STT Proxy is made with ♥ by derogab and it's released under the MIT license.

Contributors

Tip

If you like this project or directly benefit from it, please consider buying me a coffee:
🔗 bc1qd0qatgz8h62uvnr74utwncc6j5ckfz2v2g4lef
⚡️ [email protected]
💶 Sponsor on GitHub

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme