krio-stt

v1.0.0

Published

3 months ago

Speech-to-text for Krio and other African languages — powered by Whisper large-v3 via Hugging Face

0High
0Medium
0Low

maybeadev

speech-to-text stt krio whisper huggingface africa sierra-leone transcribe voice

krio-stt

Speech-to-text for Krio and other African languages — powered by Whisper large-v3 via the Hugging Face Inference API.

Works with voice notes from WhatsApp, Telegram, local files, or any audio URL. No extra dependencies — pure Node.js.

Why Whisper large-v3?

Standard STT engines struggle with Krio (Sierra Leonean Creole), Nigerian Pidgin, and heavily accented speech. Whisper large-v3 is trained on a wide range of languages and accents and handles these significantly better than smaller models.

Installation

npm install krio-stt

Or use it locally in a monorepo:

npm install ../krio-stt

Quick Start

const { transcribeUrl, transcribeBuffer, transcribeFile } = require('krio-stt');

Set your Hugging Face API key (free at huggingface.co/settings/tokens):

export HUGGINGFACE_API_KEY=hf_...

Transcribe from a URL

// Public URL — no auth needed
const text = await transcribeUrl('https://example.com/audio.ogg');

// Protected URL — pass a Bearer token
const text = await transcribeUrl(mediaUrl, {
  auth: { bearer: process.env.WHAPI_TOKEN },
});

Transcribe from a Buffer

const text = await transcribeBuffer(audioBuffer, {
  contentType: 'audio/ogg',
});

Transcribe from a local file

const text = await transcribeFile('/path/to/voice.ogg');

Auth Options

The auth option in transcribeUrl supports multiple formats:

// Bearer token (WhatsApp / Whapi, most APIs)
{ auth: { bearer: 'TOKEN' } }

// Basic auth
{ auth: { basic: { user: 'username', pass: 'password' } } }

// Arbitrary header
{ auth: { header: { key: 'X-API-Key', value: 'TOKEN' } } }

// Raw Authorization header value
{ auth: 'Bearer TOKEN' }

Platform Examples

WhatsApp (via Whapi)

const { transcribeUrl } = require('krio-stt');

// In your webhook handler:
if (message.type === 'audio') {
  const text = await transcribeUrl(message.media.url, {
    auth: { bearer: process.env.WHAPI_TOKEN },
  });
  // text is now the Krio/English transcript — handle it like a typed message
}

const { transcribeUrl } = require('krio-stt');

// After calling getFile to resolve file_path:
const url = `https://api.telegram.org/file/bot${BOT_TOKEN}/${file_path}`;
const text = await transcribeUrl(url); // no auth header needed — token is in URL

Express file upload

const { transcribeBuffer } = require('krio-stt');

app.post('/transcribe', upload.single('audio'), async (req, res) => {
  const text = await transcribeBuffer(req.file.buffer, {
    contentType: req.file.mimetype,
  });
  res.json({ transcript: text });
});

API Reference

`transcribeUrl(url, [options])` → `Promise<string>`

| Option | Type | Default | Description | |---------------|--------|--------------------------|----------------------------------------| | auth | object | — | Auth config for downloading the file | | contentType | string | auto-detected | Override MIME type | | apiKey | string | HUGGINGFACE_API_KEY | Hugging Face API key | | model | string | openai/whisper-large-v3| Whisper model ID | | timeout | number | 60000 | Request timeout in milliseconds |

`transcribeBuffer(buffer, [options])` → `Promise<string>`

Same options as above except auth (not applicable).

`transcribeFile(filePath, [options])` → `Promise<string>`

Same options as transcribeBuffer. Content-type is inferred from the file extension.

Notes

The Hugging Face free tier may have a cold-start delay (~20–30s) if the model hasn't been used recently. The first request after a period of inactivity will take longer.
Audio files must be under 25 MB (Whisper's hard limit).
For production use, consider a paid HF Inference Endpoint to avoid cold starts.

License

MIT

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

krio-stt

Why Whisper large-v3?

Installation

Quick Start

Transcribe from a URL

Transcribe from a Buffer

Transcribe from a local file

Auth Options

Platform Examples

WhatsApp (via Whapi)

Telegram

Express file upload

API Reference

transcribeUrl(url, [options]) → Promise<string>

transcribeBuffer(buffer, [options]) → Promise<string>

transcribeFile(filePath, [options]) → Promise<string>

Notes

License

`transcribeUrl(url, [options])` → `Promise<string>`

`transcribeBuffer(buffer, [options])` → `Promise<string>`

`transcribeFile(filePath, [options])` → `Promise<string>`