@aid-on/unisttp

v0.1.1

Published

2 months ago

Unified STT (Speech-to-Text) provider for Cloudflare Workers AI / Groq

0High
0Medium
0Low

aid-on

stt speech-to-text whisper cloudflare groq workers-ai

@aid-on/unisttp

日本語 | English

Why unisttp?

Speech-to-text in edge applications means dealing with multiple providers, each with their own API quirks. unisttp gives you a single, type-safe interface to all of them:

One spec format - "cloudflare:whisper-large-v3-turbo" or "groq:whisper-large-v3" -- that's it
Automatic fallback chains - If Cloudflare fails, try Groq. No manual error handling
VAD filtering - Filter out silence and noise at the provider level
Zero runtime dependencies - Pure fetch-based, runs anywhere
Edge-native - Built for Cloudflare Workers from day one

Installation

npm install @aid-on/unisttp

Quick Start

import { getSTTProvider } from "@aid-on/unisttp";

// Create a provider using the "provider:model" spec format
const provider = getSTTProvider("cloudflare:whisper-large-v3-turbo", {
  cloudflareBinding: env.AI,
});

// Transcribe audio
const result = await provider.transcribe(audioBuffer, {
  language: "ja",
  vadFilter: true,
});

console.log(result.text);       // "Hello, world"
console.log(result.language);   // "en"
console.log(result.duration);   // 2.5

STTSpec Format

The core concept is the STTSpec -- a simple "provider:model" string that uniquely identifies a provider and model combination:

cloudflare:whisper-large-v3-turbo
^^^^^^^^^  ^^^^^^^^^^^^^^^^^^^^^
provider   model

Available Specs

| Spec | Description | VAD | Languages | |------|-------------|-----|-----------| | cloudflare:whisper-large-v3-turbo | Fast, accurate multilingual STT with VAD | Yes | All | | cloudflare:whisper-large-v3 | High accuracy multilingual STT | Yes | All | | cloudflare:whisper | Base Whisper model | No | All | | cloudflare:whisper-tiny-en | Fast English-only STT | No | English | | groq:whisper-large-v3 | High accuracy STT via Groq API | No | All | | groq:whisper-large-v3-turbo | Fast STT via Groq API | No | All | | groq:distil-whisper-large-v3-en | Distilled English STT via Groq | No | English |

API Reference

Core Functions

`getSTTProvider(spec, credentials)`

Create an STT provider instance from a spec string.

import { getSTTProvider } from "@aid-on/unisttp";

// Cloudflare Workers AI
const cfProvider = getSTTProvider("cloudflare:whisper-large-v3-turbo", {
  cloudflareBinding: env.AI,
});

// Groq
const groqProvider = getSTTProvider("groq:whisper-large-v3-turbo", {
  groqApiKey: env.GROQ_API_KEY,
});

`parseSTTSpec(spec)`

Parse a spec string into its components.

import { parseSTTSpec } from "@aid-on/unisttp";

const parsed = parseSTTSpec("cloudflare:whisper-large-v3-turbo");
// => {
//   provider: "cloudflare",
//   model: "whisper-large-v3-turbo",
//   spec: "cloudflare:whisper-large-v3-turbo"
// }

`createSTTSpec(provider, model)`

Create a spec string from provider and model.

import { createSTTSpec } from "@aid-on/unisttp";

const spec = createSTTSpec("groq", "whisper-large-v3");
// => "groq:whisper-large-v3"

`getBestProvider(credentials)`

Automatically select the best available provider based on credentials. Priority: Cloudflare (VAD support) > Groq.

import { getBestProvider } from "@aid-on/unisttp";

const provider = getBestProvider({
  cloudflareBinding: env.AI,
  groqApiKey: env.GROQ_API_KEY,
});
// Returns Cloudflare provider (higher priority due to VAD support)

`getAvailableProviders(credentials)`

List which providers are available based on the supplied credentials.

import { getAvailableProviders } from "@aid-on/unisttp";

const available = getAvailableProviders({
  cloudflareBinding: env.AI,
  groqApiKey: env.GROQ_API_KEY,
});
// => ["cloudflare", "groq"]

`hasCredentials(provider, credentials)`

Check if credentials are available for a specific provider.

import { hasCredentials } from "@aid-on/unisttp";

hasCredentials("cloudflare", { cloudflareBinding: env.AI }); // true
hasCredentials("groq", { cloudflareBinding: env.AI });       // false

Fallback Chain

`createFallbackChain(options)`

Create a resilient transcription pipeline that automatically falls back to the next provider on failure.

import { createFallbackChain } from "@aid-on/unisttp";

const chain = createFallbackChain({
  specs: [
    "cloudflare:whisper-large-v3-turbo",
    "groq:whisper-large-v3-turbo",
  ],
  credentials: {
    cloudflareBinding: env.AI,
    groqApiKey: env.GROQ_API_KEY,
  },
  onFallback: (error, nextSpec) => {
    console.warn(`Provider failed: ${error.message}, trying ${nextSpec}`);
  },
});

// Transcribe with automatic fallback
const result = await chain.transcribe(audioBuffer, { language: "ja" });

// Access chain metadata
const allProviders = chain.getProviders();
const primary = chain.getPrimary();

Model Metadata

`getModelInfo(spec)`

Get detailed metadata about a model.

import { getModelInfo } from "@aid-on/unisttp";

const info = getModelInfo("cloudflare:whisper-large-v3-turbo");
// => {
//   spec: "cloudflare:whisper-large-v3-turbo",
//   provider: "cloudflare",
//   model: "whisper-large-v3-turbo",
//   name: "Whisper Large V3 Turbo",
//   description: "Fast, accurate multilingual STT with VAD support",
//   supportsVAD: true,
//   supportsWordTimestamps: true,
//   languages: []
// }

`getModelsByProvider(provider)`

Get all models for a specific provider.

import { getModelsByProvider } from "@aid-on/unisttp";

const cfModels = getModelsByProvider("cloudflare");
// => [whisper-large-v3-turbo, whisper-large-v3, whisper, whisper-tiny-en]

`getModelsWithVAD()`

Get all models that support Voice Activity Detection filtering.

import { getModelsWithVAD } from "@aid-on/unisttp";

const vadModels = getModelsWithVAD();
// => [cloudflare:whisper-large-v3-turbo, cloudflare:whisper-large-v3]

`isValidSpec(spec)` / `getAllSpecs()`

Validate specs and list all available specs.

import { isValidSpec, getAllSpecs } from "@aid-on/unisttp";

isValidSpec("cloudflare:whisper-large-v3-turbo"); // true
isValidSpec("invalid:model");                      // false

const allSpecs = getAllSpecs();
// => ["cloudflare:whisper-large-v3-turbo", "cloudflare:whisper-large-v3", ...]

Types

import type {
  ProviderType,        // "cloudflare" | "groq"
  STTSpec,             // "cloudflare:whisper-large-v3-turbo" | ...
  ParsedSTTSpec,       // { provider, model, spec }
  Credentials,         // { cloudflareBinding?, groqApiKey?, ... }
  STTOptions,          // { language?, prompt?, vadFilter?, temperature? }
  STTResult,           // { text, language?, duration?, words? }
  STTProvider,         // { name, model, spec, transcribe() }
  ModelInfo,           // Full model metadata
  FallbackChainOptions // { specs, credentials, onFallback? }
} from "@aid-on/unisttp";

Configuration

STTOptions

| Option | Type | Default | Description | |--------|------|---------|-------------| | language | string | auto-detect | ISO 639-1 language code (e.g., "ja", "en") | | prompt | string | - | Initial prompt to guide transcription | | vadFilter | boolean | - | Enable VAD to filter non-speech segments | | temperature | number | - | Sampling temperature (0-1) |

Credentials

| Field | Type | Required For | |-------|------|-------------| | cloudflareBinding | Ai | Cloudflare Workers AI | | groqApiKey | string | Groq | | cloudflareApiKey | string | Cloudflare REST API | | cloudflareAccountId | string | Cloudflare REST API |

Real-World Example: Cloudflare Worker

import { createFallbackChain, getModelsWithVAD } from "@aid-on/unisttp";

export default {
  async fetch(request: Request, env: Env) {
    const audioBuffer = await request.arrayBuffer();

    const chain = createFallbackChain({
      specs: [
        "cloudflare:whisper-large-v3-turbo",
        "groq:whisper-large-v3-turbo",
      ],
      credentials: {
        cloudflareBinding: env.AI,
        groqApiKey: env.GROQ_API_KEY,
      },
      onFallback: (error, nextSpec) => {
        console.warn(`Fallback: ${error.message} -> ${nextSpec}`);
      },
    });

    const result = await chain.transcribe(audioBuffer, {
      language: "ja",
      vadFilter: true,
    });

    return Response.json({
      text: result.text,
      language: result.language,
      duration: result.duration,
      words: result.words,
    });
  },
};

License

MIT (C) Aid-On

Unified STT for the edge. One spec, any provider.

NPM • GitHub

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

@aid-on/unisttp

Why unisttp?

Installation

Quick Start

STTSpec Format

Available Specs

API Reference

Core Functions

getSTTProvider(spec, credentials)

parseSTTSpec(spec)

createSTTSpec(provider, model)

getBestProvider(credentials)

getAvailableProviders(credentials)

hasCredentials(provider, credentials)

Fallback Chain

createFallbackChain(options)

Model Metadata

getModelInfo(spec)

getModelsByProvider(provider)

getModelsWithVAD()

isValidSpec(spec) / getAllSpecs()

Types

Configuration

STTOptions

Credentials

Real-World Example: Cloudflare Worker

License

`getSTTProvider(spec, credentials)`

`parseSTTSpec(spec)`

`createSTTSpec(provider, model)`

`getBestProvider(credentials)`

`getAvailableProviders(credentials)`

`hasCredentials(provider, credentials)`

`createFallbackChain(options)`

`getModelInfo(spec)`

`getModelsByProvider(provider)`

`getModelsWithVAD()`

`isValidSpec(spec)` / `getAllSpecs()`