@dimer47/gladia-sdk

v1.0.1

Published

a month ago

Handcrafted TypeScript SDK for the Gladia API — pre-recorded transcription & real-time live streaming via WebSocket

0High
0Medium
0Low

dimer47

gladia transcription speech-to-text websocket live-streaming diarization translation typescript sdk

🎙️ Gladia SDK — TypeScript Client

Version Bundle Size Downloads License

Handcrafted TypeScript SDK for the Gladia API — supporting both pre-recorded transcription (REST + automatic polling) and real-time live streaming via WebSocket, with full typed event system.

🌐 Version française

💡 Works everywhere: Node.js, Bun, Deno, and browsers — zero runtime dependencies.

🎉 Features

🎤 Pre-recorded transcription — upload by file or URL, automatic polling with exponential backoff
🔴 Live streaming — typed WebSocket with real-time events (partial, final, speech events)
📤 Upload — multipart (file) or JSON (remote URL)
🌍 Translation, summarization, diarization, sentiment analysis — and 10+ toggleable addons
🔒 PII redaction — personal data masking (GDPR, HIPAA...)
🏷️ 100% typed — TypeScript interfaces for all 54 API schemas
⚡ Dual ESM + CJS — compatible with all bundlers and runtimes
🪶 0 dependencies — only native fetch and WebSocket
🧪 91 unit tests — full coverage without any API key required

📍 Install

npm install @dimer47/gladia-sdk

yarn add @dimer47/gladia-sdk

pnpm add @dimer47/gladia-sdk

🚀 Quick Start

import { GladiaClient } from '@dimer47/gladia-sdk';

const gladia = new GladiaClient({ apiKey: 'gla_xxx' });

🕹️ Usage

📤 Upload

// From a file (Blob, File, Buffer)
const uploaded = await gladia.upload.fromFile(myBlob, 'recording.wav');
console.log(uploaded.audio_url);

// From a remote URL
const uploaded = await gladia.upload.fromUrl('https://example.com/audio.mp3');

🎧 Pre-recorded Transcription

✅ Simple mode (POST + automatic polling)

const result = await gladia.preRecorded.transcribe({
  audio_url: 'https://example.com/audio.mp3',
  diarization: true,
  translation: true,
  translation_config: { target_languages: ['en'] },
  onPoll: (res) => console.log(`⏳ ${res.status}...`),
});

console.log(result.result?.transcription?.full_transcript);

🔧 Granular control

// Create a job
const job = await gladia.preRecorded.create({
  audio_url: 'https://example.com/audio.mp3',
  summarization: true,
  sentiment_analysis: true,
});
console.log(`🆔 Job: ${job.id}`);

// Check status
const status = await gladia.preRecorded.get(job.id);
console.log(`📊 Status: ${status.status}`);

// List transcriptions
const list = await gladia.preRecorded.list({ limit: 10 });

// Download the original audio file
const audioBlob = await gladia.preRecorded.getFile(job.id);

// Delete
await gladia.preRecorded.delete(job.id);

🔴 Live Streaming

const session = await gladia.live.stream({
  encoding: 'wav/pcm',
  sample_rate: 16000,
  language_config: { languages: ['fr'] },
  realtime_processing: {
    translation: true,
    translation_config: { target_languages: ['en'] },
  },
});

// 📝 Listen to final transcriptions
session.on('transcript:final', (msg) => {
  console.log(`🗣️ ${msg.transcription.text}`);
});

// 📝 Listen to partial transcriptions
session.on('transcript:partial', (msg) => {
  process.stdout.write(`... ${msg.transcription.text}\r`);
});

// 🎤 Send audio chunks
session.sendAudio(audioChunk); // ArrayBuffer | Uint8Array | Blob

// ⏹️ Stop and wait for processing to finish
await session.stop();

🔧 Granular live control

// Initialize without opening the WebSocket
const liveSession = await gladia.live.init(
  { encoding: 'wav/pcm', sample_rate: 16000 },
  { region: 'eu-west' },
);
console.log(`🔗 WebSocket URL: ${liveSession.url}`);

// List, retrieve, delete
const sessions = await gladia.live.list({ limit: 5 });
const session = await gladia.live.get('session-id');
const audioBlob = await gladia.live.getFile('session-id');
await gladia.live.delete('session-id');

📦 Available Addons

| Addon | Field | Config | |-------|-------|--------| | 🗣️ Diarization | diarization | diarization_config | | 🌍 Translation | translation | translation_config | | 📝 Summarization | summarization | summarization_config | | 💬 Sentiment Analysis | sentiment_analysis | — | | 🏷️ Named Entity Recognition (NER) | named_entity_recognition | — | | 📑 Chapterization | chapterization | — | | 🔒 PII Redaction | pii_redaction | pii_redaction_config | | 📺 Subtitles | subtitles | subtitles_config | | 🤖 Audio to LLM | audio_to_llm | audio_to_llm_config | | ✏️ Custom Spelling | custom_spelling | custom_spelling_config | | 📊 Structured Data Extraction | structured_data_extraction | structured_data_extraction_config | | 🔤 Custom Vocabulary | custom_vocabulary | custom_vocabulary_config | | 🧑 Name Consistency | name_consistency | — | | 🖥️ Display Mode | display_mode | — | | 🚫 Moderation | moderation | — |

🧮 API Reference

`GladiaClient`

new GladiaClient(config: GladiaClientConfig)

| Parameter | Type | Description | |-----------|------|-------------| | apiKey | string | 🔑 Gladia API key (required) | | baseUrl | string? | Base URL (default: https://api.gladia.io) | | WebSocket | unknown? | Custom WebSocket constructor (for Node < 21, pass ws) |

`gladia.upload`

| Method | Description | |--------|-------------| | fromFile(blob, filename?, signal?) | 📤 Upload a file (multipart) | | fromUrl(url, signal?) | 🔗 Upload from a remote URL |

`gladia.preRecorded`

| Method | Description | |--------|-------------| | transcribe(options) | ✅ POST + automatic polling until completion | | create(request, signal?) | 📝 Create a transcription job | | get(id, signal?) | 🔍 Retrieve a job by ID | | list(params?, signal?) | 📋 List jobs (paginated) | | delete(id, signal?) | 🗑️ Delete a job | | getFile(id, signal?) | 💾 Download the original audio file |

`gladia.live`

| Method | Description | |--------|-------------| | stream(options?) | 🔴 Init + open WebSocket → LiveSession | | init(request?, options?) | 🔧 Init session (returns the WebSocket URL) | | get(id, signal?) | 🔍 Retrieve a session by ID | | list(params?, signal?) | 📋 List sessions (paginated) | | delete(id, signal?) | 🗑️ Delete a session | | getFile(id, signal?) | 💾 Download the audio recording |

`LiveSession`

| Method / Property | Description | |-------------------|-------------| | on(event, listener) | 👂 Listen to a typed event | | off(event, listener) | 🔇 Remove a listener | | sendAudio(data) | 🎤 Send an audio chunk | | stop() | ⏹️ Signal end and wait for processing | | closed | boolean — WebSocket state |

📡 Available Events

| Event | Type | Description | |-------|------|-------------| | transcript:final | LiveTranscriptMessage | Final transcription of an utterance | | transcript:partial | LiveTranscriptMessage | Partial transcription in progress | | speech-begin | LiveSpeechBeginMessage | Speech start detected | | speech-end | LiveSpeechEndMessage | Speech end detected | | ready | LiveReadyMessage | Session ready to receive audio | | done | LiveDoneMessage | Processing complete | | error | LiveErrorMessage | WebSocket error | | message | LiveBaseMessage | Any raw message (catch-all) |

⚠️ Error Handling

The SDK provides a typed error hierarchy:

GladiaError
├── GladiaApiError          (any HTTP error)
│   ├── BadRequestError     (400)
│   ├── UnauthorizedError   (401)
│   ├── ForbiddenError      (403)
│   ├── NotFoundError       (404)
│   └── UnprocessableEntityError (422)
├── GladiaTimeoutError      (polling timeout / abort)
└── GladiaWebSocketError    (WebSocket error)

import { UnauthorizedError, GladiaTimeoutError } from '@dimer47/gladia-sdk';

try {
  const result = await gladia.preRecorded.transcribe({
    audio_url: '...',
    pollTimeout: 60_000,
  });
} catch (err) {
  if (err instanceof UnauthorizedError) {
    console.error('🔑 Invalid API key');
  } else if (err instanceof GladiaTimeoutError) {
    console.error('⏱️ Timeout exceeded');
  }
}

🌐 Node < 21 Compatibility (WebSocket)

Node.js < 21 does not have a global WebSocket. Use the ws package:

npm install ws

import WebSocket from 'ws';
import { GladiaClient } from '@dimer47/gladia-sdk';

const gladia = new GladiaClient({
  apiKey: 'gla_xxx',
  WebSocket,
});

🧪 Tests

npm test            # Run all 91 tests
npm run test:watch  # Watch mode

Tests use mocks (fetch, WebSocket) — no API key required.

🏗️ Build

npm run build      # ESM + CJS + .d.ts via tsup
npm run typecheck   # TypeScript type checking

📁 Project Structure

gladia-sdk/
├── src/
│   ├── index.ts              # Public re-exports
│   ├── client.ts             # GladiaClient (facade)
│   ├── http.ts               # HttpClient (fetch wrapper)
│   ├── errors.ts             # Typed error hierarchy
│   ├── types/                # 54 TypeScript interfaces
│   │   ├── config.ts         # LanguageConfig, DiarizationConfig, PiiRedactionConfig...
│   │   ├── common.ts         # JobStatus, PaginationParams, GladiaClientConfig
│   │   ├── upload.ts         # UploadResponse, AudioMetadata
│   │   ├── pre-recorded.ts   # PreRecordedRequest (31 fields), responses
│   │   ├── live.ts           # LiveRequest, LiveResponse, LiveRequestParams
│   │   ├── transcription.ts  # Utterance, Word, TranscriptionDTO
│   │   └── addons.ts         # AddonTranslationDTO, SentimentAnalysisEntry...
│   ├── resources/
│   │   ├── upload.ts         # .fromFile(), .fromUrl()
│   │   ├── pre-recorded.ts   # .create(), .get(), .list(), .delete(), .getFile(), .transcribe()
│   │   └── live.ts           # .init(), .get(), .list(), .delete(), .getFile(), .stream()
│   ├── live/
│   │   ├── session.ts        # LiveSession (typed WebSocket)
│   │   └── events.ts         # LiveEventMap (11 event types)
│   └── utils/
│       └── polling.ts        # poll() with exponential backoff
├── tests/                    # 91 unit tests (vitest)
├── docs/
│   └── openapi.yaml          # Gladia OpenAPI specification (source of truth)
├── dist/                     # Build output (ESM + CJS + .d.ts)
├── package.json
├── tsconfig.json
└── tsup.config.ts

🧾 License

MIT