plivo-stream-sdk-node

v1.0.0

Published

2 months ago

A Node.js SDK for handling Plivo real-time media streaming over WebSocket

0High
0Medium
0Low

plivo-sdks

plivo streaming websocket real-time media audio voice

plivo-stream-sdk-node

A Node.js SDK for handling Plivo real-time media streaming over WebSocket. Built on top of ws WebSocketServer.

Installation

npm install plivo-stream-sdk-node
# or
bun install plivo-stream-sdk-node

Quick Start

import express from 'express';
import PlivoWebSocketServer from 'plivo-stream-sdk-node';
import type { StartEvent, MediaEvent, DTMFEvent } from 'plivo-stream-sdk-node';
import * as Plivo from 'plivo';

const app = express();
const PORT = 8000;

// Plivo webhook endpoint - returns XML to initiate streaming
app.get('/stream', (req, res) => {
  const streamUrl = `wss://${req.get('host')}/stream`;
  const plivoResponse = new (Plivo as any).Response();
  plivoResponse.addSpeak('Hello world!');
  const params = {
    contentType: 'audio/x-mulaw;rate=8000',
    keepCallAlive: true,
    bidirectional: true,
  };
  plivoResponse.addStream(streamUrl, params);
  res.header('Content-Type', 'application/xml');
  res.header('Content-Length', plivoResponse.toString().length.toString());
  res.header('Connection', 'keep-alive');
  res.header('Keep-Alive', 'timeout=60');
  const xml = plivoResponse.toXML();
  res.type('application/xml');
  res.send(xml);
});

// Start HTTP server
const server = app.listen(PORT, () => {
  console.log(`Server listening on http://localhost:${PORT}`);
});

// Create PlivoWebSocketServer attached to your HTTP server
const plivoServer = new PlivoWebSocketServer({ server, path: '/stream' });

plivoServer
  .onConnection(async (ws, req) => {
    console.log('New WebSocket connection');
    // Initialize per-connection resources here (e.g., speech-to-text clients)
  })
  .onStart((event: StartEvent, ws) => {
    console.log('Stream started:', event.start.streamId);
    console.log('Call ID:', event.start.callId);
    console.log('Media format:', event.start.mediaFormat);
  })
  .onMedia((event: MediaEvent, ws) => {
    // Get raw audio buffer from the event
    const audioBuffer = event.getRawMedia();
    // Process audio (e.g., send to speech-to-text service)
  })
  .onDtmf((event: DTMFEvent, ws) => {
    console.log('DTMF digit:', event.dtmf.digit);

    // Example: clear audio queue on * press
    if (event.dtmf.digit === '*') {
      plivoServer.clearAudio(ws);
    }
  })
  .onPlayedStream((event) => {
    console.log('Stream played:', event.name);
  })
  .onClearedAudio((event) => {
    console.log('Audio cleared:', event.streamId);
  })
  .onError((error, ws) => {
    console.error('Stream error:', error.message);
  })
  .onClose((ws) => {
    console.log('Connection closed');
  })
  .start(); // Must call .start() to begin accepting connections

API Reference

`PlivoWebSocketServer`

Extends WebSocketServer from the ws package.

Constructor

new PlivoWebSocketServer(options: ServerOptions, callback?: () => void)

Standard ws ServerOptions. Common options:

server: HTTP/HTTPS server to attach to
path: URL path for WebSocket connections (e.g., '/stream')
port: Port to listen on (if not attaching to existing server)

Lifecycle Methods

`start(): this`

Start accepting WebSocket connections. Must be called after registering all event handlers.

`close(callback?: () => void): void`

Close the WebSocket server.

Event Registration Methods (Chainable)

All return this for chaining. Multiple handlers can be registered per event.

| Method | Callback Signature | Description | | ---------------- | ---------------------------------------- | ----------------------------------------------------------------------------------- | | onConnection | (ws, request) => void \| Promise<void> | New connection established. Async callbacks are awaited before processing messages. | | onStart | (event: StartEvent, ws) => void | Stream initialization with call metadata | | onMedia | (event: MediaEvent, ws) => void | Incoming audio chunk | | onDtmf | (event: DTMFEvent, ws) => void | DTMF digit received | | onPlayedStream | (event: PlayedStreamEvent, ws) => void | Audio playback confirmation | | onClearedAudio | (event: ClearedAudioEvent, ws) => void | Audio queue cleared confirmation | | onError | (error: Error, ws) => void | Error occurred | | onClose | (ws) => void | Connection closed |

Action Methods

`playAudio(ws, contentType, sampleRate, payload)`

Send audio to a specific connection.

// payload can be Buffer, Uint8Array, or ArrayBuffer
plivoServer.playAudio(ws, 'audio/x-mulaw', 8000, audioBuffer);

`checkpoint(ws, name)`

Send a checkpoint event to track audio playback progress.

plivoServer.checkpoint(ws, 'greeting-complete');

`clearAudio(ws)`

Clear all queued audio for a connection.

plivoServer.clearAudio(ws);

Getter Methods

| Method | Return Type | Description | | ------------------ | --------------------- | ------------------------------ | | getStreamId(ws) | string \| undefined | Stream ID for the connection | | getAccountId(ws) | string \| undefined | Plivo account ID | | getCallId(ws) | string \| undefined | Call ID | | getHeaders(ws) | string \| undefined | Extra headers from start event | | isActive(ws) | boolean | Whether connection is open |

Event Types

StartEvent

{
  event: 'start';
  sequenceNumber: number;
  start: {
    callId: string;      // UUID
    streamId: string;    // UUID
    accountId: string;
    tracks: string[];
    mediaFormat: {
      encoding: string;
      sampleRate: number;
    };
  };
  extra_headers: string;
}

MediaEvent

{
  event: 'media';
  sequenceNumber: number;
  streamId: string;
  media: {
    track: string;
    timestamp: string;
    chunk: number;
    payload: string;  // base64 encoded audio
  };
  extra_headers: string;
  getRawMedia(): Buffer;  // Helper to decode payload
}

DTMFEvent

{
  event: 'dtmf';
  sequenceNumber: number;
  streamId: string;
  dtmf: {
    track: string;
    digit: string;
    timestamp: string;
  }
  extra_headers: string;
}

PlayedStreamEvent

{
  event: 'playedStream';
  sequenceNumber: number;
  streamId: string;
  name: string;
}

ClearedAudioEvent

{
  event: 'clearedAudio';
  sequenceNumber: number;
  streamId: string;
}

Types Export

import type {
  StartEvent,
  MediaEvent,
  DTMFEvent,
  PlayedStreamEvent,
  ClearedAudioEvent,
  PlayAudioEvent,
  CheckpointEvent,
  ClearAudioEvent,
  IncomingEventEnum,
  OutgoingEventEnum,
} from 'plivo-stream-sdk-node';

Running the Example App

The examples/express-streaming directory contains a complete voice AI example using:

Deepgram - Real-time speech-to-text
OpenAI - Chat completion for responses
ElevenLabs - Text-to-speech

Prerequisites

Node.js 18+ or Bun
A Plivo account with streaming enabled
API keys for Deepgram, OpenAI, and ElevenLabs
A way to expose your local server (e.g., ngrok)

Setup

Navigate to the example directory:

cd examples/express-streaming

Install dependencies:

npm install
# or
bun install

Create a .env file:

PORT=8000

# Deepgram (https://console.deepgram.com)
DEEPGRAM_API_KEY=your_deepgram_api_key
DEEPGRAM_MODEL=nova-2

# OpenAI (https://platform.openai.com)
OPENAI_API_KEY=your_openai_api_key
OPENAI_MODEL=gpt-4o-mini

# ElevenLabs (https://elevenlabs.io)
ELEVENLABS_API_KEY=your_elevenlabs_api_key
ELEVENLABS_VOICE_ID=your_voice_id
ELEVENLABS_MODEL_ID=eleven_turbo_v2

Start the server:

npx ts-node server.ts
# or with bun
bun run server.ts

Expose your server using ngrok:

ngrok http 8000

Configure Plivo:
- Go to your Plivo console
- Set your application's Answer URL to: https://your-ngrok-url.ngrok.io/stream
Make a call to your Plivo number and start talking!

How It Works

When a call comes in, Plivo hits the /stream endpoint
The XML response initiates a bidirectional WebSocket stream
Audio from the caller is sent to Deepgram for transcription
Transcriptions are sent to OpenAI for a response
OpenAI's response is converted to speech via ElevenLabs
The audio is streamed back to the caller in real-time

DTMF Controls

Press * to clear the audio queue (interrupt the AI)

License

MIT

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

plivo-stream-sdk-node

Installation

Quick Start

API Reference

PlivoWebSocketServer

Constructor

Lifecycle Methods

start(): this

close(callback?: () => void): void

Event Registration Methods (Chainable)

Action Methods

playAudio(ws, contentType, sampleRate, payload)

checkpoint(ws, name)

clearAudio(ws)

Getter Methods

Event Types

StartEvent

MediaEvent

DTMFEvent

PlayedStreamEvent

ClearedAudioEvent

Types Export

Running the Example App

Prerequisites

Setup

How It Works

DTMF Controls

License

`PlivoWebSocketServer`

`start(): this`

`close(callback?: () => void): void`

`playAudio(ws, contentType, sampleRate, payload)`

`checkpoint(ws, name)`

`clearAudio(ws)`