@keyframelabs/sdk

v0.2.0

Published

a month ago

The universal, low-level SDK for Keyframe Labs.

Downloads

992

0High
0Medium
0Low

parthkfl

kradkfl

@keyframelabs/sdk

The universal, low-level SDK for Keyframe Labs.

Which package should I use?

@keyframelabs/sdk (high control)
- You implement the UI, state management, and agent/llm binding yourself
@keyframelabs/elements (custom UI)
- You implement the UI; we handle the state and agent/llm binding (framework-agnostic)
@keyframelabs/react: (drop-in)
- We handle the UI, state, and agent/llm binding

Installation

pnpm add @keyframelabs/sdk

Quick start

1. Server-side: create a session

From your backend, use your secret Keyframe API key to create a session.

// POST https://api.keyframelabs.com/v1/session
const response = await fetch('https://api.keyframelabs.com/v1/session', {
  method: 'POST',
  headers: {
    'Content-Type': 'application/json',
    'Authorization': `Bearer ${process.env.KFL_API_KEY}`,
  },
  body: JSON.stringify({
    persona_id: "6efx..." // or persona_slug
  }),
});

// Returns { serverUrl, participantToken }
const session = await response.json();

2. Client-side: connect the client

Pass the session details to the client SDK.

import { createClient } from '@keyframelabs/sdk';

const persona = createClient({
  serverUrl: "wss://...",
  participantToken: "A6gB...",
  agentIdentity: "agent",
  onVideoTrack: (track) => {
    // Some HTML video element
    videoElement.srcObject = new MediaStream([track]);
  },
  onAudioTrack: (track) => {
    // Some HTML audio element
    audioElement.srcObject = new MediaStream([track]);
  },
  onAgentStateChange: (state) => {
    console.log('Agent state:', state); // 'listening' | 'speaking'
  },
});

// Connect to the session
await client.connect();

// Send audio (e.g. from your LLM/Agent)
client.sendAudio(pcmAudioBytes);

// Signal an interruption (clears pending frames)
persona.interrupt();

// Cleanup
await client.close();

Architecture

The SDK handles the real-time transport loop between your agent or real-time LLM and the Keyframe Platform, as well as synced audio/video rendering.

+-----------------------+                         +-----------------------+
|  Browser              |                         |   Keyframe Platform   |
|                       |                         |                       |
|  +-----------------+  |                         |  +-----------------+  |
|  |   Microphone    |  |                         |  |  AvatarSession  |  |
|  +-----------------+  |                         |  +-----------------+  |
|           |           |                         |           ^           |
|           v           |                         |           |           |
|  +-----------------+  |       DataStream        |           |           |
|  |   Agent / LLM   |  |       (PCM 16kHz)       |           |           |
|  +-----------------+  | ----------------------> |           |           |
|           |           |                         |           |           |
|           v           |                         |           v           |
|  +-----------------+  |                         |  +-----------------+  |
|  | PersonaSession  |  |                         |  |    Inference    |  |
|  +-----------------+  |                         |  +-----------------+  |
|           ^           |                         |           |           |
|           |           |          WebRTC         |           |           |
|           |           |     (Audio + Video)     |           v           |
|  +-----------------+  | <---------------------- |  +-----------------+  |
|  |  Video Element  |  |                         |  |      Video      |  |
|  +-----------------+  |                         |  +-----------------+  |
|                       |                         |                       |
+-----------------------+                         +-----------------------+

Integrating a specific agent or real-time LLM

The SDK is intentionally minimal—it only handles the avatar connection. You bring your own agent or real-time LLM (e.g, Cartesia, ElevenLabs, Gemini, OpenAI).

API

`createClient(options)`

Initializes the WebSocket connection and media managers.

| Option | Type | Default | Description | | -------------------- | ----------------- | -------- | --------------------------------------------------------- | | serverUrl | string | Required | The WSS URL returned by the /session API endpoint. | | participantToken | string | Required | The access token returned by the /session API endpoint. | | agentIdentity | string | Required | Identity of the agent participant in the room. | | onVideoTrack | (track) => void | Required | Fired when the WebRTC video track is ready. | | onAudioTrack | (track) => void | Required | Fired when the WebRTC audio track is ready. | | onStateChange | (state) => void | — | Fired when session state changes. | | onAgentStateChange | (state) => void | — | Fired when agent playback state changes ('listening' or 'speaking'). | | onClose | (reason) => void | — | Fired when the session is closed (by server or client). | | onError | (err) => void | — | Fired on errors. |

`PersonaSession`

The client instance returned by createClient().

Methods

| Method | Signature | Description | | ------------ | ---------------------------------------------- | ------------------------------------------------ | | connect | () => Promise<void> | Connect to the session. | | sendAudio | (pcmData: ArrayBuffer \| Int16Array) => void | Send 24 kHz 16-bit PCM audio. | | interrupt | () => void | Signal an interruption and clear pending frames. | | setEmotion | (emotion: Emotion) => Promise<void> | Set the avatar's emotional expression. | | close | () => void | Close the session and release resources. |

Properties

| Property | Type | Description | | -------- | ---------------------------------------------------------- | ---------------------- | | state | 'disconnected' \| 'connecting' \| 'connected' \| 'error' | Current session state. |

Types

`Emotion`

Emotion states for avatar expression:

type Emotion = 'neutral' | 'angry' | 'sad' | 'happy';

Emotion Controls

You can dynamically change the avatar's emotional expression using the setEmotion method:

// Set the avatar to show happiness
await client.setEmotion('happy');

// React to user sentiment
await client.setEmotion('sad');

// Return to neutral
await client.setEmotion('neutral');

The avatar will blend its facial expression and demeanor to match the specified emotion. This can be triggered manually or in response to sentiment analysis, LLM output, or other signals from your application.

License

MIT

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

@keyframelabs/sdk

Installation

Quick start

1. Server-side: create a session

2. Client-side: connect the client

Architecture

Integrating a specific agent or real-time LLM

API

createClient(options)

PersonaSession

Methods

Properties

Types

Emotion

Emotion Controls

License

`createClient(options)`

`PersonaSession`

`Emotion`