@keyframelabs/sdk
v0.1.8
Published
The universal, low-level SDK for Keyframe Labs.
Readme
@keyframelabs/sdk
The universal, low-level SDK for Keyframe Labs.
Which package should I use?
- @keyframelabs/sdk (high control)
- You implement the UI, state management, and agent/llm binding yourself
- @keyframelabs/elements (custom UI)
- You implement the UI; we handle the state and agent/llm binding (framework-agnostic)
- @keyframelabs/react: (drop-in)
- We handle the UI, state, and agent/llm binding
Installation
pnpm add @keyframelabs/sdkQuick start
1. Server-side: create a session
From your backend, use your secret Keyframe API key to create a session.
// POST https://api.keyframelabs.com/v1/session
const response = await fetch('https://api.keyframelabs.com/v1/session', {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'Authorization': `Bearer ${process.env.KFL_API_KEY}`,
},
body: JSON.stringify({
persona_id: "6efx..." // or persona_slug
}),
});
// Returns { serverUrl, participantToken }
const session = await response.json();2. Client-side: connect the client
Pass the session details to the client SDK.
import { createClient } from '@keyframelabs/sdk';
const persona = createClient({
serverUrl: "wss://...",
participantToken: "A6gB...",
agentIdentity: "agent",
onVideoTrack: (track) => {
// Some HTML video element
videoElement.srcObject = new MediaStream([track]);
},
onAudioTrack: (track) => {
// Some HTML audio element
audioElement.srcObject = new MediaStream([track]);
},
onAgentStateChange: (state) => {
console.log('Agent state:', state); // 'listening' | 'speaking'
},
});
// Connect to the session
await client.connect();
// Send audio (e.g. from your LLM/Agent)
client.sendAudio(pcmAudioBytes);
// Signal an interruption (clears pending frames)
persona.interrupt();
// Cleanup
await client.close();Architecture
The SDK handles the real-time transport loop between your agent or real-time LLM and the Keyframe Platform, as well as synced audio/video rendering.
+-----------------------+ +-----------------------+
| Browser | | Keyframe Platform |
| | | |
| +-----------------+ | | +-----------------+ |
| | Microphone | | | | AvatarSession | |
| +-----------------+ | | +-----------------+ |
| | | | ^ |
| v | | | |
| +-----------------+ | DataStream | | |
| | Agent / LLM | | (PCM 16kHz) | | |
| +-----------------+ | ----------------------> | | |
| | | | | |
| v | | v |
| +-----------------+ | | +-----------------+ |
| | PersonaSession | | | | Inference | |
| +-----------------+ | | +-----------------+ |
| ^ | | | |
| | | WebRTC | | |
| | | (Audio + Video) | v |
| +-----------------+ | <---------------------- | +-----------------+ |
| | Video Element | | | | Video | |
| +-----------------+ | | +-----------------+ |
| | | |
+-----------------------+ +-----------------------+Integrating a specific agent or real-time LLM
The SDK is intentionally minimal—it only handles the avatar connection. You bring your own agent or real-time LLM (e.g, Cartesia, ElevenLabs, Gemini, OpenAI).
API
createClient(options)
Initializes the WebSocket connection and media managers.
| Option | Type | Default | Description |
| -------------------- | ----------------- | -------- | --------------------------------------------------------- |
| serverUrl | string | Required | The WSS URL returned by the /session API endpoint. |
| participantToken | string | Required | The access token returned by the /session API endpoint. |
| agentIdentity | string | Required | Identity of the agent participant in the room. |
| onVideoTrack | (track) => void | Required | Fired when the WebRTC video track is ready. |
| onAudioTrack | (track) => void | Required | Fired when the WebRTC audio track is ready. |
| onStateChange | (state) => void | — | Fired when session state changes. |
| onAgentStateChange | (state) => void | — | Fired when agent playback state changes ('listening' or 'speaking'). |
| onClose | (reason) => void | — | Fired when the session is closed (by server or client). |
| onError | (err) => void | — | Fired on errors. |
PersonaSession
The client instance returned by createClient().
Methods
| Method | Signature | Description |
| ------------ | ---------------------------------------------- | ------------------------------------------------ |
| connect | () => Promise<void> | Connect to the session. |
| sendAudio | (pcmData: ArrayBuffer \| Int16Array) => void | Send 24 kHz 16-bit PCM audio. |
| interrupt | () => void | Signal an interruption and clear pending frames. |
| setEmotion | (emotion: Emotion) => Promise<void> | Set the avatar's emotional expression. |
| close | () => void | Close the session and release resources. |
Properties
| Property | Type | Description |
| -------- | ---------------------------------------------------------- | ---------------------- |
| state | 'disconnected' \| 'connecting' \| 'connected' \| 'error' | Current session state. |
Types
Emotion
Emotion states for avatar expression:
type Emotion = 'neutral' | 'angry' | 'sad' | 'happy';Emotion Controls
You can dynamically change the avatar's emotional expression using the setEmotion method:
// Set the avatar to show happiness
await client.setEmotion('happy');
// React to user sentiment
await client.setEmotion('sad');
// Return to neutral
await client.setEmotion('neutral');The avatar will blend its facial expression and demeanor to match the specified emotion. This can be triggered manually or in response to sentiment analysis, LLM output, or other signals from your application.
License
MIT
