@byteplus/avatar-web-sdk
v1.0.0
Published
Avatar Web SDK enables conversational digital-avatar interactions in the browser. Your users speak into the microphone, and an AI-driven avatar (rendered via WebRTC) responds in real time with both voice and video.
Maintainers
Keywords
Readme
Avatar Web SDK
Avatar Web SDK enables conversational digital-avatar interactions in the browser. Your users speak into the microphone, and an AI-driven avatar (rendered via WebRTC) responds in real time with both voice and video.
Package variants
Two builds of the SDK are published — pick the one that matches your deployment region. They share the same API; only the underlying RTC module and registry differ.
| Region | Package | RTC dependency |
| ------- | -------------------------- | ---------------- |
| Global | @byteplus/avatar-web-sdk | @byteplus/rtc |
npm install @byteplus/avatar-web-sdkRequirements
- Browser with WebRTC, Web Audio API and Secure Context (HTTPS or
localhost) - Micropaccess permission required for voice interaction
- Chrome 90+, Safari 14+, Firefox 88+, Edge 90+
Quick start
import {
AvatarSDK,
AvatarSession,
AvatarSessionConfig,
RenderMode,
} from "@byteplus/avatar-web-sdk"; // pkg-variant
// 1. Initialize the SDK (do this once, treat as a singleton)
const sdk = new AvatarSDK({
appKey: "YOUR_APP_KEY",
secretKey: "YOUR_SECRET_KEY",
stsToken: "OPTIONAL_STS_TOKEN",
environment: "overseas",
logEnabled: true,
logLevel: "info",
});
// 2. Prepare a container for the avatar video
const videoContainer = document.getElementById("avatar-video") as HTMLDivElement;
// 3. Create and start a session
const sessionConfig: AvatarSessionConfig = {
avatarImageUrl: "https://your-cdn.example/avatar.jpg",
speaker: "zh_female_qingxin", // TTS voice ID
userPrompt: "You are a friendly assistant.",
speechRate: 0, // -50 to 100, default 0
loudnessRate: 0, // -50 to 100, default 0
enableWebsearch: false,
};
const session: AvatarSession = sdk.createSession(sessionConfig);
session.setAvatarVideoCanvas(videoContainer, RenderMode.FIT);
session.on("start", () => {
console.log("Avatar session started");
});
session.on("end", (error?: Error) => {
if (error) {
console.error("Session ended with error:", error);
} else {
console.log("Session ended normally");
}
});
await session.start();
// 4. Begin capturing microphone audio for voice interaction
await session.startAudioCapture();
// ... user speaks, avatar responds with voice + video ...
// 5. Stop
await session.stopAudioCapture();
await session.end();
// session is now invalid; create a new one for the next interactionSession lifecycle
idle ──start()──▶ starting ──▶ started ──end()──▶ ending ──▶ ended
(cannot be reused)- A session is single-use. After
end()returns, discard it and callsdk.createSession(...)again to start a new conversation. start()opens a WebSocket connection and initializes the WebRTC video/audio channel.end()sends a clean shutdown to the server and releases microphone / WebRTC resources.- You may call
destroy()as a safety net to force-release resources.
Events
AvatarSession extends EventEmitter. Subscribe with session.on(eventName, handler):
| Event | Payload | When it fires |
| ------------------- | --------------- | --------------------------------------------- |
| start | — | Session is ready for interaction |
| end | error?: Error | Session ended (either normally or with error) |
| audioCaptureStart | — | Microphone recording began |
| audioCaptureStop | — | Microphone recording stopped |
| audioCaptureFail | error: Error | Microphone access or recording failed |
| audioFrame | frame: AudioFrame | Raw PCM frame from the microphone |
session.on("audioCaptureFail", (error) => {
alert("Microphone access denied: " + error.message);
});Render modes
Passed to setAvatarVideoCanvas(container, mode):
RenderMode.HIDDEN— fill the container, cropping if necessary (default)RenderMode.FIT— fit within the container, preserving aspect ratioRenderMode.FILL— stretch to fill the container (may distort)
