@byteplus/avatar-web-sdk

v1.0.0

Published

2 days ago

Avatar Web SDK enables conversational digital-avatar interactions in the browser. Your users speak into the microphone, and an AI-driven avatar (rendered via WebRTC) responds in real time with both voice and video.

Avatar Web SDK

Package variants

Two builds of the SDK are published — pick the one that matches your deployment region. They share the same API; only the underlying RTC module and registry differ.

| Region | Package | RTC dependency | | ------- | -------------------------- | ---------------- | | Global | @byteplus/avatar-web-sdk | @byteplus/rtc |

npm install @byteplus/avatar-web-sdk

Requirements

Browser with WebRTC, Web Audio API and Secure Context (HTTPS or localhost)
Micropaccess permission required for voice interaction
Chrome 90+, Safari 14+, Firefox 88+, Edge 90+

Quick start

import {
  AvatarSDK,
  AvatarSession,
  AvatarSessionConfig,
  RenderMode,
} from "@byteplus/avatar-web-sdk"; // pkg-variant

// 1. Initialize the SDK (do this once, treat as a singleton)
const sdk = new AvatarSDK({
  appKey: "YOUR_APP_KEY",
  secretKey: "YOUR_SECRET_KEY",
  stsToken: "OPTIONAL_STS_TOKEN",
  environment: "overseas",
  logEnabled: true,
  logLevel: "info",
});

// 2. Prepare a container for the avatar video
const videoContainer = document.getElementById("avatar-video") as HTMLDivElement;

// 3. Create and start a session
const sessionConfig: AvatarSessionConfig = {
  avatarImageUrl: "https://your-cdn.example/avatar.jpg",
  speaker: "zh_female_qingxin",    // TTS voice ID
  userPrompt: "You are a friendly assistant.",
  speechRate: 0,                    // -50 to 100, default 0
  loudnessRate: 0,                  // -50 to 100, default 0
  enableWebsearch: false,
};

const session: AvatarSession = sdk.createSession(sessionConfig);

session.setAvatarVideoCanvas(videoContainer, RenderMode.FIT);

session.on("start", () => {
  console.log("Avatar session started");
});

session.on("end", (error?: Error) => {
  if (error) {
    console.error("Session ended with error:", error);
  } else {
    console.log("Session ended normally");
  }
});

await session.start();

// 4. Begin capturing microphone audio for voice interaction
await session.startAudioCapture();

// ... user speaks, avatar responds with voice + video ...

// 5. Stop
await session.stopAudioCapture();
await session.end();
// session is now invalid; create a new one for the next interaction

Session lifecycle

 idle ──start()──▶ starting ──▶ started ──end()──▶ ending ──▶ ended
                                                    (cannot be reused)

A session is single-use. After end() returns, discard it and call sdk.createSession(...) again to start a new conversation.
start() opens a WebSocket connection and initializes the WebRTC video/audio channel.
end() sends a clean shutdown to the server and releases microphone / WebRTC resources.
You may call destroy() as a safety net to force-release resources.

Events

AvatarSession extends EventEmitter. Subscribe with session.on(eventName, handler):

| Event | Payload | When it fires | | ------------------- | --------------- | --------------------------------------------- | | start | — | Session is ready for interaction | | end | error?: Error | Session ended (either normally or with error) | | audioCaptureStart | — | Microphone recording began | | audioCaptureStop | — | Microphone recording stopped | | audioCaptureFail | error: Error | Microphone access or recording failed | | audioFrame | frame: AudioFrame | Raw PCM frame from the microphone |

session.on("audioCaptureFail", (error) => {
  alert("Microphone access denied: " + error.message);
});

Render modes

Passed to setAvatarVideoCanvas(container, mode):

RenderMode.HIDDEN — fill the container, cropping if necessary (default)
RenderMode.FIT — fit within the container, preserving aspect ratio
RenderMode.FILL — stretch to fill the container (may distort)

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme