@spatialwalk/avatarkit
v1.0.0-beta.91
Published
AvatarKit SDK - 3D Gaussian Splatting Avatar Rendering SDK
Downloads
1,507
Readme
AvatarKit SDK
Real-time virtual avatar rendering SDK for Web, supporting audio-driven animation and high-quality 3D rendering.
🚀 Features
- High-Quality 3D Rendering - GPU-accelerated avatar rendering with automatic backend selection
- Audio-Driven Real-Time Animation - Send audio data, SDK handles animation and rendering
- Multi-Avatar Support - Support multiple avatar instances simultaneously, each with independent state and rendering
- TypeScript Support - Complete type definitions and IntelliSense
- Modular Architecture - Clear component separation, easy to integrate and extend
📦 Installation
npm install @spatialwalk/avatarkit🚧 Release Gate (Hard Rule)
Release must pass gates before publish. Do not publish by manual ad-hoc commands.
Required gate checks:
pnpm typecheck
pnpm test
pnpm build
./tools/check_perf_baseline_release_gate.shIf iteration includes bugfixes, docs/bugfix-history.md must have completed rows (test mapping + red/green evidence).
Hotfix bypass is allowed only for emergency and must be recorded:
HOTFIX_BYPASS=1 ./tools/check_perf_baseline_release_gate.sh🧪 Benchmark Demo (Web SDK)
Use the dedicated benchmark demo (independent from vanilla/) for perf/render baseline runs:
pnpm demo:benchmark🚀 Demo Repository
📌 Quick Start: Check Out Our Demo Repository
We provide complete example code and best practices to help you quickly integrate the SDK.
The demo repository includes:
- ✅ Complete integration examples
- ✅ Usage examples for both SDK mode and Host mode
- ✅ Audio processing examples (PCM16, WAV, MP3, etc.)
- ✅ Vite configuration examples
- ✅ Next.js configuration examples
- ✅ Best practices for common scenarios
👉 View Demo Repository | If not yet created, please contact the team
🔧 Vite Configuration (Recommended)
If you are using Vite as your build tool, we strongly recommend using our Vite plugin to automatically handle WASM file configuration. The plugin automatically handles all necessary configurations, so you don't need to set them up manually.
Using the Plugin
Add the plugin to vite.config.ts:
import { defineConfig } from 'vite'
import { avatarkitVitePlugin } from '@spatialwalk/avatarkit/vite'
export default defineConfig({
plugins: [
avatarkitVitePlugin(), // Just add this line
],
})Plugin Features
The plugin automatically handles:
- ✅ Development Server: Automatically sets the correct MIME type (
application/wasm) for WASM files - ✅ Build Time: Automatically copies WASM files to
dist/assets/directory - ✅ Cloudflare Pages: Automatically generates
_headersfile to ensure WASM files use the correct MIME type - ✅ Vite Configuration: Automatically configures
optimizeDeps,assetsInclude,assetsInlineLimit, and other options
Manual Configuration (Without Plugin)
If you don't use the Vite plugin, you need to manually configure the following:
// vite.config.ts
export default defineConfig({
optimizeDeps: {
exclude: ['@spatialwalk/avatarkit'],
},
assetsInclude: ['**/*.wasm'],
build: {
assetsInlineLimit: 0,
rollupOptions: {
output: {
assetFileNames: (assetInfo) => {
if (assetInfo.name?.endsWith('.wasm')) {
return 'assets/[name][extname]'
}
return 'assets/[name]-[hash][extname]'
},
},
},
},
// Development server needs to manually configure middleware to set WASM MIME type
configureServer(server) {
server.middlewares.use((req, res, next) => {
if (req.url?.endsWith('.wasm')) {
res.setHeader('Content-Type', 'application/wasm')
}
next()
})
},
})🔧 Next.js Configuration
For Next.js projects, use the withAvatarkit wrapper to automatically handle WASM file configuration with webpack.
Using the Plugin
Wrap your Next.js config in next.config.mjs:
import { withAvatarkit } from '@spatialwalk/avatarkit/next'
export default withAvatarkit({
// ...your existing Next.js config
})Plugin Features
The plugin automatically handles:
- ✅ Path Fix: Patches asset path resolution so WASM files are correctly loaded at
/_next/static/chunks/ - ✅ WASM Copying: Copies
.wasmfiles intostatic/chunks/via a custom webpack plugin (client build only) - ✅ Content-Type Headers: Adds
application/wasmresponse header for/_next/static/chunks/*.wasm - ✅ Config Chaining: Preserves your existing
webpackandheadersconfigurations
🔐 Authentication
All environments require an App ID and Session Token for authentication.
App ID
The App ID is used to identify your application. You can obtain your App ID by:
- For Testing: Use the default test App ID provided in demo repositories (paired with test Session Token, only works with publicly available test avatars like Rohan, Dr.Kellan, Priya, Josh, etc.)
- For Production: Visit the Developer Platform to create your own App and avatars. You will receive your own App ID after creating an App.
Session Token
The Session Token is required for authentication and must be obtained from your SDK provider.
⚠️ Important Notes:
- The Session Token must be valid and not expired
- In production applications, you must manually inject a valid Session Token obtained from your SDK provider
- The default Session Token provided in demo repositories is only for demonstration purposes and can only be used with test avatars
- If you want to create your own avatars and test them, please visit the Developer Platform to create your own App and generate Session Tokens
How to Set Session Token:
// Initialize SDK with App ID
await AvatarSDK.initialize('your-app-id', configuration)
// Set Session Token (can be called before or after initialization)
// If called before initialization, the token will be automatically set when you initialize the SDK
AvatarSDK.setSessionToken('your-session-token')
// Get current Session Token
const sessionToken = AvatarSDK.sessionTokenToken Management:
- The Session Token can be set at any time using
AvatarSDK.setSessionToken(token) - If you set the token before initializing the SDK, it will be automatically applied during initialization
- If you set the token after initialization, it will be applied immediately
- Handle token refresh logic in your application as needed (e.g., when token expires)
For Production Integration:
- Obtain a valid Session Token from your SDK provider
- Store the token securely (never expose it in client-side code if possible)
- Implement token refresh logic to handle token expiration
- Use
AvatarSDK.setSessionToken(token)to inject the token programmatically
🎯 Quick Start
⚠️ Important: Audio Context Initialization
Before using any audio-related features, you MUST initialize the audio context in a user gesture context (e.g., click, touchstart event handlers). This is required by browser security policies. Calling initializeAudioContext() outside a user gesture will fail.
Basic Usage
import {
AvatarSDK,
AvatarManager,
AvatarView,
Configuration,
Environment,
DrivingServiceMode,
LogLevel
} from '@spatialwalk/avatarkit'
// 1. Initialize SDK
const configuration: Configuration = {
environment: Environment.cn,
drivingServiceMode: DrivingServiceMode.sdk, // Optional, 'sdk' is default
// - DrivingServiceMode.sdk: SDK mode - SDK handles network communication
// - DrivingServiceMode.host: Host mode - Host app provides audio and animation data
logLevel: LogLevel.off, // Optional, 'off' is default
// - LogLevel.off: Disable all logs
// - LogLevel.error: Only error logs
// - LogLevel.warning: Warning and error logs
// - LogLevel.all: All logs (info, warning, error)
audioFormat: { // Optional, default is { channelCount: 1, sampleRate: 16000 }
channelCount: 1, // Fixed to 1 (mono)
sampleRate: 16000 // Supported: 8000, 16000, 22050, 24000, 32000, 44100, 48000 Hz
}
// characterApiBaseUrl: 'https://custom-api.example.com' // Optional, internal debug config, can be ignored
}
await AvatarSDK.initialize('your-app-id', configuration)
// Set Session Token (required for authentication)
// You must obtain a valid Session Token from your SDK provider
// See Authentication section above for more details
AvatarSDK.setSessionToken('your-session-token')
// 2. Load avatar
const avatarManager = AvatarManager.shared
const avatar = await avatarManager.load('character-id', (progress) => {
console.log(`Loading progress: ${progress.progress}%`)
})
// 3. Create view (automatically creates Canvas and AvatarController)
// The playback mode is determined by drivingServiceMode in AvatarSDK configuration
// - DrivingServiceMode.sdk: SDK mode - SDK handles network communication
// - DrivingServiceMode.host: Host mode - Host app provides audio and animation data
const container = document.getElementById('avatar-container')
const avatarView = new AvatarView(avatar, container)
// 4. ⚠️ CRITICAL: Initialize audio context (MUST be called in user gesture context)
// This method MUST be called within a user gesture event handler (click, touchstart, etc.)
// to satisfy browser security policies. Calling it outside a user gesture will fail.
button.addEventListener('click', async () => {
// Initialize audio context - MUST be in user gesture context
await avatarView.controller.initializeAudioContext()
// 5. Start real-time communication (SDK mode only)
await avatarView.controller.start()
// 6. Send audio data (SDK mode, must be mono PCM16 format matching configured sample rate)
// audioData: ArrayBuffer or Uint8Array containing PCM16 audio samples
// - PCM files: Can be directly read as ArrayBuffer
// - WAV files: Extract PCM data from WAV format (may require resampling)
// - MP3 files: Decode first (e.g., using AudioContext.decodeAudioData()), then convert to PCM16
const audioData = new ArrayBuffer(1024) // Placeholder: Replace with actual PCM16 audio data
avatarView.controller.send(audioData, false) // Send audio data
avatarView.controller.send(audioData, true) // end=true marks the end of current conversation round
})Host Mode Example
// 1-3. Same as SDK mode (initialize SDK, load avatar)
// 3. Create view with Host mode
const container = document.getElementById('avatar-container')
const avatarView = new AvatarView(avatar, container)
// 4. ⚠️ CRITICAL: Initialize audio context (MUST be called in user gesture context)
// This method MUST be called within a user gesture event handler (click, touchstart, etc.)
// to satisfy browser security policies. Calling it outside a user gesture will fail.
button.addEventListener('click', async () => {
// Initialize audio context - MUST be in user gesture context
await avatarView.controller.initializeAudioContext()
// 5. Host Mode Workflow:
// Send audio data first to get conversationId, then use it to send animation data
const conversationId = avatarView.controller.yieldAudioData(audioData, false)
avatarView.controller.yieldFramesData(animationDataArray, conversationId) // animationDataArray: (Uint8Array | ArrayBuffer)[]Complete Examples
This SDK supports two usage modes:
- SDK mode: Real-time audio input with automatic animation data reception
- Host mode: Custom data sources with manual audio/animation data management
🏗️ Architecture Overview
Core Components
- AvatarSDK - SDK initialization and management
- AvatarManager - Avatar resource loading and management
- AvatarView - 3D rendering view
- AvatarController - Audio/animation playback controller
Playback Modes
The SDK supports two playback modes, configured in AvatarSDK.initialize():
1. SDK Mode (Default)
- Configured via
drivingServiceMode: DrivingServiceMode.sdkinAvatarSDK.initialize() - SDK handles network communication automatically
- Send audio data via
AvatarController.send() - SDK receives animation data from backend and synchronizes playback
- Best for: Real-time audio input scenarios
2. Host Mode
- Configured via
drivingServiceMode: DrivingServiceMode.hostinAvatarSDK.initialize() - Host application manages its own network/data fetching
- Host application provides both audio and animation data
- SDK only handles synchronized playback
- Best for: Custom data sources, pre-recorded content, or custom network implementations
Note: The playback mode is determined by drivingServiceMode in AvatarSDK.initialize() configuration.
Fallback Mechanism
The SDK includes a fallback mechanism to ensure audio playback continues even when animation data is unavailable:
- SDK Mode Connection Failure: If connection fails to establish within 15 seconds, the SDK automatically enters fallback mode. Audio data can still be sent and will play normally, even though no animation data will be received. This ensures audio playback is not interrupted.
- SDK Mode Server Error: If the server returns an error after connection is established, the SDK automatically enters audio-only mode for that session.
- Host Mode: If empty animation data is provided (empty array or undefined), the SDK automatically enters audio-only mode.
- Once in audio-only mode, any subsequent animation data for that session will be ignored, and only audio will continue playing.
- The fallback mode is interruptible, just like normal playback mode.
- Connection state callbacks (
onConnectionState) will notify you when connection fails or times out.
Data Flow
SDK Mode Flow
Audio input (PCM16 mono)
↓
AvatarController.send()
↓
Backend processing → Animation data
↓
SDK synchronizes audio + animation playback
↓
GPU rendering → CanvasHost Mode Flow
External data source (audio + animation)
↓
AvatarController.yieldAudioData(audioChunk) → returns conversationId
AvatarController.yieldFramesData(dataArray, conversationId)
↓
SDK synchronizes audio + animation playback
↓
GPU rendering → CanvasAudio Format Requirements
⚠️ Important: The SDK requires audio data to be in mono PCM16 format:
- Sample Rate: Configurable via
audioFormat.sampleRatein SDK initialization (default: 16000 Hz)- Supported sample rates: 8000, 16000, 22050, 24000, 32000, 44100, 48000 Hz
- The configured sample rate will be used for both audio recording and playback
- Channels: Mono (single channel) - Fixed to 1 channel
- Format: PCM16 (16-bit signed integer, little-endian)
- Byte Order: Little-endian
Audio Data Format:
- Each sample is 2 bytes (16-bit signed integer, little-endian)
- Audio data should be provided as
ArrayBufferorUint8Array - For example, with 16kHz sample rate: 1 second of audio = 16000 samples × 2 bytes = 32000 bytes
- For 48kHz sample rate: 1 second of audio = 48000 samples × 2 bytes = 96000 bytes
Audio Data Source:
The audioData parameter represents raw PCM16 audio samples in the configured sample rate and mono format. Common audio sources include:
- PCM files: Raw PCM16 files can be directly read as
ArrayBufferorUint8Arrayand sent to the SDK (ensure sample rate matches configuration) - WAV files: WAV files contain PCM16 audio data in their data chunk. After extracting the PCM data from the WAV file format, it can be sent to the SDK (may require resampling if sample rate differs)
- MP3 files: MP3 files need to be decoded first (e.g., using
AudioContext.decodeAudioData()or a decoder library), then converted from the decoded format to PCM16 before sending to the SDK - Microphone input: Real-time microphone audio needs to be captured and converted to PCM16 format at the configured sample rate before sending
- Other audio sources: Any audio source must be converted to mono PCM16 format at the configured sample rate before sending
Example: Processing WAV and MP3 Files:
// WAV file processing
async function processWAVFile(wavFile: File): Promise<ArrayBuffer> {
const arrayBuffer = await wavFile.arrayBuffer()
const view = new DataView(arrayBuffer)
// WAV format: Skip header (usually 44 bytes for standard WAV)
// Check RIFF header
if (view.getUint32(0, true) !== 0x46464952) { // "RIFF"
throw new Error('Invalid WAV file')
}
// Find "data" chunk (offset may vary)
let dataOffset = 44 // Standard WAV header size
// For non-standard WAV files, you may need to search for "data" chunk
// This is a simplified example - production code should parse chunks properly
const pcmData = arrayBuffer.slice(dataOffset)
return pcmData
}
// MP3 file processing
async function processMP3File(mp3File: File, targetSampleRate: number): Promise<ArrayBuffer> {
const arrayBuffer = await mp3File.arrayBuffer()
const audioContext = new AudioContext({ sampleRate: targetSampleRate })
// Decode MP3 to AudioBuffer
const audioBuffer = await audioContext.decodeAudioData(arrayBuffer.slice(0))
// Convert AudioBuffer to PCM16 ArrayBuffer
const length = audioBuffer.length
const channels = audioBuffer.numberOfChannels
const pcm16Buffer = new ArrayBuffer(length * 2)
const pcm16View = new DataView(pcm16Buffer)
// Mix down to mono if stereo
const sourceData = channels === 1
? audioBuffer.getChannelData(0)
: new Float32Array(length)
if (channels > 1) {
const leftChannel = audioBuffer.getChannelData(0)
const rightChannel = audioBuffer.getChannelData(1)
for (let i = 0; i < length; i++) {
sourceData[i] = (leftChannel[i] + rightChannel[i]) / 2 // Mix to mono
}
}
// Convert float32 (-1.0 to 1.0) to int16 (-32768 to 32767)
for (let i = 0; i < length; i++) {
const sample = Math.max(-1, Math.min(1, sourceData[i])) // Clamp
const int16Sample = sample < 0 ? sample * 0x8000 : sample * 0x7FFF
pcm16View.setInt16(i * 2, int16Sample, true) // little-endian
}
audioContext.close()
return pcm16Buffer
}
// Usage example:
// const wavPcmData = await processWAVFile(wavFile)
// avatarView.controller.send(wavPcmData, false)
//
// const mp3PcmData = await processMP3File(mp3File, 16000) // 16kHz
// avatarView.controller.send(mp3PcmData, false)Resampling:
- If your audio source is at a different sample rate, you must resample it to match the configured sample rate before sending to the SDK
- For high-quality resampling, we recommend using Web Audio API's
OfflineAudioContextwith anti-aliasing filtering - See example projects for resampling implementation
Configuration Example:
const configuration: Configuration = {
environment: Environment.cn,
audioFormat: {
channelCount: 1, // Fixed to 1 (mono)
sampleRate: 48000 // Choose from: 8000, 16000, 22050, 24000, 32000, 44100, 48000
}
}📚 API Reference
AvatarSDK
The core management class of the SDK, responsible for initialization and global configuration.
// Initialize SDK
await AvatarSDK.initialize(appId: string, configuration: Configuration)
// Check initialization status
const isInitialized = AvatarSDK.isInitialized
// Get initialized app ID
const appId = AvatarSDK.appId
// Get configuration
const config = AvatarSDK.configuration
// Set Session Token (required for authentication)
// You must obtain a valid Session Token from your SDK provider
// See Authentication section for more details
AvatarSDK.setSessionToken('your-session-token')
// Set userId (optional, for telemetry)
AvatarSDK.setUserId('user-id')
// Get sessionToken
const sessionToken = AvatarSDK.sessionToken
// Get userId
const userId = AvatarSDK.userId
// Get SDK version
const version = AvatarSDK.version
// Cleanup resources (must be called when no longer in use)
AvatarSDK.cleanup()AvatarManager
Avatar resource manager, responsible for downloading, caching, and loading avatar data. Use the singleton instance via AvatarManager.shared.
// Get singleton instance
const manager = AvatarManager.shared
// Load avatar
const avatar = await manager.load(
id: string,
onProgress?: (progress: LoadProgressInfo) => void
)
// Clear cache
manager.clearAll()AvatarView
3D rendering view, responsible for 3D rendering only. Internally automatically creates and manages AvatarController.
constructor(avatar: Avatar, container: HTMLElement)Parameters:
avatar: Avatar instancecontainer: Canvas container element (required)- Canvas automatically uses the full size of the container (width and height)
- Canvas aspect ratio adapts to container size - set container size to control aspect ratio
- Canvas will be automatically added to the container
- SDK automatically handles resize events via ResizeObserver
Playback Mode:
- The playback mode is determined by
drivingServiceModeinAvatarSDK.initialize()configuration - The playback mode is fixed when creating
AvatarViewand persists throughout its lifecycle - Cannot be changed after creation
// Create view (Canvas is automatically added to container)
const container = document.getElementById('avatar-container')
const avatarView = new AvatarView(avatar, container)
// Wait for first frame to render
avatarView.onFirstRendering = () => {
// First frame rendered
}
// Get or set avatar transform (position and scale)
// Get current transform
const currentTransform = avatarView.transform // { x: number, y: number, scale: number }
// Set transform
avatarView.transform = { x, y, scale }
// - x: Horizontal offset in normalized coordinates (-1 to 1, where -1 = left edge, 0 = center, 1 = right edge)
// - y: Vertical offset in normalized coordinates (-1 to 1, where -1 = bottom edge, 0 = center, 1 = top edge)
// - scale: Scale factor (1.0 = original size, 2.0 = double size, 0.5 = half size)
// Cleanup resources (must be called before switching avatars)
avatarView.dispose()Avatar Switching Example:
// To switch avatars, simply dispose the old view and create a new one
if (currentAvatarView) {
currentAvatarView.dispose()
}
// Load new avatar
const newAvatar = await avatarManager.load('new-character-id')
// Create new AvatarView
currentAvatarView = new AvatarView(newAvatar, container)
// SDK mode: start connection (will throw error if not in SDK mode)
await currentAvatarView.controller.start()AvatarController
Audio/animation playback controller, manages synchronized playback of audio and animation. Automatically handles network communication in SDK mode.
Two Usage Patterns:
SDK Mode Methods
// ⚠️ CRITICAL: Initialize audio context first (MUST be called in user gesture context)
// This method MUST be called within a user gesture event handler (click, touchstart, etc.)
// to satisfy browser security policies. Calling it outside a user gesture will fail.
// All audio operations (start, send, etc.) require prior initialization.
button.addEventListener('click', async () => {
// Initialize audio context - MUST be in user gesture context
await avatarView.controller.initializeAudioContext()
// Start service
await avatarView.controller.start()
// Send audio data (must be mono PCM16 format matching configured sample rate)
const conversationId = avatarView.controller.send(audioData: ArrayBuffer, end: boolean)
// Returns: conversationId - Conversation ID for this conversation session
// end: false (default) - Continue sending audio data for current conversation
// end: true - Mark the end of current conversation round. After end=true, sending new audio data will interrupt any ongoing playback from the previous conversation round
})
// Close service
avatarView.controller.close()Host Mode Methods
// ⚠️ CRITICAL: Initialize audio context first (MUST be called in user gesture context)
// This method MUST be called within a user gesture event handler (click, touchstart, etc.)
// to satisfy browser security policies. Calling it outside a user gesture will fail.
// All audio operations (yieldAudioData, yieldFramesData, etc.) require prior initialization.
button.addEventListener('click', async () => {
// Initialize audio context - MUST be in user gesture context
await avatarView.controller.initializeAudioContext()
// Stream audio chunks (must be mono PCM16 format matching configured sample rate)
const conversationId = avatarView.controller.yieldAudioData(
data: Uint8Array, // Audio chunk data (PCM16 format)
isLast: boolean = false // Whether this is the last chunk
)
// Returns: conversationId - Conversation ID for this audio session
// Stream animation keyframes (requires conversationId from audio data)
avatarView.controller.yieldFramesData(
keyframesDataArray: (Uint8Array | ArrayBuffer)[], // Animation keyframes binary data array
conversationId: string // Conversation ID (required)
)
})⚠️ Important: Conversation ID (conversationId) Management
SDK Mode:
send()returns a conversationId to distinguish each conversation roundend=truemarks the end of a conversation round
Host Mode:
yieldAudioData()returns a conversationId (automatically generates if starting new session)yieldFramesData()requires a valid conversationId parameter- Animation data with mismatched conversationId will be discarded
- Use
getCurrentConversationId()to retrieve the current active conversationId
Common Methods (Both Modes)
// Pause playback (from playing state)
avatarView.controller.pause()
// Resume playback (from paused state)
await avatarView.controller.resume()
// Interrupt current playback (stops and clears data)
avatarView.controller.interrupt()
// Clear all data and resources
avatarView.controller.clear()
// Get current conversation ID (for Host mode)
const conversationId = avatarView.controller.getCurrentConversationId()
// Returns: Current conversationId for the active audio session, or null if no active session
// Volume control (affects only avatar audio player, not system volume)
avatarView.controller.setVolume(0.5) // Set volume to 50% (0.0 to 1.0)
const currentVolume = avatarView.controller.getVolume() // Get current volume (0.0 to 1.0)
// Set event callbacks
avatarView.controller.onConnectionState = (state: ConnectionState) => {} // SDK mode only
avatarView.controller.onConversationState = (state: ConversationState) => {}
avatarView.controller.onError = (error: Error) => {} // Usually AvatarError (includes code for SDK/server errors)Avatar Transform Methods
// Get or set avatar transform (position and scale in canvas)
// Get current transform
const currentTransform = avatarView.transform // { x: number, y: number, scale: number }
// Set transform
avatarView.transform = { x, y, scale }
// - x: Horizontal offset in normalized coordinates (-1 to 1, where -1 = left edge, 0 = center, 1 = right edge)
// - y: Vertical offset in normalized coordinates (-1 to 1, where -1 = bottom edge, 0 = center, 1 = top edge)
// - scale: Scale factor (1.0 = original size, 2.0 = double size, 0.5 = half size)
// Example:
avatarView.transform = { x: 0, y: 0, scale: 1.0 } // Center, original size
avatarView.transform = { x: 0.5, y: 0, scale: 2.0 } // Right half, double sizeImportant Notes:
start()andclose()are only available in SDK modeyieldAudioData()andyieldFramesData()are only available in Host modepause(),resume(),interrupt(),clear(),getCurrentConversationId(),setVolume(), andgetVolume()are available in both modes- The playback mode is determined when creating
AvatarViewand cannot be changed
🔧 Configuration
Configuration
interface Configuration {
environment: Environment
drivingServiceMode?: DrivingServiceMode // Optional, default is 'sdk' (SDK mode)
logLevel?: LogLevel // Optional, default is 'off' (no logs)
audioFormat?: AudioFormat // Optional, default is { channelCount: 1, sampleRate: 16000 }
characterApiBaseUrl?: string // Optional, internal debug config, can be ignored
}
interface AudioFormat {
readonly channelCount: 1 // Fixed to 1 (mono)
readonly sampleRate: number // Supported: 8000, 16000, 22050, 24000, 32000, 44100, 48000 Hz, default: 16000
}LogLevel
Control the verbosity of SDK logs:
enum LogLevel {
off = 'off', // Disable all logs
error = 'error', // Only error logs
warning = 'warning', // Warning and error logs
all = 'all' // All logs (info, warning, error) - default
}Note: LogLevel.off completely disables all logging, including error logs. Use with caution in production environments.
Description:
environment: Specifies the environment (cn/intl), SDK will automatically use the corresponding server addresses based on the environmentdrivingServiceMode: Specifies the driving service modeDrivingServiceMode.sdk(default): SDK mode - SDK handles network communication automaticallyDrivingServiceMode.host: Host mode - Host application provides audio and animation data
logLevel: Controls the verbosity of SDK logsLogLevel.off(default): Disable all logsLogLevel.error: Only error logsLogLevel.warning: Warning and error logsLogLevel.all: All logs (info, warning, error)
audioFormat: Configures audio sample rate and channel countchannelCount: Fixed to 1 (mono channel)sampleRate: Audio sample rate in Hz (default: 16000)- Supported values: 8000, 16000, 22050, 24000, 32000, 44100, 48000
- The configured sample rate will be used for both audio recording and playback
characterApiBaseUrl: Internal debug config, can be ignoredsessionToken: Required for authentication. Set separately viaAvatarSDK.setSessionToken(), not in Configuration. See Authentication section for details
enum Environment {
cn = 'cn', // China region
intl = 'intl', // International region
}CameraConfig
interface CameraConfig {
position: [number, number, number] // Camera position
target: [number, number, number] // Camera target
fov: number // Field of view angle
near: number // Near clipping plane
far: number // Far clipping plane
up?: [number, number, number] // Up direction
aspect?: number // Aspect ratio
}📊 State Management
ConnectionState
enum ConnectionState {
disconnected = 'disconnected',
connecting = 'connecting',
connected = 'connected',
failed = 'failed'
}ConversationState
enum ConversationState {
idle = 'idle', // Idle state (breathing animation)
playing = 'playing', // Playing state (active conversation)
pausing = 'pausing' // Pausing state (paused during playback)
}State Description:
idle: Avatar is in idle state (breathing animation), waiting for conversation to startplaying: Avatar is playing conversation content (including during transition animations)pausing: Avatar playback is paused (e.g., whenend=falseand waiting for more audio data)
Note: During transition animations, the target state is notified immediately:
- When transitioning from
idletoplaying, theplayingstate is notified immediately - When transitioning from
playingtoidle, theidlestate is notified immediately
🎨 Rendering System
The SDK automatically selects the best rendering backend for your browser, no manual configuration needed.
🚨 Error Handling
AvatarError
The SDK uses custom error types, providing more detailed error information:
import { AvatarError } from '@spatialwalk/avatarkit'
try {
await avatarView.controller.start()
} catch (error) {
if (error instanceof AvatarError) {
console.error('SDK Error:', error.message, error.code)
} else {
console.error('Unknown error:', error)
}
}Error Callbacks
import { AvatarError } from '@spatialwalk/avatarkit'
avatarView.controller.onError = (error: Error) => {
if (error instanceof AvatarError) {
console.error('AvatarController error:', error.message, error.code)
return
}
console.error('AvatarController unknown error:', error)
}In SDK mode, server MESSAGE_SERVER_ERROR is forwarded to onError as AvatarError:
error.message: server-returned error messageerror.codemapping:401->sessionTokenExpired400->sessionTokenInvalid404->avatarIDUnrecognized- other HTTP status -> original status code string (for example,
"500")
🔄 Resource Management
Lifecycle Management
SDK Mode Lifecycle
// Initialize
const container = document.getElementById('avatar-container')
const avatarView = new AvatarView(avatar, container)
await avatarView.controller.start()
// Use
avatarView.controller.send(audioData, false)
// Cleanup - dispose() automatically cleans up all resources including connections
avatarView.dispose()Host Mode Lifecycle
// Initialize
const container = document.getElementById('avatar-container')
const avatarView = new AvatarView(avatar, container)
// Use
const conversationId = avatarView.controller.yieldAudioData(audioChunk, false)
avatarView.controller.yieldFramesData(keyframesDataArray, conversationId)
// Cleanup - dispose() automatically cleans up all resources including playback data
avatarView.dispose()⚠️ Important Notes:
dispose()automatically cleans up all resources, including:- Network connections (SDK mode)
- Playback data and animation resources (both modes)
- Render system and canvas elements
- All event listeners and callbacks
- Not properly calling
dispose()may cause resource leaks and rendering errors - If you need to manually close connections or clear playback data before disposing, you can call
avatarView.controller.close()(SDK mode) oravatarView.controller.clear()(both modes) first, but it's not required asdispose()handles this automatically
Memory Optimization
- SDK automatically manages memory allocation
- Supports dynamic loading/unloading of avatar and animation resources
🌐 Browser Compatibility
- Chrome/Edge 90+ (WebGPU recommended)
- Firefox 90+ (WebGL)
- Safari 14+ (WebGL)
- Mobile iOS 14+, Android 8+
📝 License
MIT License
🤝 Contributing
Issues and Pull Requests are welcome!
📞 Support
For questions, please contact:
- Email: [email protected]
- Documentation: https://docs.spatialreal.ai
