@unith-ai/core-client
v2.0.80
Published
Core TypeScript SDK for building digital human experiences with Unith AI
Maintainers
Readme
Unith Core Client Typescript SDK
An SDK library for building complex digital human experiences using javascript/typescript that run on Unith AI.
Prerequisite
Before proceeding with using this library, you're expected to have an account on Unith AI, create a digital human and take note of your API key. You can create an account here in minutes!
Installation
Install the package in your project through package manager.
npm install @unith-ai/core-client
# or
yarn add @unith-ai/core-client
# or
pnpm install @unith-ai/core-clientAzure Speech SDK (Optional)
If using microphoneProvider: "azure", install the Azure Speech SDK:
npm install microsoft-cognitiveservices-speech-sdkThe SDK is dynamically imported only when the Azure provider is used, so it won't affect bundle size if you use a different provider like eleven_labs.
Usage
This library is designed for use in plain JavaScript applications or to serve as a foundation for framework-specific implementations. Before integrating it, verify if a dedicated library exists for your particular framework. That said, it's compatible with any project built on JavaScript.
Initialize Digital Human
First, initialize the Conversation instance:
const conversation = await Conversation.startDigitalHuman(options);This will establish a WebSocket connection and initialize the digital human with realtime audio & video streaming capabilities.
Session Configuration
The options passed to startDigitalHuman specify how the session is established:
const conversation = await Conversation.startDigitalHuman({
orgId: "your-org-id",
headId: "your-head-id",
element: document.getElementById("video-container"), // HTML element for video output
apiKey: "your-api-key",
allowWakeLock: true,
username: "test-user",
...callbacks,
});Required Parameters
- orgId - Your organization ID
- headId - The digital human head ID to use
- apiKey - API key for authentication (default: "")
- element - HTML element where the video will be rendered
Optional Parameters
- mode - Conversation mode (default: "default")
- language - Language code for the conversation (default: browser language)
- allowWakeLock - Prevent screen from sleeping during conversation (default: true)
- microphoneProvider - Provider for the microphone -
"azure" | "eleven_labs" - voiceInterruptions - Flag to enable response interruption when voice signal is recognized. -
boolean(default:false) - dataTag -
stringCustom data to be used for analytics. - username -
stringCustom identifier for user. - microphoneOptions - Callbacks for microphone events
- onMicrophoneSpeechRecognitionResult ({ transcript: string }) - Called when microphone recognizes your user's speech. This returns a transcript. Ideal behavior is to call the
.sendMessagemethod with your transcript as microphone doesn't automatically commit / send users text to our AI. - onMicrophonePartialSpeechRecognitionResult () - Called when microphone recognizes that the user is trying to speak. This doesn't returns a transcript. Ideal behavior is to check if digital human is currently responding (
speaking) and trigger the stop response method (.stopResponse). - onMicrophoneStatusChange ({status}) Called when microphone status changes
- status
"ON" | "OFF" | "PROCESSING"Shows current status of microphone.
- status
- onMicrophoneError ({ message: string }) - Called when microphone has an error with the error message.
- onMicrophoneSpeechRecognitionResult ({ transcript: string }) - Called when microphone recognizes your user's speech. This returns a transcript. Ideal behavior is to call the
- elevenLabsOptions
ElevenLabsOptions- Custom options to customize behavior of ElevenLabs STT provider- noiseSuppression
Boolean - vadSilenceThresholdSecs
Number - vadThreshold
Number - minSpeechDurationMs
Number - minSilenceDurationMs
Number - disableDynamicSpeechRecognition
Boolean- This disables ElevenLabs dynamic speech recognition language detection and uses the language specified during digital human creation.
- noiseSuppression
Callbacks
Register callbacks to handle various events:
- onConnect ({userId, headInfo, microphoneAccess, sessionId}) - Called when the WebSocket connection is established
- userId
BooleanUnique Identifier for the users session. - headInfo
ConnectHeadTypeObject with data about the digital human.- name
StringDigital human head name - phrases
String[]Array with phrases set during digital human creation. - language
StringLanguage code setup during digital human creation. - avatar
StringStatic image url for digital human.
- name
- microphoneAccess
BooleanTrue if microphone access was granted, False - sessionId
StringUnique session identifier for a conversation.
- userId
- onDisconnect () - Called when the connection is closed
- onStatusChange ({status}) - Called when connection status changes
- status
"connecting" | "connected" | "disconnecting" | "disconnected"Shows current websocket connection status.
- status
- onMessage ({ timestamp, speaker, text, visible }) - Called when websocket receives a message or sends a response.
- timestamp
DateTimestamp when message was received/sent - sender
"user" | "ai"Shows who the message came from. - text
StringMessage text - visible
BooleanFlag that you can use to control visibility of message. Sometimes, message comes before the video response starts playing. In such cases, this is usuallyfalse. Listen theonSpeakingStartevent to change visibility when the video response starts playing.
- timestamp
- onMuteStatusChange - Called when mute status changes
- onSpeakingStart - Called when the digital human starts speaking
- onSpeakingEnd - Called when the digital human finishes speaking
- onSuggestions ({suggestions}) - Invoked when the system generates or updates query suggestions.
- suggestions
String[]A list of suggested query strings.
- suggestions
- onTimeout - Called when the session times out due to inactivity
- onTimeoutWarning - Called before the session times out. This event warns you that the customers session is going to end in a bit. You can call the
keepSessionmethod to extend the customers session. - onKeepSession - Called when a keep-alive request is processed
- onError - Called when an error occurs
- onStoppingStart - Called immediately when a stop request is initiated
- onStoppingEnd - Called once the current response stops playing
Getting Background Video
Retrieve the idle background video URL for use in welcome screens or widget mode:
const videoUrl = await Conversation.getBackgroundVideo({
orgId: "your-org-id",
headId: "your-head-id",
});Instance Methods
startSession()
Start the conversation session and begin audio & video playback:
await conversation.startSession();This method should be called after user interaction to ensure audio context is properly initialized, especially on mobile browsers.
sendMessage(message)
Send a text message to the digital human:
conversation.sendMessage("Hello, how are you?");toggleMicrophone()
Toggles microphone status between ON/OFF.
conversation.toggleMicrophone();getMicrophoneStatus()
Get status of microphone. Can be ON/OFF/PROCESSING
conversation.getMicrophoneStatus();keepSession()
Sends keep-alive event to prevent session timeout. Trigger this when you receive the onTimeoutWarning event to prevent session from timing out.
conversation.keepSession();This clears both audio and video queues and returns the digital human to idle state.
toggleMute()
Toggle the mute status of the audio output:
const volume = await conversation.toggleMute();
console.log("New volume:", volume); // 0 for muted, 1 for unmutedgetUserId()
Get the current user's ID:
const userId = conversation.getUserId();stopResponse()
Stops the ongoing response and notifies you via two callbacks:
onStoppingStart()- Called immediately when stop is initiated.onStoppingEnd()- Called once the current response is stopped.
await conversation.stopResponse();endSession()
End the conversation session and clean up resources:
await conversation.endSession();This closes the WebSocket connection, releases the wake lock, and destroys audio/video outputs.
Error Handling
Always handle errors appropriately:
try {
const conversation = await Conversation.startDigitalHuman({
orgId: "your-org-id",
headId: "your-head-id",
element: videoElement,
onError: ({ message, endConversation, type }) => {
if (type === "toast") {
// Show toast notification
showToast(message);
if (endConversation) {
// Restart the session
restartSession();
}
} else if (type === "modal") {
// Show modal dialog
showModal(message);
}
},
});
} catch (error) {
console.error("Failed to start digital human:", error);
}TypeScript Support
Full TypeScript types are included:
Development
Please refer to the README.md file in the root of this repository.
