agora-agent-client-toolkit
v1.1.0
Published
Agora Agent Client Toolkit — real-time voice agent integration for web
Maintainers
Readme
agora-agent-client-toolkit
Framework-agnostic TypeScript SDK for building real-time conversational AI experiences with Agora RTC and RTM.
Install
npm install agora-agent-client-toolkit agora-rtc-sdk-ng
# RTM is optional — only required for sendText, sendImage, interrupt
npm install agora-rtmQuick Start
import AgoraRTC from 'agora-rtc-sdk-ng';
import {
AgoraVoiceAI,
AgoraVoiceAIEvents,
TranscriptHelperMode,
} from 'agora-agent-client-toolkit';
// 1. Create the RTC client
const rtcClient = AgoraRTC.createClient({ mode: 'rtc', codec: 'vp8' });
// 2. Initialize the AI singleton
const ai = await AgoraVoiceAI.init({
rtcEngine: rtcClient,
renderMode: TranscriptHelperMode.WORD,
});
// 3. Subscribe to events
// TRANSCRIPT_UPDATED delivers the complete conversation history — replace, don't append
ai.on(AgoraVoiceAIEvents.TRANSCRIPT_UPDATED, (transcript) => {
console.log('Transcript:', transcript);
});
ai.on(AgoraVoiceAIEvents.AGENT_STATE_CHANGED, (agentUserId, event) => {
console.log('Agent state:', event.state);
});
// 4. Join and publish via the RTC client directly — AgoraVoiceAI does not wrap join/publish
await rtcClient.join('YOUR_APP_ID', 'YOUR_CHANNEL', 'YOUR_TOKEN', null);
const micTrack = await AgoraRTC.createMicrophoneAudioTrack();
await rtcClient.publish([micTrack]);
// 5. Start receiving AI messages
await ai.subscribeMessage('YOUR_CHANNEL');Prerequisites
| Requirement | Version |
|-------------|---------|
| agora-rtc-sdk-ng | ≥ 4.23.4 |
| agora-rtm | ≥ 2.0.0 (optional) |
| TypeScript | ≥ 5.0 |
| Node.js / browser | ES2020+ |
For WORD-mode transcript rendering, set ENABLE_AUDIO_PTS_METADATA before creating the RTC client:
AgoraRTC.setParameter('ENABLE_AUDIO_PTS_METADATA', true);
const rtcClient = AgoraRTC.createClient({ mode: 'rtc', codec: 'vp8' });Agent start parameters
The following parameters must be set when starting the AI agent via the Agora REST API. Missing any will cause events to not fire — no runtime error is thrown.
| Parameter | Required for |
|-----------|-------------|
| advanced_features.enable_rtm: true | AGENT_STATE_CHANGED, MESSAGE_RECEIPT_UPDATED, MESSAGE_ERROR, MESSAGE_SAL_STATUS |
| parameters.data_channel: "rtm" | Same as above |
| parameters.enable_metrics: true | AGENT_METRICS |
| parameters.enable_error_message: true | AGENT_ERROR |
Configuration Reference
AgoraVoiceAIConfig fields:
| Field | Type | Required | Description |
|-------|------|----------|-------------|
| rtcEngine | IAgoraRTCClient | Yes | Agora RTC client instance |
| rtmConfig | { rtmEngine: RTMClient } | No | RTM client for text/image messaging and agent interruption |
| renderMode | TranscriptHelperMode | No | TEXT, WORD, CHUNK, or AUTO. If omitted, defaults to AUTO — mode is detected from the first agent message. |
| enableLog | boolean | No | Enable debug logging (default: false) |
| enableAgoraMetrics | boolean | No | Load @agora-js/report for usage metrics (default: false) |
API Reference
AgoraVoiceAI
Static methods
AgoraVoiceAI.init(config: AgoraVoiceAIConfig): Promise<AgoraVoiceAI>
AgoraVoiceAI.getInstance(): AgoraVoiceAI // throws NotFoundError if not yet initializedInstance methods
ai.subscribeMessage(channel: string): void
ai.unsubscribe(): void
ai.chat(agentUserId: string, message: ChatMessageText | ChatMessageImage): Promise<void>
ai.sendText(agentUserId: string, message: ChatMessageText): Promise<void> // requires rtmConfig
ai.sendImage(agentUserId: string, message: ChatMessageImage): Promise<void> // requires rtmConfig
ai.interrupt(agentUserId: string): Promise<void> // requires rtmConfig
ai.destroy(): void
ai.getCfg(): { rtcEngine, renderMode, channel, enableLog }
ai.on(event, handler): void
ai.off(event, handler): voidNote:
AgoraVoiceAIdoes not wrap RTC join/publish. CallrtcClient.join()andrtcClient.publish()directly on your Agora RTC client before callingai.subscribeMessage().
Advanced API
ai.getState(): AgoraVoiceAIState
// Returns a snapshot of SDK state: { initialized, channel, hasRtm, renderMode, listenerCounts }
ai.setMaxListeners(n: number): AgoraVoiceAI
// Set the maximum number of listeners per event before a warning is logged.
// Defaults to 10. Set to 0 to disable the warning.
ai.setLogLevel(level: EventLogLevel): AgoraVoiceAI
// Set the log verbosity for event lifecycle.
// EventLogLevel.NONE (default) | EventLogLevel.ERRORS | EventLogLevel.DEBUG
ai.once(event, handler): AgoraVoiceAI
// Register a one-time handler that is automatically removed after the first invocation.
ai.removeAllEventListeners(): void
// Remove all registered event handlers. Called internally by destroy().Events
TRANSCRIPT_UPDATED
Fires whenever the transcript changes. Delivers the complete conversation history array — not an incremental update. Always re-render your UI from the full array.
ai.on(AgoraVoiceAIEvents.TRANSCRIPT_UPDATED, (transcript: TranscriptHelperItem[]) => {
renderTranscript(transcript); // replace, don't append
});Top-level fields per item: uid, stream_id, turn_id, _time, text, status (TurnStatus), metadata.
Word-level timing is at metadata.words — not at the top level.
AGENT_STATE_CHANGED (requires rtmConfig)
Fires on agent lifecycle transitions.
ai.on(AgoraVoiceAIEvents.AGENT_STATE_CHANGED, (agentUserId, event) => {
// agentUserId: string
// event: { state: AgentState, turnID: string, timestamp: number, reason?: string }
console.log(agentUserId, event.state);
});AgentState values: idle | listening | thinking | speaking | silent
Requires
advanced_features.enable_rtm: trueandparameters.data_channel: "rtm"in the agent start request.
AGENT_INTERRUPTED
Fires when the agent's current turn is interrupted.
ai.on(AgoraVoiceAIEvents.AGENT_INTERRUPTED, (agentUserId, event) => {
// agentUserId: string
// event: { turnID: number, timestamp: number }
});AGENT_METRICS
Fires when a performance metrics message is received from an agent module.
ai.on(AgoraVoiceAIEvents.AGENT_METRICS, (agentUserId, metrics) => {
// agentUserId: string
// metrics: { type: ModuleType, name: string, value: number, timestamp: number }
});
// ModuleType: 'llm' | 'mllm' | 'tts' | 'context' | 'unknown'Requires
parameters.enable_metrics: truein the agent start request.
AGENT_ERROR
Fires when an agent module reports an error.
ai.on(AgoraVoiceAIEvents.AGENT_ERROR, (agentUserId, error) => {
// agentUserId: string
// error: { type: ModuleType, code: number, message: string, timestamp: number }
});Requires
parameters.enable_error_message: truein the agent start request.
MESSAGE_RECEIPT_UPDATED (requires rtmConfig)
Fires when a delivery or read receipt is received for a sent message.
ai.on(AgoraVoiceAIEvents.MESSAGE_RECEIPT_UPDATED, (agentUserId, receipt) => {
// agentUserId: string
// receipt: { moduleType: ModuleType, messageType: ChatMessageType, message: string, turnId: string }
});
// ChatMessageType: 'text' | 'image' | 'unknown'MESSAGE_ERROR (requires rtmConfig)
Fires when a chat message fails to deliver.
ai.on(AgoraVoiceAIEvents.MESSAGE_ERROR, (agentUserId, error) => {
// agentUserId: string
// error: { type: ChatMessageType, code: number, message: string, timestamp: number }
// type = the type of message that errored (ChatMessageType, not ModuleType)
});MESSAGE_SAL_STATUS (requires rtmConfig)
Fires when the Speech Activity Level (SAL) registration status changes.
ai.on(AgoraVoiceAIEvents.MESSAGE_SAL_STATUS, (agentUserId, salStatus) => {
// agentUserId: string
// salStatus: MessageSalStatusData — { status: MessageSalStatus, timestamp: number, ... }
});
// MessageSalStatus: VP_DISABLED | VP_UNREGISTER | VP_REGISTERING | VP_REGISTER_SUCCESS | VP_REGISTER_FAIL | VP_REGISTER_DUPLICATETranscriptHelperMode
| Value | Behavior |
|-------|----------|
| TEXT | Full-sentence updates, lower frequency |
| WORD | Word-by-word updates using PTS metadata (requires ENABLE_AUDIO_PTS_METADATA) |
| CHUNK | Chunked message assembly mode for stream data |
| AUTO | Mode is detected automatically from the first agent message (default when omitted) |
| (omitted) | Same as AUTO — detected automatically from the first agent message |
ChunkedMessageAssembler
Reassembles multi-part stream messages sent in msg_id|part_idx|part_sum|base64_data format. Used internally; also exported for custom stream handling.
const assembler = new ChunkedMessageAssembler();
const result = assembler.assemble(rawStreamMessage); // returns parsed object or null
assembler.clear(); // release cached chunksMetrics
By default, the SDK logs to console.debug. Opt in to Agora metrics:
const ai = await AgoraVoiceAI.init({ rtcEngine, enableAgoraMetrics: true });Advanced: implement IMetricsReporter or use the exported ConsoleMetricsReporter / AgoraMetricsReporter directly.
Error Reference
Thrown Errors
| Error Class | When Thrown | Recovery |
|------------|------------|----------|
| NotInitializedError | getInstance() or getCfg() called before init() | Call await AgoraVoiceAI.init(config) first |
| RTMRequiredError | sendText(), sendImage(), or interrupt() called without RTM | Pass rtmConfig: { rtmEngine } in init() config |
| ConversationalAIError | chat() called with unsupported message type | Check message.messageType is TEXT or IMAGE |
All error classes extend ConversationalAIError, which extends Error. Use instanceof to catch specific error types.
Error Events
| Event | Payload | When Emitted |
|-------|---------|-------------|
| AGENT_ERROR | { type: ModuleType, code: number, message: string, timestamp: number } | Agent pipeline error (LLM, TTS, context). Requires enable_error_message: true in agent config |
| MESSAGE_ERROR | { type: ChatMessageType, code: number, message: string, timestamp: number } | RTM message delivery failure. Requires RTM |
Error Handling Example
import {
AgoraVoiceAI,
AgoraVoiceAIEvents,
RTMRequiredError,
ModuleType,
} from 'agora-agent-client-toolkit';
// Catch thrown errors
try {
await ai.sendText(agentUserId, {
messageType: ChatMessageType.TEXT,
priority: ChatMessagePriority.INTERRUPTED,
responseInterruptable: true,
text: 'Hello',
});
} catch (e) {
if (e instanceof RTMRequiredError) {
console.error('RTM not configured — pass rtmConfig to init()');
}
}
// Listen for error events
ai.on(AgoraVoiceAIEvents.AGENT_ERROR, (agentUserId, error) => {
console.error(`Agent error [${error.type}]: ${error.message}`);
if (error.type === ModuleType.LLM) {
showToast('AI model error — please retry');
}
});Sending an Image
await ai.sendImage(agentUserId, {
messageType: ChatMessageType.IMAGE,
uuid: crypto.randomUUID(),
url: 'https://example.com/image.png',
});Graceful Cleanup
window.addEventListener('beforeunload', () => {
const ai = AgoraVoiceAI.getInstance();
ai.unsubscribe();
ai.destroy();
});Troubleshooting
Events not firing
Symptoms: AGENT_STATE_CHANGED, AGENT_METRICS, or AGENT_ERROR events never fire.
Cause: These events require specific parameters when starting the agent via the Agora REST API. Missing parameters cause events to silently not emit — no runtime error is thrown.
Fix: Ensure your agent start request includes:
| Event | Required Agent Parameter |
|-------|------------------------|
| AGENT_STATE_CHANGED | advanced_features.enable_rtm: true AND parameters.data_channel: "rtm" |
| AGENT_METRICS | parameters.enable_metrics: true |
| AGENT_ERROR | parameters.enable_error_message: true |
See the Agora Conversational AI documentation for the full agent start request format.
Standalone hooks not receiving events
Cause: Standalone hooks (useTranscript, useAgentState, etc.) need access to the AgoraVoiceAI instance. Without a ConversationalAIProvider, they fall back to a single getInstance() attempt which may miss the instance if init() hasn't completed yet.
Fix: Wrap your component tree in ConversationalAIProvider — standalone hooks connect instantly via React context:
<ConversationalAIProvider config={{ channel: 'my-channel' }}>
<TranscriptPanel /> {/* useTranscript() connects via context */}
<StatusBar /> {/* useAgentState() connects via context */}
</ConversationalAIProvider>Transcript not updating in WORD mode
Cause: WORD mode requires PTS (Presentation Timestamp) metadata from the RTC audio stream. This must be enabled before creating the RTC client.
Fix: Call AgoraRTC.setParameter('ENABLE_AUDIO_PTS_METADATA', true) before AgoraRTC.createClient(), or use renderMode: TranscriptHelperMode.TEXT which doesn't require PTS.
Optional Dependencies
| Package | Feature | Fallback behavior |
|---------|---------|-------------------|
| @agora-js/report | enableAgoraMetrics: true | Falls back to console.debug |
