@telnyx/ai-agent-lib

v0.4.6

Published

9 days ago

[![npm version](https://img.shields.io/npm/v/@telnyx/ai-agent-lib.svg)](https://www.npmjs.com/package/@telnyx/ai-agent-lib) [![npm downloads](https://img.shields.io/npm/dm/@telnyx/ai-agent-lib.svg)](https://www.npmjs.com/package/@telnyx/ai-agent-lib) [![C

0High
0Medium
0Low

team-telnyx

frontend-squad-telnyx

Telnyx AI Agent Library

A TypeScript/React library for building AI-powered voice conversation applications using Telnyx's WebRTC infrastructure. This library provides a comprehensive set of tools for managing AI agent connections, real-time transcriptions, conversation states, and audio streaming.

Features

🎯 Easy AI Agent Integration - Connect to Telnyx AI agents with minimal setup
🎙️ Real-time Transcription - Automatic speech-to-text with live updates
🔊 Audio Stream Management - Built-in audio monitoring and playback controls
⚛️ React Hooks & Components - Ready-to-use React components and hooks
🔄 State Management - Automatic state synchronization using Jotai
📱 Connection Management - Robust connection handling with error recovery
🎚️ Agent State Tracking - Monitor agent states (listening, speaking, thinking)
⏱️ Latency Measurement - Automatic round-trip latency tracking using client-side Voice Activity Detection (VAD)

Installation

npm install @telnyx/ai-agent-lib

Peer Dependencies

This library requires React 19.1.1+ as a peer dependency:

npm install react@^19.1.1 react-dom@^19.1.0

Quick Start

1. Wrap your app with the provider

import React from 'react';
import { createRoot } from 'react-dom/client';
import { TelnyxAIAgentProvider } from '@telnyx/ai-agent-lib';
import App from './App';

createRoot(document.getElementById('root')!).render(
  <TelnyxAIAgentProvider agentId="your-agent-id">
    <App />
  </TelnyxAIAgentProvider>
);

2. Use hooks in your components

import React, { useEffect, useRef, useState } from 'react';
import {
  useClient,
  useTranscript,
  useConnectionState,
  useConversation,
  useAgentState
} from '@telnyx/ai-agent-lib';

function VoiceChat() {
  const client = useClient();
  const transcript = useTranscript();
  const connectionState = useConnectionState();
  const conversation = useConversation();
  const agentState = useAgentState();
  const audioRef = useRef<HTMLAudioElement>(null);
  const [messageInput, setMessageInput] = useState('');

  // Setup audio playback
  useEffect(() => {
    if (conversation?.call?.remoteStream && audioRef.current) {
      audioRef.current.srcObject = conversation.call.remoteStream;
    }
  }, [conversation]);

  const handleSendMessage = () => {
    if (messageInput.trim()) {
      client.sendConversationMessage(messageInput);
      setMessageInput('');
    }
  };

  const isCallActive = conversation?.call?.state === 'active';

  return (
    <div>
      <h2>Connection: {connectionState}</h2>
      <h3>Agent State: {agentState}</h3>
      
      <button 
        onClick={() =>
          client.startConversation({
            callerName: 'Jane Doe',
            customHeaders: [{ name: 'X-Account-Number', value: '123456' }]
          })
        }
        disabled={connectionState !== 'connected'}
      >
        Start Conversation
      </button>
      
      <button onClick={() => client.endConversation()}>
        End Conversation
      </button>

      {/* Text message input - only available during active call */}
      {isCallActive && (
        <div style={{ margin: '20px 0' }}>
          <h4>Send Text Message</h4>
          <input
            type="text"
            value={messageInput}
            onChange={(e) => setMessageInput(e.target.value)}
            placeholder="Type a message..."
            onKeyPress={(e) => e.key === 'Enter' && handleSendMessage()}
          />
          <button 
            onClick={handleSendMessage}
            disabled={!messageInput.trim()}
          >
            Send Message
          </button>
        </div>
      )}

      <audio ref={audioRef} autoPlay playsInline controls />

      <div>
        <h3>Transcript ({transcript.length} items)</h3>
        {transcript.map((item) => (
          <div key={item.id}>
            <strong>{item.role}:</strong> {item.content}
            <small> - {item.timestamp.toLocaleTimeString()}</small>
          </div>
        ))}
      </div>
    </div>
  );
}
}

API Reference

TelnyxAIAgentProvider Props

| Prop | Type | Required | Default | Description | |------|------|----------|---------|-------------| | agentId | string | ✅ | - | Your Telnyx AI agent ID | | versionId | string | ❌ | "main" | Agent version to use | | widgetVersion | string | ❌ | - | Widget package version forwarded to voice-sdk-proxy for observability | | environment | "production" \| "development" | ❌ | "production" | Telnyx environment | | debug | boolean | ❌ | false | Enable debug logging | | vad | VADOptions | ❌ | See below | Voice Activity Detection configuration |

Debug Logging

When debug: true is set, the library outputs detailed logs to the console using a timestamped format. This is useful for troubleshooting connection issues, audio stream problems, or understanding the internal state transitions.

<TelnyxAIAgentProvider agentId="your-agent-id" debug={true}>
  <App />
</TelnyxAIAgentProvider>

Debug logs include:

Audio stream monitoring events (local and remote)
Agent state transitions with timing information
AudioContext state changes
Volume threshold detection

Hooks

`useClient()`

Returns the TelnyxAIAgent instance for direct API access.

Methods:

connect() - Connect to Telnyx platform
disconnect() - Disconnect and cleanup
startConversation(options?) - Start a new conversation with optional caller metadata and headers
endConversation() - End the current conversation
sendConversationMessage(message: string) - Send a text message during an active conversation
setRemoteStream(stream: MediaStream) - Manually set the remote audio stream for monitoring (useful when call.remoteStream is not available)
transcript - Get current transcript array

startConversation Options:

| Option | Type | Description | |--------|------|-------------| | destinationNumber | string | docs | | callerNumber | string | docs | | callerName | string | docs | | customHeaders | { name: string; value: string }[] | docs |

Events:

agent.connected - Agent successfully connected
agent.disconnected - Agent disconnected
agent.error - Connection or operational error
transcript.item - New transcript item received
conversation.update - Conversation state updated
conversation.agent.state - Agent state changed

`useConnectionState()`

Returns the current connection state: "connecting" | "connected" | "disconnected" | "error"

`useTranscript()`

Returns an array of TranscriptItem objects representing the conversation history.

TranscriptItem:

{
  id: string;
  role: "user" | "assistant";
  content: string;
  timestamp: Date;
}

`useConversation()`

Returns the current conversation notification object containing call information and state.

`useAgentState()`

Returns the current agent state: "listening" | "speaking" | "thinking"

Direct Usage (Without React)

You can also use the library without React:

import { TelnyxAIAgent } from '@telnyx/ai-agent-lib';

const agent = new TelnyxAIAgent({
  agentId: 'your-agent-id',
  versionId: 'main',
  widgetVersion: '0.33.6',
  environment: 'production'
});

// Connect to the platform
await agent.connect();

// Listen for events
agent.on('agent.connected', () => {
  console.log('Agent connected!');
});

agent.on('transcript.item', (item) => {
  console.log(`${item.role}: ${item.content}`);
});

agent.on('conversation.agent.state', (state) => {
  console.log(`Agent is now: ${state}`);
});

agent.on('conversation.update', (notification) => {
  if (notification.call?.state === 'active') {
    console.log('Call is now active - you can send messages');
    
    // Send a text message during the conversation
    agent.sendConversationMessage('Hello, I have a question about your services.');
  }
});

// Start a conversation with optional caller metadata
await agent.startConversation({
  callerNumber: '+15551234567',
  callerName: 'John Doe',
  customHeaders: [
    { name: 'X-User-Plan', value: 'gold' },
    { name: 'X-Session-Id', value: 'session-abc' }
  ],
  audio: { autoGainControl: true, noiseSuppression: true }
});

// Send messages during an active conversation
// Note: This will only work when there's an active call
setTimeout(() => {
  agent.sendConversationMessage('Can you help me with my account?');
}, 5000);

// Access transcript
console.log(agent.transcript);

// Cleanup
await agent.disconnect();

Advanced Usage

Sending Text Messages During Conversations

You can send text messages to the AI agent during an active conversation using the sendConversationMessage method. This is useful for providing context, asking questions, or sending information that might be easier to type than speak.

// React example
function ChatInterface() {
  const client = useClient();
  const conversation = useConversation();
  const [message, setMessage] = useState('');

  const handleSendMessage = () => {
    if (conversation?.call?.state === 'active' && message.trim()) {
      client.sendConversationMessage(message);
      setMessage('');
    }
  };

  return (
    <div>
      {conversation?.call?.state === 'active' && (
        <div>
          <input
            value={message}
            onChange={(e) => setMessage(e.target.value)}
            placeholder="Type a message to the agent..."
            onKeyPress={(e) => e.key === 'Enter' && handleSendMessage()}
          />
          <button onClick={handleSendMessage}>Send</button>
        </div>
      )}
    </div>
  );
}

// Direct usage example
agent.on('conversation.update', (notification) => {
  if (notification.call?.state === 'active') {
    // Now you can send text messages
    agent.sendConversationMessage('I need help with order #12345');
  }
});

Important Notes:

Messages can only be sent during an active conversation (when call.state === 'active')
The agent will receive and process text messages just like spoken input
Text messages may appear in the transcript depending on the agent configuration

Latency Measurement

The library automatically measures round-trip latency using client-side Voice Activity Detection (VAD). This provides accurate timing from when the user stops speaking until the agent's response audio begins.

How it works:

Local VAD (User's microphone): Monitors the user's audio stream. After detecting silence following speech, the library records thinkingStartedAt timestamp and transitions to "thinking" state.
Remote VAD (Agent's audio): Monitors the agent's audio stream. When audio volume crosses the threshold, the library calculates userPerceivedLatencyMs as the time elapsed since user went silent and transitions to "speaking" state.
Greeting Latency: For the first agent speech (greeting), the library calculates greetingLatencyMs from when the audio stream monitoring started.

VAD Configuration:

The VAD behavior can be customized using the vad option:

// React
<TelnyxAIAgentProvider
  agentId="your-agent-id"
  vad={{
    volumeThreshold: 10,      // 0-255, audio level to detect speech
    silenceDurationMs: 500,   // ms of silence before "thinking" state
    minSpeechDurationMs: 100, // min ms of speech to count as real (filters noise)
    maxLatencyMs: 15000,      // ignore latency above this (optional, filters stale)
  }}
>

// Direct usage
const agent = new TelnyxAIAgent({
  agentId: 'your-agent-id',
  vad: {
    volumeThreshold: 10,
    silenceDurationMs: 500,
    minSpeechDurationMs: 100,
    maxLatencyMs: 15000,
  },
});

VAD Options:

| Option | Type | Default | Description | |--------|------|---------|-------------| | volumeThreshold | number | 10 | Audio level (0-255) to detect speech | | silenceDurationMs | number | 500 | Silence duration before triggering "thinking" state | | minSpeechDurationMs | number | 100 | Minimum speech duration to count as real user speech (filters brief noise) | | maxLatencyMs | number | undefined | Maximum latency to report (values above are ignored as stale) |

Tuning for different scenarios:

// Fast-paced conversation (aggressive turn detection)
vad: {
  silenceDurationMs: 500,
  minSpeechDurationMs: 80,
}

// Thoughtful conversation (tolerant of pauses)
vad: {
  silenceDurationMs: 1500,
  minSpeechDurationMs: 150,
}

// Noisy environment
vad: {
  volumeThreshold: 20,
  minSpeechDurationMs: 200,
}

Note: Silence-based VAD has inherent tradeoffs. Lower silenceDurationMs values detect turn-endings faster but may cut off natural pauses ("I need to... think about that"). Higher values are more tolerant but add latency. For production use cases requiring precise turn detection, consider integrating server-side semantic endpointing.

const agentState = useAgentState();

// Access latency when agent starts speaking
useEffect(() => {
  if (agentState.greetingLatencyMs !== undefined) {
    console.log(`Greeting latency: ${agentState.greetingLatencyMs}ms`);
  }
  if (agentState.userPerceivedLatencyMs !== undefined) {
    console.log(`Response latency: ${agentState.userPerceivedLatencyMs}ms`);
  }
  if (agentState.thinkingStartedAt) {
    console.log(`Started thinking at: ${agentState.thinkingStartedAt}`);
  }
}, [agentState]);

Custom Audio Handling

The library automatically handles audio stream monitoring and agent state detection based on audio levels. The audio stream is available through the conversation object:

const conversation = useConversation();
const audioStream = conversation?.call?.remoteStream;

If call.remoteStream is not available, you can manually provide the stream using setRemoteStream:

const client = useClient();
const conversation = useConversation();

useEffect(() => {
  const call = conversation?.call;
  if (call?.state === 'active') {
    // Get stream from peer connection if remoteStream is not available
    const peerConnection = call.peer?.instance;
    const receivers = peerConnection?.getReceivers?.();
    const audioReceiver = receivers?.find(r => r.track?.kind === 'audio');

    if (audioReceiver?.track) {
      const stream = new MediaStream([audioReceiver.track]);
      client.setRemoteStream(stream);
    }
  }
}, [conversation, client]);

Error Handling

const client = useClient();

useEffect(() => {
  const handleError = (error: Error) => {
    console.error('Agent error:', error);
    // Handle error (show notification, retry connection, etc.)
  };

  client.on('agent.error', handleError);
  
  return () => {
    client.off('agent.error', handleError);
  };
}, [client]);

State Persistence

The library uses Jotai for state management, which automatically handles state updates across components. All state is ephemeral and resets when the provider unmounts.

Events Reference

The TelnyxAIAgent class extends EventEmitter and provides a comprehensive set of events for monitoring connection status, conversation state, and agent behavior. Events can be subscribed to using on(), once(), or addListener() and unsubscribed using off() or removeListener().

Event Types

| Event | Payload | Description | |-------|---------|-------------| | agent.connected | - | Emitted when successfully connected to the Telnyx platform | | agent.disconnected | - | Emitted when disconnected from the Telnyx platform | | agent.error | Error | Emitted when any operational error occurs | | transcript.item | TranscriptItem | Emitted when a new transcript item is received | | conversation.update | INotification | Emitted when conversation state changes | | conversation.agent.state | AgentStateData | Emitted when agent state changes (listening/speaking/thinking) | | agent.audio.mute | boolean | Emitted when agent audio is muted or unmuted |

Data Types

// Transcript item representing a message in the conversation
type TranscriptItem = {
  id: string;
  role: "user" | "assistant";
  content: string;
  timestamp: Date;
  attachments?: Array<{ type: "image"; url: string }>;
};

// Agent state with optional latency information
type AgentStateData = {
  state: "speaking" | "listening" | "thinking";
  // Latency in ms from when user stopped speaking until agent response began.
  // Only present when state is "speaking"
  userPerceivedLatencyMs?: number;
  // Latency in ms for the initial agent greeting (first speech).
  // Only present on first "speaking" state
  greetingLatencyMs?: number;
  // UTC timestamp (ISO 8601) when user stopped speaking and thinking state began.
  // Only present when state is "thinking"
  thinkingStartedAt?: string;
};

Usage Examples

Connection Events

Monitor connection lifecycle to handle reconnection logic or update UI:

const agent = new TelnyxAIAgent({ agentId: 'your-agent-id' });

agent.on('agent.connected', () => {
  console.log('Connected to Telnyx platform');
  // Enable UI controls, start conversation, etc.
});

agent.on('agent.disconnected', () => {
  console.log('Disconnected from Telnyx platform');
  // Disable UI controls, show reconnection prompt, etc.
});

agent.on('agent.error', (error) => {
  console.error('Agent error:', error.message);
  // Handle error: show notification, attempt reconnection, etc.
});

Agent State Events

Track whether the agent is listening, thinking, or speaking. This is useful for visual feedback like animated indicators. The library uses client-side Voice Activity Detection (VAD) to detect when the user stops speaking (after 1 second of silence) and when the agent starts responding, providing accurate round-trip latency measurements.

agent.on('conversation.agent.state', (data) => {
  console.log(`Agent state: ${data.state}`);

  switch (data.state) {
    case 'listening':
      // Show listening indicator (e.g., pulsing microphone)
      break;
    case 'thinking':
      // Show thinking indicator (e.g., loading spinner)
      // thinkingStartedAt contains the UTC timestamp when user stopped speaking
      if (data.thinkingStartedAt) {
        console.log(`Thinking started at: ${data.thinkingStartedAt}`);
      }
      break;
    case 'speaking':
      // Show speaking indicator (e.g., animated waveform)
      if (data.greetingLatencyMs !== undefined) {
        console.log(`Greeting latency: ${data.greetingLatencyMs}ms`);
      }
      if (data.userPerceivedLatencyMs !== undefined) {
        console.log(`Response latency: ${data.userPerceivedLatencyMs}ms`);
        // Track latency for analytics or display to user
      }
      break;
  }
});

Transcript Events

Build a real-time chat interface by listening to transcript updates:

const conversationHistory: TranscriptItem[] = [];

agent.on('transcript.item', (item) => {
  conversationHistory.push(item);

  // Display the new message
  console.log(`[${item.timestamp.toLocaleTimeString()}] ${item.role}: ${item.content}`);

  // Handle attachments if present
  if (item.attachments?.length) {
    item.attachments.forEach((attachment) => {
      if (attachment.type === 'image') {
        console.log(`Image attachment: ${attachment.url}`);
      }
    });
  }
});

Conversation Update Events

Monitor call state changes to know when to enable/disable features:

agent.on('conversation.update', (notification) => {
  const call = notification.call;

  if (!call) return;

  console.log(`Call state: ${call.state}`);

  switch (call.state) {
    case 'new':
      console.log('Call initiated');
      break;
    case 'trying':
      console.log('Connecting...');
      break;
    case 'ringing':
      console.log('Ringing...');
      break;
    case 'active':
      console.log('Call is active - voice and text messaging enabled');
      // Enable send message button, show active call UI
      break;
    case 'hangup':
    case 'destroy':
      console.log('Call ended');
      // Clean up UI, show call summary
      break;
  }
});

React Hook Pattern

In React applications, use the hooks with useEffect to manage event subscriptions:

import { useClient, useAgentState, useConnectionState } from '@telnyx/ai-agent-lib';

function ConversationMonitor() {
  const client = useClient();
  const agentState = useAgentState();
  const connectionState = useConnectionState();
  const [latencyHistory, setLatencyHistory] = useState<number[]>([]);

  // Subscribe to events for additional handling beyond the built-in hooks
  useEffect(() => {
    const handleAgentState = (data: AgentStateData) => {
      if (data.userPerceivedLatencyMs !== undefined) {
        setLatencyHistory(prev => [...prev, data.userPerceivedLatencyMs!]);
      }
    };

    const handleError = (error: Error) => {
      // Custom error handling (e.g., send to error tracking service)
      console.error('Agent error:', error);
    };

    client.on('conversation.agent.state', handleAgentState);
    client.on('agent.error', handleError);

    return () => {
      client.off('conversation.agent.state', handleAgentState);
      client.off('agent.error', handleError);
    };
  }, [client]);

  const averageLatency = latencyHistory.length > 0
    ? Math.round(latencyHistory.reduce((a, b) => a + b, 0) / latencyHistory.length)
    : null;

  return (
    <div>
      <p>Connection: {connectionState}</p>
      <p>Agent: {agentState.state}</p>
      {averageLatency && <p>Avg Response Time: {averageLatency}ms</p>}
    </div>
  );
}

One-Time Event Listeners

Use once() for events you only need to handle one time:

// Wait for connection before starting conversation
agent.once('agent.connected', () => {
  agent.startConversation({ callerName: 'User' });
});

await agent.connect();

Removing Event Listeners

Clean up listeners when no longer needed:

const handleTranscript = (item: TranscriptItem) => {
  console.log(`${item.role}: ${item.content}`);
};

// Add listener
agent.on('transcript.item', handleTranscript);

// Remove specific listener
agent.off('transcript.item', handleTranscript);

// Remove all listeners for an event
agent.removeAllListeners('transcript.item');

// Remove all listeners for all events
agent.removeAllListeners();

TypeScript Support

This library is built with TypeScript and provides full type definitions. All hooks and components are fully typed for the best development experience.

Examples

Check out the example repository for a complete example application demonstrating all features of the library.

Dependencies

@telnyx/webrtc - Telnyx WebRTC SDK
eventemitter3 - Event handling
jotai - State management
loglevel - Lightweight logging with level control

License

This library is part of the Telnyx ecosystem. Please refer to Telnyx's terms of service and licensing agreements.

Support

For support, please contact Telnyx support or check the Telnyx documentation.

Contributing

This library is maintained by Telnyx. For bug reports or feature requests, please contact Telnyx support.