ttp-agent-sdk
v2.34.10
Published
Comprehensive Voice Agent SDK with Customizable Widget - Real-time audio, WebSocket communication, React components, and extensive customization options
Maintainers
Readme
TTP Agent SDK
This repository contains SDKs for integrating with the TTP Agent API:
- Frontend SDK (JavaScript) - Browser-based SDK for web applications
- Backend SDK (Java) - Server-side SDK for phone systems and backend processing
Frontend SDK (JavaScript)
A comprehensive JavaScript SDK for voice interaction with AI agents. Provides real-time audio recording, WebSocket communication, and audio playback with queue management.
Features
- 🎤 Real-time Audio Recording - Uses AudioWorklet for high-quality audio capture
- 🔄 WebSocket Communication - Real-time bidirectional communication with authentication
- 🔊 Audio Playback Queue - Smooth audio playback with queue management
- ⚛️ React Components - Ready-to-use React components
- 🌐 Vanilla JavaScript - Works with any JavaScript framework
- 🎯 Event-driven - Comprehensive event system for all interactions
- 🔒 Multiple Authentication Methods - Support for signed links and direct agent access
- 📱 Responsive Widget - Pre-built UI widget for quick integration
Installation
npm install ttp-agent-sdkQuick Start
Method 1: Direct Agent ID (Development/Testing)
import { VoiceSDK } from 'ttp-agent-sdk';
const voiceSDK = new VoiceSDK({
websocketUrl: 'wss://speech.talktopc.com/ws/conv',
agentId: 'your_agent_id',
appId: 'your_app_id',
voice: 'default',
language: 'en'
});
// Connect and start recording
await voiceSDK.connect();
await voiceSDK.startRecording();Method 2: Signed Link (Production)
import { VoiceSDK } from 'ttp-agent-sdk';
const voiceSDK = new VoiceSDK({
websocketUrl: 'wss://speech.talktopc.com/ws/conv',
// No agentId needed - server validates signed token from URL
});
// Connect using signed URL
await voiceSDK.connect();Method 3: Pre-built Widget
<script src="https://unpkg.com/ttp-agent-sdk/dist/agent-widget.js"></script>
<script>
// Get signed URL from your backend first
const response = await fetch('/api/get-session', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ agentId: 'your_agent_id' })
});
const data = await response.json();
// Use the signed URL directly
new TTPAgentSDK.TTPChatWidget({
agentId: 'your_agent_id',
signedUrl: data.signedUrl
});
</script>Or use a function to fetch the signed URL:
<script>
new TTPAgentSDK.TTPChatWidget({
agentId: 'your_agent_id',
signedUrl: async () => {
const response = await fetch('/api/get-session', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ agentId: 'your_agent_id' })
});
const data = await response.json();
return data.signedUrl;
}
});
</script>React Integration
import React from 'react';
import { VoiceButton } from 'ttp-agent-sdk';
function App() {
return (
<VoiceButton
websocketUrl="wss://speech.talktopc.com/ws/conv"
agentId="your_agent_id"
appId="your_app_id"
onConnected={() => console.log('Connected!')}
onRecordingStarted={() => console.log('Recording...')}
onPlaybackStarted={() => console.log('Playing audio...')}
/>
);
}API Reference
VoiceSDK
The main SDK class for voice interaction.
Constructor Options
const voiceSDK = new VoiceSDK({
websocketUrl: 'wss://speech.talktopc.com/ws/conv', // Required
agentId: 'agent_12345', // Optional - for direct agent access
appId: 'app_67890', // Optional - user's app ID for authentication
ttpId: 'ttp_abc123', // Optional - custom TTP ID (fallback)
voice: 'default', // Optional - voice selection
language: 'en', // Optional - language code
sampleRate: 16000, // Optional - audio sample rate
autoReconnect: true // Optional - auto-reconnect on disconnect
});Methods
connect()- Connect to the voice serverdisconnect()- Disconnect from the voice serverstartRecording()- Start voice recordingstopRecording()- Stop voice recordingtoggleRecording()- Toggle recording statedestroy()- Clean up resources
Events
connected- WebSocket connecteddisconnected- WebSocket disconnectedrecordingStarted- Recording startedrecordingStopped- Recording stoppedplaybackStarted- Audio playback startedplaybackStopped- Audio playback stoppederror- Error occurredmessage- Received message from server
VoiceButton (React)
A React component that provides a voice interaction button.
Props
<VoiceButton
websocketUrl="wss://speech.talktopc.com/ws/conv"
agentId="agent_12345"
appId="app_67890"
voice="default"
language="en"
autoReconnect={true}
onConnected={() => {}}
onDisconnected={() => {}}
onRecordingStarted={() => {}}
onRecordingStopped={() => {}}
onPlaybackStarted={() => {}}
onPlaybackStopped={() => {}}
onError={(error) => {}}
onMessage={(message) => {}}
/>AgentWidget (Vanilla JS)
A pre-built widget for quick integration.
Configuration
TTPAgentSDK.AgentWidget.init({
agentId: 'your_agent_id', // Required
getSessionUrl: 'https://your-api.com/get-session', // Required - URL or function
variables: { // Optional - dynamic variables
userName: 'John Doe',
page: 'homepage'
},
position: 'bottom-right', // Optional - widget position
primaryColor: '#4F46E5' // Optional - theme color
});Authentication Methods
1. Direct Agent ID (Unsecured - Development)
Use Case: Development, testing, or internal applications.
const voiceSDK = new VoiceSDK({
websocketUrl: 'wss://speech.talktopc.com/ws/conv',
agentId: 'agent_12345', // Visible in network traffic
appId: 'app_67890'
});Security Risk: Agent ID is visible in network traffic.
2. Signed Link (Secured - Production)
Use Case: Production applications where security is critical.
const voiceSDK = new VoiceSDK({
websocketUrl: 'wss://speech.bidme.co.il/ws/conv?signed_token=eyJ...'
// No agentId needed - server validates signed token
});Benefits: Secure, cost-controlled, and production-ready.
Message Format
Outgoing Messages
// Hello message (sent on connection)
{
t: "hello",
appId: "app_67890" // or ttpId for fallback
}
// Start continuous mode
{
t: "start_continuous_mode",
ttpId: "sdk_abc123_1234567890"
}
// Stop continuous mode
{
t: "stop_continuous_mode",
ttpId: "sdk_abc123_1234567890"
}Incoming Messages
// Text response
{
type: "agent_response",
agent_response: "Hello! How can I help you?"
}
// User transcript
{
type: "user_transcript",
user_transcription: "Hello there"
}
// Barge-in detection
{
type: "barge_in",
message: "User interrupted"
}
// Stop playing request
{
type: "stop_playing",
message: "Stop all audio"
}Capture Screenshot Tool (capture_screen)
The capture_screen tool allows AI agents to capture screenshots of the browser page during conversations. It uses html2canvas to render the DOM as an image.
Capturing the Entire Screen
When capturing the entire screen, the tool has two modes:
- Viewport only (default) - Captures only the visible portion of the page (
document.body) - Full scrollable page - When
fullPage: true, captures the entire page including all content below the fold
Example: Capture entire visible viewport
// The agent can call this tool with:
{
tool: "capture_screen",
params: {
format: "jpeg", // 'png' or 'jpeg' (jpeg is smaller)
quality: 0.85, // JPEG quality 0-1 (only for jpeg)
scale: 1, // Resolution scale (1 = normal, 2 = retina)
maxWidth: 1280, // Max width in pixels (auto-resizes if larger)
maxHeight: 1280 // Max height in pixels (auto-resizes if larger)
}
}Example: Capture full scrollable page
// To capture the entire page including content below the fold:
{
tool: "capture_screen",
params: {
fullPage: true, // Capture entire scrollable page
format: "jpeg",
quality: 0.85,
maxWidth: 1920, // Higher limits for full page
maxHeight: 5000 // Accommodate long pages
}
}How Full Page Capture Works:
When fullPage: true is set (and no selector is provided), the tool:
- Sets
windowHeightandheighttodocument.documentElement.scrollHeight(total page height) - Sets
y: 0to start from the top - Captures the entire scrollable content, not just the visible viewport
- The resulting image height equals the full scrollable height of the page
Example: Capture specific element
// Capture a specific element using CSS selector:
{
tool: "capture_screen",
params: {
selector: "#my-element", // CSS selector (e.g., "#header", ".content", "main")
format: "png", // PNG preserves transparency
scale: 2 // 2x resolution for crisp screenshots
}
}Tool Parameters
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| selector | string | null | CSS selector for specific element. If not provided, captures entire page. |
| format | string | 'jpeg' | Output format: 'png' or 'jpeg'. JPEG is smaller, PNG supports transparency. |
| quality | number | 0.85 | JPEG quality (0-1). Only used when format is 'jpeg'. |
| scale | number | 1 | Resolution scale factor. 1 = normal, 2 = retina/2x resolution. |
| maxWidth | number | 1280 | Maximum width in pixels. Image will be resized if larger (maintains aspect ratio). |
| maxHeight | number | 1280 | Maximum height in pixels. Image will be resized if larger (maintains aspect ratio). |
| fullPage | boolean | false | When true and no selector provided, captures entire scrollable page (not just viewport). |
Return Value
The tool returns a result object with:
{
image: "base64_encoded_image_data", // Base64 string (without data URI prefix)
mimeType: "image/jpeg", // MIME type: "image/jpeg" or "image/png"
width: 1280, // Final image width in pixels
height: 720, // Final image height in pixels
captureTimeMs: 234, // Time taken to capture in milliseconds
selector: "body" // Element that was captured (or selector used)
}Events
The SDK emits events when screenshots are captured:
voiceSDK.on('screenshotCaptured', (data) => {
console.log(`Screenshot captured: ${data.width}x${data.height}, ${data.sizeKB}KB`);
console.log(`Element: ${data.selector}`);
});
voiceSDK.on('screenshotError', (error) => {
console.error('Screenshot failed:', error.error);
});How It Works Internally
The tool uses html2canvas to render the DOM:
- Target Selection: If
selectoris provided, captures that element. Otherwise capturesdocument.body. - Full Page Mode: When
fullPage: trueand no selector:- Sets canvas height to
document.documentElement.scrollHeight - Captures from
y: 0to capture entire scrollable area
- Sets canvas height to
- Rendering:
html2canvasrenders the DOM to a canvas element - Resizing: If dimensions exceed
maxWidth/maxHeight, image is resized maintaining aspect ratio - Encoding: Canvas is converted to base64 (JPEG or PNG format)
Important Notes
- Full Page Capture: When
fullPage: trueis set, the tool captures the entire scrollable height (document.documentElement.scrollHeight), not just the visible viewport. This includes all content below the fold. - Performance: Full page captures may take longer, especially for very long pages. Consider using
maxHeightto limit the capture size. - Image Size: Screenshots are automatically resized if they exceed
maxWidthormaxHeightwhile maintaining aspect ratio. - Cross-Origin: The tool handles cross-origin images when possible (
useCORS: true). - Browser Compatibility: Requires modern browsers with Canvas API support (Chrome 66+, Firefox 60+, Safari 11.1+, Edge 79+).
Scroll Tool (scroll_to_element)
The scroll_to_element tool allows AI agents to scroll the page in three different ways: scrolling to a specific element, relative scrolling (up/down), or scrolling to an absolute position.
Scroll Modes
The tool supports three scroll modes with the following priority order:
- Element Scrolling (highest priority) - Scroll to a specific element by CSS selector
- Relative Scrolling - Scroll up or down by a specified number of pixels
- Absolute Position Scrolling - Scroll to specific x/y coordinates
1. Scroll to Element
Scroll to a specific element on the page using a CSS selector.
Example: Scroll to element
// The agent can call this tool with:
{
tool: "scroll_to_element",
params: {
selector: "#contact-section", // CSS selector
position: "center", // 'center', 'top', or 'bottom'
behavior: "smooth" // 'smooth' or 'instant'
}
}Example: Scroll element to top
{
tool: "scroll_to_element",
params: {
selector: ".header",
position: "top", // Scrolls element to top of viewport
behavior: "smooth"
}
}2. Relative Scrolling (Scroll Up/Down)
Scroll the page up or down by a specified number of pixels. This is useful for incremental scrolling or scrolling by viewport height.
Example: Scroll down (default 500px)
{
tool: "scroll_to_element",
params: {
direction: "down",
amount: 500, // Pixels to scroll (default: 500)
behavior: "smooth"
}
}Example: Scroll up (custom amount)
{
tool: "scroll_to_element",
params: {
direction: "up",
amount: 500, // Scroll up 500 pixels
behavior: "smooth"
}
}Example: Scroll down by viewport height
{
tool: "scroll_to_element",
params: {
direction: "down",
amount: window.innerHeight, // Scroll one viewport height
behavior: "smooth"
}
}Features:
- Automatically prevents scrolling beyond page boundaries (won't scroll below 0 or above max scroll)
- Returns
atTopandatBottomflags to indicate if scroll limits were reached - Respects smooth/instant scrolling behavior
3. Absolute Position Scrolling
Scroll to specific x/y coordinates on the page.
Example: Scroll to absolute position
{
tool: "scroll_to_element",
params: {
x: 0, // Horizontal position (optional)
y: 1000, // Vertical position
behavior: "smooth"
}
}Example: Scroll to top of page
{
tool: "scroll_to_element",
params: {
x: 0,
y: 0,
behavior: "instant" // Instant scroll to top
}
}Tool Parameters
| Parameter | Type | Required | Default | Description |
|-----------|------|----------|---------|-------------|
| selector | string | No* | - | CSS selector for element to scroll to. Takes highest priority if provided. |
| position | string | No | 'center' | Position for element scrolling: 'center', 'top', or 'bottom'. Only used with selector. |
| direction | string | No* | - | Direction for relative scrolling: 'up' or 'down'. Only used if no selector provided. |
| amount | number | No | 500 | Pixels to scroll for relative scrolling. Only used with direction. |
| x | number | No* | - | Horizontal scroll position. Only used if no selector or direction provided. |
| y | number | No* | - | Vertical scroll position. Only used if no selector or direction provided. |
| behavior | string | No | 'smooth' | Scroll behavior: 'smooth' or 'instant'. |
* At least one of selector, direction, or x/y must be provided.
Return Values
The tool returns different result objects depending on the scroll mode:
Element Scrolling Result:
{
success: true,
scrollType: "element",
selector: "#contact-section",
elementPosition: {
top: 1200, // Element's position after scroll
left: 0
}
}Relative Scrolling Result:
{
success: true,
scrollType: "relative",
direction: "down",
amount: 300,
scrolledFrom: { y: 500 }, // Starting scroll position
scrolledTo: { y: 800 }, // Final scroll position
atTop: false, // True if scrolled to top
atBottom: false // True if scrolled to bottom
}Absolute Position Scrolling Result:
{
success: true,
scrollType: "position",
scrolledTo: {
x: 0,
y: 1000
}
}Error Result:
{
success: false,
error: "Element not found: #missing-element"
}Priority Order
The tool checks parameters in this order:
selector- If provided, scrolls to element (ignoresdirectionandx/y)direction- If noselector, checks fordirection: 'up'or'down'(ignoresx/y)x/y- If noselectorordirection, uses absolute position scrolling
Use Cases
Scroll to specific section:
{ selector: "#pricing", position: "top" }Scroll down to see more content:
{ direction: "down", amount: 500 }Scroll up to previous content:
{ direction: "up", amount: 300 }Scroll to top of page:
{ y: 0, behavior: "instant" }Scroll to bottom of page:
{ y: document.documentElement.scrollHeight, behavior: "smooth" }Important Notes
- Boundary Protection: Relative scrolling automatically prevents scrolling beyond page boundaries
- Backward Compatible: Existing code using
selectororx/ycontinues to work unchanged - Smooth Scrolling: Default behavior is smooth scrolling (500ms wait time). Use
behavior: "instant"for immediate scrolling (100ms wait time) - Element Not Found: If selector doesn't match any element, returns error without scrolling
Examples
See the examples/ directory for complete usage examples:
test-text-chat.html- TTP Chat Widget with customizable settingstest-signed-link.html- Widget with signed link authenticationreact-example.jsx- React component usagevanilla-example.html- Vanilla JavaScript usage
Development
# Install dependencies
npm install
# Start development server
npm run dev
# Build for production
npm run build
# Run tests
npm testBrowser Support
- Chrome 66+
- Firefox 60+
- Safari 11.1+
- Edge 79+
License
MIT
Backend SDK (Java)
For server-side applications, phone system integration, or backend processing, see the Java SDK documentation.
Key Features:
- ✅ Format negotiation (Protocol v2)
- ✅ Raw audio pass-through (PCMU/PCMA for phone systems)
- ✅ No audio decoding (perfect for forwarding to phone systems)
- ✅ Event-driven API
Quick Start:
VoiceSDKConfig config = new VoiceSDKConfig();
config.setWebsocketUrl("wss://speech.talktopc.com/ws/conv?agentId=xxx&appId=yyy");
config.setOutputEncoding("pcmu"); // For phone systems
config.setOutputSampleRate(8000);
VoiceSDK sdk = new VoiceSDK(config);
sdk.onAudioData(audioData -> {
// Forward raw PCMU to phone system
phoneSystem.sendAudio(audioData);
});
sdk.connect();See java-sdk/README.md for full documentation.
Support
For support and questions, please open an issue on GitHub or contact our support team.
