@kalpalabs/voiceagent-sdk

v0.0.2

Published

7 months ago

VoiceAgent TypeScript SDK

Downloads

0High
0Medium
0Low

pshishodia

gautijha37

Kalpalabs VoiceAgent typescript sdk

Call control

import { VoiceAgent } from '@kalpalabs/voiceagent-sdk';

const agent = new VoiceAgent('<client-api-key>');

// Automatically connects your microphone and audio output
// in the browser via websockets.
// Provide parameters that you want to override for this call
// rest of the parameters will be taken from default `Agent Parameters` section below
await agent.start({
    "llm": {
        "model": "llama-3.1-8b",
        "system_prompt": "You are a helpful agent",
    },
    "tts": {
        "voice_name": "dan"
    },
});

// mute and unmute user's microphone
agent.isMuted(); // false
agent.setMuted(true);
agent.isMuted(); // true

// say(message: string, endCallAfterSpoken?: boolean) can be used to invoke speech and gracefully terminate the call if needed
agent.say("Our time's up, goodbye!", true);

// stop session
agent.stop();

Agent parameters

Full list of supported params and their default values:

"params": {
    "llm": {
      "model": "llama-3.3-70b",
      "max_output_tokens": 512,
      "system_prompt": "You are a helpful agent",
      "temperature": 0.5,
      "top_p": 0.9,
    },
    "tts": {
      "max_output_tokens": 2048,
      "voice_name": "tara"
    },
    "vad": {
      "threshold": 0.6,
      "min_silence_duration_ms": 500,
      "speech_pad_ms": 500
    }
}

currently we only support openai/whisper-large-v3-turbo for STT and canopylabs/orpheus-tts-0.1-finetune-prod for TTS.

For LLM we support one of the following options: "llama-3.1-8b", "llama-3.3-70b", "qwen-3-32b", "llama-4-scout-17b-16e-instruct", "llama-4-maverick-17b-128e-instruct"

For TTS, voice_name can take one of the following options: "tara", "leah", "jess", "leo", "dan", "mia", "zac", "zoe"

Events

You can listen to the following events that agent emits and perform custom actions on them:

agent.on('speech-start', () => {
  console.log('Assistant speech has started');
});

agent.on('speech-end', () => {
  console.log('Assistant speech has ended');
});

agent.on('call-start', () => {
  console.log('Call has started');
});

agent.on('call-end', () => {
  console.log('Call has stopped');
});

// Function calls and transcripts will be sent via messages
agent.on('message', (message) => {
  console.log(message);
});

agent.on('error', (e) => {
  console.error(e);
});

Full list of Kalpalabs -> Client messages

These are the additional messages that you can handle in your code within agent.on('message') event:

Call start message contains the conversation_id of the current call:

{
  "type": "call_start",
  "conversation_id": "<conversation_id>"
}

Transcript message (both user and assistant):

{
  "type": "transcript",
  "transcript": "<transcript>",
  "role": "user|assistant",
}

Transcript update message (assistant partial transcript):

{
  "type": "transcript_update",
  "transcript": "<partial_transcript>",
  "role": "assistant",
  "request_id": "<request_id>"
}

Speech start message:

{
  "type": "speech_start",
  "role": "user|assistant"
}

Response finished message when current "turn" of assistant speaking is finished:

{
  "type": "response_finished"
}

Latency message - TTFB latency (in ms) of the current "turn" of conversation:

{
  "type": "latency",
  "latency": 500
}

Error message:

{
  "type": "error",
  "message": "<error_message>"
}

Disconnect message - Server is now going to disconnect the websocket:

{
  "type": "disconnect",
  "reason": "<disconnection_reason>"
}

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

Kalpalabs VoiceAgent typescript sdk

Call control

Agent parameters

Events

Full list of Kalpalabs -> Client messages