@nishanth.kannan1/voicepipe

v0.1.1

Published

12 days ago

VoicePipe Client SDK — build voice AI agents with one API key. Sub-800ms latency, 9 Indian languages.

Downloads

249

0High
0Medium
0Low

nishanth.kannan1

voice agent AI realtime voicepipe tts stt

VoicePipe SDK

Build real-time voice AI agents with one API key. Sub-800ms response latency, 9 Indian languages.

Install

npm install @nishanth.kannan1/voicepipe

Quick Start

import VoicePipe from '@nishanth.kannan1/voicepipe';

const agent = VoicePipe({
  apiKey: 'vp_your_key_here',
  systemPrompt: 'You are a customer support agent for Acme Corp.',
  language: 'en',
});

agent.on('turn', (turn) => {
  console.log(`User: ${turn.user_text}`);
  console.log(`Agent: ${turn.agent_text}`);
  console.log(`Latency: ${turn.latency.total_ms}ms`);
});

agent.start();

That's it. The SDK handles microphone capture, audio streaming, AI processing, and speaker playback. You only need your VoicePipe API key — no Deepgram, Groq, or Cartesia keys required.

Get Your API Key

Sign up at voicepipe.io
Go to API Keys in the console
Generate a key — it looks like vp_xxxxxxxxxxxx

Free plan includes 100 minutes/month. No credit card required.

Configuration

const agent = VoicePipe({
  apiKey: 'vp_xxx',                // required — your VoicePipe API key
  systemPrompt: 'You are ...',     // what the agent should do
  language: 'hi',                  // language code (default: 'en')
  apiUrl: 'https://api.voicepipe.io', // API server (default)
});

Supported Languages

| Code | Language | Code | Language | |------|-----------|------|-----------| | en | English | ml | Malayalam | | hi | Hindi | bn | Bengali | | ta | Tamil | gu | Gujarati | | te | Telugu | mr | Marathi | | kn | Kannada | | |

Events

// Connection ready
agent.on('ready', (data) => {
  console.log('Connected:', data.session_id);
});

// User speech transcribed
agent.on('transcript', (data) => {
  console.log(data.text, data.is_final);
});

// Agent response text
agent.on('agent_text', (data) => {
  console.log(data.text, data.is_complete);
});

// Full turn complete with metrics
agent.on('turn', (turn) => {
  console.log('User:', turn.user_text);
  console.log('Agent:', turn.agent_text);
  console.log('Latency:', turn.latency.total_ms, 'ms');
  console.log('Under 800ms:', turn.latency.under_800ms);
});

// Error
agent.on('error', (data) => {
  console.error(data.error);
});

// Session ended
agent.on('stopped', (data) => {
  console.log('Turns:', data.turn_count);
});

React Integration

import { useEffect, useRef } from 'react';
import VoicePipe from '@nishanth.kannan1/voicepipe';

function VoiceAgent() {
  const agentRef = useRef(null);
  const [status, setStatus] = useState('idle');

  useEffect(() => {
    const agent = VoicePipe({
      apiKey: process.env.NEXT_PUBLIC_VOICEPIPE_KEY,
      systemPrompt: 'You are a support agent.',
    });

    agent.on('ready', () => setStatus('listening'));
    agent.on('turn', (turn) => {
      // update your UI with the conversation
    });
    agent.on('stopped', () => setStatus('idle'));

    agentRef.current = agent;
    return () => agent.stop();
  }, []);

  return (
    <div>
      <p>Status: {status}</p>
      <button onClick={() => agentRef.current?.start()}>
        Start Conversation
      </button>
      <button onClick={() => agentRef.current?.stop()}>
        Stop
      </button>
    </div>
  );
}

Methods

| Method | Description | |--------|-------------| | agent.start() | Start the voice agent (requests mic permission) | | agent.stop() | Stop and clean up | | agent.on(event, callback) | Register event listener | | agent.getSessionId() | Get current session ID | | agent.getTurnCount() | Get number of completed turns |

How It Works

When you call agent.start(), the SDK:

Creates a session on VoicePipe's servers using your API key
Connects via WebSocket (handled internally)
Captures microphone audio from the browser
Streams audio to VoicePipe for processing
VoicePipe runs Speech-to-Text → LLM → Text-to-Speech on our infrastructure
Agent audio streams back and plays through the speaker
Events fire with transcripts, responses, latency, and cost data

You never deal with WebSocket connections, audio encoding, or AI providers. The SDK and VoicePipe's servers handle everything.

Requirements

Browser with microphone access (Chrome, Firefox, Safari, Edge)
Works on localhost or HTTPS origins (mic requires secure context)

License

MIT