talkima

v0.1.0

Published

a month ago

Production voice AI framework for Node.js — Worker-thread-isolated, real-time STT→LLM→TTS pipelines

Downloads

120

0High
0Medium
0Low

sandyyy

voice ai realtime stt tts llm webrtc vad worker-threads voice-agent agent

talkima

Production voice AI framework for Node.js.

Wire together Speech-to-Text → LLM → Text-to-Speech in a single fluent chain. Worker-thread isolation, real-time barge-in interruption, VAD, session history, and plugin hooks — batteries included.

npm install talkima

Requires Node.js 18+.

Quickstart

import 'dotenv/config';
import { talkima, deepgram, groq, elevenlabs, energyVad } from 'talkima';

const server = talkima()
  .transport('websocket', { port: 8080 })
  .stt(deepgram({ apiKey: process.env.DEEPGRAM_API_KEY! }))
  .llm(groq({
    apiKey:       process.env.GROQ_API_KEY!,
    model:        'llama-3.3-70b-versatile',
    systemPrompt: 'You are Aria, a concise voice assistant.',
    maxTokens:    150,
  }))
  .tts(elevenlabs({
    apiKey:  process.env.ELEVENLABS_API_KEY!,
    voiceId: 'EXAVITQu4vr4xnSDxMaL',
  }))
  .vad(energyVad({ threshold: 0.04 }))
  .build();

await server.listen();
// Voice agent is live at ws://localhost:8080

Pipeline

Browser/App
    │  WebSocket (PCM audio)
    ▼
Transport Layer
    │
    ▼
VAD  ──────────────── barge-in interrupt
    │
    ▼
STT (Deepgram / AssemblyAI)
    │  transcript
    ▼
LLM (Groq / OpenAI / Anthropic / Ollama)
    │  streamed text sentences
    ▼
TTS (ElevenLabs / Cartesia)
    │  PCM audio chunks
    ▼
Playout Scheduler ──► WebSocket → Client

Each session runs in an isolated worker thread. A crash in one session never touches another.

STT Providers

| Provider | Factory | |-------------|----------------| | Deepgram | deepgram() | | AssemblyAI | assemblyai() |

.stt(deepgram({ apiKey: '...', model: 'nova-2', language: 'en-US' }))

LLM Providers

| Provider | Factory | |-----------------------------|-------------------| | Groq (fastest) | groq() | | OpenAI | openai() | | Anthropic (Claude) | anthropic() | | Any OpenAI-compatible URL | openaiCompat() |

// Groq
.llm(groq({ apiKey: '...', model: 'llama-3.3-70b-versatile' }))

// OpenAI
.llm(openai({ apiKey: '...', model: 'gpt-4o-mini' }))

// Ollama (local)
.llm(openaiCompat({ baseUrl: 'http://localhost:11434/v1', apiKey: 'ollama', model: 'llama3' }))

TTS Providers

| Provider | Factory | |-------------|----------------| | ElevenLabs | elevenlabs() | | Cartesia | cartesia() |

.tts(elevenlabs({ apiKey: '...', voiceId: 'EXAVITQu4vr4xnSDxMaL' }))

VAD Options

| VAD | Factory | Notes | |---------------|----------------|------------------------------------| | Energy (RMS) | energyVad() | Built-in, zero dependencies | | Silero ONNX | silero() | Most accurate, requires model file | | DTX (WebRTC) | dtxVad() | WebRTC-based, no model file needed |

.vad(energyVad({ threshold: 0.04 }))

LLM Tools (Function Calling)

.llm(groq({
  apiKey: '...',
  model: 'llama-3.3-70b-versatile',
  systemPrompt: 'You are a weather assistant.',
  tools: [
    {
      name: 'get_weather',
      description: 'Get current weather for a city',
      parameters: {
        type: 'object',
        properties: { city: { type: 'string' } },
        required: ['city'],
      },
      handler: async (args) => `It's sunny in ${args.city}.`,
    },
  ],
}))

Session Persistence (Redis)

import { redis } from 'talkima';

.store(redis({ url: process.env.REDIS_URL!, ttlSeconds: 3600 }))

Conversation history is preserved across reconnects and server restarts.

Plugin Hooks

Intercept and transform data at any pipeline stage:

import type { Plugin } from 'talkima';

const myPlugin: Plugin = {
  name: 'my-plugin',
  afterSTT:  async (chunk, sid) => { console.log(chunk.text); return chunk; },
  beforeTTS: async (text,  sid) => text.replace(/\bAI\b/g, 'A.I.'),
};

talkima().plugin(myPlugin)...build();

Available hooks: beforeSTT, afterSTT, beforeLLM, afterLLM, beforeTTS, afterTTS.

Transports

| Transport | Usage | |-------------|-----------------------------------------------------| | WebSocket | .transport('websocket', { port: 8080 }) | | Daily | .transport('daily', { roomUrl: '...', token: '...' }) | | Twilio | .transport('twilio', { accountSid: '...', ... }) |

Telemetry

.telemetry({ enabled: true, serviceName: 'my-bot', endpoint: 'http://localhost:4318' })

Exports OpenTelemetry traces and talkima.e2e latency histograms to any OTLP backend (Jaeger, Grafana Tempo, Datadog…).

Environment Variables

DEEPGRAM_API_KEY=
GROQ_API_KEY=
ELEVENLABS_API_KEY=
ELEVENLABS_VOICE_ID=EXAVITQu4vr4xnSDxMaL
# optional
REDIS_URL=redis://localhost:6379

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

talkima

Quickstart

Pipeline

STT Providers

LLM Providers

TTS Providers

VAD Options

LLM Tools (Function Calling)

Session Persistence (Redis)

Plugin Hooks

Transports

Telemetry

Environment Variables

Links

License