talkima
v0.1.0
Published
Production voice AI framework for Node.js — Worker-thread-isolated, real-time STT→LLM→TTS pipelines
Downloads
120
Maintainers
Readme
talkima
Production voice AI framework for Node.js.
Wire together Speech-to-Text → LLM → Text-to-Speech in a single fluent chain. Worker-thread isolation, real-time barge-in interruption, VAD, session history, and plugin hooks — batteries included.
npm install talkimaRequires Node.js 18+.
Quickstart
import 'dotenv/config';
import { talkima, deepgram, groq, elevenlabs, energyVad } from 'talkima';
const server = talkima()
.transport('websocket', { port: 8080 })
.stt(deepgram({ apiKey: process.env.DEEPGRAM_API_KEY! }))
.llm(groq({
apiKey: process.env.GROQ_API_KEY!,
model: 'llama-3.3-70b-versatile',
systemPrompt: 'You are Aria, a concise voice assistant.',
maxTokens: 150,
}))
.tts(elevenlabs({
apiKey: process.env.ELEVENLABS_API_KEY!,
voiceId: 'EXAVITQu4vr4xnSDxMaL',
}))
.vad(energyVad({ threshold: 0.04 }))
.build();
await server.listen();
// Voice agent is live at ws://localhost:8080Pipeline
Browser/App
│ WebSocket (PCM audio)
▼
Transport Layer
│
▼
VAD ──────────────── barge-in interrupt
│
▼
STT (Deepgram / AssemblyAI)
│ transcript
▼
LLM (Groq / OpenAI / Anthropic / Ollama)
│ streamed text sentences
▼
TTS (ElevenLabs / Cartesia)
│ PCM audio chunks
▼
Playout Scheduler ──► WebSocket → ClientEach session runs in an isolated worker thread. A crash in one session never touches another.
STT Providers
| Provider | Factory |
|-------------|----------------|
| Deepgram | deepgram() |
| AssemblyAI | assemblyai() |
.stt(deepgram({ apiKey: '...', model: 'nova-2', language: 'en-US' }))LLM Providers
| Provider | Factory |
|-----------------------------|-------------------|
| Groq (fastest) | groq() |
| OpenAI | openai() |
| Anthropic (Claude) | anthropic() |
| Any OpenAI-compatible URL | openaiCompat() |
// Groq
.llm(groq({ apiKey: '...', model: 'llama-3.3-70b-versatile' }))
// OpenAI
.llm(openai({ apiKey: '...', model: 'gpt-4o-mini' }))
// Ollama (local)
.llm(openaiCompat({ baseUrl: 'http://localhost:11434/v1', apiKey: 'ollama', model: 'llama3' }))TTS Providers
| Provider | Factory |
|-------------|----------------|
| ElevenLabs | elevenlabs() |
| Cartesia | cartesia() |
.tts(elevenlabs({ apiKey: '...', voiceId: 'EXAVITQu4vr4xnSDxMaL' }))VAD Options
| VAD | Factory | Notes |
|---------------|----------------|------------------------------------|
| Energy (RMS) | energyVad() | Built-in, zero dependencies |
| Silero ONNX | silero() | Most accurate, requires model file |
| DTX (WebRTC) | dtxVad() | WebRTC-based, no model file needed |
.vad(energyVad({ threshold: 0.04 }))LLM Tools (Function Calling)
.llm(groq({
apiKey: '...',
model: 'llama-3.3-70b-versatile',
systemPrompt: 'You are a weather assistant.',
tools: [
{
name: 'get_weather',
description: 'Get current weather for a city',
parameters: {
type: 'object',
properties: { city: { type: 'string' } },
required: ['city'],
},
handler: async (args) => `It's sunny in ${args.city}.`,
},
],
}))Session Persistence (Redis)
import { redis } from 'talkima';
.store(redis({ url: process.env.REDIS_URL!, ttlSeconds: 3600 }))Conversation history is preserved across reconnects and server restarts.
Plugin Hooks
Intercept and transform data at any pipeline stage:
import type { Plugin } from 'talkima';
const myPlugin: Plugin = {
name: 'my-plugin',
afterSTT: async (chunk, sid) => { console.log(chunk.text); return chunk; },
beforeTTS: async (text, sid) => text.replace(/\bAI\b/g, 'A.I.'),
};
talkima().plugin(myPlugin)...build();Available hooks: beforeSTT, afterSTT, beforeLLM, afterLLM, beforeTTS, afterTTS.
Transports
| Transport | Usage |
|-------------|-----------------------------------------------------|
| WebSocket | .transport('websocket', { port: 8080 }) |
| Daily | .transport('daily', { roomUrl: '...', token: '...' }) |
| Twilio | .transport('twilio', { accountSid: '...', ... }) |
Telemetry
.telemetry({ enabled: true, serviceName: 'my-bot', endpoint: 'http://localhost:4318' })Exports OpenTelemetry traces and talkima.e2e latency histograms to any OTLP backend (Jaeger, Grafana Tempo, Datadog…).
Environment Variables
DEEPGRAM_API_KEY=
GROQ_API_KEY=
ELEVENLABS_API_KEY=
ELEVENLABS_VOICE_ID=EXAVITQu4vr4xnSDxMaL
# optional
REDIS_URL=redis://localhost:6379Links
- Usage Guide — full API reference with all options
- Publishing Guide
- Issues
License
MIT © Sandeep Singh
