npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2025 – Pkg Stats / Ryan Hefner

@create-voice-agent/hume

v0.1.0

Published

Hume AI Text-to-Speech integration for voice agents

Readme

@create-voice-agent/hume 🎭

Hume AI Text-to-Speech integration for create-voice-agent.

This package provides emotionally expressive voice synthesis using Hume AI's streaming TTS API.

Installation

npm install @create-voice-agent/hume
# or
pnpm add @create-voice-agent/hume

Quick Start

import { createVoiceAgent } from "create-voice-agent";
import { AssemblyAISpeechToText } from "@create-voice-agent/assemblyai";
import { HumeTextToSpeech } from "@create-voice-agent/hume";

const voiceAgent = createVoiceAgent({
  model: new ChatOpenAI({ model: "gpt-4o" }),
  
  stt: new AssemblyAISpeechToText({ /* ... */ }),
  
  tts: new HumeTextToSpeech({
    apiKey: process.env.HUME_API_KEY!,
  }),
});

API Reference

HumeTextToSpeech

Streaming Text-to-Speech model using Hume AI's WebSocket API with instant mode for low latency.

import { HumeTextToSpeech } from "@create-voice-agent/hume";

const tts = new HumeTextToSpeech({
  apiKey: process.env.HUME_API_KEY!,
  
  // Optional configuration
  voiceName: "Ava Song",
  voiceProvider: "HUME_AI",
  outputSampleRate: 16000,
  
  // Callbacks
  onAudioComplete: () => console.log("Finished speaking"),
  onInterrupt: () => console.log("Speech interrupted"),
});

Configuration Options

| Option | Type | Default | Description | |--------|------|---------|-------------| | apiKey | string | required | Hume AI API key | | voiceName | string | "Ava Song" | Name of the voice to use | | voiceProvider | "HUME_AI" \| "CUSTOM_VOICE" | "HUME_AI" | Voice provider | | outputSampleRate | number | 16000 | Output audio sample rate (Hz) |

Available Voices

Hume AI provides a variety of expressive voices. Here are some of the built-in options:

| Voice Name | Description | |------------|-------------| | Ava Song | Default voice, warm and expressive | | Kora | Friendly and conversational | | Dacher | Calm and professional | | Aura | Gentle and soothing | | Finn | Energetic and upbeat |

To get the full list of available voices, use the Hume API:

const response = await fetch("https://api.hume.ai/v0/tts/voices", {
  headers: { "X-Hume-Api-Key": process.env.HUME_API_KEY! },
});
const voices = await response.json();
console.log(voices);

Voice Providers

| Provider | Description | |----------|-------------| | HUME_AI | Built-in Hume AI voices (default) | | CUSTOM_VOICE | Your custom cloned voices |

Instance Methods

interrupt()

Interrupt the current speech generation. Useful for barge-in handling.

// User started speaking - stop the agent
tts.interrupt();

speak(text: string): ReadableStream<Buffer>

Generate speech directly without going through the voice pipeline. Returns a ReadableStream of PCM audio buffers.

This is useful for:

  • Initial greetings when a call starts
  • System announcements that bypass the agent
  • One-off speech synthesis outside of conversations
const tts = new HumeTextToSpeech({
  apiKey: process.env.HUME_API_KEY!,
  voiceName: "Kora",
});

// Generate and play a greeting
const audioStream = tts.speak("Hello! I'm here to help. What's on your mind?");

for await (const chunk of audioStream) {
  // Send to audio output (speakers, WebRTC, etc.)
  audioOutput.write(chunk);
}

The speak() method opens a dedicated WebSocket connection and uses the same voice configuration as the main TTS pipeline.

Callbacks

onAudioComplete

Called when speech generation finishes and the WebSocket closes.

const tts = new HumeTextToSpeech({
  apiKey: process.env.HUME_API_KEY!,
  onAudioComplete: () => {
    console.log("Agent finished speaking");
  },
});

onInterrupt

Called when speech is interrupted (e.g., by barge-in).

const tts = new HumeTextToSpeech({
  apiKey: process.env.HUME_API_KEY!,
  onInterrupt: () => {
    console.log("Speech was interrupted");
  },
});

Features

Instant Mode

This integration uses Hume's instant mode for the lowest possible latency. Audio starts streaming as soon as text is received, making it ideal for real-time conversational AI.

Automatic Resampling

Hume outputs audio at 48kHz. This integration automatically resamples to your target sample rate (default: 16kHz) using linear interpolation.

// Output at 8kHz for telephony
const tts = new HumeTextToSpeech({
  apiKey: process.env.HUME_API_KEY!,
  outputSampleRate: 8000,
});

// Output at 24kHz for higher quality
const tts = new HumeTextToSpeech({
  apiKey: process.env.HUME_API_KEY!,
  outputSampleRate: 24000,
});

PCM Output

Audio is output as raw PCM (16-bit signed, little-endian, mono) for easy integration with audio pipelines.

Custom Voices

To use a custom cloned voice:

const tts = new HumeTextToSpeech({
  apiKey: process.env.HUME_API_KEY!,
  voiceName: "my-custom-voice",
  voiceProvider: "CUSTOM_VOICE",
});

See Hume's voice cloning documentation for creating custom voices.

Complete Example

import { createVoiceAgent, createThinkingFillerMiddleware } from "create-voice-agent";
import { AssemblyAISpeechToText } from "@create-voice-agent/assemblyai";
import { HumeTextToSpeech } from "@create-voice-agent/hume";
import { ChatOpenAI } from "@langchain/openai";

const tts = new HumeTextToSpeech({
  apiKey: process.env.HUME_API_KEY!,
  voiceName: "Kora",
  outputSampleRate: 16000,
  
  onAudioComplete: () => console.log("Agent finished speaking"),
});

const stt = new AssemblyAISpeechToText({
  apiKey: process.env.ASSEMBLYAI_API_KEY!,
  onSpeechStart: () => {
    // Barge-in: user started speaking, interrupt the agent
    tts.interrupt();
  },
});

const voiceAgent = createVoiceAgent({
  model: new ChatOpenAI({ model: "gpt-4o" }),
  prompt: "You are an empathetic voice assistant. Respond with warmth and understanding.",
  
  stt,
  tts,
  
  middleware: [
    createThinkingFillerMiddleware({ thresholdMs: 1000 }),
  ],
});

// Process audio streams
const audioOutput = voiceAgent.process(audioInputStream);

License

MIT