npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

glove-voice

v2.0.5

Published

Voice pipeline for Glove agent framework

Readme

glove-voice

Voice pipeline for the Glove agent framework. Add real-time voice interaction to any Glove app — speak to your agent, hear it respond.

Architecture

Mic → VAD → STTAdapter → Glove Agent → TTSAdapter → Speaker

Install

pnpm add glove-voice

Quick Start (with ElevenLabs)

1. Server token routes (Next.js)

// app/api/voice/stt-token/route.ts
import { createVoiceTokenHandler } from "glove-next";
export const GET = createVoiceTokenHandler({ provider: "elevenlabs", type: "stt" });
// app/api/voice/tts-token/route.ts
import { createVoiceTokenHandler } from "glove-next";
export const GET = createVoiceTokenHandler({ provider: "elevenlabs", type: "tts" });

Set ELEVENLABS_API_KEY in .env.local.

2. Client adapter setup

import { createElevenLabsAdapters } from "glove-voice";

const { stt, createTTS } = createElevenLabsAdapters({
  getSTTToken: () => fetch("/api/voice/stt-token").then(r => r.json()).then(d => d.token),
  getTTSToken: () => fetch("/api/voice/tts-token").then(r => r.json()).then(d => d.token),
  voiceId: "JBFqnCBsd6RMkjVDRZzb",
});

3. Create voice instance

import { GloveVoice } from "glove-voice";

const voice = new GloveVoice(gloveRunnable, { stt, createTTS });
voice.on("mode", (mode) => console.log(mode)); // idle → listening → thinking → speaking
await voice.start();

4. React hook (optional)

import { useGloveVoice } from "glove-react/voice";

const voice = useGloveVoice({ runnable, voice: { stt, createTTS } });
// voice.mode, voice.transcript, voice.start(), voice.stop(), voice.interrupt()

Push-to-Talk (React)

useGlovePTT provides a high-level push-to-talk hook with click-vs-hold detection, hotkey support, and minimum duration:

import { useGlovePTT } from "glove-react/voice";

const ptt = useGlovePTT(voice, {
  holdThresholdMs: 300,   // hold > 300ms = PTT, shorter = toggle
  minDurationMs: 600,     // minimum recording duration
  hotkey: " ",            // spacebar
});
// ptt.active, ptt.onPointerDown, ptt.onPointerUp

Or use the headless VoicePTTButton component:

import { VoicePTTButton } from "glove-react/voice";

<VoicePTTButton ptt={ptt}>
  {({ active, handlers }) => (
    <button {...handlers}>{active ? "Recording..." : "Hold to talk"}</button>
  )}
</VoicePTTButton>

Config options

| Option | Type | Description | |--------|------|-------------| | startMuted | boolean | Start the pipeline with mic muted (useful for manual mode) | | turnMode | "vad" \| "manual" | VAD for hands-free, manual for push-to-talk |

Turn Modes

| Mode | Behavior | |------|----------| | "vad" (default) | Hands-free. VAD auto-detects speech boundaries + barge-in | | "manual" | Push-to-talk. Call commitTurn() to end user's turn |

Voice Activity Detection

Built-in VAD — Energy-based, zero dependencies:

// Used automatically when no custom VAD is provided
const voice = new GloveVoice(glove, { stt, createTTS });

SileroVAD — ML-based (ONNX Runtime WASM), more accurate:

// IMPORTANT: Use dynamic import to avoid pulling WASM into SSR bundle
const { SileroVADAdapter } = await import("glove-voice/silero-vad");
const vad = new SileroVADAdapter({
  positiveSpeechThreshold: 0.5,
  negativeSpeechThreshold: 0.35,
  wasm: { type: "cdn" },
});
await vad.init();

const voice = new GloveVoice(glove, { stt, createTTS, vad });

Security

API keys never leave your server. Adapters use short-lived, single-use tokens:

  1. Your server generates a token using the provider's API
  2. Token is passed to the browser
  3. Browser uses token to authenticate with STT/TTS WebSockets

Token handlers: createVoiceTokenHandler from glove-next supports ElevenLabs, Deepgram, Cartesia.

Adapter Contracts

All adapters implement typed EventEmitter interfaces. Build your own by implementing:

  • STTAdapter — Streaming speech-to-text
  • TTSAdapter — Streaming text-to-speech
  • VADAdapter — Voice activity detection

Exports

| Entry Point | Exports | Browser-safe | |-------------|---------|-------------| | glove-voice | GloveVoice, adapters, AudioCapture, AudioPlayer, VAD | Yes | | glove-voice/server | Token generators (createElevenLabsSTTToken, etc.) | No (server only) | | glove-voice/silero-vad | SileroVADAdapter | Yes (WASM) |

React voice bindings are exported from glove-react/voice:

| Export | Description | |--------|-------------| | useGloveVoice | Core voice hook — mode, transcript, start/stop/interrupt | | useGlovePTT | Push-to-talk with click-vs-hold, hotkey, min-duration | | VoicePTTButton | Headless PTT button component with render prop |

Framework Integration Notes

Next.js:

// next.config.ts
export default {
  transpilePackages: ["glove-voice"],
};

Build warnings from onnxruntime-web are expected and harmless.

Gotchas:

  • glove-voice/silero-vad must be dynamically imported — never import at module level in SSR
  • createTTS must be a factory function (called per turn), not a single instance
  • All adapters assume 16kHz mono PCM audio
  • ElevenLabs TTS idles out after ~20s — GloveVoice handles this by closing TTS after each model response and opening a fresh session on the next text
  • Barge-in protection for mutation-critical tools requires unAbortable: true on the tool — a pending pushAndWait resolver only suppresses the voice barge-in trigger, it does not prevent tool abortion from other sources

Documentation

License

MIT