npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

voipi

v0.0.12

Published

<p align="center"> <a href="https://voipi.vercel.app/"><img src="logo.svg" alt="voipi" width="128" height="128"></a> </p>

Downloads

485

Readme

Give your apps, CLIs, and agents a voice. VoiPi is a universal, zero-dependency, free text-to-speech library for JavaScript.

  • Pure JS, Zero deps, Less than 100kB total install size and 10kB bundled providers
  • No API keys required
  • Multiple providers: Browser TTS, macOS, Edge TTS, Google TTS, Piper, eSpeak NG
  • Auto fallback: Picks the best available provider per platform
  • Auto language detection: Detects script (Arabic, Farsi, CJK, Cyrillic, etc.) and Latin-script languages (French, Spanish, German, Portuguese, etc.) — picks the best voice automatically
  • MCP Server: Give AI agents a voice — auto install with Claude Code, Codex, Cursor, Windsurf, OpenCode and Pi.

Demo

CLI

You can use voipi directly with npx/pnpx/bunx.

# Speak text (auto-selects best available provider)
npx voipi 'The quick brown fox jumps over the lazy dog'
npx voipi speak 'Hello world'

# Choose a specific voice and speed
npx voipi 'Hi' -v en-US-BrianNeural -r 1.5

# Save to file instead of playing
npx voipi speak 'Hi' -o hello.mp3

# Use a specific provider
npx voipi 'Bonjour le monde' -p edge-tts -v fr-FR-DeniseNeural

# List available voices
npx voipi voices

# List voices for a specific provider
npx voipi voices -p edge-tts

# Start MCP server (stdio transport)
npx voipi mcp

MCP Server

VoiPi includes a built-in MCP server that exposes text-to-speech tools over the stdio transport. This lets AI agents and LLM clients speak text, save audio files, and list voices.

Auto-install to all detected agents:

npx voipi@latest mcp --install

Programmatic Usage

VoiPi automatically picks the best available provider with fallback chain (macOS → Edge TTS → Google TTS → Piper → eSpeak NG):

import { VoiPi } from "voipi";

const voice = new VoiPi();

// Speak text
await voice.speak("Hello world!");

// With a prioritized voice list (first available wins)
await voice.speak("Hello!", { voice: ["Samantha", "en-US-AriaNeural"], rate: 1.5 });

// Save to file
await voice.save("Hello!", "output.mp3");

// Get audio data with duration
const audio = await voice.toAudio("Hello world!");
console.log(`Duration: ${audio.duration}s`);

// List available voices
const voices = await voice.listVoices();

You can also provide a custom provider chain using names, [name, options] tuples, or factory functions:

import { VoiPi } from "voipi";

// Using provider names
const voice = new VoiPi({
  providers: ["edge-tts", "macos"],
});

// Using [name, options] tuples for provider configuration
const voice2 = new VoiPi({
  providers: [["edge-tts", { voice: "en-US-GuyNeural" }], "macos"],
});

// Using factory functions for full control
import { MacOS, EdgeTTS } from "voipi";

const voice3 = new VoiPi({
  providers: [() => new EdgeTTS({ voice: "en-US-GuyNeural" }), () => new MacOS()],
});

Language Detection

VoiPi automatically detects the language of input text and selects an appropriate voice. This works across all providers — no manual voice selection needed for non-English text:

await voice.speak("سلام دنیا"); // Farsi → picks a Farsi voice
await voice.speak("مرحبا بالعالم"); // Arabic → picks an Arabic voice
await voice.speak("こんにちは"); // Japanese → picks a Japanese voice
await voice.speak("你好世界"); // Chinese → picks a Chinese voice
await voice.speak("L'éducation française est très appréciée"); // French → picks a French voice
await voice.speak("Straßenbahn und Gemütlichkeit"); // German → picks a German voice
await voice.speak("¿Cómo estás?"); // Spanish → picks a Spanish voice

Detects 30+ languages: unique scripts (Arabic, Farsi, Urdu, CJK, Cyrillic, Devanagari, etc.) and Latin-script languages via diacritics analysis (French, Spanish, German, Portuguese, Turkish, Polish, Czech, Romanian, Vietnamese, and more). You can also use the detection utility directly:

import { detectLanguage } from "voipi";

detectLanguage("سلام دنیا"); // "fa"
detectLanguage("Hello world"); // "en"
detectLanguage("こんにちは世界"); // "ja"
detectLanguage("L'éducation française"); // "fr"
detectLanguage("Straßenbahn"); // "de"

Duration Estimation

Estimate playback duration before or after synthesis:

import { estimateSpeechDuration, getAudioDuration } from "voipi";

// Pre-synthesis: estimate from text (~150 WPM heuristic)
const seconds = estimateSpeechDuration("Hello world!", 1.0);

// Post-synthesis: parse actual audio buffer (WAV/AIFF exact, MP3 estimated)
const audio = await voice.toAudio("Hello world!"); // duration auto-populated
console.log(audio.duration); // seconds

Cancellation

Pass an AbortSignal to cancel synthesis, playback, downloads, and subprocesses:

const ctrl = new AbortController();
setTimeout(() => ctrl.abort(), 500);
await voice.speak("This will be cut off…", { signal: ctrl.signal });

Providers

macOS

Uses the native say command. Only available on macOS.

import { MacOS } from "voipi/macos";

const voice = new MacOS({ voice: "Samantha", rate: 1.2 });
await voice.speak("Hello world!");

// Override defaults per call
await voice.speak("Hello!", { voice: "Daniel", rate: 1.5 });

Edge TTS

Cross-platform online TTS using Microsoft Edge's neural speech service. 322+ voices with configurable rate, pitch, and volume.

import { EdgeTTS } from "voipi/edge-tts";

const voice = new EdgeTTS({ voice: "en-US-AriaNeural" });
await voice.speak("Hello world!");

// List all available voices
const voices = await voice.listVoices();

Google TTS

Cross-platform online TTS using Google Translate's speech endpoint. 55+ languages, zero config.

import { GoogleTTS } from "voipi/google-tts";

const voice = new GoogleTTS({ voice: "en" });
await voice.speak("Hello world!");

// Different language
const fr = new GoogleTTS({ voice: "fr" });
await fr.speak("Bonjour le monde!");

Piper

Local neural TTS powered by Piper. 40+ languages, fully offline after first download. Uses an existing piper install if found in PATH, otherwise auto-installs a standalone binary (Linux x86_64/aarch64) or pip venv (macOS/Windows). Voice models (ONNX) are downloaded on demand from HuggingFace and cached locally.

import { Piper } from "voipi/piper";

const voice = new Piper();
await voice.speak("Hello world!");

// Custom voice, speed, and speaker
const voice2 = new Piper({ voice: "en_US-lessac-medium", lengthScale: 0.8, speaker: 0 });
await voice2.speak("Hello!");

// List all available voices
const voices = await voice.listVoices();

eSpeak NG

Local TTS using the eSpeak NG speech synthesizer. Requires espeak-ng installed on the system (available in KDE, etc). Supports 100+ languages with formant-based synthesis.

Note: It produces robotic-sounding output, for natural-sounding voices, prefer Piper which uses neural TTS.

import { EspeakNG } from "voipi/espeak-ng";

const voice = await EspeakNG.create();
await voice.speak("Hello world!");

// Custom voice and speed
const voice2 = await EspeakNG.create({ voice: "en-us+f3", rate: 1.2 });
await voice2.speak("Hello!");

// List all available voices
const voices = await voice.listVoices();

Browser TTS

Uses the Web Speech API (speechSynthesis). Works in browsers only — speaks directly without producing audio files.

import { BrowserTTS } from "voipi/browser";

const voice = new BrowserTTS();
await voice.speak("Hello world!");

// Pick a specific voice
await voice.speak("Hello!", { voice: "Google US English", rate: 1.2 });

// List available voices (varies by browser/OS)
const voices = await voice.listVoices();

Note: Browser TTS plays audio directly and does not support save() or raw audio export.

Pi Extension

VoiPi also ships with a pi package that adds TTS tools and commands to pi.

pi install git:github.com/pithings/voipi

See packages/pi/README.md for usage details.

Sponsors

License

Published under the MIT license 💛.