npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

openvoiceui

v2026.6.9

Published

Voice-powered AI assistant platform — connect any LLM, any TTS, with a live web canvas, music generation, and agent orchestration

Readme


Watch the demo -- see voice-to-canvas in action


Install

Prerequisite: Docker must be installed and running for all install methods.

Pinokio (one-click)

Download Pinokio if you don't have it, then search "OpenVoiceUI" in the app store and click Install.

npm

npx openvoiceui setup     # interactive wizard — walks you through API keys + builds Docker images
npx openvoiceui start     # starts everything

Docker

git clone https://github.com/MCERQUA/OpenVoiceUI.git
cd OpenVoiceUI
cp .env.example .env        # edit with your API keys
docker compose up

Open localhost:5001 and start talking.


What is OpenVoiceUI?

OpenVoiceUI is a hands-free, AI-controlled computer. You talk — it builds. Live web apps, dashboards, games, full websites — rendered in real time while you watch. No mouse, no keyboard, no typing prompts into a chat box.

It runs on OpenClaw and works with any LLM. The AI agent can build and display apps mid-conversation, switch between projects with a voice command, generate music on the fly, delegate work to parallel sub-agents, and remember everything across sessions. It uses any Claude Code or OpenClaw skill — and the community can build and share more through the plugin system.

Self-hosted. Your hardware, your data. MIT licensed, forever free.

Core Features

  • Hands-Free AI Computer — Talk and watch it work. The AI builds apps, switches between projects, runs tasks, and displays results on a live visual canvas — all without touching a mouse or keyboard.
  • Live Canvas — AI renders real HTML pages mid-conversation: dashboards, tools, galleries, reports, full web apps. Not text responses — real interactive pages you can use.
  • AI Music Generation — Generate songs on the fly with your voice using Suno. Full music player with playlist management built in.
  • Custom Animated Interface — Choose from animated face modes (eye-face avatar, reactive halo-smoke orb) or install community-built faces through plugins. Build your own — the face system is fully extensible.
  • Sub-Agents — Delegate multiple tasks to parallel AI workers simultaneously and get results back.
  • Long-Term Memory — Optional context engine plugin curates knowledge every turn. Persists across sessions in human-readable markdown.
  • Desktop OS Interface — Themed desktop environment with window management (Windows XP, macOS, Ubuntu, Win95, Win 3.1).
  • Admin Dashboard — Mobile-responsive. Agent profiles, provider config, workspace file browser, plugin management, system health. Everything editable live.
  • Self-Hosted — Your hardware, your data. No vendor lock-in, no monthly fees.

And More

  • Image generation (FLUX.1, Stable Diffusion 3.5)
  • Video creation (Remotion Studio)
  • Voice cloning (Qwen3-TTS via fal.ai)
  • Cron jobs for scheduled automation
  • File explorer with drag-and-drop
  • Agent profiles — switch personas, voices, and LLM providers from the admin panel

Install Details

Option 1: Pinokio (one-click)

  1. Install Pinokio if you don't have it
  2. Search "OpenVoiceUI" in the Pinokio app store
  3. Click Install, then Start

Pinokio handles Docker, dependencies, and configuration automatically.

Option 2: npm

Requires Node.js 20+, Python 3.10+, and Docker.

npx openvoiceui setup     # interactive wizard — configures LLM, TTS, API keys, builds Docker images
npx openvoiceui start     # starts OpenClaw gateway + Supertonic TTS + voice UI

The setup wizard walks you through choosing an LLM provider, TTS provider, and entering API keys. Configuration is saved to .env and openclaw-data/.

npx openvoiceui stop      # stop all services
npx openvoiceui status    # check what's running
npx openvoiceui logs      # tail service logs

Option 3: Docker

Requires Docker and Docker Compose.

git clone https://github.com/MCERQUA/OpenVoiceUI.git
cd OpenVoiceUI
cp .env.example .env

Edit .env with your API keys (at minimum: an LLM provider key and optionally a TTS key). Then:

docker compose up -d

This starts three containers:

| Container | Port | Purpose | |-----------|------|---------| | openclaw | 18791 | LLM gateway — routes to your chosen LLM provider | | supertonic | (internal) | Free local TTS — no API key needed | | openvoiceui | 5001 | Voice UI + Canvas + Admin dashboard |

Open http://localhost:5001 to use the voice interface, or http://localhost:5001/admin for the admin dashboard.

To stop: docker compose down

Option 4: VPS / Production

For running on an Ubuntu server with nginx and systemd:

git clone https://github.com/MCERQUA/OpenVoiceUI.git
cd OpenVoiceUI
cp .env.example .env               # edit with your API keys
sudo bash deploy/setup-sudo.sh     # creates dirs, installs systemd service
bash deploy/setup-nginx.sh         # generates nginx config (edit domain)

See deploy/ for the full production setup including SSL, nginx reverse proxy, and systemd service files.


Configuration

All configuration is in .env. Copy .env.example to .env and fill in your values.

Required:

  • An LLM provider API key (OpenAI, Anthropic, Groq, Z.AI, or any OpenClaw-compatible provider)
  • CLAWDBOT_AUTH_TOKEN — set during npx openvoiceui setup or in OpenClaw's setup wizard

Optional but recommended:

  • GROQ_API_KEY — enables Groq Orpheus TTS (fast, high quality, free tier)
  • SUNO_API_KEY — enables AI music generation
  • CLERK_PUBLISHABLE_KEY — enables login/auth (for multi-user or public deployments)

See .env.example for all available options with descriptions.


Works With Any Provider

LLM

| Provider | Status | |----------|--------| | OpenClaw Gateway | Built-in — routes to OpenAI, Anthropic, Groq, Z.AI, and more | | Z.AI (GLM-5-turbo) | Built-in | | Groq (Llama, Qwen) | Via OpenClaw | | Google Gemini | Via OpenClaw | | MiniMax | Via OpenClaw | | Ollama (local) | Via adapter | | Any LLM | Drop-in gateway plugin |

Text-to-Speech

| Provider | Status | |----------|--------| | Supertonic (local) | Free, ships with Docker setup | | Groq Orpheus | Fast cloud TTS, free tier | | Resemble AI | Premium cloned voices | | Qwen3-TTS (fal.ai) | Voice cloning | | Hume EVI | Emotion-aware | | ElevenLabs | High quality, many voices |

Speech-to-Text

| Provider | Status | |----------|--------| | Web Speech API | Free, browser-native (default) | | Deepgram | Streaming, accurate | | Groq Whisper | Fast cloud transcription |


Admin Dashboard

Access at localhost:5001/admin. Mobile-responsive.

  • Profiles — View and activate agent personas
  • Agent Editor — Edit name, voice, LLM provider, system prompt, features, and agent workspace files. 4 tabs: Profile, System Prompt, Features, Agent Files
  • Plugins — Install and manage face packs, gateways, and extensions
  • Canvas Pages — Toggle public/private, lock pages, delete with archive
  • Workspace Files — Browse and edit agent workspace. Audio playback, image preview built in.
  • Music (Suno) — View all generated songs, play inline, archive tracks
  • Provider Config — Select LLM, TTS, STT providers. Saves to active profile.
  • Health and Stats — CPU, RAM, disk, gateway status, session reset
  • Connector Tests — 12 automated endpoint diagnostics

Use Cases

Small Business — AI receptionist, appointment scheduler, report builder. Talk to your AI and get a live dashboard of today's leads, reviews, and tasks.

Digital Agencies — Deploy custom AI assistants per client. Multi-tenant ready. Each client gets their own voice-powered workspace.

Developers — Fork it, extend it, deploy it anywhere. MIT licensed. Build custom plugins, gateway adapters, and canvas pages on top of a voice-first platform.


How It's Different

| | OpenVoiceUI | Typical Voice AI | |---|---|---| | Source | Open source (MIT) | Closed source | | Canvas UI | Live HTML rendering | Text/audio only | | Skills | Any Claude Code or OpenClaw skill | API endpoints | | Music | AI music generation (Suno) | None | | Memory | Plugin-based long-term context | Session only | | Admin | Full dashboard, mobile-ready | Config files | | Plugins | Community face packs, pages, workflows | None | | Hosting | Self-hosted, your data | Vendor cloud only | | Pricing | Free forever | Per-minute billing |


Tech Stack

| Layer | Technology | |-------|-----------| | Backend | Python / Flask | | Frontend | Vanilla JS (ES modules, no framework) | | Canvas | Fullscreen iframe + SSE | | STT | Web Speech API, Deepgram, Groq Whisper | | TTS | Supertonic, Groq Orpheus, Resemble, Qwen3-TTS | | LLM | Any provider via OpenClaw gateway | | Memory | Context engine plugin (markdown knowledge base) | | Auth | Clerk (optional) | | Deploy | npm, Docker, Pinokio, VPS/systemd |


Plugins

OpenVoiceUI has a plugin system for community-built extensions. Plugins can include animated face packs, canvas pages, workflow dashboards, gateway adapters, or any combination.

| Plugin | Type | Description | |--------|------|-------------| | BHB Animated Characters | Face Pack | Animated BigHead Billionaires character avatars with lip-sync, mood expressions, and show lore. By BHaleyart | | Hermes Agent | Gateway | Self-improving AI agent (Hermes v0.13.0 / nousresearch/hermes-agent:v2026.5.7, pegged — never :latest) with auto-generated skills, deep memory search, autonomous tasks, multi-agent Kanban, goal-locking, video analysis, voice cloning. Adds OpenClaw+Hermes hybrid and Hermes-only modes | | SEO Platform | Canvas Page | Full SEO dashboard powered by DataForSEO — keyword research, rank tracking, backlink analysis, site audits, AI visibility, and local SEO | | Twenty CRM | Canvas Page | Connect to a Twenty CRM instance for contact, company, deal, and task management with embedded CRM view and setup wizard |

Build your own. Face packs, canvas pages, workflow dashboards, gateway adapters (template), or STT/TTS adapters (template). See the plugins repo for submission guidelines.


Documentation

Contributing

We welcome contributions — especially plugins. Build a face pack, a canvas page, a workflow dashboard, or a full extension and submit it to the plugins repo. See CONTRIBUTING.md for code contribution guidelines and openvoiceui.com for full documentation.

License

MIT