npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@khimaros/pi-omni

v0.14.0

Published

pi.dev extension: push-to-talk voice mode (STT + TTS via OpenAI-compatible endpoint)

Readme

pi-omni

push-to-talk voice extension for pi.dev: wires a local OpenAI-compatible STT/LLM/TTS stack (e.g. llama-swap serving whisper.cpp + llama.cpp + a TTS model) into a pi agent session, plus an optional browser UI.

getting started

prerequisites:

  • node.js 20+
  • a working pi installation
  • arecord (alsa-utils) and a wav-on-stdin speaker (aplay, paplay, ffplay -nodisp -autoexit -, …)
  • an OpenAI-compatible endpoint exposing STT, LLM, and TTS

install as a pi extension:

pi install npm:@khimaros/pi-omni

control voice mode from the pi tui:

> /omni              # push-to-talk: tap to record, VAD or re-tap to stop
> /omni-live         # continuous conversation: record → STT → LLM → TTS loop
> /omni-cancel       # cancel any active recording / TTS / chat loop
> /omni-setup        # configure endpoint, models, mic, speaker (re-run anytime)
> /omni-test [text]  # TTS round-trip diagnostic

control the web UI from the pi tui:

> /omni-web start    # start the web server
> /omni-web status   # view server status
> /omni-web open     # open the web UI in browser
> /omni-web stop     # stop the web server

or auto-start when pi launches (terminated when pi exits):

pi --omni-live       # continuous voice on launch
pi --omni-web        # web server on launch

run the web server standalone (no pi tui):

PI_VOICE_LLM_MODEL=qwen3-32b npx @khimaros/pi-omni

or install globally:

npm install -g @khimaros/pi-omni
PI_VOICE_LLM_MODEL=qwen3-32b pi-omni-web

then open http://127.0.0.1:4962.

Once loaded, the Web UI automatically tracks active sessions. You can use the premium glassmorphic sessions menu in the top-right corner to view a list of recent sessions, switch between them, or start a new session instantly. Reconnecting after a WebSocket disconnect or reloading the page automatically resumes the active session based on the URL hash.

pwa installation

the web UI is a Progressive Web App (PWA). you can "install" it to your home screen or desktop for a native-like experience:

  1. open the URL in a supported browser (Chrome, Safari, Edge).
  2. look for the "Install" icon in the address bar or select "Add to Home Screen" from the browser menu.
  3. the app will appear on your device with a premium waveform icon.

from a source checkout

make            # install deps + build
make test       # run tests
make wasm       # rebuild wasm/apm (after touching wasm/apm/src/)

configuration

first run of /omni triggers /omni-setup automatically — it walks through endpoint, models, mic, speaker, and an end-to-end round-trip test. saved to ~/.pi/extensions/omni.json. re-run /omni-setup anytime to reconfigure.

env vars override the saved file:

| variable | default | purpose | | --- | --- | --- | | PI_VOICE_BASE_URL | http://localhost:8080/v1 | OpenAI-compatible endpoint | | PI_VOICE_API_KEY | sk-no-key | llama-swap usually ignores it | | PI_VOICE_STT_MODEL | whisper-1 | as exposed by your server | | PI_VOICE_TTS_MODEL | tts-1 | | | PI_VOICE_TTS_VOICE | alloy | | | PI_VOICE_LLM_MODEL | (none) | required for standalone pi-omni-web | | PI_VOICE_MIC_DEVICE | (default ALSA) | passed to arecord -D | | PI_VOICE_SPEAKER_CMD | aplay -q ... | reads WAV from stdin | | PI_VOICE_AEC_ENABLED | false | acoustic echo cancellation (WebRTC AEC3 WASM) | | PI_VOICE_AEC_DELAY_MS | 200 | expected speaker→mic round-trip | | PI_VOICE_BARGE_IN | false | keep mic open during TTS, cut in on speech | | PI_VOICE_BARGE_IN_MIN_MS | 300 | minimum speech duration to count as barge-in | | PI_VOICE_WEB_HOST | 127.0.0.1 | http bind address for the web server | | PI_VOICE_WEB_PORT | 4962 | http port for the web server |

cli flags for the pi extension:

| flag | env var | config key | effect | | --- | --- | --- | --- | | --omni-live | PI_OMNI_AUTO_LIVE=true | autoStartLive | start continuous voice on launch | | --omni-web | PI_OMNI_AUTO_WEB=true | autoStartWeb | start web server on launch |

cli flags for standalone pi-omni-web:

| flag | purpose | | --- | --- | | --listen <host:port> | http bind address; takes precedence over env vars | | -h, --help | usage |

echo cancellation & barge-in

set aecEnabled: true and bargeInEnabled: true (via /omni-setup or env) to keep the mic open during TTS so you can interrupt by speaking. without AEC, only enable barge-in on headphones — speaker output will feed back into the mic and the bot will interrupt itself.

the AEC is a Rust port of WebRTC AEC3 compiled to WASM, depended on as a file: package at wasm/apm/pkg/. rebuild after touching wasm/apm/src/:

make wasm
# or directly:
cd wasm/apm && wasm-pack build --target nodejs --release

build deps: rustup (e.g. via mise use -g rust@latest) with the wasm32-unknown-unknown target, plus wasm-pack.

roadmap

see ROADMAP.md for implemented and planned features.

architecture

src/
  extension/   pi extension entry (commands, shortcuts, event handlers)
  server/      HTTP + WS server hosting the browser client
  bin/         standalone executables (pi-omni-web)
  audio/       mic, STT, TTS, VAD, AEC, sentence chunker, sanitizer
  config.ts    shared config + env-var overrides
public/        browser client (no build step)
wasm/apm/      WebRTC AEC3 → WASM (rust)
test/          node --test files

development

make            # install deps + build (tsc)
make test       # build then run node --test
make lint       # type-check (tsc --noEmit)
make precommit  # lint + test
make install    # install globally from this checkout
make update     # npm update
make wasm       # rebuild wasm/apm
make pack       # npm pack into build/
make publish    # npm publish --access public
make clean      # rm -rf dist build

known limits

  • sentence chunking is naive (split on .!?\n); abbreviations like "e.g." will split early.
  • manual barge-in via /omni re-tap works without AEC; automatic barge-in needs AEC enabled or headphones.
  • if the pi extension bus doesn't forward message_update, TTS waits for turn_end — still works, just less interactive.
  • barge-in cuts off TTS instantly but the LLM keeps generating in the background until it finishes; its output is discarded.
  • standalone pi-omni-web requires PI_VOICE_LLM_MODEL; the pi extension path doesn't (pi owns the LLM).