npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

pi-voice-input

v0.2.10

Published

Press Ctrl+Shift+R to dictate prompts into Pi using VolcEngine ASR

Readme

pi Voice Input

A publishable, pure TypeScript pi extension for Linux and macOS voice dictation into pi's editor.

  • Press Ctrl+Shift+R once to start recording.
  • Press Ctrl+Shift+R again to stop.
  • The extension sends the audio to VolcEngine WebSocket ASR.
  • The recognized text is inserted into pi's editor without submitting.

Current scope:

  • Linux uses pw-record from PipeWire tools or arecord from alsa-utils.
  • macOS uses afrecord when present, otherwise ffmpeg with AVFoundation.
  • A VolcEngine Speech API key is required.
  • This is not a local/offline ASR engine.

The provider layer is intended to be extensible. Current version supports only VolcEngine WebSocket ASR.

No Python, uv, or upload service is required for normal shortcut usage. On macOS systems without afrecord, install ffmpeg for recording.

Architecture

pi extension: extensions/index.ts → extensions/voice-input.ts
  ├─ registers Ctrl+Shift+R and /voice commands
  ├─ starts/stops a local recorder process
  │    ├─ Linux preferred: pw-record
  │    ├─ Linux fallback: arecord
  │    └─ macOS: afrecord, or ffmpeg/AVFoundation fallback
  ├─ records a temporary 16 kHz mono 16-bit WAV
  ├─ parses the WAV container in TypeScript and extracts raw PCM
  ├─ sends PCM frames to the configured ASR provider via ws
  │    └─ current provider: VolcEngine /api/v3/sauc/bigmodel_nostream
  ├─ optionally post-processes raw ASR text with a configured pi model
  │    └─ default: disabled; set polishModel to enable it
  └─ pastes the final transcript into pi's editor

Runtime package dependency:

  • ws

System dependency, one of:

  • Linux: pw-record from PipeWire tools, preferred
  • Linux: arecord from alsa-utils, fallback
  • macOS: afrecord when present, or ffmpeg from Homebrew (brew install ffmpeg) as the AVFoundation fallback

On macOS, grant Terminal, ffmpeg, or your pi host app microphone permission when prompted. If macOS has previously denied microphone access, enable it in System Settings → Privacy & Security → Microphone.

Install / Update

Install the published package with pi:

pi install npm:pi-voice-input

Update to the latest published version:

pi update npm:pi-voice-input

If pi is already running, restart pi after installing or updating. /reload may not replace code that was already loaded by the current pi process.

Providers

The extension is structured around a provider boundary: recording, editor insertion, and command handling are generic; ASR transport/protocol logic is provider-specific.

Currently implemented provider:

  • VolcEngine WebSocket ASR (bigmodel_nostream)

Planned provider direction:

  • add more ASR providers without changing the shortcut/user workflow
  • keep provider credentials and options isolated in config

Configure

All plugin settings live in one JSON file:

~/.pi/agent/voice-input.config.json

Package-local and project-local env files are not read.

Create or normalize the file from inside pi:

/voice init

Then set the VolcEngine Speech API key:

/voice key

The key URL is also shown inside pi when the key is missing, when you run /voice key, and in /voice help:

https://console.volcengine.com/speech/new/setting/apikeys?projectName=default

The config file is plain JSON and can be edited directly:

{
  "volcApiKey": "",
  "polishModel": ""
}

polishModel is disabled by default. Set it to any model shown by pi --list-models to enable transcript polish. If polishing fails, the raw ASR transcript is inserted instead.

Verify the effective non-secret config:

/voice config

Usage

Shortcut:

Ctrl+Shift+R

Slash commands:

/voice start    # start recording
/voice stop     # stop, transcribe, insert text
/voice toggle   # start if idle, stop if recording
/voice cancel   # stop recording and discard local audio without transcribing
/voice status   # show recorder state
/voice config   # show effective non-secret config and whether API key is detected
/voice init     # create or normalize ~/.pi/agent/voice-input.config.json
/voice key      # prompt for and save the current provider API key
/voice help     # show setup help, including the explicit VolcEngine API key URL

Notes

  • The extension uses post-recording WebSocket ASR: it records locally to a per-run temporary WAV, sends the stopped recording in chunks, then deletes the temporary audio. It is optimized for fast voice input, not live subtitles.
  • The default ASR segment size is intentionally larger than realtime packet sizes because this workflow sends already-recorded audio.
  • The transcript is inserted into the editor only; it is not submitted automatically.
  • Recorder stdout/stderr is not logged to disk, to avoid retaining potentially sensitive runtime data.
  • On startup, legacy ~/.pi/agent/voice-input/recordings and ~/.pi/agent/voice-input/logs artifacts are cleaned up when they are not part of an active recording.
  • When polishModel is set, polishing uses the unsent editor draft and recent session messages as context, but outputs only the refined voice text to insert at the current cursor. It must not reconstruct the full draft; the final text is pasted without replacing existing editor content.
  • While recording, the status line shows ● Mic on: [device name] — press Ctrl+Shift+R again to stop/transcribe in the current theme accent color; no separate popup is shown when recording starts.

Development

See CONTRIBUTING.md for contribution guidelines, validation commands, and pull request expectations.

Clone the repo and install dependencies:

git clone [email protected]:tr-nc/pi-voice-input.git
cd pi-voice-input
npm install

Run directly from the package checkout:

pi -e .

Or install the local checkout while developing:

pi install .

After changing the extension while pi is open, run:

/reload

Roadmap

See ROADMAP.md for planned user-visible work.

Links

  • API key settings: https://console.volcengine.com/speech/new/setting/apikeys?projectName=default
  • ASR product page: https://www.volcengine.com/product/asr
  • WebSocket ASR docs: https://www.volcengine.com/docs/6561/1354869?lang=zh