npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

mcp-listen

v0.1.2

Published

Give your AI agents the ability to listen. Microphone capture and speech-to-text tools for MCP-compatible agents.

Readme

mcp-listen

Give your AI agents the ability to listen

Microphone capture and speech-to-text tools for MCP-compatible agents.

Tools

| Tool | Description | | ------ | ------------- | | list_audio_devices | List available microphone input devices | | capture_audio | Record audio from the microphone and save as WAV | | voice_query | Capture, transcribe (whisper.cpp), and query a local LLM (Ollama) |

Quick Start

Claude Code

claude mcp add mcp-listen npx mcp-listen

Claude Desktop / ChatGPT Desktop / Cursor / Windsurf / VS Code

Add to your MCP configuration:

{
  "mcpServers": {
    "mcp-listen": {
      "command": "npx",
      "args": ["-y", "mcp-listen"]
    }
  }
}

Compatible with Claude Desktop, ChatGPT Desktop, Cursor, GitHub Copilot, Windsurf, VS Code, Gemini, Zed, and any MCP-compatible client.

Global Install

npm install -g mcp-listen

Requirements

For list_audio_devices and capture_audio:

  • Node.js 18+
  • A microphone

For voice_query (optional):

Tool Reference

list_audio_devices

Returns a JSON array of available audio input devices.

Parameters: None

Example response:

[
  { "index": 3, "name": "Microphone (Creative Live! Cam)", "isDefault": true, "maxInputChannels": 2, "defaultSampleRate": 48000 },
  { "index": 4, "name": "Microphone Array (Intel)", "isDefault": false, "maxInputChannels": 2, "defaultSampleRate": 48000 }
]

capture_audio

Records audio from the microphone and saves as a WAV file.

Parameters:

| Parameter | Type | Default | Description | | ---------- | ------ | --------- | ------------- | | duration_ms | number | 5000 | Recording duration in milliseconds (100-30000) | | device | number | system default | Device index from list_audio_devices |

Example response:

{
  "path": "/tmp/mcp-listen-1712345678901.wav",
  "duration_ms": 5000,
  "sample_rate": 16000,
  "channels": 1,
  "size_bytes": 160044
}

voice_query

Full voice pipeline: capture audio, transcribe with whisper.cpp, send to Ollama, return the response. Entirely offline.

Parameters:

| Parameter | Type | Default | Description | | ----------- | ------ | --------- | ------------- | | duration_ms | number | 5000 | Recording duration in milliseconds (100-30000) | | device | number | system default | Device index from list_audio_devices | | whisper_model | string | ggml-base.en.bin | Path or filename of Whisper GGML model | | language | string | en | Language code for transcription | | model | string | llama3.2 | Ollama model name | | prompt | string | You are a helpful assistant. | System prompt for the LLM |

Example response:

{
  "transcription": "What is the default port for PostgreSQL?",
  "response": "PostgreSQL runs on port 5432 by default.",
  "model": "llama3.2"
}

How It Works

mcp-listen uses decibri for cross-platform microphone capture. No ffmpeg, no SoX, no system audio tools required. Pre-built native binaries with zero setup.

Audio is captured as 16-bit PCM at 16kHz mono, the standard format for speech-to-text engines.

The voice_query tool replicates the pipeline from voxagent: capture audio, transcribe locally with whisper.cpp, and send to a local Ollama LLM. Fully offline, nothing leaves your machine.

Whisper Model Setup

The voice_query tool requires a Whisper GGML model file. Download one:

Linux / macOS:

mkdir -p ~/.mcp-listen/models
curl -L -o ~/.mcp-listen/models/ggml-base.en.bin https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-base.en.bin

Windows (PowerShell):

mkdir "$env:USERPROFILE\.mcp-listen\models" -Force
Invoke-WebRequest -Uri "https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-base.en.bin" -OutFile "$env:USERPROFILE\.mcp-listen\models\ggml-base.en.bin"

The model is ~150MB and downloads once. You can also set the WHISPER_MODEL_PATH environment variable to a custom directory.

Ollama Setup

  1. Install Ollama from https://ollama.com
  2. Pull a model: ollama pull llama3.2
  3. Ensure Ollama is running: ollama serve

Known Limitations

  1. Fixed recording duration. You specify how long to record. There is no "stop when I stop talking" mode yet.
  2. voice_query requires Ollama running. If Ollama isn't running, the tool returns a clear error message.
  3. Whisper model downloads on first use. The first voice_query call requires a pre-downloaded model (~150MB).
  4. No streaming. MCP's request/response pattern means the entire recording is captured, then transcribed, then sent to the LLM. No real-time partial results.
  5. Temp files. capture_audio writes WAV files to the system temp directory. They are not automatically cleaned up. voice_query cleans up after itself.

Troubleshooting

Windows: "Error opening microphone" Windows may block microphone access by default. Go to Settings > Privacy & security > Microphone and ensure microphone access is enabled for desktop apps.

Ollama: "Ollama is not running" Some Ollama installations start as a background service automatically. If you see this error, run ollama serve manually or check that the Ollama service is running.

Whisper: "model not found" The whisper model file must be downloaded before first use. See Whisper Model Setup for instructions.

Powered By

  • decibri: Cross-platform microphone capture for Node.js
  • voxagent: Voice-powered terminal agent (inspiration for the voice_query pipeline)

License

Apache-2.0. See LICENSE for details.

Copyright 2026 Analytics in Motion