npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

autoglm-asr-mcp

v0.2.2

Published

MCP server for AutoGLM ASR speech-to-text transcription with long-audio chunking and context passing

Readme

AutoGLM ASR MCP Server

MCP server for high-quality speech-to-text transcription using Zhipu AutoGLM ASR.

CN: 一个面向 Agent 的语音转文字 MCP 服务,支持长音频分块、上下文传递和时间戳分段。

For AI-oriented setup details, see AI_SETUP_GUIDE.md.

For AI Agents (TL;DR)

  • Type: MCP Server
  • Domain: ASR / speech-to-text / transcription
  • Input: local audio file path
  • Output: full transcript text + timestamp segments
  • Best for: meeting notes, call analysis, subtitle draft, voice memo transcription
  • Supported audio formats: mp3, wav, m4a, flac, ogg, webm
  • Core tools: transcribe_audio, get_audio_info

What It Does

  • Transcribes short and long audio files with automatic chunking.
  • Uses context-aware modes to balance speed and quality.
  • Returns readable full text and segment-level timestamps.
  • Runs over stdio as an MCP server for coding assistants.

Tool Index

| Tool | Purpose | Required Args | Optional Args | Returns | |------|---------|---------------|---------------|---------| | transcribe_audio | Transcribe audio to text | audio_path | context_mode, max_concurrency | Full transcript and time-aligned segments | | get_audio_info | Inspect audio before transcription | audio_path | None | Duration, format, channels, sample rate, estimated chunks |

Features

  • Fast long-audio transcription with sliding-window concurrency.
  • Better accuracy through chunk-to-chunk context passing.
  • Automatic splitting for long inputs (API limit friendly).
  • Zero-install runtime with npx.
  • Works with common MCP clients.

Installation

Prerequisites

ffmpeg must be installed:

# macOS
brew install ffmpeg

# Ubuntu/Debian
apt install ffmpeg

# Windows
choco install ffmpeg

Get your API key from Zhipu AI Open Platform.

NPX (Recommended)

npx autoglm-asr-mcp

Quick Start

Add this MCP server to your client config and set AUTOGLM_ASR_API_KEY.

{
  "mcpServers": {
    "autoglm-asr": {
      "command": "npx",
      "args": ["-y", "autoglm-asr-mcp"],
      "env": {
        "AUTOGLM_ASR_API_KEY": "your-api-key"
      }
    }
  }
}

Compatibility

  • Claude Desktop / Claude Code
  • Cursor
  • Windsurf
  • VS Code MCP
  • Other MCP-compatible clients

VS Code quick install:

Install with NPX in VS Code

Tools

transcribe_audio

Transcribe an audio file into text with timing segments.

Arguments:

| Name | Type | Required | Description | |------|------|----------|-------------| | audio_path | string | Yes | Absolute path to the audio file | | context_mode | string | No | sliding (default), none (fastest), full_serial (best quality, slower) | | max_concurrency | integer | No | Max parallel requests, range 1-20, default 5 |

Returns:

  • Full transcription text
  • Timestamped segment list
  • Basic run stats (chunks, mode, elapsed time)

Common errors:

  • File not found or unreadable path
  • Unsupported format or broken audio stream
  • Missing/invalid API key

get_audio_info

Inspect an audio file before transcription.

Arguments:

| Name | Type | Required | Description | |------|------|----------|-------------| | audio_path | string | Yes | Absolute path to the audio file |

Returns:

  • Duration
  • Format
  • Sample rate
  • Channels
  • Estimated chunks

Context Modes

| Mode | Speed | Quality | Description | |------|-------|---------|-------------| | sliding | Fast | High | First chunk initializes context, later chunks run in parallel with context | | none | Fastest | Medium | Chunks run independently in parallel | | full_serial | Slow | Best | All chunks transcribed sequentially with full context chain |

Environment Variables

| Variable | Default | Description | |----------|---------|-------------| | AUTOGLM_ASR_API_KEY | required | Your Zhipu API key | | AUTOGLM_ASR_API_BASE | https://open.bigmodel.cn/api/paas/v4/audio/transcriptions | API endpoint | | AUTOGLM_ASR_MODEL | glm-asr-2512 | ASR model name | | AUTOGLM_ASR_MAX_CHUNK_DURATION | 25 | Max chunk duration (seconds) | | AUTOGLM_ASR_MAX_CONCURRENCY | 5 | Default concurrency | | AUTOGLM_ASR_CONTEXT_MAX_CHARS | 2000 | Max context size passed between chunks |

Use Cases

  • Meeting recording to editable transcript
  • Customer support call transcription
  • Podcast/video subtitle draft generation
  • Voice memo indexing and search

Limitations

  • Requires local file path input (not remote URL input).
  • Audio quality strongly affects transcription quality.
  • Very noisy or multi-speaker overlap can reduce accuracy.

Troubleshooting

  • ffmpeg not found: install ffmpeg and retry.
  • File not found: pass an absolute existing path.
  • API errors: verify AUTOGLM_ASR_API_KEY and account quota.

Keywords

mcp, model-context-protocol, asr, speech-to-text, transcription, autoglm, zhipu, chinese-asr, audio-transcription, meeting-transcript, subtitle-generation, voice-to-text, agent-tools, llm-tools, coding-agent

License

MIT