recmp3-cli

v1.0.0

Published

17 days ago

Record audio, transcribe with AI, output developer-ready prompts — for humans and AI agents, from a single terminal command.

Downloads

0High
0Medium
0Low

aedneth

cli audio recording transcription whisper groq developer-tools vibecoding mcp ai-agent agent-native

recmp3-cli

Record audio from any terminal, transcribe with Groq Whisper, get developer-ready output. A first-class tool for both humans and terminal AI agents — every interactive flow has a fully non-interactive, JSON-emitting equivalent, plus a built-in MCP server.

recmp3 record --name "my standup"
recmp3 prompt standup.wav --template claude-code | pbcopy

What it does

Records audio with pause/resume using an Ink TUI (runs in your current terminal — no popup windows)
Transcribes via Groq whisper-large-v3-turbo (or OpenAI Whisper)
Formats output with 7 developer templates: claude-code, prd, bug, meeting-notes, todo, commit-message, raw
Cross-platform: Linux (PulseAudio/PipeWire), macOS (AVFoundation), Windows (DirectShow)
Agent-native: global --json envelopes, --yes, deterministic exit codes, stdin/stdout piping, a discoverable manifest, and an MCP server — see Agent & scripting use
Local option: --provider local-whisper transcribes on-device via whisper.cpp (no upload)

Requirements

Node.js ≥ 20
ffmpeg ≥ 4.4 (sudo apt install ffmpeg / brew install ffmpeg)
A Groq API key (free tier works) — get one at console.groq.com

Installation

npm install -g recmp3-cli

Or build from source:

git clone https://github.com/aedneth/recmp3-cli
cd recmp3-cli
npm install && npm run build && npm link

Then set your API key:

echo 'export GROQ_API_KEY=gsk_...' >> ~/.bashrc
source ~/.bashrc

Verify the install:

recmp3 doctor

Commands

`recmp3 record`

recmp3 record [options]

Options:
  -n, --name <name>        Output filename stem (e.g. "my-idea")
  -o, --out <dir>          Output directory
  -t, --transcribe         Transcribe immediately after recording
  --mp3                    Save as MP3 instead of WAV
  --provider <name>        Override provider (groq, openai, local-whisper)
  --lang <code>            Force language code (e.g. es, en)
  --source <id>            Audio source id, or "auto" for the best physical mic
  --duration <seconds>     Headless: record N seconds then stop (no TUI)
  --no-tui                 Force headless capture (record until Ctrl+C)
  --no-copy / --no-print   Don't copy / don't print the transcript
  -y, --yes                Skip upload consent prompt

Tip: recmp3 record --source auto selects your real microphone automatically, skipping the generic default device and system-audio .monitor sources. Run recmp3 sources to see every device and which one is (recommended).

Controls while recording:

p or Space — pause / resume
s or Enter — save and finish
c or Escape — cancel (discard recording)
Ctrl+C — cancel

`recmp3 transcribe <file>`

Transcribe an existing audio file. Outputs transcript text to stdout (pipeable).

recmp3 transcribe meeting.wav --template prd > meeting-prd.md

`recmp3 prompt <file>`

Apply a developer template to a transcript or text file. No network call — purely deterministic formatting.

recmp3 prompt transcript.txt --template claude-code
recmp3 prompt transcript.txt --list-templates

Available templates: raw, claude-code, prd, bug, meeting-notes, todo, commit-message

`recmp3 sources`

List available audio input devices for the current platform.

recmp3 sources
recmp3 sources --json

`recmp3 doctor`

Run 8 system checks: Node version, platform support, ffmpeg version, audio backend, config file, API key, provider connectivity, and recordings directory.

`recmp3 config`

recmp3 config init          # Setup (interactive, or flag-driven: --provider/--lang/--outdir/--key)
recmp3 config show          # Display current config (API key redacted)
recmp3 config path          # Print config file path
recmp3 config set <k> <v>   # Set a config key
recmp3 config set-key groq --key gsk_...   # Store an API key in the OS keychain

Agent & scripting use

Every command is usable by AI agents (Claude Code, Codex, Gemini CLI, …) and shell scripts with no TTY and no prompts. See docs/AGENTS.md for the full reference.

# Stable JSON envelope on stdout; chatter on stderr
recmp3 transcribe meeting.wav --json --yes | jq -r .data.text

# Compose via pipes: transcribe → template
recmp3 transcribe meeting.wav --json --yes | jq -r .data.text | recmp3 prompt - --template prd

# Headless recording (no Ink TUI)
recmp3 record --duration 5 --json --yes

# Discover the command/tool surface
recmp3 manifest --json

Exit codes: 0 success · 1 unknown · 2 config · 3 audio/ffmpeg · 4 transcription · 5 network · 6 local-whisper · 7 input · 130 user abort.

MCP server

recmp3 ships a Model Context Protocol server over stdio. Register it with any MCP client:

{ "mcpServers": { "recmp3": { "command": "recmp3", "args": ["mcp"] } } }

Tools: recmp3_transcribe, recmp3_prompt, recmp3_sources, recmp3_doctor, recmp3_config_show, recmp3_record, recmp3_manifest.

Local, no-upload transcription

export RECMP3_WHISPER_BIN=/usr/local/bin/whisper-cli   # whisper.cpp binary
export RECMP3_WHISPER_MODEL=/models/ggml-base.en.bin
recmp3 transcribe clip.wav --provider local-whisper --json

Use Cases

Transcribe Instagram reels, TikToks, or any system audio (`recwatch`)

Capture the audio your speakers are playing — useful for transcribing reels, TikToks, YouTube clips, podcasts, or any video in your browser without re-recording with the mic. Linux + PulseAudio/PipeWire only; the source name below is the monitor of your default output sink (find yours via recmp3 sources or pactl list short sources).

alias recwatch='RECMP3_SOURCE="alsa_output.platform-avs_hdaudio.0.stereo-fallback.monitor" recmp3 record --transcribe -y'

Add the alias above to ~/.bashrc and source ~/.bashrc.
Open the Instagram reel / TikTok / video in your browser.
Run recwatch — it starts recording your system audio output.
Watch the video; the audio plays through your speakers and is captured.
Press s to stop → auto-transcribes via Groq Whisper → transcript prints to terminal and is copied to your clipboard.

Configuration

Config file location:

Linux: ~/.config/recmp3/config.json
macOS: ~/Library/Preferences/recmp3/config.json
Windows: %APPDATA%\recmp3\config.json

Environment variables override config file values:

| Variable | Effect | |---|---| | GROQ_API_KEY | Groq API key | | OPENAI_API_KEY | OpenAI API key | | RECMP3_PROVIDER | groq or openai | | RECMP3_MODEL | Override transcription model | | RECMP3_SOURCE | Default audio source | | RECMP3_FFMPEG_PATH | Path to ffmpeg binary | | RECMP3_OUTDIR | Default recordings output directory | | RECMP3_LANG | Default language hint (e.g. es, en) | | RECMP3_WHISPER_BIN | Path to a whisper.cpp binary (for local-whisper) | | RECMP3_WHISPER_MODEL | Path to a GGML model file (for local-whisper) | | RECMP3_JSON | 1 to always emit JSON envelopes | | RECMP3_YES | 1 to skip all prompts | | RECMP3_QUIET | 1 to suppress stderr chatter | | RECMP3_SKIP_CONSENT | 1 to skip upload consent prompt |

Providers

| Provider | Default model | Max file size | Upload | |---|---|---|---| | Groq | whisper-large-v3-turbo | 25 MB | yes | | OpenAI | whisper-1 | 25 MB | yes | | local-whisper | whisper.cpp GGML model | unlimited | no (on-device) |

Audio is captured as WAV 16kHz mono (~1 MB/min), so the 25 MB limit covers ~25 minutes per recording. Longer recordings are chunked automatically.

Roadmap

| Version | Theme | Status | |---|---|---| | v0.1.0 | Core TUI recorder + Groq/OpenAI transcription | ✅ shipped | | v0.2.0 | Agent-native: --json, MCP server, local Whisper, keychain | ✅ shipped | | v1.0.0 | Source auto-detection, graceful MCP shutdown, stable CLI + exit-code contract | ✅ shipped | | v1.1.0 | Streaming transcription, real-time waveform display | planned | | v1.2.0 | Multi-segment smart chunking, speaker diarization | planned | | v2.0.0 | Plugin SDK for custom providers and templates | planned |

Development

npm run dev           # Run with tsx (no build step)
npm run build         # Build to dist/
npm run typecheck     # TypeScript check
npm run lint          # Biome lint
npm test              # Run test suite (vitest)
npm run test:watch    # Watch mode

See CONTRIBUTING.md for the full contribution guide.

License

recmp3-cli is dual-licensed:

AGPL-3.0 — free for personal use and open source projects
Commercial license — required for proprietary/commercial use

Contact [email protected] to purchase a commercial license.

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

recmp3-cli

What it does

Requirements

Installation

Commands

recmp3 record

recmp3 transcribe <file>

recmp3 prompt <file>

recmp3 sources

recmp3 doctor

recmp3 config

Agent & scripting use

MCP server

Local, no-upload transcription

Use Cases

Transcribe Instagram reels, TikToks, or any system audio (recwatch)

Configuration

Providers

Roadmap

Development

License

`recmp3 record`

`recmp3 transcribe <file>`

`recmp3 prompt <file>`

`recmp3 sources`

`recmp3 doctor`

`recmp3 config`

Transcribe Instagram reels, TikToks, or any system audio (`recwatch`)