npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2025 – Pkg Stats / Ryan Hefner

@krasnoperov/transcribe

v1.0.5

Published

CLI tool for audio/video transcription with speaker diarization, AI summarization, and infographic generation

Downloads

573

Readme

Transcribe

AI transcription skill for Claude Code - Transform audio/video recordings into transcripts with speaker diarization, AI-powered summaries, and visual infographics.

This skill provides a complete pipeline for processing recordings:

  • Transcription - Convert audio/video to VTT format with speaker identification (OpenAI Whisper)
  • Summarization - Generate structured markdown summaries (OpenAI GPT-5.1)
  • Infographics - Create visual summaries from text (Google Gemini)
  • All-in-one - Process video → transcript → summary → infographic in one command

See skills/transcribe/SKILL.md for complete usage guide.

Use in Claude Code

This is a Claude Code skill. Install it from the marketplace:

/plugin marketplace add krasnoperov/claude-plugins
/plugin install transcribe@krasnoperov-plugins

Once installed, use the /transcribe skill in your conversations:

/transcribe transcribe meeting.mp4 to VTT with speaker diarization
/transcribe summarize this transcript into key points
/transcribe create an infographic from this summary

Command Line Usage

You can also use this package directly via npx:

export OPENAI_API_KEY="your-openai-key"
export GOOGLE_AI_STUDIO_KEY="your-google-key"

# Transcribe audio/video
npx -y @krasnoperov/transcribe@latest transcribe meeting.mp4 -o transcript.vtt

# Generate summary
npx -y @krasnoperov/transcribe@latest summarize transcript.vtt -o summary.md

# Create infographic
npx -y @krasnoperov/transcribe@latest infographic summary.md -o visual.png

# All-in-one pipeline
npx -y @krasnoperov/transcribe@latest process recording.mp4 --output-dir ./output

Get your API keys:

Core Operations

transcribe <input>           Audio/Video → VTT transcript with speakers
summarize <input>            Text/VTT → Markdown summary
infographic <input>          Text → Visual infographic image
process <input>              All-in-one: video → transcript → summary → infographic

These operations can be used individually or chained together.

Examples

See skills/transcribe/examples/ directory:

  1. 01-basic-workflow.sh - Step-by-step transcription pipeline
  2. 02-all-in-one.sh - Single command processing

Transcription with Speaker Diarization

npx -y @krasnoperov/transcribe@latest transcribe podcast.mp3 \
  --language es \
  --model gpt-4o-transcribe-diarize \
  -o podcast.vtt

Output (VTT with speaker tags):

WEBVTT

00:00:00.000 --> 00:00:02.450
<v A>Welcome to the podcast...

00:00:02.850 --> 00:00:08.200
<v B>Thanks for having me...

Custom Summarization

npx -y @krasnoperov/transcribe@latest summarize transcript.vtt \
  --prompt "Focus on action items and decisions" \
  -o summary.md

Styled Infographic

npx -y @krasnoperov/transcribe@latest infographic summary.md \
  --style "modern minimal corporate" \
  -o infographic.png

Options

Transcribe

--model <model>          gpt-4o-transcribe-diarize (default), gpt-4o-transcribe, whisper-1
--language <lang>        Language code (en, es, ru, de, etc.)
-o, --output <file>      Output VTT file

Summarize

--prompt <text>          Custom summarization instructions
-o, --output <file>      Output markdown file

Infographic

--style <text>           Style instructions for visual
--reference <image>      Reference image for style
-o, --output <file>      Output image file

Process (All-in-one)

--output-dir <dir>       Output directory for all files
--language <lang>        Language for transcription
--model <model>          Transcription model
--style <text>           Style for infographic

Requirements

  • Node.js >= 18.0.0
  • ffmpeg (for audio extraction)
# macOS
brew install ffmpeg

# Ubuntu/Debian
sudo apt install ffmpeg

Development

npm run build      # Build TypeScript
npm run typecheck  # Type checking
npm run test       # Run tests
npm run dev        # Dev mode with type stripping

License

MIT License - Copyright (c) 2025 Aleksei Krasnoperov

See LICENSE file for details.