youtube2md

v1.0.1

Published

4 months ago

Convert a YouTube video to a Markdown summary file

0High
0Medium
0Low

sunghyo

youtube markdown summary transcript openai gpt

youtube2md

Convert any YouTube video into a structured Markdown summary — with chapter detection, clickable timestamp links, and key takeaways.

Two modes:

Standalone — fetch transcript + summarize with OpenAI → write Markdown file
Extract-only — fetch transcript only, no API key required → output JSON for external processing

Install

npm install -g youtube2md

Or use without installing:

OPENAI_API_KEY=sk-... npx youtube2md --url https://youtu.be/VIDEO_ID

Usage

# Basic usage
youtube2md --url https://www.youtube.com/watch?v=VIDEO_ID

# Custom output path
youtube2md --url https://youtu.be/VIDEO_ID --out ./notes/video.md

# Custom output directory (files saved as <video_id>.md)
youtube2md --url https://youtu.be/VIDEO_ID --out-dir ./output

# Set summary language
youtube2md --url https://youtu.be/VIDEO_ID --lang Korean

# Use a specific model
youtube2md --url https://youtu.be/VIDEO_ID --model gpt-5-mini

# Extract transcript only (no API key required if captions are available)
youtube2md --url https://youtu.be/VIDEO_ID --extract-only

# Machine-readable JSON output from full pipeline
youtube2md --url https://youtu.be/VIDEO_ID --json --stdout

Output is saved to ./summaries/<video_id>.md (full pipeline) or ./summaries/<video_id>.json (extract-only) by default.

Options

| Option | Description | |---|---| | --url <youtube_url> | YouTube video URL (required) | | --model <model> | OpenAI model to use (default: gpt-5-mini). Overrides OPENAI_MODEL env var. | | --lang <language> | Summary output language (default: same as transcript language) | | --out <path> | Output file path | | --out-dir <dir> | Output directory; file is named <video_id>.md (default: ./summaries) | | --extract-only | Skip summarization, output raw transcript as JSON | | --json | Emit result as JSON (success and errors); progress logs go to stderr | | --stdout | Write output to stdout instead of a file | | --help | Show help | | --version | Show version |

Requirements

Node.js 18+
OpenAI API key — required for summarization; not required for --extract-only when captions are available

Environment variables

| Variable | Description | |---|---| | OPENAI_API_KEY | Required for summarization. Not needed in --extract-only mode if captions are available. | | OPENAI_MODEL | Optional. Fallback model if --model is not passed (default: gpt-5-mini). |

Set your API key before running:

export OPENAI_API_KEY=sk-...
youtube2md --url https://youtu.be/VIDEO_ID

Or create a .env file in your working directory:

OPENAI_API_KEY=sk-...

Output format

Markdown (full pipeline)

# Video Title

> [Watch on YouTube](https://youtu.be/VIDEO_ID) | Duration: 12:34 | Published: 2024-01-01

## Summary

One paragraph overview of the video content.

## Chapters

### [0:00] Introduction

[▶ 0:00](https://youtu.be/VIDEO_ID?t=0)

- First key point from this section.
- Second key point from this section.

## Key Takeaways

- Key point 1
- Key point 2

JSON (extract-only)

{
  "ok": true,
  "videoId": "VIDEO_ID",
  "metadata": {
    "videoId": "VIDEO_ID",
    "title": "Video Title",
    "duration": "12:34",
    "publishDate": "2024-01-01",
    "nativeChapters": []
  },
  "segments": [
    { "text": "Hello world", "startSeconds": 0.0, "durationSeconds": 2.4 }
  ]
}

JSON (full pipeline with `--json`)

{
  "ok": true,
  "videoId": "VIDEO_ID",
  "metadata": { "..." },
  "outputPath": "/path/to/summaries/VIDEO_ID.md"
}

Error JSON

All errors emit a structured object when --json is active:

{ "ok": false, "code": "E_TRANSCRIPT_UNAVAILABLE", "message": "..." }

| Code | Cause | |---|---| | E_TRANSCRIPT_UNAVAILABLE | No captions found and Whisper fallback unavailable | | E_OPENAI_AUTH | OPENAI_API_KEY missing or invalid | | E_OPENAI_RATE_LIMIT | OpenAI rate limit hit | | E_WHISPER_FAILED | Whisper STT transcription failed | | E_NETWORK | Network or YouTube access error | | E_WRITE_FAILED | Could not write output file |

Transcript strategy

The tool tries these methods in order:

YouTube captions via Android Innertube — uses caption tracks directly (supports json3 and XML timedtext formats)
youtube-transcript fallback — retries with an alternate parser path
OpenAI Whisper STT fallback — downloads audio and transcribes it (requires OPENAI_API_KEY; audio must be under 25 MB). Skipped in --extract-only mode when no API key is set.

Summary process

Summarization runs in two modes based on transcript token count (using tiktoken with model-aware encoding):

Normalize transcript: convert each segment to [MM:SS] spoken text.
Count tokens: compute transcript size with tiktoken (fallback to o200k_base).
Choose mode:
- Single-pass when total tokens are <= 5000
- Chunked when total tokens are > 5000
Single-pass mode: one GPT request with metadata, native YouTube chapters, and full transcript.
Chunked mode:
- Split into chunks targeting 5000 tokens.
- Merge tiny final chunk (< 25% of limit) into the previous chunk.
- Summarize each chunk in parallel (up to 4 concurrent jobs).
- Combine chapters locally (chronological sort + dedupe).
- One final GPT request for full-video summary and takeaways.
Validate + normalize: require non-empty summary, chapters, takeaways; sort and deduplicate.
Render Markdown: convert to final output.

Token thresholds

Defined in src/summarizer.ts:

| Constant | Value | Meaning | |---|---|---| | SINGLE_PASS_TOKEN_LIMIT | 5000 | Use single-pass below this | | CHUNK_TOKEN_LIMIT | 5000 | Target tokens per chunk | | MIN_LAST_CHUNK_RATIO | 0.25 | Merge final chunk if smaller than 25% of limit |

Development

# Clone and install
git clone https://github.com/sunghyo/youtube2md
cd youtube2md
npm install

# Set your API key
cp .env.example .env
# Edit .env and add your OPENAI_API_KEY

# Run without building
npx tsx src/index.ts --url https://www.youtube.com/watch?v=VIDEO_ID

Build

npm run build   # Compile TypeScript to dist/
npm run dev     # Run directly with tsx (no build needed)
npm run clean   # Remove dist/

Project structure

src/
├── index.ts       # Entry point — pipeline branching and orchestration
├── cli.ts         # CLI argument parsing (Commander)
├── youtube.ts     # Metadata fetch + transcript fetch with fallback
├── summarizer.ts  # OpenAI Responses API prompting + JSON parsing
├── markdown.ts    # Markdown generation + file writing
└── types.ts       # Shared interfaces, AppError, error codes
summaries/         # Default output directory

Attribution

This project was built with AI assistance from Claude (Anthropic) and Codex (OpenAI).

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

youtube2md

Install

Usage

Options

Requirements

Environment variables

Output format

Markdown (full pipeline)

JSON (extract-only)

JSON (full pipeline with --json)

Error JSON

Transcript strategy

Summary process

Token thresholds

Development

Build

Project structure

Attribution

JSON (full pipeline with `--json`)