@tanush-yadav/youtube-transcript-mcp

v1.0.0

Published

7 months ago

Model Context Protocol server for extracting YouTube video transcripts. Works with Claude Desktop and other MCP-compatible clients.

0High
0Medium
0Low

tanush-yadav

mcp youtube transcript modelcontextprotocol claude llm ai youtube-api captions subtitles video-transcript youtubei.js

YouTube Transcript MCP Server

A Model Context Protocol (MCP) server that enables Large Language Models (LLMs) to extract transcripts from YouTube videos. Built with the reliable youtubei.js library, this server provides seamless transcript extraction with support for timestamps, metadata, and file exports.

✨ Features

🎥 Extract transcripts from any YouTube video with captions
⏱️ Timestamp support - Get transcripts with or without timestamps
📊 Rich metadata - Word count, duration, segment count, and more
💾 Export to files - Save transcripts as text files
🔧 Flexible input - Accepts full URLs, short URLs, or just video IDs
⚡ High reliability - Uses YouTube's internal API via youtubei.js
🚀 No API key required - Works out of the box
🛡️ Error handling - Clear, actionable error messages

📦 Installation

As an MCP Server for Claude Desktop

# Clone the repository
git clone https://github.com/yourusername/youtube-transcript-mcp.git
cd youtube-transcript-mcp

# Install dependencies
npm install

As an npm Package

npm install youtube-transcript-mcp

Or using yarn:

yarn add youtube-transcript-mcp

🚀 Quick Start

Configuration for Claude Desktop

Add the server to your Claude Desktop configuration:

macOS: ~/Library/Application Support/Claude/claude_desktop_config.json Windows: %APPDATA%\Claude\claude_desktop_config.json Linux: ~/.config/Claude/claude_desktop_config.json

{
  "mcpServers": {
    "youtube-transcript": {
      "command": "node",
      "args": ["/absolute/path/to/youtube-transcript-mcp/index.js"]
    }
  }
}

Or if installed globally via npm:

{
  "mcpServers": {
    "youtube-transcript": {
      "command": "npx",
      "args": ["youtube-transcript-mcp"]
    }
  }
}

🛠️ Available Tools

1. `get_transcript`

Extract transcript from a YouTube video with optional timestamps.

Parameters:

url (string, required): YouTube video URL or video ID
include_timestamps (boolean, optional): Include timestamps in output (default: false)

Example Request:

{
  "name": "get_transcript",
  "arguments": {
    "url": "https://youtube.com/watch?v=dQw4w9WgXcQ",
    "include_timestamps": true
  }
}

Example Output with timestamps:

[00:00] We're no strangers to love
[00:04] You know the rules and so do I
[00:08] A full commitment's what I'm thinking of

2. `get_transcript_with_metadata`

Extract transcript along with comprehensive metadata.

Parameters:

url (string, required): YouTube video URL or video ID

Example Response:

{
  "metadata": {
    "video_id": "dQw4w9WgXcQ",
    "video_url": "https://youtube.com/watch?v=dQw4w9WgXcQ",
    "word_count": 251,
    "segment_count": 42,
    "duration": "3:32",
    "duration_seconds": 212,
    "language": "en",
    "is_auto_generated": false
  },
  "transcript": "Never gonna give you up...",
  "full_transcript_length": 1234
}

3. `save_transcript`

Save transcript to text file(s) on the local filesystem.

Parameters:

url (string, required): YouTube video URL or video ID
filename (string, required): Base filename (without extension)
with_timestamps (boolean, optional): Save version with timestamps (default: true)

Example:

{
  "name": "save_transcript",
  "arguments": {
    "url": "https://youtu.be/dQw4w9WgXcQ",
    "filename": "rickroll_transcript",
    "with_timestamps": true
  }
}

Creates files:

rickroll_transcript_clean.txt - Plain text transcript
rickroll_transcript_with_timestamps.txt - Transcript with timestamps (if enabled)

💻 Programmatic Usage

As an MCP Client

import { Client } from '@modelcontextprotocol/sdk/client/index.js'
import { StdioClientTransport } from '@modelcontextprotocol/sdk/client/stdio.js'

// Initialize transport
const transport = new StdioClientTransport({
  command: 'node',
  args: ['/path/to/youtube-transcript-mcp/index.js'],
})

// Create client
const client = new Client({
  name: 'youtube-transcript-client',
  version: '1.0.0',
})

// Connect and use
await client.connect(transport)

// Get transcript with timestamps
const result = await client.callTool({
  name: 'get_transcript',
  arguments: {
    url: 'https://www.youtube.com/watch?v=dQw4w9WgXcQ',
    include_timestamps: true,
  },
})

console.log(result.content[0].text)

Direct Module Usage

// Coming soon: Direct module import support
import { YouTubeTranscriptExtractor } from 'youtube-transcript-mcp'

const extractor = new YouTubeTranscriptExtractor()
const transcript = await extractor.getTranscript('dQw4w9WgXcQ')
console.log(transcript)

🌐 Supported URL Formats

The server accepts various YouTube URL formats:

✅ Standard: https://www.youtube.com/watch?v=VIDEO_ID
✅ Short: https://youtu.be/VIDEO_ID
✅ Embed: https://www.youtube.com/embed/VIDEO_ID
✅ Mobile: https://m.youtube.com/watch?v=VIDEO_ID
✅ Shorts: https://www.youtube.com/shorts/VIDEO_ID
✅ With timestamps: https://youtube.com/watch?v=VIDEO_ID&t=123
✅ With playlist: https://youtube.com/watch?v=VIDEO_ID&list=PLAYLIST_ID
✅ Just video ID: dQw4w9WgXcQ

📝 Usage Examples with Claude

Once configured, you can ask Claude:

"Get the transcript from https://www.youtube.com/watch?v=dQw4w9WgXcQ"

"Extract the YouTube transcript with timestamps from video ID abc123"

"Save the transcript from this video to a file: [URL]"

"Get detailed metadata and transcript from: [URL]"

"Summarize this YouTube video: [URL]" (Claude will fetch and summarize)

🔧 Development

Running Tests

npm test

Building from Source

git clone https://github.com/yourusername/youtube-transcript-mcp.git
cd youtube-transcript-mcp
npm install
npm run build

Development Mode

npm run dev

Testing the MCP Server

Create a test file test-client.js:

import { Client } from '@modelcontextprotocol/sdk/client/index.js'
import { StdioClientTransport } from '@modelcontextprotocol/sdk/client/stdio.js'

const transport = new StdioClientTransport({
  command: 'node',
  args: ['./index.js'],
})

const client = new Client({
  name: 'test-client',
  version: '1.0.0',
})

await client.connect(transport)

// List available tools
const tools = await client.listTools()
console.log('Available tools:', tools)

// Test transcript extraction
const result = await client.callTool({
  name: 'get_transcript',
  arguments: {
    url: 'https://www.youtube.com/watch?v=dQw4w9WgXcQ',
  },
})

console.log('Transcript:', result.content[0].text)
await transport.close()

🐛 Troubleshooting

Common Issues

"No transcript available"
- ✓ Ensure the video has captions/subtitles available
- ✓ Check if the video is public and not age-restricted
- ✓ Some live streams may not have transcripts available
Connection errors
- ✓ Verify your internet connection
- ✓ Check if YouTube is accessible in your region
- ✓ Ensure Node.js version is 18.0 or higher
MCP server not found in Claude
- ✓ Verify the path in your Claude configuration is absolute
- ✓ Ensure Node.js is properly installed and in PATH
- ✓ Restart Claude Desktop after configuration changes
Permission errors when saving files
- ✓ Ensure write permissions in the target directory
- ✓ Check disk space availability

Debug Mode

Enable debug logging by setting the environment variable:

DEBUG=youtube-transcript-mcp node index.js

📊 Performance

Average transcript extraction time: 1-3 seconds
Memory usage: ~50MB
Supports videos up to 12+ hours in length
Handles 1000+ segments efficiently

🤝 Contributing

Contributions are welcome! Please follow these steps:

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

Development Guidelines

Follow existing code style
Add tests for new features
Update documentation as needed
Ensure all tests pass before submitting PR

📄 License

MIT License - see LICENSE file for details

🙏 Acknowledgments

youtubei.js - Excellent YouTube API implementation
Model Context Protocol - MCP SDK and specification
Anthropic - For creating Claude and the MCP protocol

📈 Roadmap

[ ] Support for multiple language transcripts
[ ] Batch processing for multiple videos
[ ] Transcript translation capabilities
[ ] Export to SRT/VTT subtitle formats
[ ] Caching for improved performance
[ ] Support for playlist extraction
[ ] Real-time transcript streaming
[ ] Custom formatting options

💬 Support

For issues, questions, or suggestions:

📝 Changelog

[1.0.0] - 2024-01-03

🎉 Initial release
✨ Transcript extraction with youtubei.js
⏱️ Timestamp support
📊 Metadata extraction
💾 File saving capability
🔧 MCP protocol implementation

Made with ❤️ by the Open Source Community

Star ⭐ this repo if you find it useful!

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

YouTube Transcript MCP Server

✨ Features

📦 Installation

As an MCP Server for Claude Desktop

As an npm Package

🚀 Quick Start

Configuration for Claude Desktop

🛠️ Available Tools

1. get_transcript

2. get_transcript_with_metadata

3. save_transcript

💻 Programmatic Usage

As an MCP Client

Direct Module Usage

🌐 Supported URL Formats

📝 Usage Examples with Claude

🔧 Development

Running Tests

Building from Source

Development Mode

Testing the MCP Server

🐛 Troubleshooting

Common Issues

Debug Mode

📊 Performance

🤝 Contributing

Development Guidelines

📄 License

🙏 Acknowledgments

📈 Roadmap

💬 Support

📝 Changelog

[1.0.0] - 2024-01-03

1. `get_transcript`

2. `get_transcript_with_metadata`

3. `save_transcript`