@hallelx/youtube-transcript
v0.2.0
Published
Fetch YouTube video transcripts and subtitles (manual and auto-generated) from Node.js, Bun, and Deno. TypeScript port of youtube-transcript-api.
Maintainers
Readme
Youtube Transcript Ts — @hallelx/youtube-transcript
Fetch transcripts and subtitles from YouTube videos. Works with both manually created captions and auto-generated transcripts. Supports translation and multiple output formats (JSON, text, SRT, WebVTT, pretty).
This is a faithful TypeScript port of the excellent Python library
youtube-transcript-api
by jdepoix. It uses the same internal
youtubei/v1/player endpoint, so it does not scrape the YouTube web page
DOM and is much more resilient than HTML-scraping alternatives.
Runs on Node.js (>=18), Bun, and Deno (with a custom fetchFn on
Deno when using proxies). Zero runtime dependencies in the common path.
Installation
npm install @hallelx/youtube-transcript
# or
bun add @hallelx/youtube-transcript
# or
pnpm add @hallelx/youtube-transcript[!IMPORTANT] Deploying to Vercel, AWS Lambda, or Cloudflare Workers? YouTube often blocks transcript requests from datacenter IP addresses. Read the Deploying on serverless platforms section before starting.
Quick start
import { YouTubeTranscriptApi } from '@hallelx/youtube-transcript';
const api = new YouTubeTranscriptApi();
const transcript = await api.fetch('arj7oStGLkU');
for (const snippet of transcript) {
console.log(`[${snippet.start}s] ${snippet.text}`);
}fetch(videoId, options?) returns a FetchedTranscript containing snippets,
the language, and metadata. The default language is English (en); pass
languages to specify a priority list:
const transcript = await api.fetch('arj7oStGLkU', {
languages: ['de', 'en'], // try German first, fall back to English
});Listing available transcripts
const list = await api.list('arj7oStGLkU');
for (const transcript of list) {
console.log(transcript.languageCode, transcript.language, transcript.isGenerated);
}
// Find a specific kind:
const manual = list.findManuallyCreatedTranscript(['en']);
const generated = list.findGeneratedTranscript(['en']);
const fetched = await manual.fetch();Translation
const list = await api.list('arj7oStGLkU');
const en = list.findTranscript(['en']);
if (en.isTranslatable) {
const french = en.translate('fr');
const fetched = await french.fetch();
console.log(fetched.snippets);
}Output formatters
import {
YouTubeTranscriptApi,
JSONFormatter,
SRTFormatter,
WebVTTFormatter,
TextFormatter,
} from '@hallelx/youtube-transcript';
const transcript = await new YouTubeTranscriptApi().fetch('arj7oStGLkU');
console.log(new JSONFormatter().formatTranscript(transcript, { indent: 2 }));
console.log(new SRTFormatter().formatTranscript(transcript));
console.log(new WebVTTFormatter().formatTranscript(transcript));
console.log(new TextFormatter().formatTranscript(transcript));Preserving HTML formatting
By default, all HTML tags are stripped from snippet text. To preserve a small
whitelist of formatting tags (<strong>, <em>, <b>, <i>, <mark>,
<small>, <del>, <ins>, <sub>, <sup>), pass preserveFormatting: true:
const transcript = await api.fetch('arj7oStGLkU', { preserveFormatting: true });CLI
The package ships a youtube-transcript binary:
youtube-transcript --list-transcripts arj7oStGLkU
youtube-transcript --languages en --format srt arj7oStGLkU
youtube-transcript --languages de en --format json arj7oStGLkU dQw4w9WgXcQ
youtube-transcript --translate fr arj7oStGLkURun youtube-transcript --help for the full list of options.
Deploying on serverless platforms
YouTube tightly restricts access to its transcript endpoints from datacenter IP addresses (Vercel, AWS, Cloudflare, etc.). While it may work locally, you will often encounter RequestBlocked or IpBlocked errors in production.
YouTube serves transcripts from two main internal endpoints. Starting in late 2024, they tightened enforcement on the timedtext endpoint, which now heavily penalizes datacenter IP reputations while continuing to serve residential and mobile IPs. This means serverless functions and cloud hosting providers are blocked by default.
For a deep dive into the technical details and current community reports, see the umbrella issue (#1).
Platform compatibility
| Platform | Works out of the box? | Recommended strategy | |---------------------------|-----------------------|----------------------------------| | Local dev (home internet) | Yes | No proxy needed | | Vercel serverless | No | WebshareProxyConfig or fallback | | AWS Lambda | No | WebshareProxyConfig or fallback | | Cloudflare Workers | No | Custom fetchFn + external relay | | Netlify Functions | No | WebshareProxyConfig or fallback | | Render web service | Partial | Long-lived IP, ~70-90% success | | Railway | Partial | Similar to Render | | Fly.io | Partial | Depends on region | | Self-hosted (residential) | Yes | No proxy needed |
As of April 2026. YouTube enforcement changes frequently — please report regressions in the umbrella issue.
Strategy #1: Webshare residential proxies (Recommended)
Residential proxies use IP addresses assigned to home internet connections, which have a much higher reputation than datacenter IPs.
- Sign up: Create an account at webshare.io and purchase a Residential plan (do NOT use "Proxy Server", "Static Residential", or the free tier).
- Environment Variables: Add
WEBSHARE_PROXY_USERNAMEandWEBSHARE_PROXY_PASSWORDto your platform's dashboard (e.g., Vercel Project Settings > Environment Variables). - Install Dependencies: If using Node.js, ensure
undiciis installed as a production dependency:npm install undici. - Implementation:
import { YouTubeTranscriptApi, WebshareProxyConfig } from '@hallelx/youtube-transcript';
const api = new YouTubeTranscriptApi({
proxyConfig: process.env.WEBSHARE_PROXY_USERNAME
? new WebshareProxyConfig({
proxyUsername: process.env.WEBSHARE_PROXY_USERNAME,
proxyPassword: process.env.WEBSHARE_PROXY_PASSWORD,
})
: undefined,
});Cost: ~$6/month. Success Rate: ~99%.
Strategy #2: Generic proxy / custom fetchFn
If you already have a proxy provider (Bright Data, Oxylabs, etc.) or are on a platform like Cloudflare Workers where undici is unavailable, use GenericProxyConfig or a custom fetchFn.
// Using a generic proxy
const api = new YouTubeTranscriptApi({
proxyConfig: new GenericProxyConfig({
httpUrl: 'http://user:[email protected]:8080',
httpsUrl: 'https://user:[email protected]:8080',
}),
});
// Using a custom fetch (e.g., for a relay or specialized client)
const api = new YouTubeTranscriptApi({
fetchFn: (url, init) => {
return fetch(`https://my-proxy-relay.com?url=${encodeURIComponent(url.toString())}`, init);
},
});Strategy #3: Free CORS proxy fallback
You can use a public CORS proxy as a last resort.
[!WARNING] This is not production-grade. Free CORS proxies have no SLA, log your signed URLs, and can rate-limit or disappear at any time. Fine for side projects; use a real proxy for production.
const api = new YouTubeTranscriptApi({
fetchFn: (url, init) => {
return fetch(`https://api.corsproxy.io/?url=${encodeURIComponent(url.toString())}`, init);
},
});Strategy #4: Fallback to another service
A robust production implementation should catch RequestBlocked and fall back to an external transcription service (e.g., AssemblyAI, Deepgram) which can also handle videos where transcripts are truly disabled.
import { YouTubeTranscriptApi, RequestBlocked } from '@hallelx/youtube-transcript';
const api = new YouTubeTranscriptApi();
try {
const transcript = await api.fetch(videoId);
} catch (err) {
if (err instanceof RequestBlocked) {
// Fallback to AssemblyAI / Deepgram / etc.
return fetchAlternativeService(videoId);
}
throw err;
}Local development notes
Local development usually works without any configuration because your ISP provides a residential IP. If you hit blocks locally, ensure you are not on a VPN or corporate network. If you must use a VPN, configure the library with a proxy as shown above.
Working around IP bans (proxies)
YouTube blocks IPs that make too many requests, especially from cloud providers. The library exposes two proxy configurations:
Generic HTTP/HTTPS proxy
import { YouTubeTranscriptApi, GenericProxyConfig } from '@hallelx/youtube-transcript';
const api = new YouTubeTranscriptApi({
proxyConfig: new GenericProxyConfig({
httpUrl: 'http://user:[email protected]:8080',
httpsUrl: 'http://user:[email protected]:8080',
}),
});Webshare rotating residential proxies (recommended)
import { YouTubeTranscriptApi, WebshareProxyConfig } from '@hallelx/youtube-transcript';
const api = new YouTubeTranscriptApi({
proxyConfig: new WebshareProxyConfig({
proxyUsername: 'your-webshare-username',
proxyPassword: 'your-webshare-password',
}),
});Runtime support for proxies
- Node.js: requires the optional peer dependency
undici. Install it once withnpm install undici. The library lazy-loads it only when a proxy is in use. - Bun: uses Bun's native
fetch({ proxy })option — no extra deps needed. - Deno: pass a custom
fetchFnconfigured withDeno.createHttpClient.
Custom fetchFn
For full control (custom HTTPS agents, retries, telemetry), inject your own fetch implementation:
const api = new YouTubeTranscriptApi({
fetchFn: async (input, init) => {
// wrap the global fetch, plug in middleware, etc.
return fetch(input, init);
},
});Partial blocking (transcript fallback)
Sometimes YouTube allows the initial metadata discovery (the list() call) but
blocks the final transcript fetch. For this case, you can provide a
transcriptFetchFallback to route only the final step through a proxy or relay:
const api = new YouTubeTranscriptApi({
transcriptFetchFallback: async (signedUrl, videoId) => {
// signedUrl contains a temporary auth token. Your proxy will see this.
const res = await fetch(
`https://your-proxy.io/?url=${encodeURIComponent(signedUrl)}`,
);
return res.ok ? res : null;
},
});
// fetch() will now use the fallback automatically if the primary request is blocked
const transcript = await api.fetch('arj7oStGLkU');Note: Free public CORS proxies have no SLA and are often unreliable for production use. For serious workloads, it is recommended to use a residential proxy provider or a dedicated transcription service.
Error handling
All exceptions extend YouTubeTranscriptApiException. The library uses a
hierarchical error structure so you can catch broad categories of failure or
specific edge cases:
YouTubeTranscriptApiException— base classCookieError— any cookie-related failureCookiePathInvalidCookieInvalid
CouldNotRetrieveTranscript— catch this for any fetch failureYouTubeDataUnparsable— YouTube response shape changedYouTubeRequestFailed— network-level errorVideoUnplayable— region-locked or copyright strikeVideoUnavailable— video deleted or privateInvalidVideoId— passed a URL instead of an IDRequestBlocked— catch this to handle all IP blocksIpBlocked— specifically HTTP 429 or reCAPTCHA
TranscriptsDisabled— no captions on this videoAgeRestricted— requires sign-inNotTranslatableTranslationLanguageNotAvailableFailedToCreateConsentCookieNoTranscriptFound— requested language doesn't existPoTokenRequired
import {
YouTubeTranscriptApi,
TranscriptsDisabled,
NoTranscriptFound,
} from '@hallelx/youtube-transcript';
try {
const transcript = await new YouTubeTranscriptApi().fetch('xxx');
} catch (err) {
if (err instanceof TranscriptsDisabled) {
console.log('No captions on this video');
} else if (err instanceof NoTranscriptFound) {
console.log('No transcript in the requested language');
} else {
throw err;
}
}Catching errors correctly
When handling IP blocks, always catch the parent RequestBlocked class rather
than the specific IpBlocked subclass. YouTube often uses "bot detection"
mechanisms that throw RequestBlocked directly; if you only catch IpBlocked,
your error handler will be skipped.
// ✅ CORRECT: Catches both 429s and bot-detection blocks
try {
const transcript = await api.fetch(videoId);
} catch (err) {
if (err instanceof RequestBlocked) {
// fall back to a proxy or different provider
}
}
// ❌ WRONG: Misses bot-detection cases
try {
const transcript = await api.fetch(videoId);
} catch (err) {
if (err instanceof IpBlocked) {
// this block will NOT run if YouTube returns REASON_BOT_DETECTED
}
}Common patterns
Pattern 1: Retry-worthy errors
Use this for transient issues where a different IP or a retry might succeed.
try {
return await api.fetch(videoId);
} catch (err) {
if (err instanceof RequestBlocked) {
// IP block or bot detection — fall back to proxy or another provider
return fetchWithProxy(videoId);
}
throw err;
}Pattern 2: Permanent failures
Use this to distinguish between videos that will never have transcripts and those that failed for transient reasons.
try {
return await api.fetch(videoId);
} catch (err) {
if (err instanceof TranscriptsDisabled || err instanceof NoTranscriptFound) {
// Video has no captions at all — no point retrying
return transcribeWithAssemblyAI(videoId);
}
if (err instanceof RequestBlocked) {
// Transient — retry or fall back
return fetchWithProxy(videoId);
}
throw err;
}API surface
class YouTubeTranscriptApi {
constructor(options?: {
proxyConfig?: ProxyConfig;
fetchFn?: typeof fetch;
transcriptFetchFallback?: (signedUrl: string, videoId: string) => Promise<Response | null>;
});
fetch(videoId: string, options?: { languages?: string[]; preserveFormatting?: boolean }): Promise<FetchedTranscript>;
list(videoId: string): Promise<TranscriptList>;
}
class TranscriptList implements Iterable<Transcript> {
videoId: string;
findTranscript(languageCodes: Iterable<string>): Transcript;
findManuallyCreatedTranscript(languageCodes: Iterable<string>): Transcript;
findGeneratedTranscript(languageCodes: Iterable<string>): Transcript;
}
class Transcript {
videoId: string;
language: string;
languageCode: string;
isGenerated: boolean;
isTranslatable: boolean;
translationLanguages: readonly TranslationLanguage[];
fetch(options?: { preserveFormatting?: boolean }): Promise<FetchedTranscript>;
translate(languageCode: string): Transcript;
}
class FetchedTranscript implements Iterable<FetchedTranscriptSnippet> {
snippets: FetchedTranscriptSnippet[];
videoId: string;
language: string;
languageCode: string;
isGenerated: boolean;
toRawData(): Array<{ text: string; start: number; duration: number }>;
}
interface FetchedTranscriptSnippet {
text: string;
start: number; // seconds
duration: number; // seconds
}Differences from the Python library
prettyformatter usesJSON.stringify(data, null, 2)instead of Python'spprint. The output is intended for human reading and the structure is the same.WebshareProxyConfigpercent-encodes the username and password when building the proxy URL (Python relies on therequestslibrary to handle this).- The constructor takes
fetchFn?: typeof fetchrather than arequests.Sessioninstance.
License
MIT. This package is a port of
youtube-transcript-api
by jdepoix, also MIT-licensed. Please consider supporting the upstream project.
