imdb-watchlist-scraper
v2.0.6
Published
Scrape IMDb ratings list
Readme
IMDb Watchlist Scraper
Overview
This library provides a WatchlistScraper class for scraping IMDb watchlist data, including rating IDs and usernames, using the IMDb user ID.
Class Initialization
Parameters
userId(string, required):
IMDb user ID string, formatted like'ur125655832'.headless(boolean, optional):
Default:true.
Determines whether the browser runs in headless mode. Set tofalsefor testing purposes.timeoutInMs(number, optional):
Default: 3 minutes (180000 ms).
The script timeout duration. If a timeout occurs, thewatchlistGrabIds()function will console.error() and returnnull.userAgent(string, optional):
Default:"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36".
Refer to Playwright documentation for user agent formatting details. This is used in Playwright's browser.newContext()
Usage
Main Function: watchlistGrabIds()
Description
watchlistGrabIds() opens a playwright browser, navigates to the watchlist and scrapes all of the IDs in that watchlist.
Return Value
On success, the function returns an object:
{
idArr: string[], // Array of rating IDs
username: string | null // IMDb username, or null if unavailable
}Error cases
The function throws an error explicitly in the following cases:
- The target watchlist is privated
- Parsing time exceeds the time set in
timeoutInMs - Parsing completes but the returned idArr is empty
Example
Scraping All Rating IDs and Username by userId
const scraper = new WatchlistScraper({ userId: "ur125655832" });
try {
const scrapingResults = await scraper.watchlistGrabIds();
// username can be null
const username = scrapingResults.username;
if (!!username) {
console.log(username);
}
// this is never null and always has at least 1 element
const idArr = scrapingResults.idArr;
console.log(idArr);
} catch (error) {
// handle error
}