coursera-scraper
v1.0.5
Published
Unofficial Coursera scraper/downloader CLI for Node.js.
Maintainers
Readme
⚠️ Disclaimer: This project is not affiliated with, endorsed by, or sponsored by Coursera.
It uses a locally saved Coursera session, validates coursera.org/learn/... URLs, discovers course/module/lesson links, extracts video or reading content, and writes the results to local downloads/ folders. Internally, it uses Playwright for browser-driven login, navigation, and response interception.
Requirements
- Node.js 20+
- Google Chrome installed for Playwright's
channel: "chrome"launch mode
Install
Install from npm
- Install the CLI globally:
npm install -g coursera-scraper- Install the Chrome browser that Playwright will launch:
npx -y playwright install chrome- Start the CLI:
coursera-dlInstall from source
- Install project dependencies:
npm install- Install the Chrome browser that Playwright will launch:
npx playwright install chrome- Build the TypeScript sources:
npm run build- Start the CLI:
npm run cliFirst run
- Run the CLI:
coursera-dlChoose the authentication flow when prompted, complete the login in the opened Chrome window, and wait for the CLI to confirm that your local session was saved.
After login, run the CLI again and paste a course URL such as:
https://www.coursera.org/learn/course-slug/home/welcome- The downloader will save output under the local
downloads/folder.
Usage
Interactive CLI:
coursera-dlInteractive CLI from a source checkout:
npm run cliDirect download entry point from a source checkout:
npm run download "https://www.coursera.org/learn/course-slug/home/welcome"Queue commands:
coursera-dl queue add "https://www.coursera.org/learn/course-slug/home/welcome" --concurrency 3
coursera-dl queue list
coursera-dl queue run
coursera-dl queue remove QUEUE_ITEM_ID
coursera-dl queue retry-failedThe persistent queue is stored in ~/.coursera-scraper/queue.json, so queued links survive restarts.
Security posture
- Session state is stored outside the repository in
~/.coursera-scraper/auth.json. - Downloaded filenames and folders are sanitized before being written to disk.
- Downloads are restricted to
https://URLs, block localhost/private-network targets, cap redirects, and enforce a 2 GB per-file limit. - Parallel downloads are capped to reduce accidental rate spikes.
Responsible use
- Only access content you are enrolled in and allowed to access.
- Do not commit
auth.json, screenshots, course exports, or debug dumps. - Check SECURITY.md before opening issues.
Open source caveats
The repository is technically safer after hardening, but publishing a public downloader for a proprietary learning platform can still carry policy, copyright, and trademark risk. This README does not hide this fact. Before making the repository public, review:
- Coursera Terms of Use
- any local copyright exceptions or fair-use assumptions you are relying on
- whether the project name and README wording imply affiliation
Development
npm run scan:sensitive
npm run lint
npm run build
npm run audit:prodLicense
MIT. See LICENSE.
