phetscraper
v3.1.1
Published
Scraper and exporter of PHET simulations
Readme
PhET Simulations scraper
This scraper creates offline versions in ZIM format of PhET science simulations for Science and Math.
Requirements
It requires Node.js version 16 or higher.
Quick Start
npm i && phet2zimThe above will eventually output a ZIM file to output/
Command line arguments
See phet2zim --help for details.
phet2zim --output generates ZIM files in a specific folder.
phet2zim --output myFolder--withoutLanguageVariants uses to exclude languages with Country variant. For example en_CA will not be present in zim with this argument.
--subjects is used to pass specific subjects to download. Pass values as csv. Sample of valid subjects :
physics, biology, earth-science, motion, sound-and-waves, work-energy-and-power, heat-and-thermodynamics, quantum-phenomenaAvailable only on GET step:
--withoutLanguageVariants ...Available on GET and EXPORT steps only:
--includeLanguages 'lang_1,lang_2,lang_3' ...
--excludeLanguages 'lang_1,lang_2,lang_3' ...
--subjects 'math,physics' ...Available on EXPORT step only:
# Skip ZIM files for individual languages
--mulOnly
# Create a ZIM file with all languages
--createMulExample:
phet2zim --includeLanguages en,ru,frConfig
Another way to configure behaviour is through environment variables. Sample .env file (with default values):
# request per second, affects GET step only
PHET_RPS=8
# async workers on TRANSFORM step (keep it equal to number of CPU cores)
PHET_WORKERS=10
# number of retries on GET step (delay grow with exponential backoff)
PHET_RETRIES=5
# display verbose errors
PHET_VERBOSE_ERRORS=falseAbout
This project achieves multiple things:
- Download PhET content
- Generate an Index for said content
- Generate ZIM file(s) containing content and index
Things this project does not yet do, but should:
- Generate Android APK
Usage
The functionality is split into 5 npm scripts:
npm run setup- deletes state from previous runsnpm run get- downloads PhET simulations in specified languagesnpm run transform- prepare the content and media filesnpm run export- generates ZIM file(s)npm start- runs all of the above in sequence
The steps get, transform and export have their own output directories:
getoutputs HTML and PNG files tostate/gettransformoutputs intermediate files tostate/transformexportoutputs HTML and PNG files tostate/exportAND a ZIM file(s) tooutput/(by default, unless customized with--output)

