@tricoteuses/senat
v2.20.10
Published
Handle French Sénat's open data
Readme
Tricoteuses-Senat
Retrieve, clean up & handle French Sénat's open data
Requirements
- Node >= 22
Installation
git clone https://git.tricoteuses.fr/logiciels/tricoteuses-senat
cd tricoteuses-senat/Create a .env file to set PostgreSQL database informations and other configuration variables (you can use example.env as a template). Then
npm installDatabase creation (not needed if downloading with Docker image)
Using Docker
docker run --name local-postgres -d -p 5432:5432 -e POSTGRES_PASSWORD=$YOUR_CUSTOM_DB_PASSWORD postgresDownload data
Create a folder where the data will be downloaded and run the following command to download the data and convert it into JSON files.
mkdir ../senat-data/
# Available options for optional `categories` parameter : All, Ameli, Debats, DosLeg, Questions, Sens
npm run data:download ../senat-data -- [--categories All]Data from other sources is also available :
# Retrieval of textes and rapports from Sénat's website
# Available options for optional `formats` parameter : xml, html, pdf
# Available options for optional `types` parameter : textes, rapports
npm run data:retrieve_documents ../senat-data -- --fromSession 2022 [--formats xml pdf] [--types textes]
# Retrieval & parsing (textes in xml format only for now)
npm run data:retrieve_documents ../senat-data -- --fromSession 2022 --parseDocuments
# Parsing only
npm run data:parse_textes_lois ../senat-data
# Retrieval (& parsing) of agenda from Sénat's website
npm run data:retrieve_agenda ../senat-data -- --fromSession 2022 [--parseAgenda]
# Retrieval (& parsing) of comptes-rendus de séance from Sénat's data
npm run data:retrieve_cr_seance ../senat-data -- [--parseDebats] [--keepDir]
# Retrieval (& parsing) of comptes-rendus de commissions from Sénat's website
npm run data:retrieve_cr_commission ../senat-data -- [--parseDebats] [--keepDir]
# Retrieval of sénateurs' pictures from Sénat's website
npm run data:retrieve_senateurs_photos ../senat-dataData download using Docker
A Docker image that downloads and converts the data all at once is available. Build it locally or run it from the container registry.
Use the environment variables FROM_SESSION and CATEGORIES if needed.
docker run --pull always --name tricoteuses-senat -v ../senat-data:/app/senat-data -d git.tricoteuses.fr/logiciels/tricoteuses-senat:latestUse the environment variable CATEGORIES and FROM_SESSION if needed.
Using the data
Once the data is downloaded, you can use loaders to retrieve it. To use loaders in your project, you can install the @tricoteuses/senat package, and import the iterator functions that you need.
npm install @tricoteuses/senatimport { iterLoadSenatQuestions } from "@tricoteuses/senat/loaders"
// Pass data directory and legislature as arguments
for (const { item: question } of iterLoadSenatQuestions("../senat-data", 17)) {
console.log(question.id)
}Generation of raw types from SQL schema (for contributors only)
npm run data:generate_schemas ../senat-dataPublishing
To publish a new version of this package onto npm, bump the package version and publish.
# Increment version and create a new Git tag automatically
npm version patch # +0.0.1 → small fixes
npm version minor # +0.1.0 → new features
npm version major # +1.0.0 → breaking changes
npx tsc
npm publishThe Docker image will be automatically built during a CI Workflow if you push the tag to the remote repository.
git push --tags