@tricoteuses/assemblee

v3.2.16

Published

10 days ago

Retrieve, clean up & handle French Assemblée nationale's open data

Downloads

1,856

0High
0Medium
0Low

Assemblée nationale France open data Parliament

Tricoteuses-Assemblee

Retrieve, clean up & handle French Assemblée nationale's open data

Tricoteuses Légifrance is free and open source software.

documentation

Architecture
Browser Usage - Using this package in browser/Vite projects

Installation

git clone https://git.tricoteuses.fr/logiciels/tricoteuses-assemblee
cd tricoteuses-assemblee/

npm install

Download and clean data

Basic usage

Create a directory to store the data, then run the following command to download, reorganize and clean the data.

mkdir ../assemblee-data/
npm run data:download ../assemblee-data

Available Commands

npm run data:download <dir>: Download, reorganize, and clean data
npm run data:retrieve_open_data <dir>: Download raw data files.
npm run data:reorganize_data <dir>: Reorganize raw files by entity.
npm run data:clean_data <dir>: Clean and validate reorganized files.
npm run data:retrieve_deputes_photos <dir>: Retrieval of députés' pictures from Assemblée nationale's website
npm run data:retrieve_senateurs_photos <dir>: Retrieval of sénateurs' pictures from Assemblée nationale's website
npm run data:retrieve_documents <dir>: Retrieval of legislative documents from Assemblée nationale's website
npm run data:retrieve_pending_amendements <dir>: Retrieval of pending amendments from Assemblée nationale's website (waiting to be processed by Assemblée services)

Notes:

Reorganized files (generated by the data:reorganize_data command) are also available in Tricoteuses / Data / Données brutes de l'Assemblée. They are updated on a regular basis.
Split & cleaned files (generated by the data:clean_data command) are also available in Tricoteuses / Data / Données nettoyées de l'Assemblée with the _nettoye suffix. They are updated on a regular basis.

Filtering Options

Downloading and cleaning all the data is long and takes up a lot of disk space. It is possible to choose the type of data that you want to retrieve to reduce the load.

Examples:

# Only download amendments
npm run data:download ../assemblee-data -- -k Amendements

# Only process 16th and 17th legislatures
npm run data:download ../assemblee-data -- -l 16 -l 17

# Retrieve comptes rendus de seance and commissions for one legislature
npm run data:retrieve_open_data ../assemblee-data -- --categories ComptesRendus --legislature 17 --fetchCrCommissions

Common Options

--categories or -k <name>: Filter by dataset categories (Available options : ActeursEtOrganes, Agendas, Amendements, DossiersLegislatifs, Photos, Scrutins, Questions, ComptesRendus)
--legislature or -l <number>: Specify one or more legislatures to process (e.g., -l 15 -l 16)
--dataDir <path> (Mandatory): Path to the working directory where all data is stored (required)
--silent or -s: Disable logging
--verbose or -v: Enable verbose logging
--fetch or -f: Force re-download of data even if already present
--commit or -c: Automatically commit cleaned data
--pull or -p: Pull repositories before starting
--clone or -C <url>: Clone Git repositories from a remote group or organization
--remote or -r <name>: Push commits to specified Git remote(s)
--keepDir: Keep Dir (Implement before cleaning data)
--only-recent (number): If files are already present, skip files that are above the specified number of days and skip old legislatures (e.g. -only-recent 30)

If you use such options, use them in all subsequent commands too (data:regorganize_data and data:clean_data).

Options for Data Retrieval

With data:retrieve_open_data,

use --categories=ComptesRendus to retrieve comptes rendus de séance for the selected legislature(s).

Note:

Comptes-rendus & videos of commissions are retrieved when cleaning agendas data.

Options for Cleaning Data

--dataset or -d <name>: Clean a specific dataset only
--fetchCrCommissions: Retrieve and parse CR commissions
--fetchVideos: Retrieve videos
--fetchDocuments : Specify to retrieve documents
--no-reset-after-commit: Skip Git reset after committing (useful to preserve local changes)
--no-validate or -V: Skip schema validation during cleaning
--parseDocuments: Specify to parse documents into cleaned json

Note:

use --categories=Agendas together with --fetchCrCommissions to retrieve comptes rendus de commission.
use --categories=Agendas together with --fetchVideos to retrieve videos of commissions.

Options for Retrieving Documents

--full or -f: Retrieve all documents, even those already downloaded
--document-type or -T <type>: Restrict to specific document types (e.g., PION)

Download using Docker

A Docker image that downloads and cleans the data all at once is available. Build it locally or run it from the container registry. Use the environment variables LEGISLATURE and CATEGORIES if needed.

docker run --pull always --name tricoteuses-assemblee -v ../assemblee-data:/app/assemblee-data -e LEGISLATURE=17 -d git.tricoteuses.fr/logiciels/tricoteuses-assemblee:latest

Using the data

Once the data is downloaded and cleaned, you can use loaders to retrieve it. To use loaders in your project, you can install the @tricoteuses/assemblee package, and import the iterator functions that you need.

npm install @tricoteuses/assemblee

import {
  iterLoadAssembleeActeurs,
  iterLoadAssembleeOrganes,
  iterLoadAssembleeReunions,
  iterLoadAssembleeScrutins,
  iterLoadAssembleeDocuments,
  iterLoadAssembleeDossiersParlementaires,
  iterLoadAssembleeAmendements,
  iterLoadAssembleeQuestions,
  iterLoadAssembleeComptesRendus,
} from "@tricoteuses/assemblee/loaders"

// Pass data directory and legislature as arguments
for (const { acteur } of iterLoadAssembleeActeurs("../assemblee-data", 17)) {
  console.log(acteur.uid)
}

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

Tricoteuses-Assemblee

Retrieve, clean up & handle French Assemblée nationale's open data

documentation

Installation

Download and clean data

Basic usage

Available Commands

Filtering Options

Common Options

Options for Data Retrieval

Options for Cleaning Data

Options for Retrieving Documents

Download using Docker

Using the data