npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@tricoteuses/senat

v3.1.8

Published

Handle French Sénat's open data

Downloads

2,688

Readme

Tricoteuses-Senat

Retrieve, clean up & handle French Sénat's open data

Requirements

  • Node >= 22

Installation

git clone https://git.tricoteuses.fr/logiciels/tricoteuses-senat
cd tricoteuses-senat/

Create a .env file to set PostgreSQL database informations and other configuration variables (you can use example.env as a template). Then

npm install

Database creation (not needed if downloading with Docker image)

Using Docker

docker run --name local-postgres -d -p 5432:5432 -e POSTGRES_PASSWORD=$YOUR_CUSTOM_DB_PASSWORD postgres

Download data

Basic usage

Create a folder where the data will be downloaded and run the following command to download the data and convert it into JSON files.

mkdir ../senat-data/

npm run data:download ../senat-data

Available Commands

  • npm run data:download <dir>: Download, convert data to JSON
  • npm run data:retrieve_documents <dir>: Retrieval of textes and rapports from Sénat's website
  • npm run data:retrieve_agenda <dir>: Retrieval of agenda from Sénat's website
  • npm run data:retrieve_cr_seance <dir>: Retrieval of comptes-rendus de séance from Sénat's data
  • npm run data:retrieve_cr_commission <dir>: Retrieval of comptes-rendus de commissions from Sénat's website
  • npm run data:retrieve_senateurs_photos <dir>: Retrieval of sénateurs' pictures from Sénat's website

Filtering Options

Downloading all the data is long and takes up a lot of disk space. It is possible to choose the type of data that you want to retrieve to reduce the load.

Examples:

# Only download amendments
npm run data:download ../senat-data -- -k Ameli

# Only process data from session 2023 onwards
npm run data:download ../senat-data -- --fromSession 2023

Common Options

  • --categories or -k <name>: Filter by dataset categories (Available options: All, Ameli, Debats, DosLeg, Questions, Sens)
  • --fromSession <year>: Specify the session year to retrieve data from (default: 2022)
  • --dataDir <path> (Mandatory): Path to the working directory where all data is stored (required)
  • --silent or -s: Disable logging
  • --verbose or -v: Enable verbose logging
  • --commit or -c: Automatically commit converted data
  • --pull or -p: Pull repositories before starting
  • --clone or -C <url>: Clone Git repositories from a remote group or organization
  • --remote or -r <name>: Push commits to specified Git remote(s)
  • --keepDir: Keep directories when cleaning data
  • --only-recent <days>: Retrieve only documents created within the last N days

Options for Retrieving Documents

  • --formats <format>: Specify document formats to retrieve (options: xml, html, pdf)
  • --types <type>: Specify document types to retrieve (options: textes, rapports)
  • --parseDocuments: Parse documents after retrieval
  • --parseAgenda: Parse agenda after retrieval
  • --parseDebats: Parse comptes-rendus after retrieval

Examples

# Retrieval of textes and rapports in specific formats
npm run data:retrieve_documents ../senat-data -- --fromSession 2022 --formats xml pdf --types textes

# Retrieval & parsing (textes in xml format only for now)
npm run data:retrieve_documents ../senat-data -- --fromSession 2022 --parseDocuments

# Retrieval & parsing of agenda
npm run data:retrieve_agenda ../senat-data -- --fromSession 2022 --parseAgenda

# Retrieval & parsing of comptes-rendus de séance
npm run data:retrieve_cr_seance ../senat-data -- --parseDebats --keepDir

# Retrieval & parsing of comptes-rendus de commissions
npm run data:retrieve_cr_commission ../senat-data -- --parseDebats --keepDir

Data download using Docker

A Docker image that downloads and converts the data all at once is available. Build it locally or run it from the container registry. Use the environment variables FROM_SESSION and CATEGORIES if needed.

docker run --pull always --name tricoteuses-senat -v ../senat-data:/app/senat-data -d git.tricoteuses.fr/logiciels/tricoteuses-senat:latest

Use the environment variable CATEGORIES and FROM_SESSION if needed.

Using the data

Once the data is downloaded, you can use loaders to retrieve it. To use loaders in your project, you can install the @tricoteuses/senat package, and import the iterator functions that you need.

npm install @tricoteuses/senat
import { iterLoadSenatQuestions } from "@tricoteuses/senat/loaders"

// Pass data directory and legislature as arguments
for (const { item: question } of iterLoadSenatQuestions("../senat-data", 17)) {
  console.log(question.id)
}

Generation of raw types from SQL schema (for contributors only)

npm run data:generate_schemas ../senat-data

PostgreSQL import modes

retrieve_open_data.ts supports two import modes:

  • default mode: direct import in a single database
  • --incremental: separate staging database plus postgres_fdw bridge into the target database

The default mode is the simplest one and is the one used by the Docker image. The incremental mode is intended for replicated environments where heavy transient work should stay outside the target database.

Incremental PostgreSQL import architecture

When --incremental is enabled, the open-data SQL import is designed to keep heavy transient work out of the replicated target database.

                       +-----------------------------------+
                       |  Source Open Data Senat           |
                       |  ZIP -> SQL dumps                 |
                       +----------------+------------------+
                                        |
                                        v
                       +-----------------------------------+
                       |  PostgreSQL staging instance      |
                       |  non-replicated                   |
                       |  database: senat_staging          |
                       |                                   |
                       |  schemas:                         |
                       |  - ameli_staging                  |
                       |  - debats_staging                 |
                       |  - dosleg_staging                 |
                       |  - questions_staging              |
                       |  - sens_staging                   |
                       |                                   |
                       |  heavy operations:                |
                       |  - raw dump import                |
                       |  - table renaming/prefixing       |
                       |  - staging indexes                |
                       +----------------+------------------+
                                        |
                             postgres_fdw (read-only bridge)
                                        |
                                        v
+----------------------------------------------------------------------------+
| PostgreSQL target instance, replicated                                     |
| database: canutes                                                           |
|                                                                            |
| preserved schemas: assemblee, legifrance, ...                              |
| updated schema: senat                                                      |
|                                                                            |
| lightweight temporary FDW schemas:                                         |
| - ameli_staging                                                            |
| - debats_staging                                                           |
| - dosleg_staging                                                           |
| - questions_staging                                                        |
| - sens_staging                                                             |
|                                                                            |
| final operations only:                                                     |
| - read staging data through postgres_fdw                                   |
| - incremental merge into canutes.senat                                     |
| - optional schema alignment when source structure changes                  |
+----------------------------------------------------------------------------+

At the end of an incremental import:

  • the target database is canutes
  • the target schema is senat
  • other schemas in canutes are left untouched
  • bulk transient data stays in the staging database

Direct single-database mode

Without --incremental, the script works directly in the configured target database:

  • dumps are imported into temporary *_staging schemas in that same database
  • the final tables are merged into schema senat
  • the temporary staging schemas are dropped at the end

This mode is appropriate for:

  • local development
  • disposable databases
  • the Docker image workflow
  • environments where replication cost is not a concern

Preparing postgres_fdw and database rights

This section only applies when using --incremental.

The target database must be able to connect to the separate staging database with postgres_fdw.

Environment variables

Set the target connection in .env. Add the staging connection only if you use --incremental:

# Target database
DB_HOST="localhost"
DB_PORT=5432
DB_USER="postgres"
DB_PASSWORD="PASSWORD"
DB_NAME="canutes"

# Separate non-replicated staging database or instance, only for --incremental
STAGING_DB_HOST="localhost"
STAGING_DB_PORT=5433
STAGING_DB_USER="postgres"
STAGING_DB_PASSWORD="PASSWORD"
STAGING_DB_NAME="senat_staging"

Typical commands

Default direct mode:

npm run data:retrieve_open_data -- ../senat-data --all

Incremental mode with separate staging database:

npm run data:retrieve_open_data -- ../senat-data --all --incremental

Target database prerequisites

On canutes, the import role must be able to:

  • connect to the database
  • create the postgres_fdw extension, or reuse it if already installed
  • create and drop FDW servers and user mappings
  • create and drop temporary *_staging schemas used for foreign tables
  • create, alter and drop objects inside schema senat

Typical one-time setup on the target database:

CREATE EXTENSION IF NOT EXISTS postgres_fdw;

Staging database prerequisites

The staging PostgreSQL instance should ideally be outside the replicated cluster.

The staging role must be able to:

  • recreate the senat_staging database
  • create schemas and import dumps there
  • create the technical staging indexes

The target instance must also be allowed to connect to the staging instance over the network. In practice:

  • open the staging host and port from the target server
  • allow the staging user in pg_hba.conf
  • ensure the staging credentials used in .env are valid

Operational note

The replicated target still sees a small amount of temporary DDL for FDW objects, but the large dump imports and staging tables remain outside the replicated database. Most replicated volume should therefore come from the real incremental changes applied to canutes.senat.

Validation of prefixed SQL imports

After importing datasets with prefixed tables in schema senat, you can verify that the expected renamed tables exist and still match the generated definitions:

npm run data:validate_prefixed_tables -- --categories All

Publishing

To publish a new version of this package onto npm, bump the package version and publish.

# Increment version and create a new Git tag automatically
npm version patch   # +0.0.1 → small fixes
npm version minor   # +0.1.0 → new features
npm version major   # +1.0.0 → breaking changes
npx tsc
npm publish

The Docker image will be automatically built during a CI Workflow if you push the tag to the remote repository.

git push --tags