npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

prometheus-sql-adapter

v0.3.2

Published

prometheus remote write adapter for postgresql

Downloads

3

Readme

Prometheus SQL Adapter

Adapter to connect Prometheus' remote write endpoint to a PostgreSQL server, preferably running TimescaleDB. Caches labels for each timeseries to reduce writes, linking them to samples by metric fingerprint.

This adapter was inspired by the Timescale PostgreSQL adapter and maintains a compatible schema, so queries may be used with either, but this adapter does not require the pg_prometheus extension, making it compatible with Aurora PostgreSQL, Azure Database for PostgreSQL, and other managed PostgreSQL services.

While it is possible to use this adapter and most of the schema without TimescaleDB, it will become difficult to prune older data, compression will not be available, and queries will be slower. If you can use TimescaleDB, please do.

Features

  • query compatible with Timescale's official pg_prometheus schema
  • hashed & cached label IDs
  • samples in compressed hypertable
  • uses Go's SQL package
  • uses bulk copy for samples
  • included Grafana dashboards: Kubernetes cluster, schema metadata
  • plenty of queries: alerts, schema metadata
  • does not require pg_prometheus extension
  • does not require superuser or extension privileges
  • does not use printf to build SQL queries

Status

pipeline status Renovate badge MIT license

Open bug count Open issue count Closed issue count

Contents

Getting Started

  • run TimescaleDB somewhere, like a Kubernetes pod or Timescale Cloud
  • set the PG* environment variables for your connection info (PGHOST, PGPORT, PGUSER, PGPASSWORD, PGDATABASE)
  • create a database
  • run ./scripts/schema-create.sh [license-level] [retain-live] [retain-total]
  • create a role for each set of adapters to write
  • run ./scripts/schema-grant.sh [role-name] adapter
  • create a role for each Grafana instance to read
  • run ./scripts/schema-grant.sh [role-name] grafana
  • create a role for each human instance to read
  • run ./scripts/schema-grant.sh [role-name] human

The schema scripts are idempotent and safe to run repeatedly, including schema-create.sh.

Non-breaking upgrades can be performed by running the schema scripts again, in the same order.

Schema

This adapter uses a schema that is compatible with the Timescale pg_prometheus adapter but does not require the pg_prometheus extension or SUPERUSER/plugin permissions.

Tables

The metric labels and samples are separated into two data tables and a joining view, linked by a label ID (lid). The resulting schema can be described as:

\d+ metric_labels

                                         Table "public.metric_labels"
 Column |            Type             | Collation | Nullable | Default | Storage  | Stats target | Description
--------+-----------------------------+-----------+----------+---------+----------+--------------+-------------
 lid    | uuid                        |           | not null |         | plain    |              |
 time   | timestamp without time zone |           | not null |         | plain    |              |
 labels | jsonb                       |           | not null |         | extended |              |
Indexes:
    "metric_labels_lid" UNIQUE, btree (lid)
    "metric_labels_labels" gin (labels)

\d+ metric_samples

                                         Table "public.metric_samples"
 Column |            Type             | Collation | Nullable | Default | Storage  | Stats target | Description
--------+-----------------------------+-----------+----------+---------+----------+--------------+-------------
 time   | timestamp without time zone |           | not null |         | plain    |              |
 name   | text                        |           | not null |         | extended |              |
 lid    | uuid                        |           | not null |         | plain    |              |
 value  | double precision            |           | not null |         | plain    |              |
Indexes:
    "metric_samples_name_lid_time" btree (name, lid, "time" DESC)
    "metric_samples_time_idx" btree ("time" DESC)

\d+ metrics
                                     View "public.metrics"
 Column |            Type             | Collation | Nullable | Default | Storage  | Description 
--------+-----------------------------+-----------+----------+---------+----------+-------------
 time   | timestamp without time zone |           |          |         | plain    | 
 name   | text                        |           |          |         | extended | 
 lid    | uuid                        |           |          |         | plain    | 
 value  | double precision            |           |          |         | plain    | 
 labels | jsonb                       |           |          |         | extended | 
View definition:
 SELECT s."time",
    l.labels ->> '__name__'::text AS name,
    s.lid,
    s.value,
    l.labels
   FROM metric_labels l
     JOIN metric_samples s ON s.lid = l.lid
  WHERE s."time" > (now() - '06:00:00'::interval);

The metrics view makes this compatible with the original pg_prometheus schema and the v0.1 schema (which featured a single metrics table with both value and labels).

Maximum time ranges may be enforced by the metrics view to limit the amount of raw data that can be fetched at once, but deduplication and aggregation typically need context to determine the correct operators, and must happen later.

Samples are linked to their labels using the metric's hashed fingerprint, or label ID (lid). This is provided by the Prometheus SDK and uses the 64-bit FNV-1a hash, which is then stored as a UUID column. The adapters each maintain an LRU cache of recently written label sets, stored by lid, and avoid re-INSERTing previously seen label sets.

Using the metric's fingerprint provides a short, deterministic identifier for each label set, or timeseries. The adapters do not need to coordinate and can safely write in parallel, using an ON CONFLICT clause to skip or update existing label sets. While a numeric counter might be shorter than the current hash-as-UUID, it would require coordination between the adapters or within the database.

Each label set in metric_labels has the metric's name in the __name__ key, so there is no name column in that table. To search by name, replace WHERE name = 'foo' clauses with the JSON field access labels->>'__name__' = 'foo', which will hit the metric_labels_name_lid index. While full tables scans were possible in the test cluster, which had 34k labels weighing 95MB, missing that index may become costly for larger clusters. The total size of the metric_labels table is the number of unique timeseries being written.