npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

pg-dump-parser

v1.8.1

Published

Parses PostgreSQL dump files into an array of schema objects.

Readme

pg-dump-parser

Parses PostgreSQL dump files into an array of schema objects.

Motivation

The idea behind pg-dump-parser is to split the dump file into a series of files. Each file is a top-level schema object (e.g. a table, view, etc.). The same file will contain all the schema objects associated with the top-level object (e.g. comments, indexes, etc.). This makes having the database schema as a reference easier and allows for better checking into version control.

The desired end result is something like this (see recipes for a script that does this):

generated-schema
├── extensions
│  ├── citext.sql
│  └── vector.sql
├── functions
│   ├── public.add_two_numbers.sql
│   └── public.notify_foo_insert.sql
├── materialized-views
│   ├── public.project_total_earnings.sql
│   └── public.user_account_total_earnings.sql
├── tables
│   ├── public.accounting_platform_account.sql
│   └── public.workspace_workspace_group_history.sql
└── types
    ├── public.accounting_platform.sql
    └── public.workspace_type.sql

where each file contains the SQL for the schema object.

Usage

import { readFile } from 'node:fs/promises';
import { parsePgDump } from 'pg-dump-parser';

const dump = await readFile('dump.sql', 'utf8');

const schemaObjects = parsePgDump(dump);

for (const schemaObject of schemaObjects) {
  console.log(schemaObject);
}

[!NOTE] The expected input is a PostgreSQL dump file created with pg_dump --schema-only.

The output is an array of objects, each representing a schema object in the dump file and the corresponding header, e.g.,

[
  {
    "header": {
        "Name": "bar",
        "Owner": "postgres",
        "Schema": "public",
        "Type": "TABLE"
    },
    "sql": "CREATE TABLE public.bar (\n    id integer NOT NULL,\n    uid text NOT NULL,\n    foo_id integer\n);"
  },
  {
    "header": {
        "Name": "bar",
        "Owner": "postgres",
        "Schema": "public",
        "Type": "TABLE"
    },
    "sql": "ALTER TABLE public.bar OWNER TO postgres;"
  },
  {
    "header": {
        "Name": "bar_id_seq",
        "Owner": "postgres",
        "Schema": "public",
        "Type": "SEQUENCE"
    },
    "sql": "ALTER TABLE public.bar ALTER COLUMN id ADD GENERATED ALWAYS AS IDENTITY (\n    SEQUENCE NAME public.bar_id_seq\n    START WITH 1\n    INCREMENT BY 1\n    NO MINVALUE\n    NO MAXVALUE\n    CACHE 1\n);"
  }
]

Grouping schema objects

groupSchemaObjects is an opinionated utility that assigns object to a scope.

import { readFile } from 'node:fs/promises';
import { groupSchemaObjects } from 'pg-dump-parser';

const dump = await readFile('dump.sql', 'utf8');

const schemaObjects = parsePgDump(dump);

const schemaObjectScope = groupSchemaObjects(schemaObjects);
  schemaObjects,
  {
    header: {
      Name: 'TABLE foo',
      Owner: 'postgres',
      Schema: 'public',
      Type: 'COMMENT',
    },
    sql: multiline`
      COMMENT ON TABLE public.foo IS 'Table comment x';
    `,
  }
);

schemaObjectScope is now an object that describes the owner of the object, e.g.,

{
  name: 'foo',
  schema: 'public',
  type: 'TABLE',
}

[!WARNING] The implementation behind groupSchemaObjects is super scrappy. It relies on a lot of pattern matching. Use at your own risk.

Sorting schema objects

The library provides utilities to sort schema objects for better readability and consistency:

import { readFile } from 'node:fs/promises';
import { parsePgDump, sortSchemaObjects, sortSchemaObjectsByScope, groupAndSortSchemaObjects } from 'pg-dump-parser';

const dump = await readFile('dump.sql', 'utf8');
const schemaObjects = parsePgDump(dump);

// Sort all schema objects by type, then by name
const sorted = sortSchemaObjects(schemaObjects);

// Sort objects while preserving their grouping by scope (table, view, etc.)
const sortedByScope = sortSchemaObjectsByScope(schemaObjects);

// Group objects by their scope and sort within each group
const grouped = groupAndSortSchemaObjects(schemaObjects);

The sorting applies the following rules:

  • Type ordering: Extensions → Schemas → Types → Functions/Procedures → Tables → Constraints → Indexes → Views → Triggers → Comments → etc.
  • Constraints: Sorted by type (PRIMARY KEY → UNIQUE → FOREIGN KEY → CHECK → others)
  • Indexes: Sorted alphabetically by name within the same table
  • Comments: Sorted by target type, then by target name
  • Columns: Preserved in their original order from the CREATE TABLE statement

Recipes

I intentionally did not include a script for producing a diff, because a lot of it (how you dump the schema, how you group the schema objects, etc.) is subjective. However, this is a version that we are using in production.

import fs from 'node:fs/promises';
import path from 'node:path';
import {
  parsePgDump,
  SchemaObjectScope,
  scopeSchemaObject,
  sortSchemaObjectsByScope,
} from 'pg-dump-parser';
import { default as yargs } from 'yargs';
import { $ } from 'zx';

const formatFileName = (schemaObjectScope: SchemaObjectScope) => {
  const name = schemaObjectScope.name.startsWith('"')
    ? schemaObjectScope.name.slice(1, -1)
    : schemaObjectScope.name;

  if (schemaObjectScope.schema) {
    return `${schemaObjectScope.schema}.${name}.sql`;
  }

  return `${name}.sql`;
};

const argv = await yargs(process.argv.slice(2))
  .options({
    'output-path': {
      demand: true,
      type: 'string',
    },
    'postgres-dsn': {
      demand: true,
      type: 'string',
    },
  })
  .strict()
  .parse();

const dump = await $`pg_dump --schema-only ${argv['postgres-dsn']}`;

const schemaObjects = parsePgDump(dump.stdout);

// Sort the schema objects for consistent output
const sortedSchemaObjects = sortSchemaObjectsByScope(schemaObjects);

try {
  await fs.rmdir(argv['output-path'], {
    recursive: true,
  });
} catch {
  // ignore
}

await fs.mkdir(argv['output-path']);

const files: Record<string, string[]> = {};

for (const schemaObject of sortedSchemaObjects) {
  const schemaObjectScope = scopeSchemaObject(sortedSchemaObjects, schemaObject);

  if (!schemaObjectScope) {
    continue;
  }

  const file = path.join(
    argv['output-path'],
    // MATERIALIZED VIEW => materialized-views
    schemaObjectScope.type.toLowerCase().replace(' ', '-') + 's',
    formatFileName(schemaObjectScope),
  );

  files[file] ??= [];
  files[file].push(schemaObject.sql);
}

for (const [filePath, content] of Object.entries(files)) {
  const directory = path.dirname(filePath);

  await fs.mkdir(directory, { recursive: true });

  await fs.appendFile(filePath, content.join('\n\n') + '\n');
}

Alternatives

  • https://github.com/omniti-labs/pg_extractor
    • Prior to writing pg-dump-parser, I used this tool to extract the schema. It works well, but it's slow. It was taking a whole minute to parse our dump file. We needed something that implements equivalent functionality, but is faster. pg-dump-parser processes the same dump with in a few seconds.