npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

dumpduck

v0.1.1

Published

Import PostgreSQL .dump files into DuckDB

Downloads

211

Readme

Dumpduck - import postgres .dump files into duckdb

Status: Alpha - this works for small .dumps but isn't performant for large databases yet. But I architected it so it should be quite possible to optimize.

Example usage:

  • Read from local sql file and import into local duckdb pnpx dumpduck backup.sql out.duckdb
  • Stream from remote .dump file, translate with pg_restore on the fly, and stream into motherduck. This takes low ram and low disk space, perfect for github action cronjobs wiht huge backups. pnpx dumpduck https://example.com/backup.dump md:
  • Read from stdin: cat backup.dump | pnpx dumpduck :stdin: out.duckdb

Inspired by https://github.com/NINAnor/pg_dedump

The workflow is basically:

  • We are handed a file path, an http(s) URL, or a stream such as stdin. We convert all of these to a stream of bytes, for consistency.
  • Then, we detect the format for this byte stream. If it is the custom binary format from eg pg_dump --format='custom', then we pipe it through pg_restore to convert it back to plain SQL. If you need to do this, then of course you will need pg_restore available on your system.
  • Spin up a temporary, in-memory postgres instance using the awesome pglite project (a port of postgres to 3Mb WASM file), and expose it as a local server.
  • Connect to the given duckdb database, and then ATTACH 'host=0.0.0.0 port=9077 sslmode=disable' AS pg (TYPE postgres); to that postgres connection.
  • Take the stream of raw SQL text and do very rough parsing on it, splitting it into individual statements.
  • Take the stream of SQL statements and process them one at a time:
    • The first section of SQL statements in a dump are e.g. CREATE TABLE and CREATE TYPE statements. Pass each of these through to pglite.
    • As soon as we hit the first COPY statement, that means we have moved onto the data section of the dump file. Pause for a moment, and in the duckdb instance do COPY FROM DATABASE pg TO <current db>. This leverages the type mapping of the duckdb postgres extension to copy over all the finished types, tables, and views from pglite into duckdb. There is no data in the pglite tables yet, so this does no data copying.
    • Now the duckdb database is ready to accept data, now we process each of the COPY statements. For each, we intercept the rows of data and batch them into chunks of 1k lines. For each batch, we load that into pglite using it's native COPY syntax. This guarantees that any weird csv formatting in the dump file is actually read by postgres, the program that is designed to handle it, instead of trying to convert it to something that duckdb understands (I tried this and it was hard). After loading the batch into pglite, then from duckdb we INSERT INTO <dst> SELECT * FROM pg.<dst>; to move it over, and then TRUNCATE; VACUUM; the pglite table to reclaim space.

Note how this entire thing is very carefully designed to be streaming. We only read through the input stream once, and we never materialize large chunks of data. This means that you can do pnpx dumpduck https://example.com/backup.dump md: to do backups on tiny github action runners that have low ram and disk space, but great network connectivity.

Limitations and Problems

Due to https://github.com/duckdb/duckdb-postgres/issues/429, if your postgres dump has any custom datatypes, then these will get improperly copied over to the duckdb file. You will be able to use the duckdb file fine, but any schema.sql files generated from a subsequent EXPORT statement will be garbled. If you have custom datatypes in postgres AND need to EXPORT and reIMPORT the duckdb database, you will be affected. Otherwise you should be fine.