npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

pg-turbo

v0.1.4

Published

Resilient PostgreSQL dump & restore with direct COPY protocol

Readme

pg-turbo

CI codecov npm version npm downloads License: AGPL-3.0

Resilient PostgreSQL dump & restore CLI for large databases over flaky connections.

Uses the PostgreSQL COPY protocol directly with chunked streaming, parallel workers, and per-chunk retry. Designed for RDS over VPN, remote servers, and other scenarios where standard pg_dump/pg_restore fail on large schemas.

Why pg-turbo?

| Problem with pg_dump | pg-turbo solution | |---------------------|----------------------| | One large table fails = restart entire dump | Per-chunk retry -- only the failed 250MB chunk retries | | Can't parallelize within a single table | Sub-table chunking splits large tables across workers | | No resume after interruption | .done markers let you pick up where you left off | | Connection drops after hours of dumping | TCP keepalive tuning detects drops in ~60s, auto-retry | | Slow on remote databases | Direct COPY protocol, no pg_dump process overhead per table |

Quick Start

# Install globally
npm install -g pg-turbo

# Dump a database
pg-turbo dump \
    -d "postgresql://user:pass@host:5432/mydb" \
    --output ./mydb_dump \
    -j 4

# Restore to another database
pg-turbo restore \
    -d "postgresql://user:pass@host:5432/target_db" \
    --input ./mydb_dump \
    -j 4

Requirements

  • Node.js >= 20
  • pg_dump and pg_restore (for DDL only -- data uses direct COPY)

Features

  • Chunked COPY streaming -- large tables split by PK range or ctid range for parallel transfer
  • Volume-balanced chunks -- samples row sizes to create equal-byte chunks (handles skewed data)
  • Per-chunk retry with exponential backoff + jitter
  • Resume support -- skip completed chunks on re-run
  • Snapshot consistency -- all workers see identical data via pg_export_snapshot()
  • Streaming compression -- zstd (default) or lz4, never buffers full table in memory
  • Live progress dashboard -- per-worker status, overall speed, ETA
  • Connection resilience -- TCP keepalive tuning, auto-reconnect on failure
  • Archive packaging -- optional .pgt single-file format (tar + zstd)
  • Connection string cleaning -- strips GUI params (statusColor, env, etc.) from tools like TablePlus

Usage

Dump

# Dump entire database with 4 parallel workers
pg-turbo dump \
    -d "postgresql://user:pass@host:5432/mydb" \
    --output ./mydb_dump \
    -j 4

# Dump specific schema, force chunking at 256MB
pg-turbo dump \
    -d "postgresql://user:pass@host:5432/mydb" \
    --output ./mydb_dump \
    -n public \
    -j 4 \
    --split-threshold 256MB

# Dump from read replica (skip snapshot)
pg-turbo dump \
    -d "postgresql://readonly:pass@replica:5432/mydb" \
    --output ./mydb_dump \
    -j 8 \
    --no-snapshot

# Preview what would be dumped
pg-turbo dump \
    -d "postgresql://user:pass@host:5432/mydb" \
    --output ./mydb_dump \
    --dry-run

Restore

# Restore to target database
pg-turbo restore \
    -d "postgresql://user:pass@host:5432/target_db" \
    --input ./mydb_dump \
    -j 4

# Clean restore (drop + recreate schemas first)
pg-turbo restore \
    -d "postgresql://user:pass@host:5432/target_db" \
    --input ./mydb_dump \
    -j 4 -c

# Restore only data (schema already exists)
pg-turbo restore \
    -d "postgresql://user:pass@host:5432/target_db" \
    --input ./mydb_dump \
    -j 4 -a

# Restore a single table
pg-turbo restore \
    -d "postgresql://user:pass@host:5432/target_db" \
    --input ./mydb_dump \
    -t users

# Restore from .pgt archive
pg-turbo restore \
    -d "postgresql://user:pass@host:5432/target_db" \
    --input ./mydb_dump.pgt \
    -j 4

Options

dump

| Option | Description | Default | |--------|-------------|---------| | -d, --dbname | PostgreSQL connection string | required | | --output | Output directory | required | | -n, --schema | Dump only this schema | all user schemas | | -j, --jobs | Parallel workers | 4 | | --split-threshold | Chunk tables larger than this | 1GB | | --max-chunks-per-table | Max chunks per table | 32 | | --retries | Max retries per chunk | 5 | | --retry-delay | Base retry delay (seconds) | 5 | | --compression | zstd or lz4 | zstd | | --no-snapshot | Skip snapshot (for read replicas) | | | --no-archive | Skip .pgt archive packaging | | | --dry-run | Preview without dumping | |

restore

| Option | Description | Default | |--------|-------------|---------| | -d, --dbname | PostgreSQL connection string | required | | --input | Input directory or .pgt file | required | | -n, --schema | Restore only this schema | all | | -t, --table | Restore a single table | all | | -j, --jobs | Parallel workers | 4 | | -c, --clean | Drop + recreate schemas first | | | -a, --data-only | Skip DDL, restore only data | | | --retries | Max retries per chunk | 5 | | --retry-delay | Base retry delay (seconds) | 5 | | --dry-run | Preview without restoring | |

Passthrough: args after -- are forwarded to pg_dump/pg_restore for DDL operations.

How It Works

Architecture

Coordinator Connection
  BEGIN REPEATABLE READ
  pg_export_snapshot() --> snapshot_id
  (holds transaction open for consistency)
       |
       +-- Worker 1: SET TRANSACTION SNAPSHOT
       |   COPY (SELECT * FROM big_table WHERE id BETWEEN 0 AND 25000) TO STDOUT
       |     --> zstd compress --> chunk_0000.copy.zst
       |
       +-- Worker 2: SET TRANSACTION SNAPSHOT
       |   COPY (SELECT * FROM big_table WHERE id BETWEEN 25001 AND 50000) TO STDOUT
       |     --> zstd compress --> chunk_0001.copy.zst
       |
       +-- Worker 3: SET TRANSACTION SNAPSHOT
       |   COPY small_table TO STDOUT
       |     --> zstd compress --> chunk_0000.copy.zst
       |
       +-- Worker 4: ...

Chunking Strategies

| Strategy | When | How | |----------|------|-----| | pk_range | Table > threshold, has integer PK | WHERE id BETWEEN start AND end | | ctid_range | Table > threshold, no integer PK, PG 14+ | WHERE ctid >= '(page,0)'::tid | | none | Table < threshold | Single COPY table TO STDOUT |

Large tables are split into volume-balanced chunks by sampling row sizes at evenly-spaced PK points. This handles skewed data distributions where some rows are much larger than others.

Resume

Dump: Each completed chunk writes a .done marker file. Re-running the dump skips chunks that already have markers.

Restore: Completed chunks are tracked in a _pg_turbo._progress table in the target database, committed atomically with the data. Survives crashes.

Output Format

dump_dir/
  manifest.json                    # Metadata: tables, chunks, sequences, options
  _schema_ddl.dump                 # pg_dump custom format (DDL only)
  data/
    public.users/
      chunk_0000.copy.zst          # COPY text format, zstd compressed
      chunk_0000.copy.zst.done     # Resume marker
      chunk_0001.copy.zst
      chunk_0001.copy.zst.done
    public.orders/
      chunk_0000.copy.zst
      chunk_0000.copy.zst.done

Benchmarks

On a 144MB test database (500K rows, local Docker):

| Method | Time | vs pg_dump | |--------|------|-----------| | pg_dump -Fc (single) | 2.1s | baseline | | pg_dump -Fd -j4 | 1.6s | 1.3x | | pg-turbo -j1 | 1.0s | 2.1x | | pg-turbo -j4 | 0.8s | 2.6x |

The speed advantage comes from direct COPY protocol (no per-table pg_dump process spawning) and overlapping DDL dump with data dump. On remote databases over VPN, the advantage is larger due to connection reuse and chunked retry.

Development

npm install          # Install dependencies
npm run typecheck    # TypeScript strict mode check
npm test             # Unit tests (vitest)
npm run test:integration  # Integration tests (requires Docker)
npm run bench        # Performance benchmarks
npm run lint         # Lint (biome)
npm run format       # Format (biome)

Project Structure

src/
  cli/          # CLI concerns (args, orchestration, terminal UI)
  core/         # Reusable logic (connection, streaming, chunking, schema)
  types/        # Shared TypeScript type definitions
tests/
  unit/         # Vitest unit tests
  integration/  # Docker-based end-to-end tests
bench/          # Performance benchmarks

License

AGPL-3.0