npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

duphunt

v0.1.0

Published

Find duplicate files by content (SHA-256) — zero install, cross-platform. npx duphunt . — no brew/apt. Zero dependencies.

Readme

duphunt

Find duplicate files by content — anywhere, with nothing to install. The great duplicate finders (fdupes, jdupes, rdfind, fclones) are native binaries you have to brew/apt/cargo install first — which you can't always do on a locked-down box, a colleague's laptop, a CI runner, or a container. duphunt runs the moment you have Node or Python: npx duphunt . or pip install duphunt. Zero dependencies, no network.

$ npx duphunt ~/Downloads

2 duplicate group(s), 5 files, 8.1 MB reclaimable

  4.1 MB × 2   4.1 MB reclaimable
    /Users/me/Downloads/invoice.pdf
    /Users/me/Downloads/invoice (1).pdf

  2.0 MB × 3   4.0 MB reclaimable
    /Users/me/Downloads/clip.mp4
    /Users/me/Downloads/clip-copy.mp4
    /Users/me/Downloads/old/clip.mp4

Groups are sorted biggest-waste-first, so the files worth deleting are at the top.

How it works

  1. Group by size. Two files of different sizes can't be identical, so files with a unique size are never even read.
  2. Hash the collisions. Within each size group, each file is SHA-256 hashed (streamed in 64 KB chunks, so multi-GB files don't blow up memory).
  3. Report identical content. Files with the same hash are true byte-for-byte duplicates, grouped and ranked by reclaimable space.

It reports — it never deletes. You decide what to remove.

Usage

duphunt                      # scan the current directory
duphunt ~/Downloads ~/Desktop   # scan several roots at once
duphunt a.jpg b.jpg c.jpg    # or just compare specific files
duphunt . --json             # machine-readable
duphunt . --min-size 1048576 # ignore files under 1 MB
duphunt . --exit-code        # exit 1 if any duplicates exist (CI gate)

Options

| Flag | Effect | |------|--------| | --json | Emit { groups, summary } as JSON (raw byte sizes, full paths) | | --quiet | Print only the one-line summary | | --min-size <n> | Ignore files smaller than n bytes (default 1 — skips empty files) | | --follow | Follow symlinks (default: skip them, to avoid loops and double-counting) | | --exit-code | Exit 1 when duplicates are found (for CI gates) | | -v, --version | Print version | | -h, --help | Show help |

Notes

  • Empty files are skipped by default (they all hash alike and are rarely what you mean); pass --min-size 0 to include them.
  • Symlinks are skipped unless --follow, so a symlinked tree won't be double-counted or loop forever.
  • Each physical file is counted once. Repeated or overlapping roots and symlink aliases (even under --follow) are de-duplicated by real path, so they never inflate the results — while genuine hard links still surface.
  • Same tool, two builds. The Node and Python builds hash with SHA-256 and produce identical results — use whichever your environment already has.

--json shape

{
  "groups": [
    { "hash": "9f86d0…", "size": 4300000, "count": 2, "wasted": 4300000,
      "paths": ["/a/invoice.pdf", "/b/invoice (1).pdf"] }
  ],
  "summary": { "groups": 1, "files": 2, "wasted": 4300000 }
}

Exit codes

| Code | Meaning | |------|---------| | 0 | success (default — even when duplicates are found) | | 1 | duplicates found and --exit-code was passed | | 2 | error (bad option, missing path) |

By default duphunt is a viewer and exits 0; add --exit-code to gate a pipeline on it.

License

MIT