npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

reddit-crawler

v0.0.1

Published

crawl all submissions in a subreddit

Downloads

10

Readme

reddit-crawler

npm install reddit-crawler

Iterate over all submissions in a subreddit.

  • Uses Reddit's CloudSearch API.
  • Auto-renews OAuth access token as it crawls.

Usage

Until Node gets native async iterators, the crawler approximates one with its method next(): Promise<array | null>.

If falsey, then the crawler is done crawling the subreddit.

The array of submissions may be empty. The crawler will expand its search interval until it finds results, attempting to hover around 50-99 results per request.

const makeCrawler = require('reddit-crawler')

const creds = {
    username: 'foo',
    password: 'secret',
    appId: 'xxx',
    appSecret: 'yyy',
}

async function work() {
    const crawler = makeCrawler('webdev', {
        creds,
        userAgent: 'my-crawler:0.0.1 (by /u/foo)',
    })

    while (true) {
        const submissions = await crawler.next()

        if (!submissions) {
            console.log('end of subreddit')
            break
        }

        for (const sub of submissions) {
            await processSubmission(sub)
        }
    }
}

function processSubmission(sub) {
    console.log(`title: "${sub.title}"`)
}

work().catch(console.error)

Credentials are for a Reddit app and a user that owns it. By giving the crawler your creds, it can renew its access-token as it crawls.

The access-token expires in one hour, but large subreddits take a while to crawl (respecting Reddit's rate-limit).

Reddit's API requires a user-agent: https://github.com/reddit/reddit/wiki/API.

<platform>:<app ID>:<version string> (by /u/<reddit username>)

Options

const Duration = require('reddit-crawler/duration')

const crawler = makeCrawler('webdev', {
    // Required
    creds,
    userAgent: 'my-crawler:0.0.1 (by /u/foo)',
    // Optional (here are the defaults)
    initInterval: Duration.ofMinutes(15),
    minInterval: Duration.ofMinutes(10),
    maxInterval: Duration.ofDays(365),
    initMax: new Date(),
})
  • initInterval: the crawler starts off requesting submissions created within this span of time.
  • minInterval/maxInterval: the crawler shrinks/grows its interval to hover around 50-99 results per request, never exceeding min nor max.
  • maxInterval also tells the crawler when to give up: if the crawler has grown to its max interval yet it still does not find any results, then it assumes that there are no more submissions.
  • initMax: the crawler starts at initMax date and crawls backwards into the past. useful when you want to resume progress without re-crawling the top N submissions.

Notes

  • Reddit asks that you hit its API no more than once per second. The crawler has a rudimentary, built-in sleep after each cloud-search request.
  • Set DEBUG=reddit-crawler to see debug logging.