npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

crazy-crawler

v0.3.0

Published

crawl everything by this

Downloads

46

Readme

crazyCrawler 2.0

crawl everything by this

Usage(使用方法)

install(安装)

npm install crazy-crawler

require(引用)

const { CrazyCrawler, Task, TaskChain } = require('crazy-crawler')

const crawler = new CrazyCrawler({ maxTask: 5, sleep: 100 })

crawler.on('done', () => {
	// crawler finish working
})

create crawler task (创建爬虫任务)

const task = new Task({
	name: 'example',
	url: 'http://www.baidu.com',
	handler: function(response) {
		// response is axios response
		console.log(response.data) //data of axios response
		console.log(response.task.name) // example
	},
	errorHandler: function(err) {
		// when error occurs in HTTP request this handler will be invoked
	}
})

queue task to crawler and run (将任务加入到爬虫队列并运行)

crawler.queueTask(task).run() // run crawler

craete crawler taskChain (创建爬虫任务链)

const taskChain = new TaskChain()

const firstTask = new Task({
	name: 'first',
	url: 'http://www.baidu.com',
	beforeTask: function({ lastTask, task, state }) {
		// if this task is the first task of the task chain
		// lastTask will be undefined
		// task is the task will be executed
		// state is the property of taskChain
		// and used by every task
		console.log(task.name) // first
		state.firstStatus = 'before'
	},
	afterTask: function({ task, state, response }) {
		// response is axios response and the same as response in handler
		console.log(state.firstStatus) // before
		state.firstStatus = 'finish'
	}
})

const secondTask = new Task({
	name: 'second',
	url: 'http://www.baidu.com',
	beforeTask: function({ lastTask, task, state }) {
		console.log(lastTask.name) // first
		console.log(state.firstStatus) // finish
	},
	afterTask: function({ task, state, response }) {
		// response is axios response and the same as response in handler
		console.log(response.task.name) // second
	}
})

taskChain.queue([firstTask, secondTask])

queue taskChain to crawler and run (将任务链加入到爬虫队列并运行)

crawler.queueTask(taskChain).run() // run crawler

examples

repeat task

  • example 1
const crawler = new CrazyCrawler({ maxTask: 5, sleep: 100 })
let counter = 0

crawler.on('done', () => {
	console.log(counter) // 3
})

const repaetTask = new Task({
	name: 'repeat',
	url: 'http://example.com',
	handler: function(response) {
		counter++
	},
	repeat: true,
	limit: 3
})

crawler.queueTask(repaetTask).run()
  • example 2
const crawler = new CrazyCrawler({ maxTask: 5, sleep: 100 })
let counter = 0

crawler.on('done', () => {
	console.log(counter) // 4
})

const repaetTask = new Task({
	name: 'repeat',
	url: 'http://example.com',
	handler: function(response) {
		counter++
	},
	repeat: true,
	limit: 2
})

const repaetTask1 = new Task({
	name: 'repeat',
	url: 'http://example.com',
	handler: function(response) {
		counter++
	},
	repeat: true,
	limit: 2
})

crawler
	.queueTask(repaetTask)
	.queueTask(repaetTask1)
	.run()

functional task

  • example 3
const crawler = new CrazyCrawler({ maxTask: 5, sleep: 100 })
let counter = 0

crawler.on('done', () => {
	console.log(counter) // 2
})

const functionalTask = new Task({
	name: 'functional',
	baseUrl: 'http://example.com/:id',
	paramSetters: {
		id: function(counter) {
			return counter + 123
			// url will be http://example.com/123 http://example.com/124 ...
		}
	},
	handler: function(response) {
		counter++
	},
	functional: true,
	limit: 2
})

crawler.queueTask(functionalTask).run()

functional and repeat task

  • example 4
const crawler = new CrazyCrawler({ maxTask: 5, sleep: 100 })
let counter = 0

crawler.on('done', () => {
	console.log(counter) // 4
})

const functionalTask = new Task({
	name: 'functional',
	baseUrl: 'http://example.com/:id',
	paramSetters: {
		id: function(counter) {
			return counter + 123
			// url will be http://example.com/123 http://example.com/124 ...
		}
	},
	handler: function(response) {
		counter++
	},
	functional: true,
	limit: 2
})

const repaetTask = new Task({
	name: 'repeat',
	url: 'http://example.com',
	handler: function(response) {
		counter++
	},
	repeat: true,
	limit: 2
})

crawler
	.queueTask(functionalTask)
	.queueTask(reapeatTask)
	.run()

API

CrazyCrawler

CrazyCrawler.constructor({ maxTask, sleep })

  • maxkTask: max tasks downloader execs at the same time
  • sleep: sleep between every task

CrazyCrawler.queueTask(task: Task | TaskChain)

  • add task or taskChain to crawler

CrazyCrawler.run()

  • run crawler

events

done

  • when crawler finish working 'done' event will be emitted

Task

Task.constructor({...options})

basic options(基础选项)

  • name: the name of task
  • url: target url
  • method: default to 'get'
  • data: only work with method is post, can be plain object or string
  • headers: can be plain object or string
  • cookies: cookie object, if headers not exist 'Cookie' property, then use cookies options
  • axiosOptions: any axios supported options, include url,method, data, headers
  • handler: to handle response if success, parameter is axios response
  • errorHandler: to handle error if any Error occurs in axios progress
  • fakeIP: by add 'X-Forword-For' and 'CLIENT_IP' with random IP to headers
  • repeat: specific task is repeat
  • limit: work with task is repeat or functional, number or function

functional task options(函数式任务选项)

  • functional: sepecific task is functional
  • baseUrl: generate url from baseUrl
  • baseData: generate data from baseData
  • paramSetters: sepecific properties to be generated to url and data
  • baseUrlPattern: how to find where to be replaced with generated param

task in chain options(任务链有效的选项)

  • inChain: specific task is working in chain
  • beforeTask: invoke before axios progress and you can modify the task
  • afterTask: invoke after axios progress and you can store some useful data to use in chain

Task.exec()

run task

Task.CheckLimit()

check if task is over limit

Task.copy()

return a task with coptFrom property of this task

Task.repeatTask()

return a task like this task

Task.generateTask()

if task is functional this will return a generated task with functional options, otherwise return this.copy() with this task

TaskChain

TaskChain.constructor({ repeat, functional, limit })

  • repeat: sepecific this task chain is repeat chain
  • limit: times to repeat, not work with functional
  • functional: sepecific this task chain is functional

TaskChain.queue(task)

queue tasks to exec in chain, order is the order with queue

TaskChain.toTask()

to Task

TaskChain.checkLimit()

if reapet this will check if over limit, if functional this will invoke checkLimit of every task in chain to check

TaskChain.generateTaskChain()

if functional, this will invoke generateTask of every task in chain and push them to a new TaskChain, then return this new chain

TaskChain.repeatTaskChain()

if repeat, this will return a new TaskChain based on this taskChain

welcome pull request

Lisence

MIT