npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

command-scraper

v1.0.3

Published

A web image scraper with user approval process in the command line.

Downloads

12

Readme

Command Scraper

A command line approval based web scraper.

A side note about web scraping. Always check the Terms of Service on a website if you're going to attempt to scrape it's content. The owner may not want their site and bandwidth touched by scraping so be mindful, and respect peoples wishes. If unsure, drop the website owner an email :)

What it does

The command-scraper searches a specified website for images and returns them to the user in web browser window, one-by-one, allowing the user to approve the images via the terminal which will then produce an action on the approved image (such as save to a database/file system).

The app will create a directory called public if one doesn't exist, and have this set as a place to serve static files.

As the user is scraping images, they will be saved temporarily to a directory inside the public directory called temp. The contents of this directory is automatically emptied once all images have been scraped and reviewed.

On approval of an image, the package will request the image from the temporary directory and write the same image to a user-specified permanent directory.

On each sucesfull approved image write, a callback function will be called, passing in the permanent image URL, so tasks such as DB updates can be made in accordance with the image write.

Note: currently the app is best ran on a local environment. Updates coming to allow integration with a remote system.

Install

npm i command-scraper

The function

Example usage

To run the command-scraper in a Node project, simply include the package into a Node instance and run the package. There's no need to set up a server or socket info as all this is done in the package:

Then simply:

node scraper.js

To do

  • Allow process to be ran on remote system (such as Heroku)
  • Make image selectors more flexible (i.e. not have it limited to parent/child searches)
  • Enable selecting and parsing of more than only images
  • Update so scraper can be ran from the CLI rather than just embeded in a function