npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

bontan

v1.0.0

Published

Utility scripts for scraping

Downloads

8

Readme

 _                 _              
| |__   ___  _ __ | |_ __ _ _ __  
| '_ \ / _ \| '_ \| __/ _` | '_ \ 
| |_) | (_) | | | | || (_| | | | |
|_.__/ \___/|_| |_|\__\__,_|_| |_|

Bontan is a simple scraper with specialized behavior for some sites (like Wikipedia) and smart fallbacks for others. This repository also includes other scraping utilities.

  • bontan: Attempt to scrape all useful text content on the page.
  • kinkan: Like Bontan, but as a summary - just grabs the first paragraph. ("kinkan" is Japanese for "kumquat".)
  • sudachi: Simple utility to scrape using css selectors.
  • mikan: A simple clone of import.io. Prints each item in the biggest list on the page.

Examples

First install it:

npm install -g bontan

bontan

For a full text version of the page:

bontan 'https://en.wikipedia.org/wiki/Pomelo'

Images will have their src printed.

kinkan

Like Bontan, but with less output - typically it will use the page title, the first image, and the first p element.

kinkan 'https://en.wikipedia.org/wiki/Pomelo'

Pomelo - Wikipedia, the free encyclopedia
/wikipedia/commons/thumb/1/1c/Citrus_grandis_-_Honey_White.jpg/220px-Citrus_grandis_-_Honey_White.jpg
Citrus maxima (or Citrus grandis), (Common names: shaddick,[1] pomelo, pummelo, pommelo, pamplemousse, or shaddok) is a natural (non-hybrid) citrus fruit, with the appearance of a big grapefruit, native to South and Southeast Asia.

sudachi

For grabbing things by css selectors. It uses a virtual dom (domino), which makes it comparatively fast but unable to handle contents generated by JS after page load. Let's try getting all the h3 elements:

sudachi 'https://en.wikipedia.org/wiki/Pomelo' h3

Possible non-hybrid pomelos[edit]
Hybrids[edit]
Personal tools
Namespaces
etc.

You can pass -r to return innerHTML instead of textContent.

sudachi -r 'https://en.wikipedia.org/wiki/Pomelo' h3

<span class="mw-headline" id="Possible_non-hybrid_pomelos">Possible non-hybrid pomelos</span><span class="mw-editsection"><span class="mw-editsection-bracket">[</span><a href="/w/index.php?title=Pomelo&amp;action=edit&amp;section=5" title="Edit section: Possible non-hybrid pomelos">edit</a><span class="mw-editsection-bracket">]</span></span>
<span class="mw-headline" id="Hybrids">Hybrids</span><span class="mw-editsection"><span class="mw-editsection-bracket">[</span><a href="/w/index.php?title=Pomelo&amp;action=edit&amp;section=6" title="Edit section: Hybrids">edit</a><span class="mw-editsection-bracket">]</span></span>
Personal tools
Namespaces
etc.

mikan

This attempts to replicate some of the magic of import.io using a simple trick - usually, the most interesting list on a page is the longest one. Here's what happens when you point it at Stack Overflow:

mikan 'http://stackoverflow.com'
   0 votes   1 answer   4 views     Unique index not working  ruby-on-rails unique-constraint database-indexes   answered 1 min ago Thong Kuah 1,960   
   0 votes   0 answers   2 views     fprintf giving me a blank .txt file in MATLAB  matlab   asked 1 min ago physicist82 1   
   0 votes   0 answers   11 views     Node.js / Inheritance of variables and modules  javascript node.js inheritance   modified 1 min ago MiddleWare 138   
   5 votes   2 answers   30 views     Global Events in Angular 2  angular2   modified 1 min ago pixelbits 14.5k   
   0 votes   1 answer   5 views     Running “mvn test site” giving [ERROR] Failed to execute goal org.apache.maven.plugins:maven-site-plugin:3.3:site (default-site) on project  maven selenium xslt   modified 1 min ago Tunaki 29k   
 etc.

Or Hacker News:

mikan 'http://news.ycombinator.com'
 1.      The Trouble with the TPP, Day 5: Rights Holders “Shall” vs. Users “May” (michaelgeist.ca)
 46 points by walterbell 3 hours ago  | discuss

 2.      Tesla Model S can now park itself (techcrunch.com)
 151 points by prostoalex 6 hours ago  | 89 comments

 3.      Nvidia GPUs can break Chrome's incognito mode (charliehorse55.wordpress.com)
 374 points by charliehorse55 11 hours ago  | 120 comments

 etc.

License

WTFPL, do as you please.

-POLM