npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

@easyscrape/core

v0.0.2

Published

EasyScrape is a NodeJS module designed to be integrated into your web scraping project. With it, you can more easily get information from the web from a JSON object to organized data, as a REST API could give you!

Downloads

4

Readme

EasyScrapeCore

PROJECT IN DEVELOPMENT!

DO NOT USE IT IN PRODUCTION.

EasyScrape It's a mediator to make easier the Web Scraping with JavaScript and TypeScript. With EasyScrape you can extract all data from any website like an API. This is the core of the all Middlewares based on EasyScrape scraping method. Here you can read how to make your own Middleware based on EasyScrape. If you only want to use it, This isn't your Documentation, read the all EasyScrape Implementations on the next list based on your requirements.

EasyScrape For:

  • Cheerio: It scrapes from HTML documents.
  • Puppeteer: It scrapes or controls a navigator like Chromium or other browsers supported by Puppeteer (coming soon)

Documentation Links

How can EasyScrape help you?

Well, EasyScape can Scrape and give you the information that you want exactly like you need.

Installation

Use this command to install EasyScrape's Module in your Project.

# if you use npm
npm install @easyscrape/core

# or yarn
yarn add @easyscrape/core

How could i use it?

Very Easy! Only imports the NodeJS Module that implements EasyScrapeCore for manage a your favorite Scraping Module in your code like this

// Example using EasyScrape for Cheerio
const EasyScrape = require('@easyscrape/cheerio');

Then, load your HTML Code with the Module that you want to use in your project. Supposing that you has an HTML code Like this

<nav id="ShoppingList">
    <ul id="fruits">
        <li class="apple">Apple</li>
        <li class="orange">Orange</li>
        <li class="pear">Pear</li>
    </ul>
    <ul id="meats">
        <li class="pork">Pork meat</li>
        <li class="beef">Beef</li>
        <li class="chicken">Chicken</li>
    </ul>
</nav>

If you are using Cheerio, you can do this.

let $ = EasyScrape.load('<nav id="ShoppingList">...</nav>');

let data = $({
    fruits: {
        _each_: '#fruits li', // Get all "li" elements inside the element with id "fruits" and for each elements do the next
        _text: true // get the inner text
    },
    meats: {
        _each_: '#meats li', // Get all "li" elements inside the element with id "meats" and for each elements do the next
        _text: true // get the inner text
    }
});

The variable "data" contains:

{
    fruits: [
        'Apple',
        'Orange',
        'Pear'
    ],
    meats: [
        'Pork meat',
        'Beef',
        'Chicken'
    ]
}

How can i create my own EasyScrape Middleware?

Its very easy create your own implementation, you can follow the next steps to do it.

Step 0: Preparations

  • THE DOCUMENTATION IS UNDER DEVELOPMENT RIGTH NOW!

  • You remember the Documentation Web Site (coming soon) its your friend! If you don't know what is the use for some method or you need an example, there is the technical documentation. This is a Quickstart guide.
  • If you use Visual Studio Code you can watch all technical documentation for each method only making mouseover the method name.
  • You can read the JSDoc comments each methods or classes.
  • Other form of help your self its reading the Cheerio Implementation.
  • Let's start!

Step 1: Installation

Install EasyScrapeCore in your Project.

Step 2: Main File

You Make a main file for your implementation, using the next structure.

// File: ./MyFirstMiddlewareEasyScrape.ts
import MyMiddlewareESQueriesManager from './MyMiddlewareESQueriesManager';
import {AbstractEasyScrapeMiddleware, IESObject, IESQuery} from '@easyscrape/core';

class MyFirstMiddlewareEasyScrape extends AbstractEasyScrapeMiddleware{
    /**
     * Middleware Information
     */
    SupportFor = {
        LibraryName: 'Cheerio', // Library name that your middleware use
        PackageName: 'cheerio' // NPM Package name
    }; 

    /** 
     * Your Middleware Queries Manager
     */
    protected QueriesManager: MyMiddlewareESQueriesManager = new MyMiddlewareESQueriesManager(this);

    /** 
     * This method says to EasyScrape when it can manage the data 
     */
    canICollect($: any): boolean {
        // Write here one code that return true or false if it can scrape over current node
    }

    /** 
     * Make your Middleware Load method
     * This method its expected that return an function whit one parameter with the types accepted 
     */
    load($: any){
        return (query: IESObject|IESQuery|string) => this.collect($, query);
    }
}
// The next line is very important, because it solve the unnecessary creations of the same middleware and export the module.
export default new MyMiddlewareEasyScrape;

Step 3: Middleware's Queries Manager

The queries manager its an class that contains all instructions to manage all queries that the user can use, the interface "IESQueriesManager" give you the basis queries and its information, but you can create all you need follow the following requirements:

  • All query names must use the prefix "_" at the beginning.
  • Use "$" like a wildcard in the query name to allow that the user customize the query.
  • You can use as many wildcards as you need.

For example: "_select$" handles statements such as "_selectFood", "_selectAllListsElements" or "_select".

// File: ./MyMiddlewareESQueriesManager.ts
import import {
    AbstractESQueriesManager, 
    ESQueriesManagerUtils,
    IESQueriesManager, 
    ESFilterHandle
} from '@easyscrape/core';

class MyMiddlewareESQueriesManager 
extends AbstractESQueriesManager // define the default EasyScrape methods, you can override if you need.
implements IESQueriesManager // says you what method do you need to create
{
    // Your methods here
}
export default MiddlewareESQueriesManager;

Step 4: Build and Share your Middleware

Export your package. Write this in the package.json. Please, name your package using @easyscrape/ followed of your middleware name, like this:

{
    "name": "@easyscrape/mymiddleware",
    "version": "1.0.0",
    "main": "./MyMiddlewareESQueriesManager.js",
    // ...
}

You remember add your Middleware name on the list of this repository so that everyone can use it and know it.