npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

epg-grabber

v0.36.1

Published

Node.js CLI tool for grabbing EPG from different sites

Downloads

5,465

Readme

EPG Grabber test

Node.js CLI tool for grabbing EPG from different websites.

Installation

npm install -g epg-grabber

Quick Start

epg-grabber --config=example.com.config.js

example.com.config.js

module.exports = {
  site: 'example.com',
  channels: 'example.com.channels.xml',
  url: function (context) {
    const { date, channel } = context

    return `https://api.example.com/${date.format('YYYY-MM-DD')}/channel/${channel.site_id}`
  },
  parser: function (context) {
    const programs = JSON.parse(context.content)

    return programs.map(program => {
      return {
        title: program.title,
        start: program.start,
        stop: program.stop
      }
    })
  }
}

example.com.channels.xml

<?xml version="1.0" ?>
<channels site="example.com">
  <channel site_id="cnn-23" xmltv_id="CNN.us">CNN</channel>
</channels>

Example Output

<tv>
  <channel id="CNN.us">
    <display-name>CNN</display-name>
    <url>https://example.com</url>
  </channel>
  <programme start="20211116040000 +0000" stop="20211116050000 +0000" channel="CNN.us">
    <title lang="en">News at 10PM</title>
  </programme>
  // ...
</tv>

CLI

epg-grabber --config=example.com.config.js

Arguments:

  • -c, --config: path to config file
  • -o, --output: path to output file or path template (example: guides/{site}.{lang}.xml; default: guide.xml)
  • --channels: path to list of channels; you can also use wildcard to specify the path to multiple files at once (example: example.com_*.channels.xml)
  • --lang: set default language for all programs (default: en)
  • --days: number of days for which to grab the program (default: 1)
  • --delay: delay between requests in milliseconds (default: 3000)
  • --timeout: set a timeout for each request in milliseconds (default: 5000)
  • --max-connections: set a limit on the number of concurrent requests per site (default: 1)
  • --cache-ttl: maximum time for storing each request in milliseconds (default: 0)
  • --gzip: compress the output (default: false)
  • --debug: enable debug mode (default: false)
  • --curl: display current request as CURL (default: false)
  • --log: path to log file (optional)
  • --log-level: set the log level (default: info)

Site Config

module.exports = {
  site: 'example.com', // site domain name (required)
  output: 'example.com.guide.xml', // path to output file or path template (example: 'guides/{site}.{lang}.xml'; default: 'guide.xml')
  channels: 'example.com.channels.xml', // path to list of channels; you can also use an array to specify the path to multiple files at once (example: ['channels1.xml', 'channels2.xml']; required)
  lang: 'fr', // default language for all programs (default: 'en')
  days: 3, // number of days for which to grab the program (default: 1)
  delay: 5000, // delay between requests (default: 3000)
  maxConnections: 200, // limit on the number of concurrent requests (default: 1)

  request: { // request options (details: https://github.com/axios/axios#request-config)

    method: 'GET',
    timeout: 5000,
    proxy: {
      protocol: 'https',
      host: '127.0.0.1',
      port: 9000,
      auth: {
        username: 'mikeymike',
        password: 'rapunz3l'
      }
    },
    cache: { // cache options (details: https://axios-cache-interceptor.js.org/#/pages/per-request-configuration)
      ttl: 60 * 1000 // 60s
    },

    /**
     * @param {object} context
     *
     * @return {string} The function should return headers for each request (optional)
     */
    headers: function(context) {
      return {
        'User-Agent':
          'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.130 Safari/537.36 Edg/79.0.309.71'
      }
    },

    /**
     * @param {object} context
     *
     * @return {string} The function should return data for each request (optional)
     */
    data: function(context) {
      const { channel, date } = context

      return {
        channels: [channel.site_id],
        dateStart: date.format('YYYY-MM-DDT00:00:00-00:00'),
        dateEnd: date.add(1, 'd').format('YYYY-MM-DDT00:00:00-00:00')
      }
    }
  },

  /**
   * @param {object} context
   *
   * @return {string} The function should return URL of the program page for the channel
   */
  url: function (context) {
    return `https://example.com/${context.date.format('YYYY-MM-DD')}/channel/${context.channel.site_id}.html`
  },

  /**
   * @param {object} context
   *
   * @return {string} The function should return URL of the channel logo (optional)
   */
  logo: function (context) {
    return `https://example.com/logos/${context.channel.site_id}.png`
  },

  /**
   * @param {object} context
   *
   * @return {array} The function should return an array of programs with their descriptions
   */
  parser: function (context) {

    // content parsing...

    return [
      {
        title, // program title (required)
        start, // start time of the program (required)
        stop, // end time of the program (required)
        sub_title, // program sub-title (optional)
        description, // description of the program (optional)
        category, // type of program (optional)
        season, // season number (optional)
        episode, // episode number (optional)
        date, // the date the programme or film was finished (optional)
        icon, // image associated with the program (optional)
        rating, // program rating (optional)
        director, // the name of director (optional)
        actor, // the name of actor (optional)
        writer, // the name of writer (optional)
        adapter, // the name of adapter (optional)
        producer, // the name of producer (optional)
        composer, // the name of composer (optional)
        editor, // the name of editor (optional)
        presenter, // the name of presenter (optional)
        commentator, // the name of commentator (optional)
        guest // the name of guest (optional)
      },
      ...
    ]
  }
}

Context Object

From each function in config.js you can access a context object containing the following data:

  • channel: The object describing the current channel (xmltv_id, site_id, name, lang)
  • date: The 'dayjs' instance with the requested date
  • content: The response data as a String
  • buffer: The response data as an ArrayBuffer
  • headers: The response headers
  • request: The request config
  • cached: A boolean to check whether this request was cached or not

Channels List

<?xml version="1.0" ?>
<channels site="example.com">
  <channel site_id="cnn-23" xmltv_id="CNN.us">CNN</channel>
  ...
</channels>

You can also specify the language, site and logo for each channel individually, like so:

<channel
  site="example.com"
  site_id="france-24"
  xmltv_id="France24.fr"
  lang="fr"
  logo="https://example.com/france24.png"
>France 24</channel>

How to use SOCKS proxy?

First, you need to install socks-proxy-agent:

npm install socks-proxy-agent

Then you can use it to create an agent that acts as a SOCKS proxy. Here is an example of how to do it with the Tor SOCKS proxy:

const { SocksProxyAgent } = require('socks-proxy-agent')

const torProxyAgent = new SocksProxyAgent('socks://localhost:9050')

module.exports = {
  site: 'example.com',
  url: 'https://example.com/epg.json',
  request: {
    httpsAgent: torProxyAgent,
    httpAgent: torProxyAgent
  },
  parser(context) {
    // ...
  }
}

Contribution

If you find a bug or want to contribute to the code or documentation, you can help by submitting an issue or a pull request.

License

MIT