npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

html-select

v2.3.24

Published

match a tokenized html stream with css selectors

Downloads

60,998

Readme

html-select

match and splice a tokenized html stream with css selectors

build status

example

readable stream

Given a tokenized stream from html-tokenize, this program will print the dt tags matching the selector 'ul > li dt':

var select = require('html-select');
var tokenize = require('html-tokenize');
var fs = require('fs');

var s = select('ul > li dt', function (e) {
    console.log('*** MATCH ***');
    e.createReadStream().on('data', function (row) {
        console.log([ row[0], row[1].toString() ]);
    });
});
fs.createReadStream(__dirname + '/page.html').pipe(tokenize()).pipe(s);
s.resume();

The s.resume() is necesary to put the stream into flow mode since we aren't doing anything with the output of s.

Now this html input:

<html>
  <head>
    <title>presentation examples</title>
  </head>
  <body>
    <h1>hello there!</h1>
    <p>
      This presentation contains these examples:
    </p>
    
    <ul>
      <li>
        <dt>browserify</dt>
        <dd>node-style <code>require()</code> in the browser</dd>
      </li>
      
      <li>
        <dt>streams</dt>
        <dd>shuffle data around with backpressure</dd>
      </li>
      
      <li>
        <dt>ndarray</dt>
        <dd>n-dimensional matricies on top of typed arrays</dd>
      </li>
      
      <li>
        <dt>music</dt>
        <dd>make music with code</dd>
      </li>
      
      <li>
        <dt>voxeljs</dt>
        <dd>make minecraft-style games in webgl</dd>
      </li>
      
      <li>
        <dt>trumpet</dt>
        <dd>transform html with css selectors and streams</dd>
      </li>
    </ul>
  </body>
</html>

gives this output:

*** MATCH ***
[ 'open', '<dt>' ]
[ 'text', 'browserify' ]
[ 'close', '</dt>' ]
*** MATCH ***
[ 'open', '<dt>' ]
[ 'text', 'streams' ]
[ 'close', '</dt>' ]
*** MATCH ***
[ 'open', '<dt>' ]
[ 'text', 'ndarray' ]
[ 'close', '</dt>' ]
*** MATCH ***
[ 'open', '<dt>' ]
[ 'text', 'music' ]
[ 'close', '</dt>' ]
*** MATCH ***
[ 'open', '<dt>' ]
[ 'text', 'voxeljs' ]
[ 'close', '</dt>' ]
*** MATCH ***
[ 'open', '<dt>' ]
[ 'text', 'trumpet' ]
[ 'close', '</dt>' ]

transform

Using the same html file from the previous example, this script converts everything inside dt elements to uppercase:

var select = require('html-select');
var tokenize = require('html-tokenize');
var through = require('through2');
var fs = require('fs');

var s = select('dt', function (e) {
    var tr = through.obj(function (row, buf, next) {
        this.push([ row[0], String(row[1]).toUpperCase() ]);
        next();
    });
    tr.pipe(e.createStream()).pipe(tr);
});

fs.createReadStream(__dirname + '/page.html')
    .pipe(tokenize())
    .pipe(s)
    .pipe(through.obj(function (row, buf, next) {
        this.push(row[1]);
        next();
    }))
    .pipe(process.stdout)
;

Running the transform program yields this html output:

<html>
  <head>
    <title>presentation examples</title>
  </head>
  <body>
    <h1>hello there!</h1>
    <p>
      This presentation contains these examples:
    </p>
    
    <ul>
      <li>
        <DT>BROWSERIFY</DT>
        <dd>node-style <code>require()</code> in the browser</dd>
      </li>
      
      <li>
        <DT>STREAMS</DT>
        <dd>shuffle data around with backpressure</dd>
      </li>
      
      <li>
        <DT>NDARRAY</DT>
        <dd>n-dimensional matricies on top of typed arrays</dd>
      </li>
      
      <li>
        <DT>MUSIC</DT>
        <dd>make music with code</dd>
      </li>
      
      <li>
        <DT>VOXELJS</DT>
        <dd>make minecraft-style games in webgl</dd>
      </li>
      
      <li>
        <DT>TRUMPET</DT>
        <dd>transform html with css selectors and streams</dd>
      </li>
    </ul>
  </body>
</html>

methods

var select = require('html-select')

var sel = select(selector, cb)

Create a new html selector transform stream sel.

sel expects tokenized html objects as input and writes tokenized html objects as output.

If selector and cb are given, sel.select(selector, cb) is called automatically.

sel.select(selector, cb)

Register a callback cb(elem) to fire whenever the css selector string matches.

elem.createReadStream(opts)

Create a readable object mode stream at the selector. The readable stream contains all the matching tokenized html objects including the element that matched and its closing tag.

If opts.inner is true, only read the inner content. Otherwrite read the outer content.

elem.createWriteStream(opts)

Create a writable object mode stream at the selector. The writable stream writes into the document stream at the selector, replacing the existing content.

If opts.inner is true, only write to the inner content. Otherwrite write to the outer content.

elem.createStream(opts)

Create a duplex object mode stream at the selector. The writable side will write into the document stream at the selector, replacing the existing content. The readable side contains the existing content.

If opts.inner is true, only read and write to the inner content. Otherwrite read and write to the outer content.

elem.setAttribute(key, value)

Set an attribute named by key to value.

If value is true, the attribute will appear without an equal sign in the markup.

elem.removeAttribute(key)

Remove an attribute named by key.

elem.getAttribute(key)

Return an object with a single attribute value named by key.

elem.getAttributes()

Return an object with all attributes.

properties

elem.name

The string name of the tag.

events

elem.on('close', function () {})

When a matched element is closed for reading and writing, this event fires.

usage

usage: html-select SELECTOR OPTIONS

  Given a newline-separated json stream of html tokenize output on stdin,
  print content below matching html tokens as json on stdout.

OPTIONS are:

  -r, --raw   Instead of printing html token data as json, print the html
              directly.

supported css selectors

Internally html-select uses cssauron.

install

With npm do:

npm install html-select

to get the library or

npm install -g html-select

to get the command-line program.

license

MIT