npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

htmltoolbox

v1.2.3

Published

Search and Modification Engine for HTML Documents

Downloads

228

Readme

HTMLToolbox

Search and Modification Engine for HTML Documents

HTMLToolbox is a set of tools for dealing with HTML documents. It doesn't rely on a browser and can work in node environment too. It can

  • Convert a HTML document to its rendered text, while being as close as possible to what is rendered by browser. For example <b>HTML</b>Toolbox will be converted to HTMLToolbox, or <input value='HTML'>Toolbox will be rendered as HTML Toolbox. Even less used tags like <q> are supported, for example <q>HTML</q>Toolbox will be rendered as "HTML"Toolbox. Style-sheets and scripts are also automatically detected and neglected in rendered text.

  • Search in the text version, using both strings and regular expressions, and modify the matches. The search can also simply traverse the document tree. The modifications can be applied in batch or in a loop. In the loop you can also check for the tag names, classes and other attribute values, using javascript and it is not limited to for example CSS queries. Available modification functions included are:

  • Remove: remove the matches.
  • Insert: insert HTML before or after the matches.
  • Replace: replace also supports regular expression $ notation.
  • Wrap: the matches can be wrapped in any kind of HTML, it doesn't need to be only a simple tag.
  • Change Tags: change the tags of the matches.
  • Set Attributes: change the HTML attributes of the matches
  • Minify or Pretify and indent the HTML document and automatically fix simple errors.

  • As HTMLToolbox doesn't use browser, it is secure against injections and there is no such danger.

  • It is customizable, every aspect of it can be modified, all the predefined tags can be changed and new ones can be added on the fly. There also exist a bunch configuration options that can save the day.

  • HTMLToolbox is tested and documented well and it is pretty fast.

Installation

The latest source code of HTMLToolbox can be found at https://github.com/arashkazemi/htmltoolbox

To use in other node projects, install HTMLToolbox from npm public repository:

    npm install htmltoolbox  

and then import it using

    const HTMLToolbox = require("htmltoolbox");

To use in a webpage, download the source code and extract it. The minified script itself is available in the /dist directory.

Alternatively, it is available via unpkg CDN and can be included in HTML files using

    <script src="https://unpkg.com/htmltoolbox/dist/htmltoolbox.min.js"></script>

Usage

See the documentation homepage and also the class documentation for all the available methods and many more examples.

Here are a few explained examples:

Replace Example

For replacing all occurrences of an regex or string in an HTML document, first create and instance of HTMLToolbox for the document string:

    let doc = "<!DOCTYPE html>this is a tesss<b><b></b></b>sssst!</div>";
    let htb = new HTMLToolbox(doc);

and then

    htb.replaceAll(/te(s+)t/gm, "<div>document</div>");

and to get the result:

    console.log(htb.getHTML(null));

which would be:

    <!DOCTYPE html>this is a <div>document</div>!

In case you want a fine control for each occurrence, you can iterate the results:

    for(const val of htb.search(/te(s+)t/gm)) {
          htb.replace("<div>document</div>");
    }

the val object contains both the match, and its start and end positions in the input string. It also contains an start and end node in the parsed HTML tree, so that you can traverse on and replace accordingly.

To get the output use getHTML, which receives the indentation as argument which is \t by default, and null means no indentation.

    htb.getHTML(null);

To get the flattened text use:

    htb.getText();

which in this case would be:

    this is a document.

To use regex match groups:

    let doc = "<div>1 and 2 and 3 and 4</div>";
    let htb = new HTMLToolbox(doc);

    for(const val of htb.search(/(\d) and/gm)) {
          const m = val.match;
          htb.replace(`${m[1]} or`);
    }

    console.log(htb.getHTML(null));

and the result would be:

    <!DOCTYPE html><div>1 or 2 or 3 or 4</div>

in case you want to replace all occurrences, you can call replaceAll:

Notice that the creation of the HTMLToolbox instance for a document is to make it possible to apply consecutive changes and then get the results.

Wrap Example

The wrap function receives the wrap envelope as a string and recognizes <!/> as a placeholder for the nodes that are being wrapped. The wrap envelope string is parsed as an HTML fragment before applying and so errors in it won't break the overall document structure, but may lead to unexpected results.

    let doc = "<div>1 and 2 and 3 and 4</div>";
    let htb = new HTMLToolbox(doc);

    htb.wrapAll(/\d/gm, "<span>Number <!/></span>");

or for fine control on each occurrences:

    for(const val of htb.search(/\d/gm)) {
          htb.wrap(`<span>Number <!/></span>`);
    }

    console.log(htb.getHTML(null));

and the result would be:

    <div><span>Number 1</span> and <span>Number  2</span> and <span>Number  3</span> and <span>Number  4</span></div>

See the class documentation for all other methods, features, and also the details of the data structures.


Copyright (C) 2023-2024 Arash Kazemi [email protected]. All rights reserved.

HTMLToolbox project is subject to the terms of BSD-2-Clause License. See the LICENSE file for more details.