npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@mrspartak/html-ai-ready

v1.1.0

Published

HTML AI Ready

Downloads

542

Readme

HTML AI Ready

Raw HTML is usable by AI, but contains huge amount of token noise, implodes costs and latency. One of the approaches is to just strip all the HTML tags and leave only the text, which still leaves some useless information like style and script tags. The other and actually really good approach is to use HTML -> Markdown converter. But we can still gain a bit of performance and reduce tokens as AI does not really care about proper markdown, spaces etc.

This library is experimental and in personal tests showed the same quality results as plain HTML or Markdown, but much less tokens and a bit faster.

There is also a native version of the library, which is faster and smaller in size but in beta stage. html-ai-ready-native

Installation

# yarn
yarn add @mrspartak/html-ai-ready
# npm
npm i @mrspartak/html-ai-ready
# pnpm
pnpm add @mrspartak/html-ai-ready
# bun
bun add @mrspartak/html-ai-ready

Usage

import { htmlToAiReady, PRESET_QUALITY } from "@mrspartak/html-ai-ready";

const html = "<p>Hello, world!</p>";
const aiReady = htmlToAiReady(html, PRESET_QUALITY);

console.log(aiReady);

Benchmark

The main point of this package is to be fast, give the smallest result in terms of token size and also still maintain the context to answer questions. It is compared to a couple of other methods I saw so far.

Output Size Comparison

pnpm benchmark

When comparing the output size across all tested pages (average percentage of original HTML size):

| Method | Average Size (% of original) | | ---------------------- | ---------------------------- | | HTML_TO_AI_FAST | 24.69% | | HTML_TO_AI_QUALITY | 7.76% | | HTML_TO_AI_NATIVE | 12.19% | | NODE_HTML_MARKDOWN | 13.88% | | CHEERIO_QUALITY_PARSED | 19.31% |

Performance Benchmark

Performance comparison across all pages combined:

| Method | Operations/sec | Mean time (ms) | Comparison | | --------------------- | -------------- | -------------- | ------------------------- | | htmlToAiReady NATIVE | 73.31 | 13.63 | Fastest | | htmlToAiReady FAST | 28.76 | 34.77 | 2.55x slower than NATIVE | | htmlToAiReady QUALITY | 15.36 | 65.09 | 4.83x slower than NATIVE | | cheerioParse | 7.31 | 136.76 | 9.93x slower than NATIVE | | node-html-markdown | 6.30 | 158.77 | 11.54x slower than NATIVE |

AI Response Quality and Token Usage

# don't forget to add OPENAI_API_KEY to .env file first
pnpm aiq

To test real-world effectiveness, we used 3 HTML pages as context for AI and asked deterministic questions. The results show accuracy rates, token usage, and AI response times:

| Method | Accuracy | Avg Tokens | Avg Response Time | | ------------------------ | -------------- | ---------- | ----------------- | | htmlToAiReadyTextQuality | 15/20 (75.00%) | 10,759 | 618.75ms | | cherioText | 15/20 (75.00%) | 12,931 | 5,377.55ms | | nodeHtmlMarkdownText | 14/20 (70.00%) | 27,389 | 2,099.20ms |

As shown in the benchmarks, the QUALITY preset not only maintains the same accuracy as Cheerio while using fewer tokens, but it also delivers responses significantly faster. The FAST preset offers the best performance while the QUALITY preset provides the smallest output size with excellent accuracy, giving you options depending on your priority.

Some website statistics

To determing tags that I would like to strip, first of course I gathered tags that would not make any context for AI. Those are style, head, iframe etc. But stripping the tags is costly operation, so I wanted to actually know if stripping them makes any difference. So I gathered a list of ~800 random websites, crawled and parsed them. Here are some details:

Page Size Statistics

| Metric | Value | | ------- | ---------------- | | Minimum | 242 bytes | | Maximum | 11,647,892 bytes | | Average | 517,804 bytes | | Median | 346,929 bytes |

Crawl Timing Statistics

| Metric | Value | | ------- | -------- | | Average | 3,202 ms | | Median | 2,504 ms |

Element Size Analysis

Ordered by total size across all pages:

| Element | Average Size (bytes) | % of Page | % of Body | Total Size (bytes) | | -------- | -------------------- | --------- | --------- | ------------------ | | body | 406,357 | 76.11% | - | 349,061,078 | | head | 110,629 | 23.55% | 73.40% | 95,030,079 | | links | 98,256 | 19.18% | 24.71% | 84,401,659 | | svgs | 83,426 | 12.76% | 15.38% | 71,662,569 | | nav | 103,673 | 12.17% | 15.00% | 89,055,316 | | script | 73,849 | 10.84% | 14.09% | 63,435,956 | | images | 31,274 | 6.18% | 8.17% | 26,864,083 | | footer | 17,350 | 4.21% | 5.81% | 14,903,361 | | style | 16,031 | 3.82% | 4.80% | 13,770,735 | | forms | 14,827 | 3.29% | 4.33% | 12,736,770 | | button | 12,395 | 2.54% | 3.47% | 10,647,352 | | comments | 7,163 | 1.67% | 2.28% | 6,152,769 | | aside | 6,539 | 1.04% | 1.19% | 5,617,229 | | noscript | 1,663 | 0.32% | 0.44% | 1,428,938 | | iframe | 902 | 0.27% | 0.40% | 775,052 | | video | 74 | 0.03% | 0.03% | 63,815 | | canvas | 4 | 0.00% | 0.00% | 3,149 |

Kudos

This project wouldn't be possible without the valuable contributions and support from:

Contributing

I welcome contributions from the community! Whether it's improving the documentation, adding new features, or reporting bugs, please feel free to make a pull request or open an issue.

License

This project is licensed under the MIT License - see the LICENSE file for details.