npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2025 – Pkg Stats / Ryan Hefner

html-to-markdown-node

v2.14.2

Published

High-performance HTML to Markdown converter - Node.js native bindings

Readme

html-to-markdown-node

npm package: html-to-markdown-node (this README). Use html-to-markdown-wasm for the portable WASM build.

Native Node.js and Bun bindings for html-to-markdown using NAPI-RS v3.

Built on the shared Rust engine that powers the Python wheels, Ruby gem, PHP extension, WebAssembly package, and CLI – ensuring identical Markdown output across every language target.

High-performance HTML to Markdown conversion using native Rust code compiled to platform-specific binaries.

Crates.io npm (node) npm (wasm) PyPI Packagist RubyGems NuGet Maven Central Go Reference License: MIT

Performance

Native NAPI-RS bindings deliver the fastest HTML to Markdown conversion available in JavaScript.

Benchmark Results (Apple M4)

| Document Type | ops/sec | Notes | | -------------------------- | ---------- | ------------------ | | Small (5 paragraphs) | 86,233 | Simple documents | | Medium (25 paragraphs) | 18,979 | Nested formatting | | Large (100 paragraphs) | 4,907 | Complex structures | | Tables (20 tables) | 5,003 | Table processing | | Lists (500 items) | 1,819 | Nested lists | | Wikipedia (129KB) | 1,125 | Real-world content | | Wikipedia (653KB) | 156 | Large documents |

Average: ~18,162 ops/sec across varied workloads.

Comparison

  • vs WASM: ~1.17× faster (native has zero startup time, direct memory access)
  • vs Python: ~7.4× faster (avoids FFI overhead)
  • Best for: Node.js and Bun server-side applications requiring maximum throughput

Benchmark Fixtures (Apple M4)

task bench:bindings feeds identical Wikipedia + hOCR fixtures into every binding. Node keeps pace with the Rust CLI across the board:

| Document | Size | ops/sec (Node) | | ---------------------- | ------ | -------------- | | Lists (Timeline) | 129 KB | 1,308 | | Tables (Countries) | 360 KB | 331 | | Medium (Python) | 657 KB | 150 | | Large (Rust) | 567 KB | 163 | | Small (Intro) | 463 KB | 208 | | hOCR German PDF | 44 KB | 2,944 | | hOCR Invoice | 4 KB | 27,326 | | hOCR Embedded Tables | 37 KB | 3,475 |

Run task bench:bindings -- --language node locally to regenerate these numbers.

Installation

Node.js

npm install html-to-markdown-node
# or
yarn add html-to-markdown-node
# or
pnpm add html-to-markdown-node

Bun

bun add html-to-markdown-node

Usage

Basic Conversion

import { convert } from 'html-to-markdown-node';

const html = '<h1>Hello World</h1><p>This is <strong>fast</strong>!</p>';
const markdown = convert(html);
console.log(markdown);
// # Hello World
//
// This is **fast**!

With Options

import { convert } from 'html-to-markdown-node';

const markdown = convert(html, {
  headingStyle: 'Atx',
  codeBlockStyle: 'Backticks',
  listIndentWidth: 2,
  bullets: '-',
  wrap: true,
  wrapWidth: 80
});

Preserve Complex HTML (NEW in v2.5)

import { convert } from 'html-to-markdown-node';

const html = `
<h1>Report</h1>
<table>
  <tr><th>Name</th><th>Value</th></tr>
  <tr><td>Foo</td><td>Bar</td></tr>
</table>
`;

const markdown = convert(html, {
  preserveTags: ['table'] // Keep tables as HTML
});
// # Report
//
// <table>
//   <tr><th>Name</th><th>Value</th></tr>
//   <tr><td>Foo</td><td>Bar</td></tr>
// </table>

TypeScript

Full TypeScript definitions included:

import { convert, convertWithInlineImages, type JsConversionOptions } from 'html-to-markdown-node';

const options: JsConversionOptions = {
  headingStyle: 'Atx',
  codeBlockStyle: 'Backticks',
  listIndentWidth: 2,
  bullets: '-',
  wrap: true,
  wrapWidth: 80
};

const markdown = convert('<h1>Hello</h1>', options);

Reusing Parsed Options

Avoid re-parsing the same options object on every call (benchmarks, tight render loops) by creating a reusable handle:

import {
  createConversionOptionsHandle,
  convertWithOptionsHandle,
} from 'html-to-markdown-node';

const handle = createConversionOptionsHandle({ hocrSpatialTables: false });
const markdown = convertWithOptionsHandle('<h1>Handles</h1>', handle);

Zero-Copy Buffer Input

Skip the intermediate UTF-16 string allocation by feeding Buffer/Uint8Array inputs directly—handy for benchmark harnesses or when you already have raw bytes:

import {
  convertBuffer,
  convertInlineImagesBuffer,
  convertBufferWithOptionsHandle,
  createConversionOptionsHandle,
} from 'html-to-markdown-node';
import { readFileSync } from 'node:fs';

const html = readFileSync('fixtures/lists.html'); // Buffer
const markdown = convertBuffer(html);

const handle = createConversionOptionsHandle({ headingStyle: 'Atx' });
const markdownFromHandle = convertBufferWithOptionsHandle(html, handle);

// Inline images work too:
const extraction = convertInlineImagesBuffer(html, null, {
  maxDecodedSizeBytes: 5 * 1024 * 1024,
});

Inline Images

Extract and decode inline images (data URIs, SVG):

import { convertWithInlineImages } from 'html-to-markdown-node';

const html = '<img src="..." alt="Logo">';

const result = convertWithInlineImages(html, null, {
  maxDecodedSizeBytes: 5 * 1024 * 1024, // 5MB
  inferDimensions: true,
  filenamePrefix: 'img_',
  captureSvg: true
});

console.log(result.markdown);
console.log(`Extracted ${result.inlineImages.length} images`);

for (const img of result.inlineImages) {
  console.log(`${img.filename}: ${img.format}, ${img.data.length} bytes`);
  // Save image data to disk
  require('fs').writeFileSync(img.filename, img.data);
}

Supported Platforms

Pre-built native binaries are provided for:

| Platform | Architectures | | ----------- | --------------------------------------------------- | | macOS | x64 (Intel), ARM64 (Apple Silicon) | | Linux | x64 (glibc/musl), ARM64 (glibc/musl), ARMv7 (glibc) | | Windows | x64, ARM64 |

Runtime Compatibility

Node.js 18+ (LTS) ✅ Bun 1.0+ (full NAPI-RS support) ❌ Deno (use html-to-markdown-wasm instead)

When to Use

Choose html-to-markdown-node when:

  • ✅ Running in Node.js or Bun
  • ✅ Maximum performance is required
  • ✅ Server-side conversion at scale

Use html-to-markdown-wasm for:

  • 🌐 Browser/client-side conversion
  • 🦕 Deno runtime
  • ☁️ Edge runtimes (Cloudflare Workers, Deno Deploy)
  • 📦 Universal packages

Other runtimes:

Configuration Options

See ConversionOptions for all available options including:

  • Heading styles (ATX, underlined, ATX closed)
  • Code block styles (indented, backticks, tildes)
  • List formatting (indent width, bullet characters)
  • Text escaping and formatting
  • Tag preservation (preserveTags) and stripping (stripTags)
  • Preprocessing for web scraping
  • hOCR table extraction
  • And more...

Examples

Preserving HTML Tags

Keep specific HTML tags in their original form instead of converting to Markdown:

import { convert } from 'html-to-markdown-node';

const html = `
<p>Before table</p>
<table class="data">
    <tr><th>Name</th><th>Value</th></tr>
    <tr><td>Item 1</td><td>100</td></tr>
</table>
<p>After table</p>
`;

const markdown = convert(html, {
  preserveTags: ['table']
});

// Result includes the table as HTML:
// "Before table\n\n<table class=\"data\">...</table>\n\nAfter table\n"

Combine with stripTags for fine-grained control:

const markdown = convert(html, {
  preserveTags: ['table', 'form'],  // Keep these as HTML
  stripTags: ['script', 'style']    // Remove these entirely
});

Web Scraping

const { convert } = require('html-to-markdown-node');

const scrapedHtml = await fetch('https://example.com').then(r => r.text());

const markdown = convert(scrapedHtml, {
  preprocessing: {
    enabled: true,
    preset: 'Aggressive',
    removeNavigation: true,
    removeForms: true
  },
  headingStyle: 'Atx',
  codeBlockStyle: 'Backticks'
});

hOCR Document Processing

const { convert } = require('html-to-markdown-node');
const fs = require('fs');

// OCR output from Tesseract in hOCR format
const hocrHtml = fs.readFileSync('scan.hocr', 'utf8');

// Automatically detects hOCR and reconstructs tables
const markdown = convert(hocrHtml, {
  hocrSpatialTables: true  // Enable spatial table reconstruction
});

Links

License

MIT