npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@kreuzberg/html-to-markdown-node

v2.19.5

Published

High-performance HTML to Markdown converter - Node.js native bindings

Readme

@kreuzberg/html-to-markdown-node

npm package: @kreuzberg/html-to-markdown-node (this README). Use @kreuzberg/html-to-markdown-wasm for the portable WASM build.

Native Node.js and Bun bindings for html-to-markdown using NAPI-RS v3.

Built on the shared Rust engine that powers the Python wheels, Ruby gem, PHP extension, WebAssembly package, and CLI – ensuring identical Markdown output across every language target.

High-performance HTML to Markdown conversion using native Rust code compiled to platform-specific binaries.

Crates.io npm (node) npm (wasm) PyPI Packagist RubyGems NuGet Maven Central Go Reference License: MIT

Migration Guide (v2.18.x → v2.19.0)

⚠️ BREAKING CHANGE: Package Namespace Update

In v2.19.0, the npm package namespace changed from html-to-markdown-node to @kreuzberg/html-to-markdown-node to reflect the new Kreuzberg.dev organization.

Install Updated Package

Before (v2.18.x):

npm install html-to-markdown-node

After (v2.19.0+):

npm install @kreuzberg/html-to-markdown-node

Update Import Statements

Before:

import { convert } from 'html-to-markdown-node';

After:

import { convert } from '@kreuzberg/html-to-markdown-node';

Summary of Changes

  • Package renamed from html-to-markdown-node to @kreuzberg/html-to-markdown-node
  • All APIs remain identical
  • Full backward compatibility after updating package name and imports

Performance

Native NAPI-RS bindings deliver the fastest HTML to Markdown conversion available in JavaScript.

Benchmark Results (Apple M4)

| Document Type | ops/sec | Notes | | -------------------------- | ---------- | ------------------ | | Small (5 paragraphs) | 86,233 | Simple documents | | Medium (25 paragraphs) | 18,979 | Nested formatting | | Large (100 paragraphs) | 4,907 | Complex structures | | Tables (20 tables) | 5,003 | Table processing | | Lists (500 items) | 1,819 | Nested lists | | Wikipedia (129KB) | 1,125 | Real-world content | | Wikipedia (653KB) | 156 | Large documents |

Average: ~18,162 ops/sec across varied workloads.

Comparison

  • vs WASM: ~1.17× faster (native has zero startup time, direct memory access)
  • vs Python: ~7.4× faster (avoids FFI overhead)
  • Best for: Node.js and Bun server-side applications requiring maximum throughput

Benchmark Fixtures (Apple M4)

The shared benchmark harness lives in tools/benchmark-harness. Node keeps pace with the Rust CLI across the board:

| Document | Size | ops/sec (Node) | | ---------------------- | ------ | -------------- | | Lists (Timeline) | 129 KB | 3,137 | | Tables (Countries) | 360 KB | 932 | | Medium (Python) | 657 KB | 460 | | Large (Rust) | 567 KB | 554 | | Small (Intro) | 463 KB | 627 | | hOCR German PDF | 44 KB | 8,724 | | hOCR Invoice | 4 KB | 96,138 | | hOCR Embedded Tables | 37 KB | 9,591 |

Run task bench:harness -- --frameworks node to regenerate these numbers.

Installation

Node.js

npm install @kreuzberg/html-to-markdown-node
# or
yarn add @kreuzberg/html-to-markdown-node
# or
pnpm add @kreuzberg/html-to-markdown-node

Bun

bun add @kreuzberg/html-to-markdown-node

Usage

Basic Conversion

import { convert } from '@kreuzberg/html-to-markdown-node';

const html = '<h1>Hello World</h1><p>This is <strong>fast</strong>!</p>';
const markdown = convert(html);
console.log(markdown);
// # Hello World
//
// This is **fast**!

With Options

import { convert } from '@kreuzberg/html-to-markdown-node';

const markdown = convert(html, {
  headingStyle: 'Atx',
  codeBlockStyle: 'Backticks',
  listIndentWidth: 2,
  bullets: '-',
  wrap: true,
  wrapWidth: 80
});

Preserve Complex HTML (NEW in v2.5)

import { convert } from '@kreuzberg/html-to-markdown-node';

const html = `
<h1>Report</h1>
<table>
  <tr><th>Name</th><th>Value</th></tr>
  <tr><td>Foo</td><td>Bar</td></tr>
</table>
`;

const markdown = convert(html, {
  preserveTags: ['table'] // Keep tables as HTML
});
// # Report
//
// <table>
//   <tr><th>Name</th><th>Value</th></tr>
//   <tr><td>Foo</td><td>Bar</td></tr>
// </table>

TypeScript

Full TypeScript definitions included:

import { convert, convertWithInlineImages, type JsConversionOptions } from '@kreuzberg/html-to-markdown-node';

const options: JsConversionOptions = {
  headingStyle: 'Atx',
  codeBlockStyle: 'Backticks',
  listIndentWidth: 2,
  bullets: '-',
  wrap: true,
  wrapWidth: 80
};

const markdown = convert('<h1>Hello</h1>', options);

Reusing Parsed Options

Avoid re-parsing the same options object on every call (benchmarks, tight render loops) by creating a reusable handle:

import {
  createConversionOptionsHandle,
  convertWithOptionsHandle,
} from '@kreuzberg/html-to-markdown-node';

const handle = createConversionOptionsHandle({ hocrSpatialTables: false });
const markdown = convertWithOptionsHandle('<h1>Handles</h1>', handle);

Zero-Copy Buffer Input

Skip the intermediate UTF-16 string allocation by feeding Buffer/Uint8Array inputs directly—handy for benchmark harnesses or when you already have raw bytes:

import {
  convertBuffer,
  convertInlineImagesBuffer,
  convertBufferWithOptionsHandle,
  createConversionOptionsHandle,
} from '@kreuzberg/html-to-markdown-node';
import { readFileSync } from 'node:fs';

const html = readFileSync('fixtures/lists.html'); // Buffer
const markdown = convertBuffer(html);

const handle = createConversionOptionsHandle({ headingStyle: 'Atx' });
const markdownFromHandle = convertBufferWithOptionsHandle(html, handle);

// Inline images work too:
const extraction = convertInlineImagesBuffer(html, null, {
  maxDecodedSizeBytes: 5 * 1024 * 1024,
});

Inline Images

Extract and decode inline images (data URIs, SVG):

import { convertWithInlineImages } from '@kreuzberg/html-to-markdown-node';

const html = '<img src="data:image/png;base64,iVBORw0..." alt="Logo">';

const result = convertWithInlineImages(html, null, {
  maxDecodedSizeBytes: 5 * 1024 * 1024, // 5MB
  inferDimensions: true,
  filenamePrefix: 'img_',
  captureSvg: true
});

console.log(result.markdown);
console.log(`Extracted ${result.inlineImages.length} images`);

for (const img of result.inlineImages) {
  console.log(`${img.filename}: ${img.format}, ${img.data.length} bytes`);
  // Save image data to disk
  require('fs').writeFileSync(img.filename, img.data);
}

Supported Platforms

Pre-built native binaries are provided for:

| Platform | Architectures | | ----------- | --------------------------------------------------- | | macOS | x64 (Intel), ARM64 (Apple Silicon) | | Linux | x64 (glibc/musl), ARM64 (glibc/musl), ARMv7 (glibc) | | Windows | x64, ARM64 |

Runtime Compatibility

Node.js 18+ (LTS) ✅ Bun 1.0+ (full NAPI-RS support) ❌ Deno (use @kreuzberg/html-to-markdown-wasm instead)

When to Use

Choose @kreuzberg/html-to-markdown-node when:

  • ✅ Running in Node.js or Bun
  • ✅ Maximum performance is required
  • ✅ Server-side conversion at scale

Use html-to-markdown-wasm for:

  • 🌐 Browser/client-side conversion
  • 🦕 Deno runtime
  • ☁️ Edge runtimes (Cloudflare Workers, Deno Deploy)
  • 📦 Universal packages

Other runtimes:

Configuration Options

See ConversionOptions for all available options including:

  • Heading styles (ATX, underlined, ATX closed)
  • Code block styles (indented, backticks, tildes)
  • List formatting (indent width, bullet characters)
  • Text escaping and formatting
  • Tag preservation (preserveTags) and stripping (stripTags)
  • Preprocessing for web scraping
  • hOCR table extraction
  • And more...

Examples

Preserving HTML Tags

Keep specific HTML tags in their original form instead of converting to Markdown:

import { convert } from '@kreuzberg/html-to-markdown-node';

const html = `
<p>Before table</p>
<table class="data">
    <tr><th>Name</th><th>Value</th></tr>
    <tr><td>Item 1</td><td>100</td></tr>
</table>
<p>After table</p>
`;

const markdown = convert(html, {
  preserveTags: ['table']
});

// Result includes the table as HTML:
// "Before table\n\n<table class=\"data\">...</table>\n\nAfter table\n"

Combine with stripTags for fine-grained control:

const markdown = convert(html, {
  preserveTags: ['table', 'form'],  // Keep these as HTML
  stripTags: ['script', 'style']    // Remove these entirely
});

Web Scraping

const { convert } = require('@kreuzberg/html-to-markdown-node');

const scrapedHtml = await fetch('https://example.com').then(r => r.text());

const markdown = convert(scrapedHtml, {
  preprocessing: {
    enabled: true,
    preset: 'Aggressive',
    removeNavigation: true,
    removeForms: true
  },
  headingStyle: 'Atx',
  codeBlockStyle: 'Backticks'
});

hOCR Document Processing

const { convert } = require('@kreuzberg/html-to-markdown-node');
const fs = require('fs');

// OCR output from Tesseract in hOCR format
const hocrHtml = fs.readFileSync('scan.hocr', 'utf8');

// Automatically detects hOCR and reconstructs tables
const markdown = convert(hocrHtml, {
  hocrSpatialTables: true  // Enable spatial table reconstruction
});

Links

License

MIT