npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@kreuzberg/html-to-markdown

v3.1.0

Published

High-performance HTML to Markdown converter for TypeScript/Node.js with a Rust core.

Readme

html-to-markdown

High-performance HTML to Markdown converter for Node.js and Bun with full TypeScript support. This package wraps native @kreuzberg/html-to-markdown-node bindings and provides a type-safe API.

Installation

npm install @kreuzberg/html-to-markdown

Requires Node.js 18+ or Bun. Native bindings provide superior performance.

npm:

npm install @kreuzberg/html-to-markdown

pnpm:

pnpm add @kreuzberg/html-to-markdown

yarn:

yarn add @kreuzberg/html-to-markdown

bun:

bun add @kreuzberg/html-to-markdown

Alternatively, use the WebAssembly version for browser/edge environments:

npm install @kreuzberg/html-to-markdown-wasm

Performance Snapshot

Apple M4 • Real Wikipedia documents • convert() (TypeScript (Node.js))

| Document | Size | Latency | Throughput | | -------- | ---- | ------- | ---------- | | Lists (Timeline) | 129KB | 0.58ms | 222 MB/s | | Tables (Countries) | 360KB | 1.89ms | 190 MB/s | | Mixed (Python wiki) | 656KB | 4.21ms | 156 MB/s |

Quick Start

Basic conversion:

import { convert } from '@kreuzberg/html-to-markdown';

const result = convert('<h1>Hello World</h1>');
const markdown: string = result.content;
console.log(markdown); // # Hello World

With conversion options:

import { convert, ConversionOptions } from '@kreuzberg/html-to-markdown';

const options: ConversionOptions = {
  headingStyle: 'atx',
  listIndentWidth: 2,
  wrap: true,
};

const result = convert('<h1>Title</h1><p>Content</p>', options);
const markdown = result.content;

API Reference

Core Function

convert(html: string, options?: ConversionOptions, visitor?: Visitor): ConversionResult

Converts HTML to Markdown. Returns a ConversionResult object with all results in a single call.

import { convert, ConversionOptions } from '@kreuzberg/html-to-markdown';

const result = convert(html);
const markdown  = result.content;    // Converted Markdown string
const metadata  = result.metadata;   // Metadata (when extractMetadata: true)
const tables    = result.tables;     // Structured table data (when extractTables: true)
const document  = result.document;   // Document-level info
const images    = result.images;     // Extracted images
const warnings  = result.warnings;   // Any conversion warnings

Options

ConversionOptions – Key configuration fields:

  • heading_style: Heading format ("underlined" | "atx" | "atx_closed") — default: "underlined"
  • list_indent_width: Spaces per indent level — default: 2
  • bullets: Bullet characters cycle — default: "*+-"
  • wrap: Enable text wrapping — default: false
  • wrap_width: Wrap at column — default: 80
  • code_language: Default fenced code block language — default: none
  • extract_metadata: Enable metadata extraction into result.metadata — default: false
  • extract_tables: Enable structured table extraction into result.tables — default: false
  • output_format: Output markup format ("markdown" | "djot" | "plain") — default: "markdown"

Djot Output Format

The library supports converting HTML to Djot, a lightweight markup language similar to Markdown but with a different syntax for some elements. Set output_format to "djot" to use this format.

Syntax Differences

| Element | Markdown | Djot | |---------|----------|------| | Strong | **text** | *text* | | Emphasis | *text* | _text_ | | Strikethrough | ~~text~~ | {-text-} | | Inserted/Added | N/A | {+text+} | | Highlighted | N/A | {=text=} | | Subscript | N/A | ~text~ | | Superscript | N/A | ^text^ |

Example Usage

import { convert, ConversionOptions } from '@kreuzberg/html-to-markdown';

const html = "<p>This is <strong>bold</strong> and <em>italic</em> text.</p>";

// Default Markdown output
const markdown = convert(html);
// Result: "This is **bold** and *italic* text."

// Djot output
const djot = convert(html, { outputFormat: 'djot' });
// Result: "This is *bold* and _italic_ text."

Djot's extended syntax allows you to express more semantic meaning in lightweight text, making it useful for documents that require strikethrough, insertion tracking, or mathematical notation.

Plain Text Output

Set output_format to "plain" to strip all markup and return only visible text. This bypasses the Markdown conversion pipeline entirely for maximum speed.

import { convert } from '@kreuzberg/html-to-markdown';

const html = "<h1>Title</h1><p>This is <strong>bold</strong> and <em>italic</em> text.</p>";

const plain = convert(html, { outputFormat: 'plain' });
// Result: "Title\n\nThis is bold and italic text."

Plain text mode is useful for search indexing, text extraction, and feeding content to LLMs.

Metadata Extraction

The metadata extraction feature enables comprehensive document analysis during conversion. Extract document properties, headers, links, images, and structured data in a single pass — all via the standard convert() function.

Use Cases:

  • SEO analysis – Extract title, description, Open Graph tags, Twitter cards
  • Table of contents generation – Build structured outlines from heading hierarchy
  • Content migration – Document all external links and resources
  • Accessibility audits – Check for images without alt text, empty links, invalid heading hierarchy
  • Link validation – Classify and validate anchor, internal, external, email, and phone links

Zero Overhead When Disabled: Metadata extraction adds negligible overhead and happens during the HTML parsing pass. Pass extract_metadata: true in ConversionOptions to enable it; the result is available at result.metadata.

Example: Quick Start

import { convert } from '@kreuzberg/html-to-markdown';

const html = '<h1>Article</h1><img src="test.jpg" alt="test">';
const result = convert(html, { extractMetadata: true });

console.log(result.content);                      // Converted Markdown
console.log(result.metadata?.document?.title);    // Document title
console.log(result.metadata?.headers);            // All h1-h6 elements
console.log(result.metadata?.links);              // All hyperlinks
console.log(result.metadata?.images);             // All images with alt text
console.log(result.metadata?.structuredData);     // JSON-LD, Microdata, RDFa

Visitor Pattern

The visitor pattern enables custom HTML→Markdown conversion logic by providing callbacks for specific HTML elements during traversal. Pass a visitor as the third argument to convert().

Use Cases:

  • Custom Markdown dialects – Convert to Obsidian, Notion, or other flavors
  • Content filtering – Remove tracking pixels, ads, or unwanted elements
  • URL rewriting – Rewrite CDN URLs, add query parameters, validate links
  • Accessibility validation – Check alt text, heading hierarchy, link text
  • Analytics – Track element usage, link destinations, image sources

Supported Visitor Methods: 40+ callbacks for text, inline elements, links, images, headings, lists, blocks, and tables.

Example: Quick Start

import { convert, type Visitor, type NodeContext, type VisitResult } from '@kreuzberg/html-to-markdown';

const visitor: Visitor = {
  visitLink(ctx: NodeContext, href: string, text: string, title?: string): VisitResult {
    // Rewrite CDN URLs
    if (href.startsWith('https://old-cdn.com')) {
      href = href.replace('https://old-cdn.com', 'https://new-cdn.com');
    }
    return { type: 'custom', output: `[${text}](${href})` };
  },

  visitImage(ctx: NodeContext, src: string, alt?: string, title?: string): VisitResult {
    // Skip tracking pixels
    if (src.includes('tracking')) {
      return { type: 'skip' };
    }
    return { type: 'continue' };
  },
};

const html = '<a href="https://old-cdn.com/file.pdf">Download</a>';
const result = convert(html, {}, visitor);
const markdown = result.content;

Examples

Links

Contributing

We welcome contributions! Please see our Contributing Guide for details on:

  • Setting up the development environment
  • Running tests locally
  • Submitting pull requests
  • Reporting issues

All contributions must follow our code quality standards (enforced via pre-commit hooks):

  • Proper test coverage (Rust 95%+, language bindings 80%+)
  • Formatting and linting checks
  • Documentation for public APIs

License

MIT License – see LICENSE.

Support

If you find this library useful, consider sponsoring the project.

Have questions or run into issues? We're here to help: