npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

llmfood

v1.0.2

Published

Generate LLM-friendly Markdown from Docusaurus HTML builds

Readme

llmfood

CI npm version

Generate LLM-friendly Markdown from Docusaurus HTML builds, implementing the llms.txt convention.

Overview

llmfood converts a Docusaurus static HTML build into clean Markdown files optimized for LLM consumption. It:

  1. Discovers all pages in a Docusaurus build directory
  2. Resolves client-side content that doesn't exist in static HTML (GitHub code references, remote content, mermaid diagrams)
  3. Converts each HTML page to Markdown, stripping Docusaurus chrome (breadcrumbs, pagination, TOC, footers)
  4. Generates llms.txt — a structured index linking to all converted .md files
  5. Generates custom files — aggregated Markdown files matching URL patterns (e.g., llms-full.txt)

Installation

npm install llmfood
# or
bun add llmfood

Usage

Docusaurus Plugin (recommended)

Add llmfood as a Docusaurus plugin for zero-config integration. It runs automatically after docusaurus build:

// docusaurus.config.js
module.exports = {
  plugins: [
    [
      "llmfood/docusaurus",
      {
        sectionOrder: ["guides", "api", "concepts"],
        sectionLabels: { guides: "Guides", api: "API Reference" },
        customFiles: [
          {
            filename: "llms-full.txt",
            title: "Full Documentation",
            description: "Complete documentation in a single file",
            includePatterns: [/.*/],
          },
        ],
      },
    ],
  ],
};

The plugin automatically derives baseUrl, buildDir, siteTitle, and siteDescription from your Docusaurus config. It also sets docsDir to {siteDir}/docs by default, enabling source file scanning for mermaid diagrams and remote content resolution.

Standalone

import { generateLlmsMarkdown } from "llmfood";

await generateLlmsMarkdown({
  baseUrl: "https://docs.example.com",
  buildDir: "./build",
  siteTitle: "My Docs",
  siteDescription: "Documentation for my project",
  docsDir: "./docs", // optional: enables source file scanning
  sectionOrder: ["guides", "api", "concepts"],
  sectionLabels: { guides: "Guides", api: "API Reference" },
  ignorePatterns: [/\/blog\//],
  customFiles: [
    {
      filename: "llms-full.txt",
      title: "Full Documentation",
      description: "Complete documentation in a single file",
      includePatterns: [/.*/],
    },
  ],
});

Standalone HTML to Markdown

You can also use the converter directly:

import { htmlToMarkdown } from "llmfood";

const markdown = htmlToMarkdown(docusaurusHtmlString);

Content Resolution

Some Docusaurus plugins render content client-side, so the static HTML contains placeholders instead of real content. When docsDir is set, llmfood scans MDX source files and resolves these automatically:

| Pattern | Source detection | Resolution | | ---------------------- | ------------------------------------------------------------------------------------------------ | ----------------------------------------------------------------- | | GitHub code references | CodeBlock JSX, fenced ```lang reference, and children/src/srcUrl/source attributes | Fetches code from raw.githubusercontent.com with line ranges | | Remote content | url="..." or url={expr} in MDX | Fetches remote markdown (JSX expressions via resolveRemoteUrl) | | Mermaid diagrams | ```mermaid blocks in MDX | Injects mermaid source into HTML (client-side renders leave none) | | YouTube embeds | <iframe> with YouTube URL in HTML | Converts to [title](youtube-url) markdown link |

Source scanning also resolves imported MDX snippets (import Foo from "./_snippet.mdx"), substitutes ${props.x} expressions using caller prop values, and matches files by frontmatter id when the slug differs from the filename.

All external fetches run in parallel with a concurrency limit of 6.

API

generateLlmsMarkdown(config)

Processes an entire Docusaurus build and generates llms.txt plus any custom files.

LlmfoodConfig

| Property | Type | Required | Description | | --------------------- | --------------------------- | -------- | --------------------------------------------------------------------------- | | baseUrl | string | Yes | Base URL for generated links (e.g., https://docs.example.com) | | buildDir | string | Yes | Path to the Docusaurus build output directory | | customFiles | CustomLlmFile[] | No | Custom aggregated output files to generate | | docsDir | string | No | Path to docs source directory (enables mermaid + remote content resolution) | | ignorePatterns | RegExp[] | No | URL patterns to exclude (root / is always excluded) | | postProcessHtml | (html, context) => string | No | Hook to transform HTML before markdown conversion | | postProcessMarkdown | (md, context) => string | No | Hook to transform markdown after conversion | | resolveRemoteUrl | (expr) => string | No | Resolve JSX expressions (e.g., getBenchmarkURL(...)) to fetch URLs | | rootContent | string | No | Additional content to include at the top of llms.txt | | sectionLabels | Record<string, string> | No | Custom display labels for URL sections | | sectionOrder | string[] | No | Ordering for sections in llms.txt | | siteDescription | string | No | Site description shown in llms.txt | | siteTitle | string | No | Site title shown in llms.txt | | verbose | boolean | No | Log individual skipped pages with reasons |

Both hooks receive a ProcessContext with { urlPath: string } and may return a Promise.

CustomLlmFile

| Property | Type | Required | Description | | ----------------- | ---------- | -------- | ---------------------------------------- | | filename | string | Yes | Output filename (e.g., llms-full.txt) | | includePatterns | RegExp[] | Yes | URL patterns to include in this file | | description | string | No | Description shown at the top of the file | | title | string | No | Title shown at the top of the file |

htmlToMarkdown(html)

Converts a Docusaurus HTML string to clean Markdown. Expects the content to be wrapped in an <article> tag.

Returns an empty string if no <article> element is found.

Supported Docusaurus Elements

The converter handles these Docusaurus-specific elements:

  • Prism code blocks — preserves language and syntax highlighting structure
  • Admonitions — converts to :::type [title] syntax (tip, warning, info, caution, danger, note, important)
  • Tabs — renders each tab panel with its label as a bold heading
  • Details/Summary — preserves as HTML <details> elements
  • KaTeX math — converts to $$...$$ (block) and $...$ (inline) syntax
  • Images — converts to standard Markdown, skipping data URIs
  • Tables — converts to GFM table syntax with alignment support (:---:, ---:)
  • Strikethrough — converts <del> and <s> to ~~text~~
  • YouTube iframes — converts to markdown links with video title
  • Mermaid code blocks — preserves as fenced mermaid code blocks (when source is available)

Pages that can't be converted are tracked and summarized. Set verbose: true to see individual skipped pages with reasons (redirects, empty pages, missing files, errors).

License

MIT