get-llms-txt

v1.0.1

Published

3 months ago

Generate LLM-friendly llms.txt files from markdown and MDX content files

0High
0Medium
0Low

nextjs llms get-llms-txt ai llm-context documentation markdown mdx static-site-generator gatsby astro docusaurus vitepress

npm-typescript

🌟 Features

🔍 Markdown/MDX scanning - Recursively finds all .mdx and .md files in your content directory
📄 Generate llms.txt - High-level, markdown-like index of your project with categorized file lists
📚 Markdown conversion - Converts MDX files to plain markdown, removing JSX components and metadata
🎛️ Configurable - Customize content directory, output directory, base URL, and project metadata
🧰 CLI + API - Use as a CLI tool or call programmatically in your own scripts
⚡ Fast and efficient - Only processes content files, ignores node_modules, .next, and out directories

📖 Usage

CLI Usage

Install the package:

npm install --save-dev get-llms-txt

yarn add -D get-llms-txt

pnpm add -D get-llms-txt

Run the CLI:

npx get-llms-txt

By default, this will:

Scan ./content directory for .mdx and .md files
Generate llms.txt in ./out (if exists) or ./public
Create plain markdown versions in <output-dir>/md/ directory

Basic usage for MD/MDX files:

The tool processes any .md or .mdx files in your content directory:

Extracts metadata (title, description, tags)
Converts MDX to plain markdown (removes JSX components)
Generates categorized llms.txt index file
Creates individual .md files for each content file

CLI Options

npx get-llms-txt [options]

Options:

-c, --content-dir <dir> - Content directory to scan (default: ./content)
-o, --output-dir <dir> - Output directory for llms.txt and md files (default: ./out if exists, otherwise ./public)
-u, --base-url <url> - Base URL for links in llms.txt (default: empty)
-n, --project-name <name> - Project name for llms.txt (default: "Personal Website & Blog")
-d, --project-description <desc> - Project description (default: auto-generated)
-h, --help - Show help message

Examples

# Use default settings
npx get-llms-txt

# Specify custom directories
npx get-llms-txt --content-dir ./content --output-dir ./out

# With base URL for production
npx get-llms-txt --base-url https://example.com --output-dir ./out

# Custom project name and description
npx get-llms-txt --project-name "My Blog" --project-description "Technical blog about software development"

Programmatic Usage

import {generateLlmsFiles} from 'get-llms-txt';

await generateLlmsFiles({
  contentDir: './content',
  outputDir: './out',
  baseUrl: 'https://example.com',
  projectName: 'My Next.js Project',
  projectDescription: 'A collection of technical content',
});

🛠️ Installation

npm install --save-dev get-llms-txt

API

`generateLlmsFiles(options: GenerateOptions): Promise<void>`

Main function to generate llms.txt files.

Parameters:

options.contentDir (string) - Content directory to scan
options.outputDir (string) - Output directory for llms.txt and md files
options.baseUrl? (string) - Base URL for links in llms.txt
options.projectName? (string) - Project name for llms.txt
options.projectDescription? (string) - Project description

`processMDXFile(filePath: string): ProcessedContent`

Process an MDX file and extract metadata and content.

`processMDFile(filePath: string): ProcessedContent`

Process a Markdown file and extract metadata and content.

`extractTitle(filePath: string, metadata: FileMetadata, content: string): string`

Extract title from file metadata, first H1, or filename.

`extractDescription(metadata: FileMetadata, content: string): string | undefined`

Extract description from metadata or first paragraph.

What It Does

Scans Content Directory: Recursively finds all .mdx and .md files in the content directory
Extracts Metadata: Extracts title, description, and other metadata from MDX files
Converts to Markdown:
- Removes MDX metadata exports
- Strips JSX components
- Converts to plain markdown
Generates Individual .md Files: Creates markdown versions in <output-dir>/md/ directory
Generates llms.txt: Creates llms.txt file in the output directory with:
- Project name and description
- Content structure overview
- Categorized file lists with links

Output Structure

<output-dir>/
  ├── llms.txt
  └── md/
      ├── blog/
      │   ├── post1.md
      │   └── ...
      ├── apps/
      │   ├── app1.md
      │   └── ...
      └── research/
          └── ...

File Processing

MDX Files

The script supports two common metadata formats used in Next.js projects:

YAML Frontmatter (most common):

---
title: My Post
description: A great post
tags: [blog, tech]
---

# My Post

Content here...

Export const metadata (Next.js MDX format):

export const metadata = {
  title: 'My Post',
  description: 'A great post',
  tags: ['blog', 'tech'],
};

# My Post

Content here...

The script automatically detects and handles both formats. It also:

Removes React/JSX components
Strips import statements
Converts to plain markdown
Preserves markdown structure

MD Files

Extracts YAML frontmatter (if present)
Preserves markdown content as-is

Integration with Build Process

For Next.js static export (or any static site generator), add to your build script:

{
  "scripts": {
    "build": "next build && npx get-llms-txt --output-dir ./out",
    "deploy:prod": "next build && npx get-llms-txt --output-dir ./out --base-url https://example.com && firebase deploy --only hosting"
  }
}

Compatibility

✅ Optimized for Next.js projects using MDX/MD files, but also works with:

✅ Any static site generator (Gatsby, Astro, etc.)
✅ Documentation sites (Docusaurus, VitePress, etc.)
✅ Any project with markdown/MDX content files

Supported formats:

✅ YAML frontmatter (most common)
✅ export const metadata (Next.js MDX format)
✅ Mixed formats
✅ Files with no metadata (falls back to filename/H1 extraction)
✅ Any directory structure (recursively scans)
✅ Locale suffixes (.en, .ru, etc.) are handled automatically

Requirements:

Node.js 22+
MDX or MD files in a content directory (configurable)
TypeScript support (for programmatic usage)

URL Generation

Locale suffixes (.en, .ru) are removed from filenames
All files are converted to .md extension
Directory structure is preserved
URLs are relative to the base URL (if provided)
Index files are handled specially (uses directory name)

License

MIT

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme