get-llms-txt
v1.0.1
Published
Generate LLM-friendly llms.txt files from markdown and MDX content files
Downloads
2,334
Maintainers
Readme
🌟 Features
- 🔍 Markdown/MDX scanning - Recursively finds all
.mdxand.mdfiles in your content directory - 📄 Generate
llms.txt- High-level, markdown-like index of your project with categorized file lists - 📚 Markdown conversion - Converts MDX files to plain markdown, removing JSX components and metadata
- 🎛️ Configurable - Customize content directory, output directory, base URL, and project metadata
- 🧰 CLI + API - Use as a CLI tool or call programmatically in your own scripts
- ⚡ Fast and efficient - Only processes content files, ignores
node_modules,.next, andoutdirectories
📖 Usage
CLI Usage
Install the package:
npm install --save-dev get-llms-txtor
yarn add -D get-llms-txtor
pnpm add -D get-llms-txtRun the CLI:
npx get-llms-txtBy default, this will:
- Scan
./contentdirectory for.mdxand.mdfiles - Generate
llms.txtin./out(if exists) or./public - Create plain markdown versions in
<output-dir>/md/directory
Basic usage for MD/MDX files:
The tool processes any .md or .mdx files in your content directory:
- Extracts metadata (title, description, tags)
- Converts MDX to plain markdown (removes JSX components)
- Generates categorized
llms.txtindex file - Creates individual
.mdfiles for each content file
CLI Options
npx get-llms-txt [options]Options:
-c, --content-dir <dir>- Content directory to scan (default:./content)-o, --output-dir <dir>- Output directory forllms.txtand md files (default:./outif exists, otherwise./public)-u, --base-url <url>- Base URL for links inllms.txt(default: empty)-n, --project-name <name>- Project name forllms.txt(default: "Personal Website & Blog")-d, --project-description <desc>- Project description (default: auto-generated)-h, --help- Show help message
Examples
# Use default settings
npx get-llms-txt
# Specify custom directories
npx get-llms-txt --content-dir ./content --output-dir ./out
# With base URL for production
npx get-llms-txt --base-url https://example.com --output-dir ./out
# Custom project name and description
npx get-llms-txt --project-name "My Blog" --project-description "Technical blog about software development"Programmatic Usage
import {generateLlmsFiles} from 'get-llms-txt';
await generateLlmsFiles({
contentDir: './content',
outputDir: './out',
baseUrl: 'https://example.com',
projectName: 'My Next.js Project',
projectDescription: 'A collection of technical content',
});🛠️ Installation
npm install --save-dev get-llms-txtAPI
generateLlmsFiles(options: GenerateOptions): Promise<void>
Main function to generate llms.txt files.
Parameters:
options.contentDir(string) - Content directory to scanoptions.outputDir(string) - Output directory for llms.txt and md filesoptions.baseUrl?(string) - Base URL for links in llms.txtoptions.projectName?(string) - Project name for llms.txtoptions.projectDescription?(string) - Project description
processMDXFile(filePath: string): ProcessedContent
Process an MDX file and extract metadata and content.
processMDFile(filePath: string): ProcessedContent
Process a Markdown file and extract metadata and content.
extractTitle(filePath: string, metadata: FileMetadata, content: string): string
Extract title from file metadata, first H1, or filename.
extractDescription(metadata: FileMetadata, content: string): string | undefined
Extract description from metadata or first paragraph.
What It Does
- Scans Content Directory: Recursively finds all
.mdxand.mdfiles in the content directory - Extracts Metadata: Extracts title, description, and other metadata from MDX files
- Converts to Markdown:
- Removes MDX metadata exports
- Strips JSX components
- Converts to plain markdown
- Generates Individual .md Files: Creates markdown versions in
<output-dir>/md/directory - Generates llms.txt: Creates
llms.txtfile in the output directory with:- Project name and description
- Content structure overview
- Categorized file lists with links
Output Structure
<output-dir>/
├── llms.txt
└── md/
├── blog/
│ ├── post1.md
│ └── ...
├── apps/
│ ├── app1.md
│ └── ...
└── research/
└── ...File Processing
MDX Files
The script supports two common metadata formats used in Next.js projects:
YAML Frontmatter (most common):
--- title: My Post description: A great post tags: [blog, tech] --- # My Post Content here...Export const metadata (Next.js MDX format):
export const metadata = { title: 'My Post', description: 'A great post', tags: ['blog', 'tech'], }; # My Post Content here...
The script automatically detects and handles both formats. It also:
- Removes React/JSX components
- Strips import statements
- Converts to plain markdown
- Preserves markdown structure
MD Files
- Extracts YAML frontmatter (if present)
- Preserves markdown content as-is
Integration with Build Process
For Next.js static export (or any static site generator), add to your build script:
{
"scripts": {
"build": "next build && npx get-llms-txt --output-dir ./out",
"deploy:prod": "next build && npx get-llms-txt --output-dir ./out --base-url https://example.com && firebase deploy --only hosting"
}
}Compatibility
✅ Optimized for Next.js projects using MDX/MD files, but also works with:
- ✅ Any static site generator (Gatsby, Astro, etc.)
- ✅ Documentation sites (Docusaurus, VitePress, etc.)
- ✅ Any project with markdown/MDX content files
Supported formats:
- ✅ YAML frontmatter (most common)
- ✅
export const metadata(Next.js MDX format) - ✅ Mixed formats
- ✅ Files with no metadata (falls back to filename/H1 extraction)
- ✅ Any directory structure (recursively scans)
- ✅ Locale suffixes (
.en,.ru, etc.) are handled automatically
Requirements:
- Node.js 22+
- MDX or MD files in a content directory (configurable)
- TypeScript support (for programmatic usage)
URL Generation
- Locale suffixes (
.en,.ru) are removed from filenames - All files are converted to
.mdextension - Directory structure is preserved
- URLs are relative to the base URL (if provided)
- Index files are handled specially (uses directory name)
License
MIT
