@ammit/llms-txt
v0.1.0
Published
Generate llms.txt files for any website. Make your site AI-friendly.
Maintainers
Readme
llms-txt
Generate llms.txt files for any website. Make your site AI-friendly.
llms.txt is a standard that helps AI agents understand your website. It converts your pages into clean, token-efficient markdown that LLMs can consume directly, reducing token usage by ~90% compared to raw HTML.
Only 5-15% of websites have llms.txt today. This tool generates one for you automatically.
Install
npm install -g @ammit/llms-txtOr run directly:
npx @ammit/llms-txt https://example.comUsage
# Generate llms.txt for a site
llms-txt https://docs.anthropic.com
# Set crawl depth
llms-txt https://example.com --depth 3
# Output to files
llms-txt https://example.com -o ./llms.txt
# Include/exclude URL patterns
llms-txt https://example.com --include "/docs/**" --exclude "/blog/**"
# Generate llms-full.txt (all content bundled)
llms-txt https://example.com --fullWhat it does
- Discovers pages via
sitemap.xmland link following - Extracts clean content using Readability (strips nav, ads, scripts)
- Converts to markdown via Turndown
- Outputs a standard
llms.txtindex and optionalllms-full.txtbundle
Output format
The generated llms.txt follows the llms.txt standard:
# Example Docs
> Documentation for the Example platform.
## Getting Started
- [Quick Start](https://example.com/docs/quickstart): Set up your first project in 5 minutes
- [Installation](https://example.com/docs/install): System requirements and install steps
## API Reference
- [Authentication](https://example.com/docs/api/auth): API keys and OAuth setup
- [Endpoints](https://example.com/docs/api/endpoints): Complete REST API referenceOptions
| Flag | Description | Default |
|------|-------------|---------|
| --depth, -d | Max crawl depth | 3 |
| --output, -o | Output file path | stdout |
| --full | Also generate llms-full.txt | false |
| --include | URL patterns to include (glob) | all |
| --exclude | URL patterns to exclude (glob) | none |
| --rate | Requests per second | 2 |
| --concurrency, -c | Parallel requests | 5 |
| --json | Output as JSON | false |
| --verbose, -v | Show detailed skip/fetch logging | false |
| --timeout | Fetch timeout in milliseconds | 10000 |
How it works
URL --> Sitemap/Link Discovery --> Content Extraction --> Markdown Conversion --> llms.txt- Discovery: Checks
sitemap.xmlfirst, falls back to recursive link following - Extraction: Mozilla Readability isolates main content, removes chrome
- Conversion: Turndown produces clean GFM markdown
- Assembly: Groups pages by URL path into sections, generates descriptions
Contributing
Contributions welcome. Please open an issue first to discuss what you'd like to change.
