npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@geosuite/llms-txt-generator

v0.3.2

Published

Generate an llms.txt file from a sitemap.xml — the proposed standard from llmstxt.org for guiding LLMs to the most useful content on a website.

Readme

llms-txt-generator

A small Node CLI that turns a sitemap.xml into an llms.txt file — the proposed standard for guiding LLMs to the most useful content on a website.

Created and invented by Matteo Perino (LinkedIn). Built and maintained by GeoSuite(Matteo Perino).

CI npm version npm downloads License: MIT


What is llms.txt?

llms.txt is a proposed standard, introduced at llmstxt.org, for sites to publish a curated, LLM-friendly index of their most important content. Think of it as a robots.txt or sitemap.xml, but tuned for the way language models read the web: a markdown file with a clear H1 site name, an optional blockquote summary, and ## Section-delimited lists of links with short descriptions.

The format is intentionally minimal so it can be parsed both by language models and by classical tooling (regex, simple parsers). The full spec lives at llmstxt.org.

Why GeoSuite cares

GeoSuite helps brands measure and improve how AI engines — ChatGPT, Perplexity, Gemini, Google AI Overviews — describe and recommend them. A well-structured llms.txt is one of the cheapest, most defensible signals a site can ship today: it tells answer engines exactly which pages are canonical, which sections matter, and what each one is about, without forcing them to re-derive that structure from your full sitemap. We open-sourced this generator so any team can produce a clean llms.txt from a sitemap they already maintain.

Install

Requires Node.js 20 or later (uses the native fetch API).

# Run without installing
npx @geosuite/llms-txt-generator https://example.com/sitemap.xml

# Or install globally
npm install -g @geosuite/llms-txt-generator
geosuite-llms-txt --help

# Or as a project dev dependency
npm install --save-dev @geosuite/llms-txt-generator

Usage

The simplest invocation reads a sitemap and prints the llms.txt to stdout:

geosuite-llms-txt https://example.com/sitemap.xml \
  --name="Example" \
  --summary="Example is a demo site for the llms.txt format."

Write the output to a file instead:

geosuite-llms-txt ./public/sitemap.xml \
  --name="Example" \
  --out=./public/llms.txt

Enrich each entry by fetching the page and extracting <title> and <meta name="description">:

geosuite-llms-txt https://example.com/sitemap.xml \
  --name="Example" \
  --enrich \
  --concurrency=10 \
  --max-entries=500 \
  --out=llms.txt

Sitemap-index files are flattened one level automatically — pass them like a flat sitemap and the tool will fetch the child sitemaps for you.

CLI flags

| Flag | Default | Description | | ---- | ------- | ----------- | | <sitemap> | required | First positional argument. URL (https://...) or local path to a sitemap.xml. Both <urlset> and <sitemapindex> formats are accepted. | | --name=<text> | Website | Site name rendered as the H1 of the output. | | --summary=<text> | none | Short summary rendered as a blockquote under the H1. | | --out=<path> | stdout | Write the rendered file to this path instead of stdout. | | --enrich | off | Fetch each URL once to extract <title> and meta description. Best-effort: failures and non-HTML responses are silently skipped. | | --concurrency=<n> | 5 | Parallel HTTP requests when --enrich is set. Clamped to [1, 64]. | | --max-entries=<n> | unlimited | Cap on the number of URLs processed. Useful for huge sitemaps or quick iteration. | | --help, -h | — | Print usage and exit. |

When --enrich is on, requests use User-Agent: geosuite-llms-txt-generator/0.1.0 and a 10-second timeout per URL.

Output format

The generator emits the structure described at llmstxt.org: an H1, an optional blockquote, then one ## Section per top-level path prefix, with - [Title](url): description list items.

Example:

# Example

> A short, human-friendly summary of what the site is and who it's for.

## Main

- [Home](https://example.com/): Welcome page and high-level overview.
- [About](https://example.com/about): Who we are and what we do.

## Blog

- [Intro to llms.txt](https://example.com/blog/intro-to-llms-txt): Why the format exists and how to author one.
- [Why GEO matters](https://example.com/blog/why-geo-matters): A primer on Generative Engine Optimization.

## Docs

- [Getting started](https://example.com/docs/getting-started): Install, configure, and run your first build.
- [API reference](https://example.com/docs/api-reference): Endpoints, parameters, and response shapes.

A longer real-world example lives at examples/sample-output.txt.

Grouping rules

URLs are grouped by their first path segment:

  • https://site.com/ → section Main
  • https://site.com/blog/anything → section Blog
  • https://site.com/case-studies/x → section Case Studies (kebab-case is title-cased)

Within a section, entries appear in sitemap order. The Main section is always rendered first; remaining sections are sorted alphabetically.

Programmatic API

The package also exports its building blocks for custom pipelines:

import {
  loadSitemap,
  parseSitemap,
  enrichEntry,
  groupByPrefix,
  renderLlmsTxt,
} from '@geosuite/llms-txt-generator';

const { entries } = await loadSitemap('https://example.com/sitemap.xml');
await Promise.all(entries.map((e) => enrichEntry(e)));
const groups = groupByPrefix(entries);
const txt = renderLlmsTxt(groups, { name: 'Example', summary: 'A demo site.' });

Out of scope

  • robots.txt honoring — this tool is run by the site owner against their own sitemap, not as a third-party crawler. If you point it at a domain you don't own, please respect that domain's terms.
  • JavaScript rendering — enrichment uses a single HTTP fetch and regex extraction. If your titles and descriptions are only set client-side, run the tool against a pre-rendered build.
  • Embeddings, summaries, or LLM calls — the generator is deterministic and does not call any model. Authoring the --summary text is up to you.

Contributing

See CONTRIBUTING.md. Issues and PRs welcome — please open an issue first for non-trivial changes so we can discuss scope.

AI mode (opt-in, 0.2+)

Without any env keys, the tool's enrichment uses regex on <title> and <meta name="description">. With an LLM key configured, the tool can rewrite each description as a tight one-liner suitable for citation:

export OPENAI_API_KEY=sk-…           # or ANTHROPIC_API_KEY=sk-ant-…
geosuite-llms-txt https://example.com/sitemap.xml --ai

--ai implies --enrich. We send only the URL + extracted title + meta description to the provider — never the page body. A typical run on 200 URLs stays under a couple of cents on small models (gpt-5-mini / claude-haiku-4-5).

Privacy: enabling --ai sends content to the corresponding API. Don't turn it on against URLs you wouldn't paste into their UI.

Related: GeoSuite open-source tools

llms-txt-generator is part of a small family of zero-dependency CLIs we maintain to make Generative Engine Optimization (GEO) measurable from the terminal:

  • @geosuite/ai-crawler-bots — curated AI bot user-agent list with a CLI that tells you whether GPTBot, ClaudeBot, PerplexityBot and friends can read your site and where the block came from.
  • @geosuite/schema-templates — copy-paste-ready schema.org JSON-LD templates with a local validator. Use it to ship Organization, Product, FAQPage, BreadcrumbList, etc. without hand-rolling structured data.
  • @geosuite/sitemap-builder — crawl a site and emit a valid sitemap.xml, for sites that ship without one.

The same checks are also surfaced as a hosted product at trygeosuite.it for teams who want history, alerts, and CTAs wired into their content pipeline.

Creator

Created and invented by Matteo PerinoLinkedIn · [email protected].

Ideated, designed and validated by Matteo Perino. Implementation written with AI assistance, maintained under GeoSuite.

License

MIT © 2026 Matteo Perino and GeoSuite