cmdgraph
v0.3.0
Published
CLI documentation introspection tool for AI agents
Maintainers
Readme
cmdgraph
Recursive CLI documentation introspection for humans and AI agents.
cmdgraph runs CLI help commands (--help, -h, -H, or help), discovers subcommands recursively, parses the output into structured data, and exports documentation as JSON, Markdown, static HTML, llms.txt, and sitemap.xml.
Why cmdgraph?
Most CLIs are documented in unstructured terminal text. cmdgraph turns that into:
- Machine-friendly JSON for indexing, retrieval, and agent pipelines
- Human-readable Markdown for generated docs and internal references
- Static single-page HTML for hosting CLI docs as a site
- Explicit
llms.txtoutput for LLM-facing discovery - Explicit
sitemap.xmloutput for search-engine discovery - A command tree (AST) that preserves hierarchy and relationships
Features
- Recursive command discovery from
--help,-h,-H, orhelp - Plugin parser system (
heuristic,oclif,commander,yargs,cobra,thor,picocli,urfave-cli,system-commandline,commandlineparser,click,typer,clap,argparse) - Best-effort metadata extraction for arguments, examples, and aliases
- Concurrency control for recursive help crawling
- Automatic in-memory caching of help outputs within a process
- Timeout-safe command execution using
execa - Non-interactive execution defaults (
CI=1,NO_COLOR=1) - JSON, Markdown, static HTML,
llms.txt, andsitemap.xmloutput formats - Searchable single-page HTML docs with client-side command filtering
- Search-engine and LLM-friendly discovery artifacts without coupling them to HTML output
- Unit + integration + e2e tests with deterministic fixtures
Requirements
- Node.js
>=18
Installation
npm install -g cmdgraphCLI Usage
cmdgraph generate <command> [options]Examples:
cmdgraph generate git --format=json --format=md --output=./docs
cmdgraph generate git --format=html --output=./site
cmdgraph generate git --format=html --output-html-title="Git CLI Docs" --output-html-project-link=https://github.com/acme/git-cli --output-html-readme=README.md --output=./site
cmdgraph generate git --format=html --format=llms-txt --format=sitemap --output-root-command-name=cmdgraph --output-llms-txt-base-url=https://docs.example.com/git/ --output-sitemap-base-url=https://docs.example.com/git/ --output=./site
cmdgraph generate git --max-depth=3 --concurrency=4 --format=json --output=./docs
cmdgraph generate kubectl --max-depth=3 --timeout=8000 --format=json --output=./docs
cmdgraph generate "node ./tools/my-cli.mjs" --parser=heuristic --format=md --output=./docsCrawler options
| Option | Type | Default | Description |
| --- | --- | --- | --- |
| --max-depth | integer | | Maximum recursion depth for subcommands (when , crawl continues until leaf commands) |
| --concurrency | integer | 4 | Maximum number of help commands to run in parallel |
| --timeout | integer | 5000 | Per-command timeout in ms |
| --parser | string | | Force a parser plugin by name |
Output options
| Option | Type | Default | Description |
| --- | --- | --- | --- |
| --format | repeatable json \| md \| html \| llms-txt \| sitemap | json | Output format; repeat the flag to write multiple outputs |
| --output | string | ./docs | Output directory |
| --output-root-command-name | string | | Override the displayed root command name in generated outputs |
| --output-html-title | string | | Set HTML page title |
| --output-html-project-link | string | | Project URL shown in the HTML footer |
| --output-html-readme | string | | Path to a .md file rendered as a README section in the HTML page |
| --output-llms-txt-base-url | string | | Base URL used to generate llms.txt links |
| --output-sitemap-base-url | string | | Base URL used to generate sitemap.xml links (required for sitemap output) |
Library Usage
cmdgraph can also be used as a library:
import { generate, introspect } from 'cmdgraph'
const { tree, warnings } = await introspect('git', {
timeoutMs: 5000,
concurrency: 4,
})
const generated = await generate('git', {
timeout: 5000,
concurrency: 4,
parser: 'heuristic',
'output-root-command-name': 'cmdgraph',
'output-html-title': 'Git CLI Documentation',
'output-html-project-link': 'https://github.com/haoliangyu/cmdgraph',
'output-html-readme': './README.md',
'output-llms-txt-base-url': 'https://docs.example.com/git/',
'output-sitemap-base-url': 'https://docs.example.com/git/',
format: ['json', 'md', 'html', 'llms-txt', 'sitemap'],
})
console.log(generated.json)
console.log(generated.markdown)
console.log(generated.html)
console.log(generated.llmsTxt)
console.log(generated.sitemap)Library API notes:
introspect(command, options)returns{ tree, warnings }generate(command, options)returns{ tree, json?, markdown?, html?, llmsTxt?, sitemap?, warnings }- omit
maxDepth/max-depthto recurse until leaf commands; set it explicitly to cap traversal depth options.formatsupportsjson,md,html,llms-txt, andsitemap; pass an array for multiple outputs, and omit it to default to JSONoptions.output-root-command-nameoverrides the displayed root command name in generated outputsoptions.output-html-titlecustomizes the HTML page titleoptions.output-html-project-linkadds a project URL link to the HTML footeroptions.output-html-readmepoints to a.mdfile to render as a README section in HTML outputoptions.output-sitemap-base-urlis required forsitemap;options.output-llms-txt-base-urlcontrolsllms.txtlinksgenerateoptions align with CLI flag names:max-depth,timeout,concurrency,parser,format,output-root-command-name,output-html-title,output-html-project-link,output-html-readme,output-llms-txt-base-url, andoutput-sitemap-base-url- advanced injection (
executor,parserRegistry) is available for tests/custom integration
Supported Parsers
cmdgraph uses a plugin-based parser registry. You can force one with --parser, or let cmdgraph auto-detect.
heuristic: default and fallback parser; handles common help layouts (Usage,Commands,Options/Flags); recommended for most tools.oclif: parser for oclif-style CLIs (supports uppercase section blocks such asUSAGE,COMMANDS,FLAGS).commander: parser for Commander.js-style output (display help for command,output the version number).yargs: parser for yargs-style output (Show help,Show version number, type hints like[boolean]).cobra: parser for Cobra-style CLIs (Available Commands,Flags,Global Flags).thor: parser for Thor-style CLIs (Usage: ... COMMAND [ARGS],Commands/Tasksheadings, e.g. Bundler CLI).picocli: parser for picocli-style Java CLIs (Show this help message and exit.,Print version information and exit., e.g. Gradle).urfave-cli: parser for urfave/cli-style Go CLIs (NAME,USAGE,COMMANDS,GLOBAL OPTIONS).system-commandline: parser for .NET System.CommandLine CLIs (Usage:heading blocks andShow help and usage information).commandlineparser: parser for C# CommandLineParser CLIs (USAGE:,OPTIONS:,Display this help screen.,Display version information.).click: parser for Click-style output ([OPTIONS],Show this message and exit).typer: parser for Typer-style output (Click-based plus completion flags and boxed sections).clap: parser for clap-style output (Print help,Print version).argparse: parser for Python argparse-style output (usage:,show this help message and exit).
Parser selection behavior:
- If
--parseris provided, that parser is used. - Otherwise, parser
detect()methods are checked. - If nothing matches,
heuristicis used.
Output
For cmdgraph generate git --format=json --format=md --format=html --format=llms-txt --format=sitemap --output-llms-txt-base-url=https://docs.example.com/git/ --output-sitemap-base-url=https://docs.example.com/git/ --output=./docs, you get:
docs/git.jsondocs/git.mddocs/index.htmldocs/llms.txtdocs/sitemap.xml
Why these formats:
- JSON is agent-ready because it is structured, stable, and easy to index, diff, validate, and consume in automation pipelines.
- Markdown is human-readable because it is hierarchy-first, scannable in docs/reviews, and works well in repos, wikis, and generated documentation sites.
- HTML is hosting-ready because it renders the canonical command tree into a single accessible page with dark mode support and navigation for static-site deployment.
llms.txtis explicit because it gives LLM crawlers a compact text map of the hosted documentation without embedding that responsibility into the HTML page itself.sitemap.xmlis explicit because search-engine discovery depends on deployable site URLs, not just local output files.
JSON shape:
{
"name": "git",
"description": "The stupid content tracker",
"usage": "git [options] [command]",
"aliases": [],
"arguments": [],
"examples": [],
"options": [
{ "flag": "-h, --help", "description": "display help" }
],
"subcommands": ["add", "commit", "push"],
"path": ["git"],
"children": []
}Matching Markdown output:
# Command Documentation
## git
The stupid content tracker
**Usage:** `git [options] [command]`
**Options**
- `-h, --help`: display help
**Subcommands**
- `add`
- `commit`
- `push`HTML output characteristics:
- generated as a single
index.htmlfile for static hosting - rendered from a React template via server-side rendering
- styled with Tailwind CSS and shadcn/ui-inspired component patterns
- modern light-green theme by default, with an accessible dark mode toggle
- includes client-side command filtering for large documentation pages
- includes crawlable semantic content, metadata, and structured data for search engines and LLM bots
Discovery artifact characteristics:
llms.txtis generated separately and lists the hosted documentation page plus command-level anchorssitemap.xmlis generated separately and requires--output-sitemap-base-urlso it contains valid deployable URLs- HTML output does not implicitly generate either file; request them explicitly with
--format=llms-txtand--format=sitemap
Agent Reference Guide (Packaged JSON)
npm run build:docs:release now generates a JSON reference guide and places it inside the published package payload:
dist/agent-reference/cmdgraph.json(stable path for agents)
How to use it in an agent/tooling workflow:
- Install the package.
- Read
dist/agent-reference/cmdgraph.jsonfrom the installed package directory. - Use the command tree, options, and examples as the source of truth when generating or validating
cmdgraphusage.
Notes:
- The file is generated from live introspection of the built CLI.
- It is rebuilt on package release.
Testing
Run default tests (build + unit/integration):
npm testRun real CLI e2e tests:
npm run test:e2eWatch mode:
npm run test:watchCurrent test coverage includes:
- Executor behavior (success + timeout)
- Heuristic parser with common and real-world fixtures (
git,docker,kubectl,ghstyles) - Framework parser detection and parsing fixtures (
oclif,commander,yargs,cobra,thor,picocli,urfave-cli,system-commandline,commandlineparser,click,typer,clap,argparse) - Metadata extraction for aliases, arguments, and examples
- Library API tests for programmatic introspection and formatted output generation
- HTML formatter rendering and static site generation
- Explicit
llms.txtandsitemap.xmlgeneration and validation - Integration crawling against a real fixture executable
- End-to-end generation through built CLI, with auto-skip when target CLIs are unavailable (including Bundler/Thor-style, Gradle/picocli-style, urfave/cli-style Go CLIs, and C# System.CommandLine/CommandLineParser CLIs)
Development
npm install
npm run build
npm run lint
npm test
npm run test:e2eFormat code:
npm run formatLicense
MIT
