lassie-llm
v1.4.0
Published
Generate llms.txt files from GitHub repositories for LLM consumption. Helps developers make sense of existing codebases and documentation.
Maintainers
Readme
Lassie
Generate llms.txt files from git repositories for LLM consumption.
A CLI tool by Raggle that helps developers make sense of existing codebases and documentation by generating LLM-friendly markdown indexes.
Like a faithful dog, Lassie fetches your documentation and brings it back in a format LLMs can consume.
Features
- Multi-platform support: Works with GitHub, GitLab, Codeberg, Bitbucket, Gitea, and self-hosted instances
- Recursive scanning: Finds all
.md,.markdown, and.mdxfiles in subdirectories - Auto-detects remote: Works with both
originandupstreamremotes - Subdirectory support: Run from any folder within a git repo
- URL verification: Optionally verify files exist on the remote before including
- Publish to GitHub Gist: Instant public hosting via
ghCLI or API - Groups by folder: Organizes output by directory structure
- Extracts titles: Pulls h1/h2 headings from markdown content
- Dry run mode: Preview output without writing files
- Quiet/Verbose modes: Control output verbosity for scripting or debugging
- Config file support: Define settings in
.lassierc,lassie.config.json, orpackage.json - JSON output: Machine-readable output for CI/CD pipelines
Installation
# Install globally with npm
npm install -g @raggle/lassie
# Or with bun
bun install -g @raggle/lassieUsage
Basic Usage
# Generate llms.txt in current directory
lassie
# Generate for a specific directory
lassie /path/to/repo
# Generate from a subdirectory (paths calculated from git root)
lassie /path/to/repo/packages/docs -r upstream -b mainOptions
lassie [directory] [options]
Arguments:
directory Directory to scan (defaults to current directory)
Options:
-o, --output <file> Output file path (defaults to <dir>/llms.txt)
-t, --title <title> Custom title for the llms.txt file
-d, --description <d> Custom description
-b, --branch <branch> Override the branch name for URLs
-r, --remote <name> Git remote to use (defaults to origin, then upstream)
-i, --ignore <pattern> Pattern to ignore (can be used multiple times)
--full Also generate llms-full.txt with complete content
--descriptions Include file descriptions in output
--verify Verify URLs exist before adding (skips unpushed files)
--dry-run Preview output without writing files
--json Output results as JSON (machine-readable)
-v, --verbose Show detailed debug output
-q, --quiet Suppress all output except errors
-V, --version Show version number
-h, --help Show help message
Publish Options:
-p, --publish Publish llms.txt to GitHub Gist
--gist-id <id> Update an existing Gist instead of creating newExamples
# Generate with custom title
lassie -t "My Project Docs"
# Use a specific remote and branch
lassie -r upstream -b main
# Verify URLs exist on GitHub (skips unpushed files)
lassie --verify
# Generate and publish to GitHub Gist
lassie --publish
# Update an existing Gist
lassie --publish --gist-id abc123def456
# Ignore certain directories
lassie -i node_modules -i dist -i .github
# Generate full content file as well
lassie --full
# Get JSON output for scripting
lassie --json --dry-runConfiguration Files
Lassie looks for configuration in these locations (in order of priority):
.lassierc(JSON).lassierc.jsonlassie.config.jsonpackage.json"lassie" field
Example .lassierc
{
"title": "My Project",
"description": "Project documentation for LLMs",
"ignore": ["node_modules", "dist", "*.test.md"],
"includeDescriptions": true,
"branch": "main",
"full": true
}Example package.json
{
"name": "my-project",
"lassie": {
"title": "My Project Docs",
"ignore": ["node_modules"]
}
}CLI arguments always take precedence over config file settings. The ignore patterns from CLI and config are merged together.
Supported Platforms
Lassie automatically detects and supports these git hosting platforms:
| Platform | Raw URL Format |
|----------|---------------|
| GitHub | raw.githubusercontent.com/owner/repo/branch/path |
| GitLab | gitlab.com/owner/repo/-/raw/branch/path |
| Codeberg | codeberg.org/owner/repo/raw/branch/path |
| Bitbucket | bitbucket.org/owner/repo/raw/branch/path |
| Gitea | host/owner/repo/raw/branch/branch/path |
Self-hosted instances are detected by hostname patterns (e.g., gitlab.mycompany.com).
URL Verification
The --verify flag checks each URL exists on the remote before including it:
$ lassie --verify
Scanning directory: /path/to/repo
Verifying 50 URLs...
Skipping 2 files not found on remote:
- docs/draft-feature.md
- docs/unpushed-doc.md
Generated: llms.txt
Done!This is useful when:
- Some files haven't been pushed yet
- You're working on a branch that differs from the remote
- You want to ensure all links in the output are accessible
Publishing to GitHub Gist
The --publish flag uploads your llms.txt to a public GitHub Gist.
Authentication
Lassie tries these methods in order:
- GitHub CLI (
gh): If logged in viagh auth login, no token needed - Environment variable: Set
GITHUB_TOKENorGH_TOKEN
# Option 1: Use gh CLI (recommended)
gh auth login
# Option 2: Set token
export GITHUB_TOKEN=ghp_your_token_herePublishing
$ lassie --publish
Scanning directory: /path/to/repo
Generated: llms.txt
Using gh CLI for authentication...
Creating new Gist...
Published to GitHub Gist!
Raw URL: https://gist.githubusercontent.com/user/abc123/raw/llms.txt
Gist: https://gist.github.com/user/abc123
Gist ID: abc123def456
To update this Gist in the future, use:
lassie --publish --gist-id abc123def456Output Format
llms.txt
# Repository Title
> Optional description
Repository: https://github.com/owner/repo
Branch: main
- [Document Title](https://raw.githubusercontent.com/owner/repo/main/README.md)
## Docs
- [Getting Started](https://raw.githubusercontent.com/owner/repo/main/docs/getting-started.md)
- [API Reference](https://raw.githubusercontent.com/owner/repo/main/docs/api.md)
## Docs / Guides
- [Installation Guide](https://raw.githubusercontent.com/owner/repo/main/docs/guides/installation.md)llms-full.txt
When using --full, generates a file with complete content:
# Repository Title - Full Documentation
Generated: 2025-12-18
<document>
<source>docs/getting-started.md</source>
<url>https://raw.githubusercontent.com/owner/repo/main/docs/getting-started.md</url>
<content>
... complete markdown content ...
</content>
</document>
---How It Works
- Git Detection: Reads git remote URL and auto-detects the hosting platform
- Recursive Scan: Finds all
.md,.markdown,.mdxfiles in all subdirectories - Path Calculation: Handles subdirectory repos by calculating path prefix from git root
- URL Generation: Constructs platform-specific raw content URLs
- Verification (optional): HEAD requests to verify URLs exist
- Output: Groups files by directory, extracts titles from content
- Publishing (optional): Uploads to GitHub Gist
Requirements
- Node.js 18+ or Bun
- Git repository with a remote (GitHub, GitLab, Codeberg, Bitbucket, or Gitea)
- GitHub CLI (
gh) or token (only for--publish)
Programmatic Usage
import {
generate,
generateLlmsTxt,
getGitInfo,
publishToGist,
createLogger,
buildRawUrl,
detectPlatform
} from '@raggle/lassie'
// Generate files
const result = await generate({
dir: '/path/to/repo',
title: 'My Docs',
description: 'Documentation',
full: true,
remote: 'upstream',
verifyUrls: true,
})
// Publish to Gist
const gist = await publishToGist(result.llmsTxt, {
description: 'My Project - llms.txt'
})
console.log(gist.url)
// Get git info (returns platform info for any supported forge)
const gitInfo = await getGitInfo('/path/to/repo')
// { platform: 'github', host: 'github.com', owner: 'user', repo: 'repo', branch: 'main', pathPrefix: '' }
// Build raw URLs for any platform
const rawUrl = buildRawUrl(gitInfo, 'main', 'docs/README.md')
// Use custom logger for quiet operation
const logger = createLogger({ quiet: true })
await generate({ dir: '/path/to/repo', logger })Contributing
Contributions are welcome. Please open an issue first to discuss what you would like to change.
- Fork the repository
- Create your feature branch (
git checkout -b feature/amazing-feature) - Run tests (
bun test) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
