octogrep
v0.3.1
Published
GitHub code search CLI for AI agents and LLM workflows with token-efficient TOON output.
Maintainers
Readme
octogrep
octogrep is a lightweight CLI for GitHub code search optimized for AI agents.
It uses incur for structured output and emits TOON format by default.
Why this tool
- Token-efficient output for LLM workflows (TOON by default)
- Minimal, normalized search result shape for fast downstream processing
- No auth credential handling in octogrep itself
Internally, octogrep calls GitHub Search API through GitHub CLI (gh).
What is TOON?
TOON (Token-Oriented Object Notation) is a compact, human-readable encoding of the JSON data model. It is designed for LLM input as a drop-in, lossless representation of existing JSON while reducing token usage. In octogrep, using TOON by default keeps search output compact and model-friendly.
Learn more: https://github.com/toon-format/toon
Requirements
- Node.js
- GitHub CLI (
gh) installed - GitHub CLI authenticated:
gh auth loginoctogrep never stores GitHub tokens and relies on the authenticated gh session.
Installation
Global install:
npm install -g octogrep
pnpm add -g octogrep
bun add -g octogrepOne-shot execution:
npx octogrep --version
pnpm dlx octogrep --version
yarn dlx octogrep --version
bunx octogrep --versionYou can confirm the CLI is available with octogrep --version.
Install the octogrep skill
This is separate from installing the octogrep CLI itself. To install the AI agent skill, use vercel-labs/skills via npx skills add.
# Generic install
npx skills add hudrazine/octogrep --skill octogrep
# Install for Codex
npx skills add hudrazine/octogrep --skill octogrep -a codex
# Install for Claude Code
npx skills add hudrazine/octogrep --skill octogrep -a claude-codeUsage
octogrep search <query> [options]
octogrep fetch <contentsUrl>Use quotes for multi-word queries (for example, octogrep search "root command").
After installation, you can confirm the CLI is available with octogrep --version.
Examples:
octogrep search "root command"
octogrep search "http client" --repo cli/cli --language go --limit 5
octogrep search "panic" --org cli --filename option.go
octogrep search "createServer" --user vercel --language ts
octogrep fetch "https://api.github.com/repositories/212613049/contents/pkg/cmd/root/root.go?ref=59ba50885feeed63a6f31de06ced5a06a5a3930d"search options
--repo <owner/repo>(repeatable)--org <org>(repeatable)--user <user>(repeatable)--language <language>(repeatable)--path <path>--filename <filename>--extension <extension>--limit <1..100>(default:20)--page <>=1(default:1)
Query conflict rule
If raw query already includes a qualifier (for example org:) and the corresponding option is also provided (for example --org), octogrep returns QUERY_CONFLICT.
Use either:
- raw qualifier style:
octogrep search 'term org:my-org' - option style:
octogrep search term --org my-org
fetch
Use fetch with a contentsUrl returned by octogrep search.
octogrep fetch "$contentsUrl"fetch is intentionally strict:
- accepts only GitHub Contents API URLs on authenticated GitHub hosts
- requires the
https://...?...ref=...URL returned byoctogrep search - rejects browser
htmlUrlvalues - prints file contents for AI and human reading workflows
When a fetched file is too large to pass around in one step, use incur's built-in token controls to read it in chunks:
# Check approximate output size first
octogrep fetch "$contentsUrl" --token-count
# Read the first chunk
octogrep fetch "$contentsUrl" --token-limit 400
# Continue from the next chunk
octogrep fetch "$contentsUrl" --token-offset 400 --token-limit 400Use --limit and --page to page search results themselves. Use --token-limit and --token-offset when you want to trim already formatted output, especially long fetch output for LLM handoff or staged reading.
Error handling
When octogrep fails with --json, use the structured error fields to decide the next step:
code: stable error categoryretryable: whether retrying is reasonablecta: optional recovery hint when octogrep can safely suggest a concrete follow-up command
Prefer code, message, and retryable for recovery decisions. If cta is present, treat it as a convenience hint rather than a guaranteed replay script.
Output
Default output is TOON (incur standard behavior).
You can also use --format json|yaml|md or --json.
Returned fields are intentionally minimal:
querycompiledQuerymeta.totalCountmeta.incompleteResultsmeta.pagemeta.limitmeta.returnedCountitems[]with:repositorypathshahtmlUrl(GitHub browser URL)contentsUrl(GitHub Contents API URL forgh api)fragment(nullable)
Use htmlUrl for browsing and contentsUrl for fetching file contents as text:
octogrep fetch "$contentsUrl"For large file reads, keep the default TOON output and add --token-count, --token-limit, or --token-offset as needed. With --verbose, truncated output also includes meta.nextOffset so the next chunk can be requested programmatically.
Token controls operate on the formatted output stream rather than whole result items, so they are useful for chunking fetch output but are not a precise replacement for search --page and search --limit.
If you need byte-exact raw output or stable hashes, use gh api directly:
gh api -H "Accept: application/vnd.github.raw+json" "$contentsUrl"When no results are found, octogrep returns an empty list and exits with code 0.
Development
pnpm build
pnpm typecheck
pnpm testRun real GitHub integration smoke test:
OCTOGREP_E2E=1 pnpm testAcknowledgements
This project is built with the following tools:
incur: TypeScript CLI framework used for structured command and output handling.TOON: Token-Oriented Object Notation used as the default output format.GitHub CLI (gh): Used to execute GitHub code search through authenticatedghsessions.
