auto-geo
v0.8.4
Published
The open-source GEO engine that gets your brand mentioned in ChatGPT, Claude, Gemini, Perplexity, and Grok. Audit, generate, fix, and track the pages large language models cite. Built by Shadow, a media research lab building the next generation of AI-powe
Maintainers
Readme
auto-geo
The open-source GEO engine that gets your brand mentioned in ChatGPT, Claude, Gemini, Perplexity, and Grok.
Audit, generate, fix, and track the pages large language models cite — one CLI, file-based, MIT.
When someone asks ChatGPT, Perplexity, Claude, Gemini, or Google AI Overviews a question your business should answer, do those engines cite your domain? auto-geo is the full loop for making that happen and proving it's happening:
auto-geo init # set up the system once
auto-geo doctor # audit any page for citation readiness
auto-geo write # generate publish-ready pages from target queries
auto-geo fix # rewrite an existing page so it passes the audit
auto-geo check # measure: do AI engines actually cite you?
auto-geo history # track citation coverage over timeEverything is file-based and committable — tracked prompts, check history, config. No server, no account, no database. One install away:
npm i -g auto-geoBuilt by Shadow — a media research lab building the next generation of AI-powered media intelligence and communications technology, in partnership with the teams that put OpenAI, TikTok, Meta, Amazon, and Lovable on the map. Shadow uses
auto-geoto publish to shadow.inc/resources.
Contents
- Quickstart
- What is GEO?
- The workflow
auto-geo init— set up the systemauto-geo doctor— audit any page for citation readinessauto-geo write— generate pages from queriesauto-geo fix— rewrite a page for citation readinessauto-geo prompts— manage your tracked promptsauto-geo check— measure actual citation coverageauto-geo history— citation coverage over time- Configuration
- The page architecture
- Agent-friendly output
- LLM-friendly
- Contributing
- License
Quickstart
# 0. Install once (or run any command one-shot via `npx auto-geo@latest`)
npm i -g auto-geo
# 1. Set up — config, .env.local key slots, and the .auto-geo workspace
auto-geo init
# 2. Add an API key to .env.local (auto-loaded by every command)
# 3. Audit any page — yours or a competitor's
auto-geo doctor https://example.com/some-page
# 4. Track the prompts you want AI engines to cite you for
auto-geo prompts add "best media monitoring tools" "what is GEO"
# 5. Measure — every run is saved to history automatically
auto-geo check
# 6. Watch coverage move over time
auto-geo historyNode >=18.17 required. Upgrading later is npm i -g auto-geo@latest. Other ways in:
brew install shadowresearch/tap/auto-geo # Homebrew (macOS / Linux)
mise use -g npm:auto-geo # miseStandalone executables (no Node required) for macOS, Linux, and Windows ship with every release.
What is GEO?
Generative Engine Optimization is the discipline of making your pages the ones AI search engines quote when they answer a question. It is the successor to SEO: instead of ranking in a list of links, you're competing to be cited inside the answer.
The pages that win are not blog posts. Empirical research links citation probability to a specific shape:
- Architecture, not prose. Named, validated blocks — TL;DR, intro, question-format H2 sections, related guides, key takeaways, FAQ, disclosure. AI engines extract structured chunks; rigid structure improves extraction.
- Answer-first. Every section opens with a 40–60 word "answer capsule" that fully answers the section's question before any supporting paragraph.
- Question-format headings. H2s are written as the questions users actually ask AI engines.
- Entity-dense. Named entities (companies, people, products) at high density — linked to ~4.8x higher citation probability.
- Schema-derived. Article + FAQPage JSON-LD emitted from structure, not hand-written.
auto-geo encodes this shape in a strict schema (see docs/sop.md — the full standard operating procedure), audits any URL against it, generates new pages that conform to it, and then closes the loop by measuring whether the engines actually cite you.
The workflow
┌──────────────────────────────────────────────────────┐
│ auto-geo init │
│ config · .env.local · .auto-geo/ workspace │
└──────────────────────────────────────────────────────┘
│
┌──────────┼──────────────┬─────────────────┐
▼ ▼ ▼ ▼
doctor write fix prompts
audit a generate rewrite an track the queries
page new pages existing page that matter to you
│ │ │ │
└──────────┴──────────────┴────────┬────────┘
▼
check ──── saves every run ────┐
measure actual ▼
citations history
coverage over time,
newly cited / lostdoctor measures readiness (is this page shaped for citation?). check measures outcome (is it actually being cited?). history turns the outcomes into a trend line.
auto-geo init — set up the system
auto-geo init # interactive (a handful of questions)
auto-geo init --yes # non-interactive templateOne command scaffolds everything:
| File | What it is |
| ----------------------- | ---------------------------------------------------------------------------------- |
| auto-geo.config.json | Your defaults — domain, provider, model, author. Committable; never holds secrets. |
| .env.local | API key slots. Auto-loaded by every command. Gitignore it. |
| .auto-geo/prompts.txt | Your tracked prompts — one per line, # comments allowed. |
| .auto-geo/checks/ | Every check run, saved as JSON. The data behind history. |
The interactive flow ends by asking for the prompts you want to track, so a fresh project goes from zero to a measurable citation baseline in one sitting. init never overwrites an existing .env.local and refuses to overwrite an existing config without --force.
auto-geo doctor — audit any page for citation readiness
Run it on any URL — yours, a competitor's, every page in your sitemap — and get a structured report on the citation signals AI engines look for.
auto-geo doctor https://example.com/some-page✓ TL;DR present (52 words, in range)
✗ Question-format H2 headings (2 of 6 are question-format; SOP §3 targets all)
✓ Article JSON-LD present
✗ FAQPage JSON-LD present (No FAQPage JSON-LD block detected)
✓ Entity density (12.3/1k words)
✗ Image cadence (0 images for 1247 words)
✓ Answer-first first paragraph
✓ No self-link in related guides
Score: 5 / 8 checks pass — moderate GEO posture
Top 3 fixes (ranked by citation lift):
1. Add a FAQPage JSON-LD block. Each Q is a citable extraction target.
2. Convert 4 statement-form H2 headings to question form.
3. Add 2 images with descriptive alt text (entity + context).# Whole sitemap — mean score, lowest-scoring pages, most common failures
auto-geo doctor --site https://example.com/sitemap.xml --max-pages 50
# JSON for CI / dashboards
auto-geo doctor https://example.com/page --jsonExit code 0 if score ≥ 75%, 1 otherwise — gate deploys on it. See docs/doctor.md for the full check reference.
auto-geo write — generate pages from queries
Give it your domain and the queries you want to be cited for; get back validated, publish-ready JSON files — one structured page per query, conforming to the full GEO architecture.
auto-geo write \
--query "what is GEO" \
--query "GEO vs SEO" \
--out ./resources✓ "what is GEO" → ./resources/geo.json (validated, ~$0.06)
✓ "GEO vs SEO" → ./resources/geo-vs-seo.json (validated, ~$0.06)
Total: 2 pages · 2 ok · ~$0.12 spent · 31s elapsedThe system prompt encodes the GEO SOP — TL;DR length, answer-capsule windows, banned superlatives, FAQ structure — and output is constrained to the schema at the type-system level via the Vercel AI SDK's generateObject, with a bounded self-correction loop on validation failure. Defaults: gpt-5.4 (OpenAI) or claude-sonnet-4-6 (Anthropic), auto-detected from whichever API key you have set.
# Dry-run — plan + cost estimate, no LLM calls
auto-geo write --query "what is X" --dry-run
# Batch from a file, anthropic, 4 pages at a time
auto-geo write --queries-file queries.txt --provider anthropic --concurrency 4With a config file (auto-geo init), --domain, author fields, and provider come from config — a bare --query is all you need. See docs/write.md.
auto-geo fix — rewrite a page for citation readiness
Where doctor tells you what's wrong, fix produces a GEO-optimized rewrite that passes all 8 checks — fetched, audited, regenerated, and validated against the same schema write uses.
auto-geo fix https://www.example.com/some-blog-post --out ./fixed.jsonScore (before): 3 / 8
Generating rewrite via openai gpt-5.4...
Score (projected): 8 / 8 — strong GEO posture
→ ./fixed.json (validated)auto-geo fix https://example.com/page --provider anthropic # Claude instead
auto-geo fix https://example.com/page --dry-run # audit + cost estimate onlySee docs/fix.md.
auto-geo prompts — manage your tracked prompts
Your tracked prompts are the questions you want AI engines to answer by citing your domain. They live in .auto-geo/prompts.txt (plain text, committable) and they're what check runs by default.
auto-geo prompts add "best media monitoring tools" "what is GEO"
auto-geo prompts # numbered list
auto-geo prompts rm 2 # by index — or by exact textDon't know what to track? Let the engine propose your prompt set — discover fetches your homepage, looks at what you already track, and has the LLM generate the high-intent queries you should compete for:
auto-geo prompts discover --dry-run # preview the proposals
auto-geo prompts discover --count 15 # append 15 (never overwrites, never duplicates)prompts add (and discover) bootstrap the workspace on first use, so you don't even need init to start tracking.
auto-geo check — measure actual citation coverage
For each prompt, ask a real AI search engine and report whether your domain is among the citations. This is the ground truth doctor predicts.
auto-geo check # tracked prompts, domain from config using 3 tracked prompts from .auto-geo/prompts.txt
[1/3] ✗ "what is GEO" — not cited (5 sources)
[2/3] ✓ "how do I get cited by ChatGPT" — cited (2 sources)
[3/3] ✓ "open source GEO tools" — cited (1 source)
Coverage: 2/3 queries (67%) · 3 page citations total · ~$0.012 spent
saved → .auto-geo/checks/2026-06-10T13-22-05--perplexity.json (auto-geo history)Engines: perplexity (default), openai, anthropic, gemini, xai (alias grok), or --engine all — which runs every engine whose API key is set and reports per-engine coverage plus a union roll-up.
# Explicit queries instead of the tracked set
auto-geo check --domain shadow.inc --query "what is GEO"
# Every engine you have keys for, union coverage
auto-geo check --engine all
# CI: fail the deploy when critical queries don't cite you
auto-geo check --queries-file geo/critical-queries.txt && deploy
# Streaming JSON for agents / dashboards
auto-geo check --ndjsonEvery run is saved to .auto-geo/checks/ automatically (opt out with --no-save). Exit code 0 if coverage > 0%, 1 if 0%. See docs/check.md for output shapes, fan-out-query capture, domain-matching rules, and the --format geo-audit interop mode.
auto-geo history — citation coverage over time
The payoff for saving every run: a trend line. Run-by-run coverage with per-engine deltas, plus exactly which prompts you started or stopped being cited for.
auto-geo history 2026-06-01 08:30 perplexity 33% · 1/3 cited $0.01
2026-06-08 09:15 perplexity 67% ↑34 2/3 cited $0.01
Since last run (perplexity · 2026-06-01 08:30 ▸ 2026-06-08 09:15)
✓ newly cited open source GEO tools
✗ lost (none)
2 runs · .auto-geo/checksTrends compare like with like — each run is measured against the previous run of the same engine selector. --engine all filters to multi-engine runs; --limit N controls depth; --json emits rows + delta machine-readably. See docs/history.md.
Configuration
Set once with auto-geo init, override anywhere. Precedence, highest first:
- CLI flag
- Environment variable (provider auto-detected from which API key is set)
auto-geo.config.json(walks up from cwd — monorepo-friendly)- Built-in default
// auto-geo.config.json — committable, no secrets
{
"domain": "https://www.example.com",
"basePath": "/resources",
"provider": "openai",
"model": "gpt-5.4",
"engine": "perplexity",
"concurrency": 4,
"author": {
"name": "Jane Doe",
"jobTitle": "Head of Content",
"bio": "Jane writes about generative engine optimization…",
},
}API keys live in .env.local (or .env), auto-loaded by every command — already-set environment variables always win:
| Engine / provider | Env var |
| ----------------------------- | ------------------------------------ |
| OpenAI (write, fix, check) | OPENAI_API_KEY |
| Anthropic (write, fix, check) | ANTHROPIC_API_KEY |
| Perplexity (check) | PERPLEXITY_API_KEY |
| Gemini (check) | GOOGLE_API_KEY or GEMINI_API_KEY |
| xAI / Grok (check) | XAI_API_KEY |
The page architecture
Everything write and fix produce — and everything doctor audits for — follows a strict seven-block architecture:
- TL;DR — 40–60 word answer capsule
- Intro — context-setting blocks
- Sections — question-format H2s, each opening with a 40–60 word answer capsule
- Related Guides — 4–8 entries
- Key Takeaways — 4–6 declarative bullets
- FAQ — 3–10 Q&As with 40–60 word answers
- Disclosure — sourcing note, timestamp, publisher line
Structural violations are hard errors (the generated payload is rejected and regenerated); density and cadence heuristics are soft warnings. The full spec: docs/architecture.md, docs/validation.md, and the SOP behind every constraint: docs/sop.md.
The output JSON is renderer-agnostic — POST it to your CMS, hydrate a template, or render it with your own components. The structure is the contract.
Agent-friendly output
Every command is built to be driven by an agent as much as by a human:
--json— one stable, machine-readable object on stdout.--ndjson(check) — one JSON line per query as results stream in, plus a_summaryline.- Progress goes to stderr, results to stdout — pipes stay clean.
- Stable exit codes —
doctorandcheckare CI gates out of the box. --no-color/NO_COLOR/ non-TTY detection for log-friendly output.
auto-geo check --ndjson | jq 'select(.cited) | .query'LLM-friendly
auto-geo is a tool whose output is content meant to be cited by LLMs — so this repo eats its own dogfood:
llms.txt— a curated index following the llmstxt.org convention.llms-full.txt— README + every substantive doc inlined into a single file for one-fetch ingestion.- GitHub Pages site at shadowresearch.github.io/auto-geo — advertises both via
<link rel="alternate">, emits Article JSON-LD. AGENT.md— a compact operating spec for coding agents driving the CLI.
Contributing
See CONTRIBUTING.md. Bug reports, check improvements, new engines, and documentation refinements all welcome.
License
MIT.
About Shadow
Shadow is a media research lab building the next generation of AI-powered media intelligence and communications technology, in partnership with the teams that put OpenAI, TikTok, Meta, Amazon, and Lovable on the map. Shadow runs auto-geo end-to-end on a schedule for media research, PR, and communications teams.
Learn more at shadow.inc.
