aeorank
v2.3.2
Published
AI Engine Optimization audit - score any website across 28 criteria for AI visibility
Maintainers
Readme
AEORank
Score any website for AI engine visibility across 28 criteria. Pure HTTP + regex - zero API keys, under 10 seconds.
Quick Start
CLI
npx aeorank example.comnpx aeorank example.com --json # JSON output
npx aeorank example.com --summary # Human-readable scorecard
npx aeorank example.com --html # Standalone HTML report
npx aeorank example.com --ci --threshold 80 # CI gate
npx aeorank site-a.com site-b.com # Side-by-side comparison
npx aeorank example.com --full-crawl # Crawl all discoverable pages
npx aeorank example.com --full-crawl --max-pages 50 # Limit to 50 pagesProgrammatic
import { audit } from 'aeorank';
const result = await audit('example.com');
console.log(result.overallScore); // 0-100
console.log(result.scorecard); // 28 criteria with scores
console.log(result.opportunities); // Prioritized improvementsWhat It Checks
AEORank evaluates 28 criteria that determine how AI engines (ChatGPT, Claude, Perplexity, Google AI Overviews) discover, parse, and cite your content. Criteria are organized into three tiers by impact on real-world AI citations:
Scoring Tiers (by importance)
Content Substance (~55%) - Why an AI engine would cite you:
| Criterion | Weight | What it measures | |-----------|--------|------------------| | Topic Coherence | 14% | Blog content focus on core expertise vs scattered topics | | Original Data & Expert Analysis | 10% | Proprietary research, case studies, unique data points | | Content Depth | 7% | Article length, heading structure, deep vs thin pages | | Fact & Data Density | 6% | Specific numbers, statistics, data points per page | | Direct Answer Paragraphs | 5% | Concise answer paragraphs after question headings | | Q&A Content Format | 5% | Question-format headings (What, How, Why) with answers | | Query-Answer Alignment | 5% | Every question heading followed by a direct answer | | Comprehensive FAQ Section | 4% | Dedicated FAQ with FAQPage schema markup |
Content Organization (~30%) - How easily AI can extract and trust your content:
| Criterion | Weight | What it measures |
|-----------|--------|------------------|
| Entity Authority & NAP Consistency | 5% | Organization schema, consistent name/address/phone |
| Internal Linking Structure | 4% | Topic clusters, breadcrumbs, reachability from homepage |
| Content Freshness Signals | 4% | dateModified schema, visible dates, recent content |
| Schema.org Structured Data | 3% | JSON-LD blocks (Organization, Article, FAQPage, etc.) |
| Author & Expert Schema | 3% | Person schema with credentials and expertise |
| Table & List Extractability | 3% | HTML tables with headers, ordered/unordered lists |
| Definition Patterns | 2% | Clear "X is defined as..." patterns for key terms |
| Visible Date Signal | 2% | Visible publication dates with <time> elements |
| Semantic HTML5 & Accessibility | 2% | Semantic elements (main, article, nav), ARIA, lang |
| Clean, Crawlable HTML | 2% | HTTPS, meta tags, proper heading hierarchy |
Technical Plumbing (~15%) - Whether AI crawlers can find you (table stakes):
| Criterion | Weight | What it measures | |-----------|--------|------------------| | Content Cannibalization | 2% | Overlapping pages competing for the same topic | | llms.txt File | 2% | /llms.txt with site description and key page URLs | | robots.txt for AI Crawlers | 2% | GPTBot, ClaudeBot, PerplexityBot access | | Content Publishing Velocity | 2% | Regular publishing cadence in sitemap | | Content Licensing & AI Permissions | 2% | /ai.txt file, license schema for AI usage | | Sitemap Completeness | 1% | sitemap.xml with lastmod dates | | Canonical URL Strategy | 1% | Self-referencing canonical tags | | RSS/Atom Feed | 1% | RSS feed linked from homepage | | Schema Coverage & Depth | 1% | Schema markup on inner pages, not just homepage | | Speakable Schema | 1% | SpeakableSpecification for voice assistants |
Coherence Gate: Sites with topic coherence below 6/10 are score-capped regardless of technical perfection. A scattered site with perfect robots.txt, llms.txt, and schema will score lower than a focused site with mediocre technical implementation.
| # | Criterion | Weight | Tier | |---|-----------|--------|------| | 1 | llms.txt File | 2% | Plumbing | | 2 | Schema.org Structured Data | 3% | Organization | | 3 | Q&A Content Format | 5% | Substance | | 4 | Clean, Crawlable HTML | 2% | Organization | | 5 | Entity Authority & NAP Consistency | 5% | Organization | | 6 | robots.txt for AI Crawlers | 2% | Plumbing | | 7 | Comprehensive FAQ Section | 4% | Substance | | 8 | Original Data & Expert Analysis | 10% | Substance | | 9 | Internal Linking Structure | 4% | Organization | | 10 | Semantic HTML5 & Accessibility | 2% | Organization | | 11 | Content Freshness Signals | 4% | Organization | | 12 | Sitemap Completeness | 1% | Plumbing | | 13 | RSS/Atom Feed | 1% | Plumbing | | 14 | Table & List Extractability | 3% | Organization | | 15 | Definition Patterns | 2% | Organization | | 16 | Direct Answer Paragraphs | 5% | Substance | | 17 | Content Licensing & AI Permissions | 2% | Plumbing | | 18 | Author & Expert Schema | 3% | Organization | | 19 | Fact & Data Density | 6% | Substance | | 20 | Canonical URL Strategy | 1% | Plumbing | | 21 | Content Publishing Velocity | 2% | Plumbing | | 22 | Schema Coverage & Depth | 1% | Plumbing | | 23 | Speakable Schema | 1% | Plumbing | | 24 | Query-Answer Alignment | 5% | Substance | | 25 | Content Cannibalization | 2% | Plumbing | | 26 | Visible Date Signal | 2% | Organization | | 27 | Topic Coherence | 14% | Substance | | 28 | Content Depth | 7% | Substance |
CLI Options
aeorank <domain> [options]
aeorank <domain-a> <domain-b> [options] # comparison mode
Options:
--json Output raw JSON to stdout
--summary Print human-readable scorecard
--html Generate standalone HTML report file
--ci CI mode: JSON + exit 1 if score < threshold
--threshold <N> Score threshold for --ci (default: 70)
--no-headless Skip Puppeteer SPA rendering
--no-multi-page Skip extra page discovery (faster)
--full-crawl BFS crawl all discoverable pages
--max-pages <N> Max pages for --full-crawl (default: 200)
--concurrency <N> Parallel fetches for --full-crawl (default: 5)
--version Print version
--help Show helpGitHub Actions
Use the built-in action to gate deployments on AEO score:
- name: AEO Audit
uses: AEO-Content-Inc/aeorank@v2
with:
domain: example.com
threshold: 70Or use npx directly:
- name: AEO Audit
run: npx aeorank example.com --ci --threshold 70API
audit(domain, options?)
Run a complete audit. Returns AuditResult with:
overallScore- 0-100 weighted scorescorecard- 28ScoreCardItementries (criterion, score 0-10, status, key findings)detailedFindings- Per-criterion findings with severityopportunities- Prioritized improvements with effort/impactpitchNumbers- Key metrics (schema types, AI crawler access, etc.)verdict- Human-readable summary paragraphbottomLine- Actionable recommendationpagesReviewed- Per-page analysis with issues, strengths, and AEO score (0-100)elapsed- Wall-clock seconds
Options:
| Option | Type | Default | Description |
|--------|------|---------|-------------|
| noHeadless | boolean | false | Skip Puppeteer SPA rendering |
| noMultiPage | boolean | false | Homepage + blog only |
| timeout | number | 15000 | Fetch timeout in ms |
| fullCrawl | boolean | false | BFS crawl all discoverable pages |
| maxPages | number | 200 | Max pages for full crawl |
| concurrency | number | 5 | Parallel fetches for full crawl |
scorePage(html, url?)
Score a single HTML page against 14 per-page AEO criteria. Returns PageScoreResult with:
aeoScore- 0-100 weighted scorecriterionScores- 14PageCriterionScoreentries (criterion, score 0-10, weight)
scoreAllPages(siteData)
Batch-score all pages (homepage + blogSample) from a SiteData object. Returns PageScoreResult[].
buildLinkGraph(pages, domain, homepageUrl)
Analyze internal linking structure from crawled pages. Returns LinkGraph with:
nodes- Map of URL toPageNode(in/out degree, depth, pillar/hub/orphan flags)edges- Array ofLinkEdge(from, to, anchor text)stats-LinkGraphStats(total pages, orphans, pillars, hubs, avg depth, clusters)clusters-TopicCluster[](pillar URL, spoke URLs, cohesion score)
import { crawlFullSite, prefetchSiteData, buildLinkGraph } from 'aeorank';
const siteData = await prefetchSiteData('example.com');
const crawl = await crawlFullSite(siteData, { maxPages: 200 });
const graph = buildLinkGraph(crawl.pages, 'example.com', 'https://example.com');
console.log(graph.stats.orphanPages); // Pages with no inbound links
console.log(graph.stats.pillarPages); // High-authority hub pages
console.log(graph.clusters); // Topic clusters detectedgenerateFixPlan(domain, score, criteria, pages?, linkGraph?)
Generate a phased fix plan from audit results. Returns FixPlan with:
phases- 4 phases (Foundation, Content, Authority, Architecture) with prioritizedFixAction[]quickWins- Low-effort, high-impact fixesprojectedScore- Estimated score after applying all fixessummary- Counts by impact level, top opportunity, estimated effort
Each FixAction includes: title, description, impact/effort levels, step-by-step instructions, code examples, affected pages, and dependency ordering.
import { audit, generateFixPlan } from 'aeorank';
const result = await audit('example.com');
const plan = generateFixPlan(
'example.com',
result.overallScore,
result.criterionResults,
result.pagesReviewed,
);
console.log(plan.projectedScore); // e.g. 82
console.log(plan.quickWins[0].title); // e.g. "Add llms.txt file"
console.log(plan.quickWins[0].impactScore); // e.g. 10
console.log(plan.phases[0].fixes.length); // Foundation phase fixesAdvanced API
For custom pipelines, import individual stages:
import {
prefetchSiteData,
auditSiteFromData,
calculateOverallScore,
buildScorecard,
buildDetailedFindings,
generateVerdict,
generateOpportunities,
scorePage,
scoreAllPages,
buildLinkGraph,
generateFixPlan,
isSpaShell,
fetchWithHeadless,
} from 'aeorank';
const siteData = await prefetchSiteData('example.com');
const results = auditSiteFromData(siteData);
const score = calculateOverallScore(results);Browser Entry Point
For browser environments (Chrome extensions, web apps), import from aeorank/browser to avoid Node.js dependencies (Puppeteer, fs):
import {
prefetchSiteData,
auditSiteFromData,
calculateOverallScore,
buildLinkGraph,
generateFixPlan,
analyzeAllPages,
crawlFullSite,
} from 'aeorank/browser';The browser entry exports everything except headless-fetch (Puppeteer), html-report (Node fs), audit orchestrator, and CLI.
SPA Support
Sites that use client-side rendering (React, Vue, Angular) return empty HTML shells to regular HTTP requests. AEORank detects these automatically and re-renders them with Puppeteer if available.
Install Puppeteer as an optional dependency:
npm install puppeteerUse --no-headless to skip SPA rendering (faster but may produce lower scores for SPAs).
Page Discovery
AEORank automatically discovers and scores pages beyond just the homepage:
- Sitemap blog sample - Up to 50 blog/article pages from
sitemap.xml - Nav link extraction - Internal links from
<nav>elements - Common page variants -
/about,/pricing,/services,/contact,/team,/resources,/docs,/case-studies - Sitemap content pages - 6 non-blog pages from sitemap (service pages, product pages)
- Homepage link fallback (v2.2+) - When no sitemap exists (or fewer than 4 blog pages found), extracts up to 30 internal links from the full homepage HTML to build a page list automatically
This ensures realistic scoring even for sites without a sitemap. Without the fallback, sites with no sitemap were only getting 1-5 pages checked, inflating scores.
Full-Site Crawl
For even deeper analysis beyond the automatic page discovery, enable --full-crawl to BFS-crawl every discoverable page:
npx aeorank example.com --full-crawl # Up to 200 pages
npx aeorank example.com --full-crawl --max-pages 50 # Limit to 50
npx aeorank example.com --full-crawl --concurrency 10 # 10 parallel fetchesThe crawler seeds from sitemap URLs and homepage links, then follows internal links on each fetched page. It respects robots.txt Disallow rules, skips resource files, and tags each page with a category (blog, about, pricing, services, docs, faq, etc.).
Programmatic usage:
import { audit } from 'aeorank';
const result = await audit('example.com', {
fullCrawl: true,
maxPages: 100,
concurrency: 5,
});Or use the crawler directly:
import { crawlFullSite, prefetchSiteData } from 'aeorank';
const siteData = await prefetchSiteData('example.com');
const crawlResult = await crawlFullSite(siteData, { maxPages: 200 });
console.log(crawlResult.pages.length); // Pages fetched
console.log(crawlResult.discoveredUrls.length); // Total URLs foundPer-Page Scoring
AEORank scores each individual page (0-100) against the 14 criteria that apply at page level. Instead of only seeing "your site scores 62," you get "your /about page scores 45, your /blog/guide scores 78."
The 14 per-page criteria follow the same substance-first weighting as the site-level score:
| Tier | Per-Page Criteria | Weight | |------|-------------------|--------| | Substance | Original Data & Expert Content | 10% | | | Fact & Data Density | 6% | | | Direct Answer Paragraphs | 5% | | | Q&A Content Format | 5% | | | Query-Answer Alignment | 5% | | | FAQ Section Content | 4% | | Organization | Content Freshness Signals | 4% | | | Schema.org Structured Data | 3% | | | Table & List Extractability | 3% | | | Definition Patterns | 2% | | | Visible Date Signal | 2% | | | Semantic HTML5 & Accessibility | 2% | | | Clean, Crawlable HTML | 2% | | Plumbing | Canonical URL Strategy | 1% |
The remaining 14 criteria are site-level only: llms.txt, robots.txt, sitemap, RSS, entity consistency, internal linking, content licensing, author schema, content velocity, schema coverage, speakable schema, content cannibalization, topic coherence, and content depth.
CLI Output
Per-page scores appear in the pages section:
Pages reviewed (47):
Homepage https://example.com 0 issues [AEO: 72]
Blog https://example.com/blog/post 2 issues [AEO: 58]
Average page AEO score: 62/100
Top: /blog/medicare-walkers-guide (92)
Bottom: /thank-you (23)Programmatic API
import { scorePage, scoreAllPages } from 'aeorank';
import type { PageScoreResult, PageCriterionScore } from 'aeorank';
// Score a single page
const result = scorePage(html, url);
console.log(result.aeoScore); // 0-100
console.log(result.criterionScores); // 14 per-criterion scores
// Score all pages from site data
const allScores = scoreAllPages(siteData);Link Graph Analysis
Analyze your site's internal linking structure to find orphan pages, identify pillar content, and detect topic clusters:
npx aeorank example.com --full-crawl --json | jq '.linkGraph.stats'import { crawlFullSite, prefetchSiteData, buildLinkGraph, serializeLinkGraph } from 'aeorank';
const siteData = await prefetchSiteData('example.com');
const crawl = await crawlFullSite(siteData, { maxPages: 200 });
const graph = buildLinkGraph(crawl.pages, 'example.com', 'https://example.com');
// Orphan pages (no inbound links - invisible to crawlers)
const orphans = [...graph.nodes.values()].filter(n => n.isOrphan);
// Pillar pages (high authority, many inbound links)
const pillars = [...graph.nodes.values()].filter(n => n.isPillar);
// Topic clusters (pillar + spoke pages with high cohesion)
graph.clusters.forEach(c => {
console.log(`${c.pillarTitle}: ${c.spokes.length} spokes, cohesion ${c.cohesion}`);
});
// Serialize for storage/transport (Map -> plain object)
const json = serializeLinkGraph(graph);Fix Plan Engine
Generate actionable, phased fix plans from audit results. Each fix includes step-by-step instructions, code examples, effort/impact ratings, and dependency ordering:
npx aeorank example.com --full-crawl --json | jq '.fixPlan'import { audit, generateFixPlan } from 'aeorank';
const result = await audit('example.com', { fullCrawl: true });
const plan = generateFixPlan(
'example.com',
result.overallScore,
result.criterionResults,
result.pagesReviewed,
result.linkGraph, // optional - enables link-aware fixes
);
// 4 phases: Foundation -> Content -> Authority -> Architecture
plan.phases.forEach(phase => {
console.log(`${phase.title}: ${phase.fixes.length} fixes`);
});
// Quick wins: low effort + high impact
plan.quickWins.forEach(qw => {
console.log(`${qw.title} (+${qw.impactScore} pts) - ${qw.effort} effort`);
qw.steps.forEach(s => console.log(` - ${s}`));
});
console.log(`Current: ${plan.overallScore} -> Projected: ${plan.projectedScore}`);Scoring
Each criterion is scored 0-10 by deterministic checks (regex, HTML parsing, HTTP headers). The overall score is a weighted average normalized to 0-100.
Score interpretation:
- 86-100 - Excellent AI visibility
- 71-85 - Strong fundamentals, room for optimization
- 56-70 - Moderate readiness, significant gaps
- 41-55 - Below average, multiple areas need attention
- 0-40 - Critical gaps, largely invisible to AI engines
HTML Reports
Generate a self-contained HTML report with score visualization, scorecard grid, and opportunities table:
npx aeorank example.com --html
# -> aeorank-example-com.html
npx aeorank site-a.com site-b.com --html
# -> aeorank-site-a-com-vs-site-b-com.htmlReports include inline CSS and SVG - no external dependencies. Open directly in any browser or share as a file.
Programmatic usage:
import { audit, generateHtmlReport } from 'aeorank';
const result = await audit('example.com');
const html = generateHtmlReport(result);Comparison Mode
Compare two sites side-by-side. Both audits run in parallel:
npx aeorank site-a.com site-b.com
npx aeorank site-a.com site-b.com --json
npx aeorank site-a.com site-b.com --htmlProgrammatic usage:
import { compare } from 'aeorank';
const result = await compare('site-a.com', 'site-b.com');
console.log(result.comparison.scoreDelta); // Overall score difference
console.log(result.comparison.siteAAdvantages); // Criteria where A leads
console.log(result.comparison.siteBAdvantages); // Criteria where B leads
console.log(result.comparison.tied); // Criteria with equal scoresChangelog
v2.3.0 - Coherence Scaling & Script Stripping
- Topic coherence scales with page count: Sites with many pages (50+) no longer penalized for having more topic clusters. Cluster thresholds scale proportionally (pages/10, pages/5, pages/3). Absolute term presence (10+ pages) boosts focus score.
- Strip inline JavaScript from scoring:
<script>and<style>tags are now removed before text analysis, preventing WP Rocket and similar deferred-loading scripts from corrupting regex-based scoring. - Regex safety net:
checkQueryAnswerAlignmentwrapsnew RegExp()in try-catch to handle residual script content gracefully.
v2.2.0 - Auto Page Discovery
Sites without a sitemap.xml now get up to 30 pages discovered from homepage links instead of 1-5. Prevents inflated scores from insufficient page coverage.
v2.1.0 - Scoring Rebalance with Coherence Gate
Weight distribution redesigned: Content Substance ~55%, Organization ~30%, Plumbing ~15%. Coherence gate caps scores when topic focus is below 6/10.
v2.0.0 - Topic Coherence & Content Depth
Added 2 new criteria (26 -> 28): Topic Coherence (14%) and Content Depth (7%). Blog sampling for coherence analysis.
v1.6.0 - Link Graph & Fix Plan Engine
Internal linking analysis with orphan/pillar/hub detection, topic clusters. Phased fix plan generation with code examples.
v1.5.0 - Per-Page Scoring
Individual page scores (0-100) against 14 page-level criteria. Top/bottom page rankings.
Benchmark Dataset
The data/ directory contains the largest open dataset of AI visibility scores - 13,619 domains scored across 28 criteria, including 4,328 Y Combinator startups across 48 batches (W06-W26):
| File | Contents |
|------|----------|
| data/benchmark.json | 13,619 domains with per-criterion scores and sector/category |
| data/yc.json | 4,328 Y Combinator startups with company name, one-liner, founders, industry tags |
| data/sectors.json | 129 sectors with pre-computed statistics (mean, median, p25, p75) |
Use the dataset for research, benchmarking, or building on top of AEORank:
import yc from './data/yc.json' assert { type: 'json' };
// Find top-scoring YC startups
const top = yc.entries.filter(e => e.score >= 80);
// Filter by batch
const w25 = yc.entries.filter(e => e.batch === 'w25');
// Get sector averages
import sectors from './data/sectors.json' assert { type: 'json' };
console.log(sectors.sectors.healthcare.mean);Contributing
git clone https://github.com/AEO-Content-Inc/aeorank.git
cd aeorank
npm install
npm test
npm run buildLicense
MIT - see LICENSE
Built by AEO Content, Inc.
