@agent-ready-web/next
v0.1.2
Published
Next.js App Router adapters for agent-ready-web.
Maintainers
Readme
agent-ready-web
Agent readiness primitives for Next.js and Cloudflare Workers. Generates spec-compliant robots.txt, sitemap.xml, Link: headers (RFC 8288), .well-known/api-catalog (RFC 9727), and .well-known/agent-skills/index.json from a single config file.
One config, three packages:
┌──────────────────────────────┐ ┌────────────────────────┐
│ agent-readiness.config.ts │───▶│ agent-ready-web │ pure generators
└──────────────┬───────────────┘ └────────────────────────┘
│ ▲
├────────────────────────▶│ @agent-ready-web/next Next.js App Router
│ │
└────────────────────────▶│ @agent-ready-web/cloudflare Workers handlerWhy
Cloudflare's Agent Readiness Score checker at isitagentready.com validates a 15-item checklist across Discoverability, Content, Bot Access Control, API/Auth/MCP discovery, and Commerce. As AI agents become a significant traffic source — Cloudflare reports AI-referred traffic up 7x since January 2025 and AI-attributed orders up 11x — agent legibility has become foundational web infrastructure alongside SEO.
This package covers the Foundation + Discovery subset (items 1, 2, 3, 5, 6, 7, 11). Item 4 (markdown content negotiation) is deferred to v0.2 — see § Markdown negotiation. Items 8–10 and 12–15 are out of scope until use cases emerge.
Install
Pick the adapter for your framework:
# Next.js App Router site
pnpm add agent-ready-web @agent-ready-web/next
# Cloudflare Worker
pnpm add agent-ready-web @agent-ready-web/cloudflareBoth adapters pull agent-ready-web transitively; it's listed above so the config file can import { defineAgentReadiness } from 'agent-ready-web' directly.
Quickstart: Next.js
One config file, five one-liners.
// src/agent-readiness.config.ts
import { defineAgentReadiness } from 'agent-ready-web';
export default defineAgentReadiness({
siteUrl: 'https://example.com',
contentSignals: { 'ai-train': 'yes', search: 'yes', 'ai-input': 'yes' },
aiCrawlers: {
GPTBot: 'allow',
'OAI-SearchBot': 'allow',
'Claude-Web': 'allow',
'anthropic-ai': 'allow',
'Google-Extended': 'allow',
PerplexityBot: 'allow',
CCBot: 'allow',
},
userAgents: [{ name: '*', allow: ['/'], disallow: ['/api/', '/admin'] }],
sitemap: {
urls: async () => [
{
url: 'https://example.com/',
lastModified: '2026-04-21',
changeFrequency: 'weekly',
priority: 1.0,
},
{ url: 'https://example.com/about' },
],
},
linkHeaders: {
'/': [
{ href: '/.well-known/api-catalog', rel: 'api-catalog' },
{ href: '/sitemap.xml', rel: 'sitemap' },
],
},
apiCatalog: {
linkset: [
{
anchor: 'https://example.com/api/contact',
'service-desc': [
{ href: '/api/openapi.json', type: 'application/openapi+json;version=3.1' },
],
},
],
},
agentSkills: [
{
id: 'contact',
name: 'Contact',
description: 'Send a message',
manifest: { endpoint: '/api/contact' },
},
],
});// src/app/robots.txt/route.ts
import { createRobotsRoute } from '@agent-ready-web/next';
import config from '@/agent-readiness.config';
export const dynamic = 'force-dynamic';
export const GET = createRobotsRoute(config);// src/app/sitemap.ts
import { createSitemap } from '@agent-ready-web/next';
import config from '@/agent-readiness.config';
export const dynamic = 'force-dynamic';
export default createSitemap(config);// src/middleware.ts
import { createMiddleware } from '@agent-ready-web/next';
import configModule from '@/agent-readiness.config';
export const middleware = createMiddleware(configModule);
export const config = {
matcher: [
'/((?!_next/static|_next/image|favicon.ico|.*\\.(?:png|jpg|jpeg|svg|webp|ico|woff2?)).*)',
],
};// src/app/.well-known/api-catalog/route.ts
import { createApiCatalogRoute } from '@agent-ready-web/next';
import config from '@/agent-readiness.config';
export const dynamic = 'force-dynamic';
export const GET = createApiCatalogRoute(config);// src/app/.well-known/agent-skills/index.json/route.ts
import { createAgentSkillsRoute } from '@agent-ready-web/next';
import config from '@/agent-readiness.config';
export const GET = createAgentSkillsRoute(config);createRobots(config) is also available if you want to use Next.js's built-in MetadataRoute.Robots export — but it cannot emit the Content-Signal: directive because Next's metadata shape has no field for it. Use createRobotsRoute when you want the full robots.txt including Content-Signal.
Quickstart: Cloudflare Workers
// src/index.ts
import { handleAgentRoutes } from '@agent-ready-web/cloudflare';
import config from './agent-readiness.config.js';
export default {
async fetch(request: Request): Promise<Response> {
const handled = await handleAgentRoutes(request, config);
if (handled) return handled;
// ... your normal Worker logic, or:
return new Response('Not found', { status: 404 });
},
} satisfies ExportedHandler;handleAgentRoutes returns a Response for exactly four paths — /robots.txt, /sitemap.xml, /.well-known/api-catalog (only if config.apiCatalog is defined), and /.well-known/agent-skills/index.json (only if config.agentSkills is defined). For any other path it returns null, leaving the Worker's existing logic free to handle it. It never proxies, augments, or buffers responses on other paths.
Full config reference
Every field of AgentReadinessConfig:
| Field | Type | Purpose |
| -------------------------------- | ----------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| siteUrl | string | Absolute origin of the site, e.g. 'https://example.com'. Used to build the Sitemap: line in robots.txt and any absolute URLs downstream. |
| contentSignals | { 'ai-train', 'search', 'ai-input' } | Emitted as a Content-Signal: directive inside the User-agent: * stanza (per draft-romm-aipref-contentsignals). Values are 'yes' or 'no'. See contentsignals.org. |
| aiCrawlers | Record<string, 'allow' \| 'disallow'> | Per-bot dispositions appended to robots.txt as User-agent: stanzas with Allow: / or Disallow: /. Common keys: GPTBot, OAI-SearchBot, Claude-Web, anthropic-ai, Google-Extended, PerplexityBot, CCBot. |
| userAgents | UserAgentRule[] | Human-equivalent User-agent: stanzas with allow and disallow path arrays (plus optional crawlDelay). Typically [{ name: '*', allow: ['/'], disallow: ['/api/', '/admin'] }]. |
| sitemap | { urls: () => Promise<SitemapEntry[]> } | Async URL source for /sitemap.xml. Each entry has url plus optional lastModified, changeFrequency, priority. |
| linkHeaders | Record<pathname, LinkHeaderEntry[]> | Per-path Link: header directives per RFC 8288. Use to advertise api-catalog, sitemap, and other rel links from HTML responses. Applied by createMiddleware. |
| apiCatalog (optional) | { linkset: ApiCatalogLinksetItem[] } | RFC 9727 linkset served at /.well-known/api-catalog. Each item has an anchor URL and arrays of service-desc, service-doc, service-meta links. Omit if the site has no public API. |
| agentSkills (optional) | AgentSkill[] | Skills declared to agents. Each skill has id, name, description, manifest (arbitrary JSON). Served at /.well-known/agent-skills/index.json with auto-computed SHA-256 digests over canonical-JSON manifests. |
| markdownNegotiation (optional) | { enabled: boolean, exclude?, maxBodyBytes? } | Not implemented in v0.1. Setting enabled: true causes createMiddleware to throw Not implemented. See Markdown negotiation. |
Content-Signals classification
ai-train=yesmeans "use my content for AI model training". Setnoto opt out of training.search=yesmeans "allow search-style retrieval for citation and grounding". Most sites wantyes.ai-input=yesmeans "allow use as input to user-facing AI features" (e.g., summaries in chat answers). Related to but distinct from training.
AI crawler classification gotchas
Google-Extendedis Google's training-only opt-out token (Bard/Gemini), not a search crawler. RegularGooglebothandles Google Search and follows standardUser-agent: *rules.Claude-Webis Anthropic's retrieval bot for Claude search/citation.anthropic-aiis Anthropic's training bot. On sites withai-train=no,Claude-Webcan still be allowed (search) whileanthropic-aiis disallowed (training).OAI-SearchBotis OpenAI's search crawler.GPTBotis OpenAI's training bot.
Per-RFC mapping
Which isitagentready.com check each function satisfies:
| isitagentready item | Spec | Function |
| -------------------------------- | ---------------------------------------------------------------------------------------------------------------- | -------------------------------------------- |
| 1. robots.txt present | RFC 9309 | robotsTxt, createRobotsRoute |
| 2. sitemap.xml present | sitemaps.org 0.9 | sitemap, createSitemap |
| 3. Content-Signals in robots.txt | contentsignals.org | robotsTxt, createRobotsRoute |
| 4. Markdown negotiation | Cloudflare: Markdown for Agents | v0.2 |
| 5. Link headers | RFC 8288 | linkHeaders, createMiddleware |
| 6. Agent Skills Discovery | Cloudflare RFC | agentSkillsIndex, createAgentSkillsRoute |
| 7. API Catalog | RFC 9727 | apiCatalog, createApiCatalogRoute |
Markdown negotiation (deferred to v0.2)
Markdown negotiation requires a post-render response-transformation point that Next.js/Vercel Edge middleware does not provide — middleware runs before rendering and cannot inspect or rewrite the rendered HTML body. The contract below will be implemented in v0.2 via a Cloudflare Worker front layer helper (@agent-ready-web/cloudflare/markdown-negotiate).
When v0.2 ships, the adapter will transform HTML → Markdown only when all of the following hold:
- Request method is
GETorHEAD. - Request
Acceptheader preferstext/markdownovertext/htmlby proper q-value parsing (not substring match). - Upstream response is
200 OK. - Upstream response
Content-Typeistext/html(any charset). - Upstream response body size does not exceed configured
maxBodyBytes(default 2 MB). If exceeded, the HTML is returned unchanged. - Request path is not in
markdownNegotiation.exclude(defaults:/api,/admin,/auth,/_next). Vary: Acceptis always emitted on eligible paths regardless of which variant is served. The HTML path preserves streaming vianew Response(body, { ...init, headers })without reading the body stream.
Setting config.markdownNegotiation.enabled = true in v0.1 causes createMiddleware to throw Not implemented so that accidental early opt-in is caught at boot rather than silently returning HTML forever.
Contributing
- Fork, clone,
pnpm install. - Tests:
pnpm run test. TDD is expected — write a failing test before adding behavior. - Format:
pnpm run format. Lint:pnpm run lint. Typecheck:pnpm run typecheck. - Build:
pnpm -r --filter "./packages/*" run build. - Pack smoke:
./scripts/pack-smoke.sh(packs each package, installs tarballs into scratch copies of the examples, runs their test suites). - Before opening a PR, add a changeset:
pnpm changeset. - The CI matrix runs on Node 20.x and 22.x with pnpm 9.x. All checks must be green before merge.
- Releases are driven by changesets. A PR with changesets is version-bumped and published automatically when merged to
mainvia npm OIDC provenance — no long-lived tokens.
License
MIT © lexabu
