@autoview/cli
v0.4.3
Published
Read any OpenAPI document, prove it can be driven as a product, and emit the surfaces that drive it — a human frontend and an agent tool surface (MCP) — each verified.
Downloads
1,332
Readme
AutoView
Read any OpenAPI document, prove it can actually be driven as a product, and emit the surfaces that drive it — a human frontend, an agent tool surface (MCP), an RFP QA verdict — each verified to work.
┌─▶ frontend/ a human admin UI (Next.js + shadcn/ui)
swagger.json ─┼─▶ --emit mcp an agent tool surface (runnable MCP server)
(any) ├─▶ --emit report consumability verdict (can this API be driven?)
├─▶ --emit qa --rfp RFP QA: does the API implement each requirement?
└─▶ verify proof it works (real browser / real agent)AutoView reads an OpenAPI 3.x document into a deterministic semantic model of the API — its resources, their hierarchy, which endpoint produces the id another endpoint consumes, what is a read vs a mutation — and projects that one model into the surfaces that consume the API. Then it verifies them: the frontend's user journeys in a real browser, the agent's tasks against the live backend, and a static consumability report that flags ids no endpoint can supply.
The point is not "a pretty generated screen" (that is a commodity). The point is proof: that your API is consumable as a product, demonstrated rather than asserted.
AutoView is standalone and the whole READ → MAP → EMIT pipeline is LLM-free and deterministic — same document, same output, every time. The domain map, the screen set, and the page render are all derived structurally, so even a large swagger generates whole, un-sliced (Box's 234 operations → 142 pages, DigitalOcean's 287 → 106, both typecheck-clean). A model you supply is used only as an optional typecheck-recovery pass during frontend generation, and for agent-task verification.
Origin note: AutoView was first written inside the
AutoBE monorepo and is published as
@autoview/cli. All @autobe/* runtime dependencies have been removed.
Table of contents
- The one idea
- What it emits
- From an AutoBE backend to a UI (demo workflow)
- Quick start
- Verification — proof, not assertion
- How it reads the API (the IR)
- CLI
- Install & programmatic API
- Verified on
- Honest limits
- License
The one idea
Everything is one deterministic IR (intermediate representation — the API's semantic model) and several emits off it. Build the IR once; every surface and every check is a projection of it.
┌── frontend (human view)
swagger ─▶ IR ───────┼── tool surface / MCP (agent view)
│ └── consumability report (verdict)
│
└── verification (proves the emits actually work)The IR carries, per operation: the resource it belongs to and the resource
hierarchy (sales → questions → comments), its role (list / detail /
create / update / delete / action → read vs write), and the producer→consumer
chain (the saleId that questions.list needs is produced by sales.list's
id). That model is what a flat "OpenAPI → tools" or "OpenAPI → CRUD UI" dump
lacks, and it is what makes the surfaces navigable and the report possible.
What it emits
| Emit | Command | For | LLM |
| --- | --- | --- | --- |
| Frontend | autoview swagger.json --out ./frontend | humans — a Next.js + shadcn/ui admin (sidebar nav, dense tables, detail views with actions, forms) | optional |
| Agent tool surface (MCP) | autoview swagger.json --emit mcp | AI agents — a runnable MCP server; each tool carries read/write annotations and producer hints, and executes the real API over HTTP | none |
| Consumability report | autoview swagger.json --emit report | API authors / CI — a deterministic verdict: every id a tool needs should be obtainable from another tool | none |
| RFP QA report | autoview swagger.json --emit qa --rfp <reqs> | QA / product — does the API implement and drive each requirement? (coverage + missing capabilities) | none¹ |
| Preview | autoview swagger.json --preview | a quick look — the endpoint tree + schemas as one HTML page | none |
All five read the same IR. The frontend and the MCP server are the same resource model rendered for two different consumers; the reports are that model checked for navigability (consumability) and against a spec (QA).
¹ The QA report itself is LLM-free; only turning a prose RFP into the structured requirement list needs a model. A structured JSON RFP runs with no key.
Two different things are called a "mode" — don't conflate them.
- Emit mode — what AutoView produces, chosen on the
autoviewcommand line (the rows above: frontend /--emit mcp/--emit report/--emit qa/--preview).- Data mode — how the generated frontend runs, chosen later at
npm run devtime: simulate (typia-mock data, no backend —NEXT_PUBLIC_API_SIMULATE=true) or live (real backend —NEXT_PUBLIC_API_HOST=<host>). It is just an env toggle on the generated app, switchable without regenerating.For a demo you only touch one emit (frontend) and one data mode at a time (simulate to always have populated screens; live when the backend is up).
From an AutoBE backend to a UI (demo workflow)
AutoBE generates a backend and its OpenAPI document. AutoView turns that document
into a running Next.js admin you can open in a browser — the human-facing proof
that the AutoBE backend actually works. The whole flow is four commands, and with
--no-llm no API key is needed at all — the READ → MAP → render pipeline is
fully deterministic.
TL;DR — swagger → localhost in one paste
Where is the swagger? An AutoBE-generated backend always carries its OpenAPI document at a fixed path inside the project folder:
<project>/packages/api/swagger.jsonThat exact file is AutoView's input — the CLI reads the path you give it, it does
not scan a folder, so point it straight at packages/api/swagger.json.
Copy the whole block (no inline comments — paste-safe). Set PROJECT to the
AutoBE backend folder.
PROJECT=/Users/yongrean/Downloads/AutoBE.interfaceComplete
SWAGGER="$PROJECT/packages/api/swagger.json"
OUT="$PROJECT-frontend"
cd /Users/yongrean/Downloads/AutoView-Legacy
npm install
npm run build
node lib/cli/main.js "$SWAGGER" --out "$OUT" --no-llm
cd "$OUT"
npm install
PORT=3000 NEXT_PUBLIC_API_SIMULATE=true npm run devOpen the exact URL the dev server prints — the Local: http://localhost:PORT
line. Do not assume 3000: if that port is already taken, Next.js silently bumps
to 3001, 3002, … and you would otherwise be staring at a different (possibly
stale) app on 3000. If you see ⚠ Port 3000 is in use, trying 3001 instead,
a previous npm run dev is still running — either open the new port it picked,
or free 3000 first:
lsof -tiTCP:3000 -sTCP:LISTEN | xargs kill # kill whatever holds 3000, then rerunNEXT_PUBLIC_API_SIMULATE=true boots with typia-mock data so every screen is
walkable without a backend. To run a second app alongside the first (e.g. a
-2 project), give it its own port: PORT=3001 …. For real data instead of
mock, swap the env for NEXT_PUBLIC_API_HOST=https://your-backend-host.
The numbered steps below break this same flow apart and explain each flag.
0 · One-time setup (in this repo)
npm install
npm run build # builds lib/cli/main.js — invoked below as `node lib/cli/main.js`1 · Inspect the swagger first — instant
node lib/cli/main.js ./erp.swagger.json --preview --out ./out
open ./out/preview.html # endpoint tree + schemas + a READ-layer report2 · Generate the frontend (no key)
node lib/cli/main.js ./erp.swagger.json --out ./erp-frontend --no-llm \
--backend https://your-erp-host--no-llm generates everything deterministically — same swagger, same output,
zero tokens. (Drop --no-llm and pass --model/--api-key only if you want the
optional LLM pass that repairs a page on the rare chance the deterministic render
does not typecheck.)
--backendis the live ERP host baked intoconnection.ts. Omit it and AutoView auto-extractsservers[0].urlfrom the swagger.- Large ERP swagger? Slice it —
--includekeeps only matching paths and the component schemas they reference, so even a huge spec generates:--include "erp/admin/**,erp/inventory/**"(or generate the whole thing; Box's 234 operations → 142 pages compiles clean).
3 · Open it in the browser
cd erp-frontend && npm install
# A) Guaranteed populated demo — typia-mock data, no backend needed:
PORT=3000 NEXT_PUBLIC_API_SIMULATE=true npm run dev
# B) Real data — against the live ERP backend (host must be reachable + allow CORS):
PORT=3000 NEXT_PUBLIC_API_HOST=https://your-erp-host npm run devThen open the Local: http://localhost:PORT line the dev server prints — not a
hard-coded 3000. If 3000 is busy Next.js silently falls back to 3001/3002,
and ⚠ Port 3000 is in use means a previous npm run dev is still holding it
(lsof -tiTCP:3000 -sTCP:LISTEN | xargs kill frees it).
For a demo where the backend might not be reachable, use (A) — every list, detail, and form renders with mock data so the whole UI is walkable. Switch to (B) the moment the real ERP host is up to show live rows.
Tip for an ERP swagger with authentication: in live mode the app may try to bootstrap a session against the auth endpoints on first load. If that stalls (host down, CORS, no seed account), demo in simulate mode (A) — it never calls the backend, so the full UI is always walkable.
Common knobs
| Goal | Flag |
| --- | --- |
| Slice a huge ERP spec | --include "erp/admin/**" (and --exclude to drop noise) |
| Force mock mode at generate time | --backend="" |
| Inspect without generating | --preview (HTML) or --emit report (navigability verdict) |
| Tune render concurrency | --semaphore 8 |
Quick start
npm install -g @autoview/cli # installs the `autoview` command — or: npx @autoview/cli ...A human frontend — deterministic, no key needed with --no-llm:
autoview ./swagger.json --out ./frontend --no-llm
cd frontend && npm install && npm run dev # open the Local URL it prints(Drop --no-llm and pass --model/--api-key to add the optional LLM
typecheck-recovery pass.)
An agent tool surface (MCP server):
autoview ./swagger.json --emit mcp --out ./mcp
cd mcp && npm install
API_HOST=https://api.example.com API_TOKEN="Bearer …" npm startThen wire it into Claude Desktop / Cursor / Claude Code (the generated
README.md has the exact mcpServers JSON).
A consumability verdict:
autoview ./swagger.json --emit report --out ./out
cat out/consumability-report.mdAn RFP QA verdict — does the API implement each requirement? (no key for a
structured JSON list; a prose RFP needs --model):
autoview ./swagger.json --emit qa --rfp ./requirements.json --out ./out
cat out/qa-report.mdVerification — proof, not assertion
A typecheck score is not proof a product works. AutoView verifies the emits the way a user (or an agent) actually exercises them.
- Frontend workflows —
autoview … --verifyboots the generated app in a real headless browser and walks the derived user journeys (open the list → it renders rows or an honest empty state → open a record → its fields render), writingwiki/verification.mdwith per-step pass/fail evidence. - Agent tasks —
verifyAgentTasks(document, { client, model, baseUrl, tasks })drives a real agent through named tasks against the tool surface + live backend and grades each. The model and key are injected by you (e.g. from.env) — which model your API's agents use is your call, not ours. - Structural (LLM-free) —
analyzeConsumability(document)proves the tool graph is navigable, with one model shared by the tool surface so the report never claims navigability the tools don't hint. Each path-param input is one of: resolved (a list/search tool produces the id and the surface says which one), nested (the id appears inside a parent read's response —order→goods[].id— navigable but not via a dedicated list), orphan (an entity*_idno endpoint can supply — a genuine gap), caller-supplied key (repository_name,scope— a human-known value, not a chained id), or undetermined (the resource is read but no id could be traced, often an inline schema). A detail read is never counted as its own producer (circular); only true id orphans count as defects.
That find-and-fix loop is the product: run it, and you get the surface plus the evidence it works — or exactly which step/endpoint breaks.
Run it yourself
examples/verify-agent.ts verifies a swagger as an
agent tool surface both ways — structurally (always) and behaviorally (when a
model is configured):
npm run verify:agent # bundled petstore → public backend
# or: npm run verify:agent ./your.json https://api.example.com── structural (deterministic) ──
18 tools — 8 read, 10 write
3 id inputs resolved · 0 orphan · 100% navigable
── behavioral (openai/gpt-4o-mini → https://petstore3.swagger.io/api/v3) ──
PASS list
tools: pet.findByStatus.get
PASS producer→consumer chain
tools: pet.findByStatus.get → pet.getByPetid
answer: The pet's ID is 105484548 and its name is "modi".
2/2 tasks passed end-to-end (real agent, live backend).(The behavioral transcript was recorded 2026-06-08; petstore3 is a public
demo server and is sometimes down — on a bad day the structural pass still
runs, and the agent honestly reports the failing calls. The 3/2/3 structural
split: petId ×3 resolved from the pet list, orderId ×2 untraceable — the
API has no order list to produce it — and username ×3 caller-supplied.)
The chain task is the point: the agent lists pets, takes an id from the
response, and calls the detail endpoint with it — the producer→consumer
link the tool surface declares, exercised against a live server. The structural
pass runs with no key; the behavioral pass needs AUTOVIEW_API_KEY +
AUTOVIEW_MODEL (the model is your choice).
The measured wedge — where the hints actually matter
We A/B'd the shaped surface against a naive METHOD /path tool dump — same
model, same tools, same deterministic backend, blind arms, temperature 0,
five graded tasks × 5 runs each (suite ·
full results):
| task | naive dump | shaped surface | what it measures |
| -------- | ---------- | -------------- | ------------------------------------------ |
| flat | 5/5 | 5/5 | control — list → read a field |
| nested | 5/5 | 5/5 | read a value buried in units[].stocks[] |
| write | 5/5 | 5/5 | source one id, POST, report the response |
| chained | 0/5 | 5/5 | a discovered id must FEED a follow-up call |
| boundary | 5/5 | 5/5 | the id only appears under a renamed field |
The verdict is narrower and sharper than "agents fail on nested data":
reads take care of themselves — a capable model lifts values (even renamed
ids) straight out of response bodies. What it cannot do is know where an id
comes from the moment that id becomes an input to the next call: the naive
arm fabricated stock ids and 404'd on every single chained run. The producer
chain in the tool descriptions eliminates exactly that failure,
deterministically. And ids don't only travel as path params: a cart's
commodities.create has no path params at all — its sale_id /
stocks[].unit_id / stocks[].stock_id live in the request body. The surface
wires those too (bodyProducers): the producing tool is annotated on the body
schema property itself, in the description, and in the error channel, with the
consumer's own namespace excluded (the cart's index can't bootstrap the cart's
first create). Reproduce it with your own model:
AUTOVIEW_SELFTEST=1 npm run ab:suite # LLM-free wiring check first
npm run ab:suite -- 5 # needs AUTOVIEW_API_KEY + AUTOVIEW_MODELAnd it scales: with 110 distractor tools from the real shopping swagger on the
table (AUTOVIEW_DISTRACTORS=examples/shopping.swagger.json), chained stays
0/5 vs 5/5 — and the naive arm additionally collapses on plain tool
selection (1/5 on a task it solved at 6 tools, blind-firing destructive
calls while wandering), while the shaped surface stays 5/5 throughout.
Reproducibility note (2026-06-12). Every number above was re-run a day
after it was recorded. The headline rows reproduce exactly: chained 0/5 vs
5/5 at 6 tools and again at 116 tools, ab:nested 0/5 vs 5/5, and the
naive selection collapse under noise (boundary 0–1/5). The non-headline
116-tool control rows drifted with the provider snapshot (shaped flat
5/5 → 0/5 — identical traces at two different code revisions, so
provider-side, not ours), and live-backend runs of the real-surface tasks
swung between same-day runs (temperature: 0 does not pin OpenRouter
outputs). Two standing conclusions: treat live/small-N behavioral runs as
smoke and defect discovery, never headline metrics — and the drift itself
exposed a real failure mode worth knowing: when the first tool selection
lands in the wrong distractor cluster, hint-following keeps the agent
coherently wrong, while a hint-less arm recovers by name-scanning. The full
reproduction audit is in
the results file.
RFP QA — check the API against a spec
--emit qa re-aims the consumability check from "is this navigable" to "does
this satisfy the requirements". Give it a requirement list and it rules on each:
satisfiable (the endpoints that implement it exist AND every id they need is
obtainable), unreachable (implemented, but an id has no producer),
missing (a named endpoint is absent — an API gap), or unmapped (could
not be tied to any endpoint). It never silently passes a requirement it could
not place.
A structured JSON requirement list runs with no key:
cat > rfp.json <<'EOF'
{ "requirements": [
{ "id": "FR-1", "statement": "An employee can browse invoices." },
{ "id": "FR-2", "statement": "An employee can create a journal entry." },
{ "id": "FR-3", "statement": "A manager can export the tax filing as PDF." }
] }
EOF
autoview ./erp.swagger.json --emit qa --rfp ./rfp.json --out ./out- 3 requirements · 67% satisfiable end-to-end.
- ✅ 2 satisfiable — backing endpoints exist and are reachable.
- ❔ 1 unmapped — could not be tied to any endpoint.
## 🔴 Functional capabilities with no endpoint
- FR-3 A manager can export the tax filing as PDF. ← the API never implemented itA prose RFP (.md / .txt) works too — the model splits it into atomic
requirements and maps each to real endpoints (using the document's own AutoBE
annotations as context), then the verdict above is computed deterministically.
That parse is the only model-gated step; pass --model / --api-key:
autoview ./erp.swagger.json --emit qa --rfp ./requirements.md \
--model gpt-4.1-mini --api-key "$OPENAI_API_KEY" --out ./outTo go one step further and actually run each satisfiable requirement as a
real agent against a live backend (a live pass/fail beside the static verdict),
see examples/qa-live-shopping.ts.
How it reads the API (the IR)
swagger.json
│
▼ READ fromSwagger() → toEndpoints() (deterministic, LLM-free)
│ • upgrade Swagger 2 / OpenAPI 3.0/3.1 → one normalized document
│ • resolve $ref parameters (incl. cross-path JSON pointers),
│ flatten allOf composition, drop unroutable (`#`-fragment) paths
│ • every operation: method · path · accessor · path-params ·
│ QUERY · requestBody · responseBody
│
▼ MAP classifyEndpoints → resourcePlan (deterministic, LLM-free)
│ • CRUD role per endpoint (read vs write)
│ • resource hierarchy / chain from the path
│ • producer→consumer links (which id comes from where)
│
▼ EMIT frontend page-gen · MCP tools · report (deterministic)
│ + optional LLM typecheck-recovery pass on the frontend only
▼
surfaces + verificationThe READ and MAP layers — the IR — are 100% deterministic and reused by every
emit. Robustness was hardened against real third-party specs: $ref parameter
pointers (DigitalOcean), allOf composition and entries-style collection
wrappers (Box), numeric ids, and #-fragment paths.
CLI
autoview <swagger.json> [options]
--out <dir> Output directory (default: ./frontend)
--emit mcp Generate a runnable MCP server (agent tool surface). LLM-free.
--emit report Write consumability-report.md (navigable graph + orphans). LLM-free.
--emit qa --rfp <f> Write qa-report.md — does the API implement & drive each
requirement in <f>? <f> is structured JSON
({ "requirements": [{ id, statement, endpoints? }] }, LLM-free)
or a prose RFP (.md/.txt → needs --model to structure it).
--preview Write preview.html (endpoint tree + schemas). LLM-free.
--verify Verify the frontend in a real headless browser (user workflows).
--backend <url> Live backend for the frontend's connection.ts / the report's hints.
Auto-extracted from the swagger's servers[] when omitted.
--include <globs> Comma-separated path globs to KEEP — slices the whole
document (paths AND the component schemas they reference),
so a large swagger fits the model and actually generates.
--exclude <globs> Comma-separated path globs to DROP (applied after include).
--model <name> LLM model (only for the optional frontend RENDER polish).
--api-key <key> Vendor key (or AUTOVIEW_API_KEY / OPENAI_API_KEY env).
--base-url <url> OpenAI-compatible endpoint (or AUTOVIEW_BASE_URL env).--emit, --report, and --preview need no model or key — they are
deterministic.
Install & programmatic API
npm install @autoview/cliimport {
fromSwagger,
buildToolSurface, // IR → agent tools (read/write annotated, producer hints)
emitMcpServer, // IR → a runnable MCP server project (files map)
analyzeConsumability, // IR → navigable / orphan / undetermined breakdown
emitConsumabilityReport, // → markdown report
analyzeRequirementCoverage, // requirements + IR → per-requirement QA verdict
emitQaReport, // → markdown RFP QA report
parseRfp, // prose RFP + IR → structured requirements (LLM)
runRequirementChecks, // drive each requirement live against a backend
verifyAgentTasks, // run an agent through tasks against the tool surface
AutoViewAgent, // the frontend generator (optional LLM polish)
} from "@autoview/cli";
const document = fromSwagger(require("./swagger.json"));
// e.g. RFP QA, no LLM:
const qa = emitQaReport(
[{ id: "FR-1", statement: "Browse invoices", endpoints: ["invoices.get"] }],
document,
);
// agent tool surface + structural proof — no LLM, no network
const tools = buildToolSurface(document);
const coverage = analyzeConsumability(document);
console.log(`${tools.length} tools, ${coverage.orphan.length} orphan inputs`);
// a runnable MCP server, written to disk by the caller
const files = emitMcpServer(document, { backend: "https://api.example.com" });Verified on
Real third-party OpenAPI documents, end-to-end:
| Swagger | Domain | Ops | Frontend typecheck | Navigability (report) |
| --- | --- | --- | --- | --- |
| @samchon/shopping | e-commerce | 206 | 0 errors | 100% (203 resolved + 11 nested, 0 orphan) |
| DigitalOcean | cloud infra | 287 | 0 errors (106 pages, whole) | 99% (146 resolved, 1 orphan; 47 undetermined, 52 caller-keys) |
| Box | file platform | 234 | 0 errors (142 pages, whole) | 97% (77 resolved, 2 orphan; 84 undetermined, 19 caller-keys) |
| Petstore | (canonical) | 18 | 0 errors (11 pages) | 100% (3 resolved, 0 orphan) |
Frontend workflow verification (real browser): shopping 13/13 journeys pass, DigitalOcean 7/7. The emitted MCP server was driven end-to-end by a real MCP client; agent tasks pass against the tool surface with a configurable model.
Honest limits
- Tool shaping helps exactly where the chain is hard to guess. On a flat
1-hop chain (list → detail), a capable model reconstructs it from names alone,
so a naive surface chains about as well as the shaped one — shaping is not
magic there. But on a nested chain — an id buried in a parent's response
array (a sale's
units[].stocks[].id) — the model cannot guess where the id lives, and a naive surface fails. A blind A/B (same model, same backend;examples/ab-nested-chain.ts,npm run ab:nested) is decisive: naive 0/5, shaped 5/5 — the naive agent blindly calls the deep endpoint with ids it doesn't have; the shaped agent is told "stockId ← sale detail atunits[].stocks[].id" and gets it in two calls. So the durable value is the read/write safety annotations, the deterministic navigability proof, AND the nested producer hints that close chains a model can't infer. - Inline response schemas are partly handled. The producer→consumer analysis
now introspects inline object responses and resource-named collection wrappers
(
{ databases: [...] }) including inline array items — lifting DigitalOcean's navigability from 13% to 86%. Inputs it still cannot trace are reported as undetermined, never as a defect. - Frontend RENDER is deterministic by default. The pages are generated from the schema (one column per field, typed SDK wiring) — no LLM needed for a working app. An LLM is only used for an optional aesthetic polish pass.
License
See LICENSE.
