@vauban-org/agent-sdk-conformance
v0.1.0
Published
AG-UI normative conformance test suite. Validates any SSE-based agent server against the canonical AG-UI Protocol specification (event shapes, wire format, ordering, headers, auth contracts).
Maintainers
Readme
@vauban-org/agent-sdk-conformance
Normative AG-UI conformance test suite. Drop it in your CI, point it at any SSE-based agent server, get a structured pass/fail report against the canonical AG-UI Protocol specification.
Published under the GitHub Packages registry of vauban-org. MIT-licensed so any third party (CopilotKit, AWS Bedrock AgentCore, Google Agent SDK, in-house implementations) can validate their own AG-UI-compatible servers.
What this package tests
The suite produces a ConformanceReport with the following checks (each PASS / FAIL / inconclusive):
| Check id | What it verifies |
|---|---|
| headers.content_type | Response header is Content-Type: text/event-stream. |
| headers.cache_control | Response header includes Cache-Control: no-cache. |
| headers.connection | Response header is Connection: keep-alive (or absent under HTTP/2). |
| wire.separator_blank_line | Every emitted frame ends with the blank line dispatch (\n\n). |
| wire.event_field_present | At least one frame carries an explicit event: field. |
| wire.data_field_parses_as_json | Every non-empty data: field parses as JSON. |
| events.shape_conformance | Every recognized event matches the AG-UI shape (zod-validated). |
| events.type_in_vocabulary | Every event type is one of the 19 canonical AG-UI types. |
| ordering.start_before_steps | RUN_STARTED (or run.start) precedes any STEP_* event. |
| ordering.finished_terminates | RUN_FINISHED is the last non-state event of the run. |
| ordering.monotonic_ids | Wire id: fields (or in-payload seq) increase strictly. |
| reconnect.since_gap_free | ?since=N replays events with seq > N, no gap. |
| auth.401_on_missing_bearer | Unauthenticated requests return HTTP 401. |
| auth.403_on_insufficient_scope | Read-only tokens get HTTP 403 on write paths. |
| auth.dpop_required_on_cnf | cnf-bound sub-tokens trigger 401 dpop_required without a DPoP header. |
The suite reads up to maxFrames (default 32) live frames off your server, then exits. It performs at most three additional probes (reconnect, missing bearer, insufficient scope) and an optional DPoP probe.
Usage
Programmatic
import { runConformanceSuite } from "@vauban-org/agent-sdk-conformance";
const report = await runConformanceSuite({
baseUrl: "http://localhost:3113",
streamPath: "/remote/stream", // default
writeProbePath: "/remote/inject", // default
bearer: process.env.AGENT_BEARER,
insufficientScopeBearer: process.env.READ_ONLY_BEARER,
cnfBoundBearer: process.env.DPOP_BEARER,
maxFrames: 32,
timeoutMs: 5000,
});
if (!report.ok) {
for (const c of report.checks.filter((x) => !x.ok)) {
console.error(`[FAIL] ${c.id} ; ${c.detail}`);
}
process.exit(1);
}CLI
npx @vauban-org/agent-sdk-conformance \
--base-url http://localhost:3113 \
--bearer "$AGENT_BEARER" \
--insufficient-scope-bearer "$READ_ONLY_BEARER"Exit code 0 on full conformance, 1 on any failed check. JSON report on stdout, one-line summary on stderr.
Naming gap (Vauban-flavoured vs canonical AG-UI)
The AG-UI specification defines event type as UPPER_SNAKE_CASE (e.g. RUN_STARTED, TEXT_MESSAGE_CONTENT, TOOL_CALL_START). The Vauban @vauban-org/agent-sdk has shipped dotted-lowercase types since 2.11.0 (run.start, assistant.delta, tool.call.start).
Both vocabularies are semantically aligned (same lifecycle, same shape ; the SDK source explicitly cross-references AG-UI for each event), but they are NOT byte-identical on the wire. The conformance suite treats this as a real gap and flags it via the events.type_in_vocabulary check.
Canonical-mode entrypoint (SDK 2.27.0+)
@vauban-org/agent-sdk 2.27.0 ships a dual-emission bridge. A server that boots its hub in canonical mode now PASSES events.type_in_vocabulary against this suite :
import { createRemoteControlHub, createRemoteControlServer } from "@vauban-org/agent-sdk";
const hub = createRemoteControlHub({ eventNaming: "canonical" });
const server = await createRemoteControlServer(hub, { token });
// Pointing this conformance suite at `server.url` now PASSES all checks
// including events.type_in_vocabulary.The hub normalises every emitted event at the boundary via normalizeEventType() ; the bidirectional mapping lives in packages/agent-sdk/src/remote/event-name-map.ts (LEGACY_TO_CANONICAL + CANONICAL_TO_LEGACY).
What this means today
If you ship a server emitting legacy Vauban events (the default in 2.27.0) :
- Wire format checks PASS (headers + framing + JSON are clean).
events.type_in_vocabularyFAILS by design ; this is the documented canary.- The legacy canary in
tests/reference-server.test.tsstill asserts the FAIL ; it remains the regression guard for the legacy default until 2.28.0 flips the default to canonical.
If you ship a server emitting canonical Vauban events (eventNaming: "canonical") :
- All checks PASS, including
events.type_in_vocabulary. - The new canonical describe block in
tests/reference-server.test.tsasserts the PASS ; it was added alongside 2.27.0 to seal the bridge.
If you ship a strictly AG-UI-compatible server (non-Vauban), this entire section is moot ; canonical is your only mode and the suite validates you directly.
Migration path
| SDK version | Default hub mode | Conformance result on default |
|---|---|---|
| 2.11.0 through 2.26.x | legacy (no option) | events.type_in_vocabulary FAIL (expected) |
| 2.27.0 | legacy (option canonical opt-in) | FAIL on default ; PASS on opt-in |
| 2.28.0 | canonical (option legacy opt-out) | PASS on default ; opt-out path documented |
| 3.0.0 | canonical only | PASS on default ; legacy mode removed |
AG-UI specification references
Consulted 2026-05 :
- AG-UI Protocol homepage : https://docs.ag-ui.com
- Events concept : https://docs.ag-ui.com/concepts/events
- JS SDK event reference : https://docs.ag-ui.com/sdk/js/core/events
- Architecture (transport-agnostic, SSE + WS + webhook) : https://docs.ag-ui.com/concepts/architecture
- GitHub : https://github.com/ag-ui-protocol/ag-ui (Linux Foundation, 2026)
AG-UI itself does not normatively mandate HTTP headers or a ?since= reconnect query parameter ; the suite enforces the de-facto stack (WHATWG SSE headers + Last-Event-ID semantics) because every production-grade implementation relies on it. Servers that ship WebSocket-only transports MUST signal that to the runner via a future transport: "websocket" option (not yet implemented ; see runner.ts).
Boundaries
- The suite consumes servers via
fetch; it has zero runtime dependency on@vauban-org/agent-sdk. The SDK is only listed as a devDependency to support thereference-server.test.tsintegration test. - The suite does NOT mutate any state on the target server unless
writeProbePathis configured AND aninsufficientScopeBeareris provided. The write probe sends a benign payload ({"text":"conformance-probe"}). - The suite does NOT attempt to verify Ed25519 signatures on events. Signature conformance is a Vauban-specific extension above AG-UI ; a separate suite would be required.
Layout
packages/agent-sdk-conformance/
├── src/
│ ├── index.ts ; package surface
│ ├── spec.ts ; AG-UI canonical zod shapes + validator oracle
│ ├── sse-parser.ts ; strict WHATWG SSE parser
│ ├── runner.ts ; runConformanceSuite() ; talks to your server via fetch
│ └── bin/cli.ts ; agent-sdk-conformance CLI
├── tests/
│ ├── spec.test.ts ; vocabulary + shape validator unit tests
│ ├── sse-parser.test.ts ; SSE framing + incremental parsing tests
│ ├── runner.test.ts ; runner integration against fake conformant + broken servers
│ └── reference-server.test.ts ; integration against the in-repo Vauban SSE reference server
├── package.json
├── tsconfig.json
└── vitest.config.tsRunning the suite locally
pnpm install
pnpm --filter @vauban-org/agent-sdk-conformance test
pnpm --filter @vauban-org/agent-sdk-conformance buildVersioning
Pre-1.0 ; the conformance check set may evolve as AG-UI itself stabilizes (the 1.0 spec milestone is targeted at the Linux Foundation cadence). Check ids are stable surface : adding a check is a MINOR bump, removing or renaming an existing id is a MAJOR bump.
License
MIT. See repository root LICENSE.
