express-agentserver

v0.1.0

Published

14 days ago

Express bindings + host facade for the OpenAI Responses API and Azure AI Foundry hosted-agent protocols (Responses / Invocations / A2A). TypeScript port of azure-ai-agentserver-*, so Node.js teams can ship an agent to Foundry, any OpenAI-compatible runtim

Downloads

126

express-agentserver

Express bindings for the OpenAI Responses API + Azure AI Foundry hosted-agent protocols. Mountable Routers and middleware that implement the wire contract (SSE event stream, sequence numbers, lifecycle events, function-call / reasoning / tool-call output items, background mode, cancellation, conversation history), plus a startAgent host facade for the 90% case — so a Node.js team can ship an agent to Foundry, any OpenAI-compatible runtime, or their own infrastructure with one function call or hand-wired middleware, whichever fits.

Status: verified end-to-end against the live Foundry runtime in NCUS (every cookbook example is deployed there as its own hosted agent). ~6,500 LOC TS, 118 Vitest tests, all wire-format gotchas (id partition keys, auto-stamps, agent_session_id, memory-store type: "message", etc.) discovered by A/B against Python's azure-ai-agentserver-* and patched. Foundry behaviour rules B2 / B13 / B16 / B38 / B39 / B40 plus background-mode (B45) all enforced in the lib.

Foundry vs. non-Foundry — the lib works for both

The Responses-protocol wire format is OpenAI's published spec, not Foundry-specific. Any OpenAI SDK client speaks to a responsesRouter out of the box. So the lib is two things at once:

A generic OpenAI-Responses-API server. Bring the lib, write a handler, point an OpenAI SDK client at your Express app — done. No Azure, no Foundry, no Microsoft anything required. The cookbook/01-minimal.ts and 12-custom-storage.ts examples don't reference Foundry at all.
A Foundry hosted-agent runtime adapter. When FOUNDRY_AGENT_NAME env is set (Foundry injects it), the lib auto-activates FoundryStorageProvider for cross-replica persistence, FoundryEnrichmentSpanProcessor for App Insights, the agent-MI auth path for sibling-resource calls (prompt agents, memory stores), and honours every Foundry behaviour rule documented above. Without that env, all those pieces are no-ops and you get a clean stand-alone server.

What's reusable for non-Foundry users:

| You need | Use | | -------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------ | | HTTP server speaking OpenAI Responses API | startAgent({ handler }) or raw responsesRouter | | SSE streaming with proper sequence_number + lifecycle events | ResponseEventStream builder | | Background-mode (202 + queued + poll) | body.background = true — handled by router | | Mid-stream cancellation | POST /responses/:id/cancel + abortSignal in handler | | Per-tenant isolation | x-agent-user-isolation-key → context.isolation.userKey | | Pluggable storage (Postgres, Redis, S3) | Implement ResponseProvider, pass via storage: option | | Multi-agent orchestration (LangGraph / OpenAI Agents / Anthropic / Google ADK / on-device Foundry Local) | examples/cookbook/frameworks/ — each wires the framework's stream events through to OpenAI Responses output | | OTel tracing + log export to your own collector | configureObservability() with OTEL_EXPORTER_OTLP_ENDPOINT env |

What's Foundry-specific (and silently no-ops elsewhere):

| Component | When it activates | Off-Foundry behaviour | | ------------------------------------------------------------------ | ----------------------------------------------- | --------------------------------------------------------------------------------- | | FoundryStorageProvider | FOUNDRY_AGENT_NAME env set | maybeCreateFoundryStore() returns null → fallback to InMemoryResponseProvider | | Foundry partition-key id format caresp_<16hex>00<32entropy> | always | Still emitted; it's a valid id from the client's POV | | Auto-stamps (agent_session_id, agent_reference, response_id) | always | Harmless extra fields if the client doesn't read them | | express-agentserver/foundry-tools subpath | only when imported | Doesn't load otherwise | | Azure Monitor exporter | APPLICATIONINSIGHTS_CONNECTION_STRING env set | Skipped; OTLP exporter still works for non-Azure backends |

In other words: start with cookbook/01-minimal.ts regardless of where you intend to deploy. The lib doesn't care; the Foundry-specific features wake up only when their environment flags are present.

Architecture at a glance

flowchart LR
    Client[Foundry runtime / SDK client] -->|HTTPS| App
    subgraph App["express() app"]
        direction TB
        MW[middleware: requestId · platformHeaders · inboundLogging · tracing · readiness]
        RR[/responses router\nB2/B13/B16/B38/B39/B40/B45/]
        IR[/invocations router\nopt-in via invocationsHandler/]
        Errors[agentServerErrorHandler]
        MW --> RR
        MW --> IR
        RR --> Errors
        IR --> Errors
    end
    RR --> Handler[your async generator handler]
    Handler -->|callAgent| Sibling[sibling Foundry agents]
    Handler -->|searchMemory / addMemories| Memory[Foundry memory store]
    RR -->|persist + replay| Storage[FoundryStorageProvider\nor InMemoryResponseProvider]
    App -->|spans + logs| OTel[OTel SDK → Azure Monitor / OTLP]
    App -.->|DEBUG=express-agentserver:*| Debug[debug npm package]

startAgent({ handler }) returns this whole assembly with sane defaults; power users compose the same pieces by hand.

Why this exists

Microsoft publishes azure-ai-agentserver-* for Python and Azure.AI.AgentServer.* for .NET, but not for Node.js. The contract is specific: SSE event shapes, per-event sequence_number, partition-keyed ids (caresp_<16hex>00<32entropy>), auto-stamped agent_session_id / response_id / agent_reference on persisted envelopes, x-platform-server / x-agent-invocation-id headers, /readiness probe, graceful shutdown. This package is a faithful port — byte-compatible output verified by transport-tracing the Python SDK.

Install

npm install express-agentserver express

Optional, for Foundry storage / observability / sibling-resource calls:

npm install @azure/identity \
            @azure/monitor-opentelemetry-exporter \
            @opentelemetry/sdk-trace-node \
            @opentelemetry/api-logs \
            @opentelemetry/sdk-logs

These are listed as optional peer dependencies — install only the ones you use. Without them the lib silently no-ops the corresponding code paths.

Two ways to use it

1. The host facade — `startAgent` (recommended for new projects)

One function, all defaults sane. Wires every middleware + router for the Foundry contract: OTel + W3C trace context, requestId, platformHeaders, inboundLogging, tracing, readiness, the responses router, optional invocations router, port + lifecycle, span flush on shutdown.

import { startAgent, TextResponse } from "express-agentserver";

await startAgent({
  handler: async function* (request, context) {
    yield* new TextResponse(context, request, {
      text: `Echo: ${context.getInputText() || "(no input)"}`,
    });
  },
});
// → boots OTel, picks Foundry/in-memory store automatically, mounts /responses,
//   honours PORT, attaches SIGTERM/SIGINT, flushes spans on shutdown.

For the version that returns the configured Express app without listening — so you can mount it inside an existing server — use createAgentApp:

import express from "express";
import { createAgentApp } from "express-agentserver";

const agentApp = await createAgentApp({ handler: myHandler });
const outer = express();
outer.use("/api/v1/agents/echo", agentApp); // composes inside your own app
outer.listen(8088);

startAgent returns { app, server, port, close } so you keep every escape hatch — including direct access to the underlying Express app.

2. Raw bindings — middleware + router factories (full control)

For when you need fine-grained control or want to see exactly what startAgent composes:

import express from "express";
import {
  attachLifecycle,
  CORE_SEGMENT,
  inboundLogging,
  platformHeaders,
  readiness,
  requestId,
  responsesRouter,
  RESPONSES_SEGMENT,
  TextResponse,
} from "express-agentserver";

const app = express();
app.use(express.json({ limit: "4mb" }));
app.use(requestId());
app.use(platformHeaders({ segments: [CORE_SEGMENT, RESPONSES_SEGMENT] }));
app.use(inboundLogging());
app.use(readiness());
app.use(
  "/responses",
  responsesRouter({
    handler: async function* (request, context) {
      yield* new TextResponse(context, request, { text: `Echo: ${context.getInputText()}` });
    },
  }),
);

const server = app.listen(Number(process.env.PORT) || 8088);
attachLifecycle(server);

No host class, no framework wrapping the user's app. The routers are plain express.Router instances; mount them anywhere.

Request lifecycle (Responses protocol)

sequenceDiagram
    autonumber
    participant C as Client (Foundry)
    participant R as responsesRouter
    participant H as your handler
    participant S as ResponseProvider
    C->>R: POST /responses { input, stream?, background? }
    R->>R: validate B13/B40 + extract isolation + resolve session (B39)
    R->>R: build ResponseContext + AbortController
    R->>H: handler(request, context, abortSignal)
    loop stream events
        H-->>R: yield response.created / in_progress / output_item.added / ... / completed
        alt stream:true
            R-->>C: SSE event with sequence_number
        end
    end
    R->>S: saveResponse(envelope) + saveEvents(...)
    alt non-streaming
        R-->>C: 200 ResponseObject
    else background
        R-->>C: 202 { status: queued }
    end
    Note over C,R: Cancel → POST /responses/:id/cancel<br/>signals AbortController; router emits response.cancelled

Routes

GET    /readiness                            health probe → {"status":"healthy"}
POST   /responses                            non-streaming → ResponseObject
                                             streaming    → SSE event stream
GET    /responses/:id                        retrieve persisted response
GET    /responses/:id?stream=true            SSE replay (B2 — needs background+stream+store)
                                             with `?starting_after=<seq>` resume cursor
GET    /responses/:id/input_items            paginated; needs FoundryStorageProvider
POST   /responses/:id/cancel                 signal abort to in-flight handler
DELETE /responses/:id                        cancel-then-remove
POST   /invocations                          opt-in via invocationsHandler
GET    /invocations/:id                      opt-in via getHandler
POST   /invocations/:id/cancel               opt-in via cancelHandler

Test it:

curl -s -X POST http://localhost:8088/responses \
  -H 'content-type: application/json' \
  -d '{"input":"hello"}' | jq

curl -N -X POST http://localhost:8088/responses \
  -H 'content-type: application/json' \
  -d '{"input":"stream me","stream":true}'

Examples

examples/cookbook/ — 30+ focused, single-concept agents, every one verified end-to-end through Foundry's hosted-agent gateway. Categorised in the cookbook README:

Core protocol (minimal, streaming, multi-turn, cancellation, background, multi-protocol)
Output items (tool-calling, reasoning)
Foundry integrations (prompt-agent composition, memory store, MCP via connection, KB RAG variants)
Composition (mounted-in-existing-app, custom storage, error filter, isolation keys)
Observability (DEBUG namespacing)
MCP tool-calling (public MCP, Foundry-resolved MCP)
Agent-to-agent (A2A server + client over @a2a-js/sdk)
Multi-agent workflows (handoff, fan-out, multi-step research, conditional tool loop, human-in-the-loop)
Framework integrations (LangGraph, OpenAI Agents, Anthropic, Google ADK, Foundry Local)

There is also a Foundry workflow agent example and a Foundry config reference covering ARM templates, preview headers, reserved env vars, and Postman specs.

examples/foundry-deploy/ — full deployment template that exercises every feature; this is what's currently live on agentserver-ncus/dev3 for end-to-end verification.

Output items

ResponseEventStream builders for every output-item type Foundry's runtime emits:

Message + text / refusal / audio — addOutputItemMessage() → addTextContent() / addRefusalContent() / addAudioContent({ data, format })
Function call — addOutputItemFunctionCall({ name }) → emitArgumentsDelta(...) → emitArgumentsDone() → emitDone()
Reasoning — addOutputItemReasoning() → addSummaryPart()
Image generation — addOutputItemImageGeneration()
Tool calls — addOutputItemWebSearch(), addOutputItemFileSearch(), addOutputItemCodeInterpreter(), addOutputItemMcpCall()
Computer-use — addOutputItemComputerCall({ action }), addOutputItemComputerCallOutput({ callId, imageUrl })
Generic — addOutputItem(item) for anything not yet ergonomically wrapped (validator is permissive, so future item types still pass through).

Foundry sibling resources — `express-agentserver/foundry-tools`

Subpath for calling other Foundry resources from inside a hosted agent: prompt agents, memory stores, MCP / KB connections.

import { DefaultAzureCredential } from "@azure/identity";
import {
  FoundryProjectClient,
  callAgent,
  searchMemory,
  addMemories,
  listConnections,
} from "express-agentserver/foundry-tools";

const tools = new FoundryProjectClient(new DefaultAzureCredential(), {
  endpoint: process.env.FOUNDRY_PROJECT_ENDPOINT!,
});

// Compose with a sibling prompt agent.
const result = await callAgent(tools, { agentName: "summarizer", input: "..." });

// Memory CRUD.
await addMemories(tools, "agent-memory", {
  scope: "user-alice",
  messages: [{ role: "user", content: "I love sushi" }],
});
const hits = await searchMemory(tools, "agent-memory", {
  scope: "user-alice",
  query: "food preferences",
});

The lib auto-stamps type: "message" on memory items (Foundry's extractor silently no-ops without it; verified empirically). Memory paths use the colon-action segments (:update_memories, :search_memories, :delete_scope); the preview header Foundry-Features: MemoryStores=V1Preview is added automatically.

Observability

configureObservability() sets up an OpenTelemetry TracerProvider, extracts incoming W3C trace context, and stamps GenAI semantic-convention attributes on every span (gen_ai.system, gen_ai.agent.name, gen_ai.response.id, microsoft.session.id, microsoft.foundry.project.id).

import { configureObservability, tracing } from "express-agentserver";
await configureObservability(); // reads APPLICATIONINSIGHTS_CONNECTION_STRING
// and OTEL_EXPORTER_OTLP_ENDPOINT from env
app.use(tracing({ operation: "create_response" }));

startAgent calls this for you by default. Pass observability: false to skip.

| Exporter | Package | | ---------------------------- | ----------------------------------------- | | Azure Monitor (App Insights) | @azure/monitor-opentelemetry-exporter | | OTLP HTTP | @opentelemetry/exporter-trace-otlp-http |

A FoundryEnrichmentSpanProcessor is added automatically — it stamps agent identity + session/conversation IDs on every span (including spans created by frameworks like LangChain), so the Foundry portal Tracing tab shows the full request graph.

The getLogger() wrapper auto-fills severityNumber from severityText on every emit. Without that, Azure Monitor silently drops records — a real footgun we hit during verification.

Debug logging

The lib uses the canonical debug package — turn on internals via env var:

DEBUG=express-agentserver:*                  # everything
DEBUG=express-agentserver:host               # just host wire-up
DEBUG=express-agentserver:foundry-tools      # just sibling-resource calls
DEBUG=express-agentserver:responses:router   # just the responses path
DEBUG=express:*,express-agentserver:*        # plus Express's own

Application code can register its own namespace via the lib's helper:

import { createDebug } from "express-agentserver";
const log = createDebug("my-agent:billing");
log("charged %s for %d cents", customer, cents); // shows when DEBUG=my-agent:*

Foundry storage (cross-replica persistence)

For production, replicas are stateless. FoundryStorageProvider persists responses + events to Foundry's storage tier so any replica can serve GET /responses/:id:

import {
  responsesRouter,
  maybeCreateFoundryStore,
  InMemoryResponseProvider,
} from "express-agentserver";

const store = (await maybeCreateFoundryStore()) ?? new InMemoryResponseProvider();
app.use("/responses", responsesRouter({ store, handler: myHandler }));

startAgent does this auto-detect by default (storage: "auto"). Provider implements every storage operation ResponseContext uses for history resolution: createResponse, updateResponse, getInputItems, getItems, getHistoryItemIds.

Conversation history + item references

When the request carries previous_response_id or conversation.id, the handler can fetch the prior turn's items:

const history = await context.getHistory(); // OutputItem[]
const resolved = await context.getResolvedInputItems(); // references resolved

Both are no-ops against the in-memory provider and full-fidelity against FoundryStorageProvider — so simple agents stay zero-config locally and become full-stack in production with no code change.

The router also performs eager previous_response_id validation — unknown ids return 404 not_found before the handler runs, with proper isolation-aware lookup.

Cancellation

POST /responses/:id/cancel signals the handler via the AbortSignal it received and the router emits response.cancelled. DELETE /responses/:id cancels-then-removes. The handler periodically checks abortSignal.aborted (or threads the signal into framework calls that honour it — fetch, OpenAI SDK, LangChain).

Foundry behaviour rules implemented

The lib enforces (and tests) the rules from Foundry's hosted-agent API behaviour contract that affect protocol semantics:

| Rule | Behaviour | | ---- | ------------------------------------------------------------------------------------------------------ | | B2 | SSE replay requires background+stream+store — late attachers without all three get 400 | | B13 | background=true requires store=true — contradiction returns 400 unsupported_parameter | | B16 | Non-background in-flight responses are not findable via GET (404 even when entry exists) | | B38 | x-agent-response-id request header overrides the generated id | | B39 | Session-id resolution: header → middleware → body → default → conv-derived → prev-derived → fresh UUID | | B40 | Path-param ids are validated against the canonical format; malformed → 400 invalid_parameters | | B45 | Background mode → 202 + queued envelope, async drain, late attach via subject |

Plus a SeekableReplaySubject with ?starting_after=<seq> cursor for resuming SSE after a disconnect.

Pluggable interfaces

Process-local defaults can be swapped out:

interface StreamProvider {
  create(responseId: string): StreamPublisher;
  get(responseId: string): StreamPublisher | undefined;
  remove(responseId: string): void;
}

interface CancellationSignalProvider {
  register(responseId: string, controller: AbortController, opts?): InFlightEntry;
  cancel(responseId: string): InFlightEntry | undefined;
  // ...status transitions, isFindable, etc.
}

The defaults are in-process Maps; implement these against Redis (or similar) for cross-replica fan-out + cancellation.

Typed exceptions + error filter

import {
  agentServerErrorHandler,
  NotFoundError,
  InvalidParametersError,
  AuthenticationError,
  InternalError,
} from "express-agentserver";

app.use(myRoutes);
app.use(agentServerErrorHandler()); // last — catches all AgentServerError throws
// and renders { error: { code, message, ... } }

Throw the typed exceptions from anywhere — handlers, providers, middleware. Unknown errors fall through to a 500 internal_error envelope with the message preserved.

Environment variables

| Variable | Purpose | | --------------------------------------- | -------------------------------------------------------------------- | | PORT | Listen port (startAgent honours this; default 8088) | | FOUNDRY_AGENT_NAME | Set by Foundry runtime; presence triggers storage auto-detect | | FOUNDRY_AGENT_VERSION | Set by Foundry runtime | | FOUNDRY_PROJECT_ENDPOINT | Set by Foundry runtime; consumed by FoundryProjectClient | | FOUNDRY_PROJECT_ARM_ID | Set by Foundry runtime | | FOUNDRY_AGENT_SESSION_ID | Default session id for Invocations | | APPLICATIONINSIGHTS_CONNECTION_STRING | Platform-injected (configured via project's App Insights connection) | | OTEL_EXPORTER_OTLP_ENDPOINT | Optional OTLP endpoint | | DEBUG | express-agentserver:* for namespaced internal logs |

Docker

FROM node:20-alpine AS build
WORKDIR /app
COPY package*.json tsconfig.json ./
RUN npm ci
COPY src ./src
RUN npm run build && npm prune --omit=dev

FROM node:20-alpine AS runtime
WORKDIR /app
ENV PORT=8088
# Required for Foundry's storage HTTPS — Node 20's bundled CA store is too narrow.
RUN apk add --no-cache ca-certificates && update-ca-certificates
ENV NODE_OPTIONS=--use-openssl-ca
COPY --from=build /app/dist ./dist
COPY --from=build /app/node_modules ./node_modules
COPY --from=build /app/package.json ./
EXPOSE 8088
USER node
CMD ["node", "dist/index.js"]

See examples/foundry-deploy/Dockerfile for the complete working setup with the lib bundled as a tarball during build.

Foundry deployment — hard-won gotchas

These all surfaced during real-runtime testing against Foundry. None are documented anywhere on Microsoft's side. Capturing here so the next person doesn't burn a day on each.

Memory store items need `type: "message"`

Foundry's memory extractor silently no-ops when items in the request body lack type: "message". The API accepts both shapes (200 + update_id), the LRO transitions to completed, but memory_operations returns [] and usage.input_tokens is 0 — the chat extractor never runs. Verified empirically across 3 separate Foundry projects. The lib's addMemories / searchMemory auto-stamp type: "message" on every item.

Memory paths use action segments + colons

Memory paths are NOT /memory_stores/<store>/{search,items,...} — they're /memory_stores/<store>:<action>?api-version=v1 with action segments (:update_memories, :search_memories, :delete_scope). Body shape is { items, scope }, not { messages, scope }. Scope regex is [A-Za-z0-9_\-.%+@/]{1,256} (colons rejected).

`/openai/v1/...` routes reject `?api-version=v1`

When the path itself contains a version segment, Foundry rejects the query parameter ("api-version query parameter is not allowed when using /v1 path"). The lib's buildUrl auto-strips the param when the path matches \/(v\d+|openai\/v\d+).

`severityNumber` is required for App Insights log records

Records emitted via the OTel api-logs Logger.emit({...}) API without severityNumber are silently dropped during ingestion. severityText: "INFO" alone isn't enough. The lib's getLogger() auto-fills it.

Hosted-agent invocation URL

POST {project_endpoint}/agents/{agent_name}/endpoint/protocols/openai/responses?api-version=v1

Bearer token scope: https://ai.azure.com/.default. Header: Foundry-Features: HostedAgents=V1Preview. The OpenAI-style <account>.openai.azure.com/openai/v1/responses with agent_reference in body does NOT work for hosted agents — Foundry returns 400 with the canonical URL above.

Foundry generates a fresh `agent_session_id` per request

Even within a previous_response_id chain, Foundry stamps a fresh session id on each response. For per-user memory scoping use x-agent-user-isolation-key (the canonical "tenant" header — Foundry hashes it consistently) instead of agent_session_id. The lib reads it into context.isolation.userKey.

Required role assignments

| Principal | Role | Scope | | ---------------------------------------- | -------------------------------- | ------------ | | AI Services account system identity | AcrPull | the registry | | Project system identity | AcrPull | the registry | | Project system identity | Azure AI User | the account | | Project system identity | Cognitive Services OpenAI User | the account | | Agent's instance_identity.principal_id | AcrPull | the registry | | Agent's instance_identity.principal_id | Azure AI User | the account |

The Cognitive Services OpenAI User on the account for the project MI is required for memory-store extraction to invoke chat/embedding models. The agent's principal id is in the create_version response body — missing AcrPull on it causes silent provisioning failures.

Capability hosts can't be updated

Per Microsoft's docs and verified empirically: PUT capabilityHost returns "Update of capability is not currently supported. Please delete and recreate with the new configuration." Plan accordingly — change to aiServicesConnections / vectorStoreConnections / threadStorageConnections requires deleting + recreating the host (which means deleting all dependent agents / memory stores first). For default Microsoft-managed storage of memories etc., don't create a project-level capability host at all — the runtime falls back to managed defaults.

Dockerfile TLS — Node bundle isn't enough

Foundry's hosted-agent container's outbound HTTPS to <project_endpoint>/storage/* fails with SELF_SIGNED_CERT_IN_CHAIN against Node 20's bundled CA store. Fix in your runtime stage:

RUN apk add --no-cache ca-certificates && update-ca-certificates
ENV NODE_OPTIONS=--use-openssl-ca

Wire-format alignment with Python (storage tier)

POST /storage/responses body must always include input_items: [] and history_item_ids: [], even when empty.
POST /storage/items/batch/retrieve uses key item_ids, not ids.
x-ms-client-request-id: <uuid> is required — Foundry's storage tier uses it for routing; missing → generic 500.
The response object needs null defaults stripped before send — Python's as_dict() does this; explicit nulls cause server-side deserialization to fail.
Ids use partition-key prefixes: caresp_<16hex>00<32entropy>. Output items get <prefix>_<16hex>00<32entropy> co-located on the same shard. The lib's idGenerator does this; if you swap it, mirror the format or storage will misroute.

Our FoundryStorageProvider already does all of this — these notes are for anyone porting to a different language.

Layout

src/
├── core/
│   ├── middleware.ts   requestId, platformHeaders, inboundLogging, readiness, asyncHandler
│   ├── lifecycle.ts    attachLifecycle
│   ├── sse.ts          SSE framing + keep-alive
│   ├── config.ts       loadAgentConfig
│   ├── errors.ts       sendError + AgentServerError hierarchy + agentServerErrorHandler
│   ├── tracing.ts      configureObservability + getLogger + FoundryEnrichmentSpanProcessor
│   └── debug.ts        createDebug — namespaced logging via the `debug` package
├── invocations/
│   ├── router.ts       invocationsRouter
│   └── ids.ts          sanitizeId
├── responses/
│   ├── router.ts       responsesRouter (B2/B13/B16/B38/B39/B40/B45)
│   ├── context.ts      ResponseContext
│   ├── eventStream.ts  ResponseEventStream + every output-item builder
│   ├── textResponse.ts TextResponse
│   ├── streamSubject.ts StreamSubject + StreamProvider interface
│   ├── inFlight.ts     InFlightRegistry + CancellationSignalProvider interface
│   ├── store.ts        InMemoryResponseProvider + ResponseProvider interface
│   ├── stateMachine.ts EventStreamValidator
│   ├── sessionResolver.ts B39 fallback chain
│   ├── idGenerator.ts  Foundry id format (partition-key co-location)
│   ├── foundryProvider.ts FoundryStorageProvider
│   ├── foundrySettings.ts FoundryStorageSettings
│   ├── foundryErrors.ts Storage-tier error mapping
│   └── models.ts       Wire types
├── foundry-tools/
│   ├── client.ts       FoundryProjectClient (sibling-resource HTTP)
│   ├── agents.ts       callAgent, streamAgent
│   ├── memory.ts       addMemories, searchMemory, deleteMemoryScope
│   └── connections.ts  listConnections, getConnection
├── host/
│   └── index.ts        startAgent + createAgentApp
└── index.ts            Flat re-exports

Each subentry also has its own export path (express-agentserver/core, /invocations, /responses, /foundry-tools) — see package.json exports.

CI

npm run ci       # build + lint + format:check + test
npm run test     # vitest, 118 tests across 15 files
npm run lint     # eslint
npm run format   # prettier --write

Credits

Port of Microsoft's Python azure-ai-agentserver-* packages. All the protocol decisions, event ordering rules, header conventions, and session-resolution behaviour are theirs — the Express shape (routers, middleware, lifecycle helper, host facade) is ours.

License

MIT

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

express-agentserver

Foundry vs. non-Foundry — the lib works for both

Architecture at a glance

Why this exists

Install

Two ways to use it

1. The host facade — startAgent (recommended for new projects)

2. Raw bindings — middleware + router factories (full control)

Request lifecycle (Responses protocol)

Routes

Examples

Output items

Foundry sibling resources — express-agentserver/foundry-tools

Observability

Debug logging

Foundry storage (cross-replica persistence)

Conversation history + item references

Cancellation

Foundry behaviour rules implemented

Pluggable interfaces

Typed exceptions + error filter

Environment variables

Docker

Foundry deployment — hard-won gotchas

Memory store items need type: "message"

Memory paths use action segments + colons

/openai/v1/... routes reject ?api-version=v1

severityNumber is required for App Insights log records

Hosted-agent invocation URL

Foundry generates a fresh agent_session_id per request

Required role assignments

Capability hosts can't be updated

Dockerfile TLS — Node bundle isn't enough

Wire-format alignment with Python (storage tier)

Layout

CI

Credits

License

1. The host facade — `startAgent` (recommended for new projects)

Foundry sibling resources — `express-agentserver/foundry-tools`

Memory store items need `type: "message"`

`/openai/v1/...` routes reject `?api-version=v1`

`severityNumber` is required for App Insights log records

Foundry generates a fresh `agent_session_id` per request