@psci-labs/chat-runtime
v0.3.0
Published
Node agent runner for the Claude Agent SDK chat surface — Web-standard Request → Response handler with SSE streaming, MCP plumbing, and a pluggable persistence interface
Readme
@psci-labs/chat-runtime
Node agent runner for the Claude Agent SDK chat surface. Returns a Web-standard Request → Response handler with SSE streaming, MCP plumbing, and a pluggable persistence interface.
pnpm add @psci-labs/chat-runtime @anthropic-ai/claude-agent-sdk@anthropic-ai/claude-agent-sdk is a peer dependency — pin the version your app
wants and the runtime forwards through to it. @psci-labs/chat-protocol is a
direct dependency (no need to install separately).
Persistence
import { InMemoryPersistence, type PersistenceAdapter } from '@psci-labs/chat-runtime';
const adapter: PersistenceAdapter = new InMemoryPersistence();The interface (in src/persistence/adapter.ts) is intentionally narrow:
saveCheckpoint({ threadId, sessionId, sdkVersion, state, metadata })—sdkVersionis required so SDK shape drift fails at the boundary.loadCheckpoint(threadId)— returns the latest checkpoint for the thread, ornull.appendMessage(threadId, sessionId, message)/listMessages(threadId, sessionId?)— replay history, optionally scoped to a single session.archiveSession(threadId, sessionId)— lifecycle hook; messages stay readable.
Reusable contract test
Every adapter is tested against the same suite via
@psci-labs/chat-runtime/testing:
import { describe } from 'vitest';
import { runPersistenceAdapterContract } from '@psci-labs/chat-runtime/testing';
import { PostgresPersistence } from '../src/index.js';
describe('PostgresPersistence', () => {
runPersistenceAdapterContract(() => new PostgresPersistence({ ...config }));
});The factory must return a fresh empty adapter per call; the contract invokes it once per test case so suites stay isolated.
Auth (getUserContext)
The runtime is auth-agnostic. Apps inject a callback:
import { resolveUserContext, type GetUserContext } from '@psci-labs/chat-runtime';
const getUserContext: GetUserContext = async (req) => {
const session = await getServerSession(req);
return session ? { userId: session.user.id } : null;
};Returning null is the contract for "unauthenticated" — the runtime answers
401 with { error: 'Unauthorized' }. Throwing from the callback is reserved
for genuine bugs in the integration and surfaces as 500.
The shape returned is validated against the UserContext schema from
@psci-labs/chat-protocol, so a misshapen context (empty userId, wrong
attributes type) fails loudly at the boundary.
System prompt
Three input shapes — passed in via createAgentRunner({ systemPrompt }):
// Plain string — full custom prompt, SDK preset bypassed
systemPrompt: 'You are a billing-workflow specialist.'
// Bare preset — use the SDK's claude_code preset as-is
systemPrompt: { preset: 'claude_code' }
// Preset + per-request append (static or dynamic)
systemPrompt: {
preset: 'claude_code',
append: ({ userContext, threadId }) =>
`User ${userContext.userId} on thread ${threadId}.`,
}resolveSystemPrompt(input, ctx) translates these into the SDK's expected
shape and is awaited per request, so the dynamic callback can do async work
(e.g. read the user's tenant config from a DB).
Built-in MCP servers
Two SDK MCP servers ship with the runtime. Both are pluggable — apps inject the side-effect (notification delivery, user-question UI) via callbacks; the servers own the tool name, description, and Zod schema.
import { createNotificationsServer, createInteractionServer } from '@psci-labs/chat-runtime';
const notifications = createNotificationsServer({
onNotify: async ({ type, title, body }) => {
await pushNotificationService.send({
/* ... */
});
},
});
const interaction = createInteractionServer({
askUser: async (questions) => {
// session manager wires this up — resolves when the user replies
return await sessionManager.askUser(threadId, questions);
},
});Renderer keys (used by chat-ui to dispatch to a renderer):
mcp__notifications__notify—type(progress/completed/error/info),title,bodymcp__interaction__ask_user— array of questions; supports freeform / single-select / multi-select / yes-no
For unit tests, the handler is also exported as makeNotifyHandler /
makeAskUserHandler so you can exercise the business logic without spinning
up the SDK.
SSE encoder
import { encodeSSE } from '@psci-labs/chat-runtime';
const stream = encodeSSE(eventIterable, { signal: req.signal });
return new Response(stream, {
headers: { 'content-type': 'text/event-stream' },
});pull-based ReadableStream — backpressure is automatic. Custom formatter
hook lets you set the SSE event: line when you want typed events. Stream
cancellation calls iter.return?.() so the upstream agent can release
resources promptly.
Session manager
The session manager is the runtime's brain. It owns:
- The active-sessions map keyed by
threadId - A
MessagePump<UserPrompt>per session (queues follow-up messages while the agent is busy, drains as the SDK pulls) - Lifecycle:
start/continueSession/terminate/clearContext - Persistence checkpointing at session boundaries (
sdkVersionalways recorded;statecarries previous-session IDs and cost/turn metrics) respondToTool(threadId, answer)to resolve themcp__interaction__ask_userwaiter
The SDK call is injected via a runAgent: (opts) => AsyncIterable<SDKEvent>
factory, so tests drive the manager with canned event sequences and the
Phase 2E runner factory binds the real query() call (with per-session
MCP wiring).
import { SessionManager, InMemoryPersistence } from '@psci-labs/chat-runtime';
import { query } from '@anthropic-ai/claude-agent-sdk';
const mgr = new SessionManager({
persistence: new InMemoryPersistence(),
sdkVersion: '0.1.77',
runAgent: ({ prompts, signal, resumeSessionId }) =>
query({ prompt: prompts, options: { signal, resume: resumeSessionId } }),
});
const events = await mgr.start({ threadId, userMessage: { text: 'Hello' } });
// pipe through encodeSSE → return as ResponseHTTP handlers
Four standalone handlers, one per route. Each takes (req, ctx) where
ctx carries the resolved userContext, threadId, sessionManager, and
persistence. The router (Phase 2E) is responsible for parsing the URL
and resolving auth before dispatching.
| Handler | Route | Behavior |
| --------------- | -------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| handleStream | POST /threads/:id/stream | Body { message: { text } }. Starts a session, returns text/event-stream. 400 invalid body, 409 session already running. Forwards req.signal so client disconnects abort the agent. |
| handleHistory | GET /threads/:id/history | Optional ?sessionId. Returns { messages: StoredMessage[] }. |
| handleCancel | POST /threads/:id/cancel | Idempotent. terminate(threadId) → 200. |
| handleClear | POST /threads/:id/clear | Idempotent. clearContext(threadId) → 200. The current session is archived; its sessionId is queued for the next session's previousSessionIds. |
Status codes are surfaced explicitly so the UI can react: 409 says "the agent is busy, you might want to cancel"; 400 says "your client sent something I can't parse." Auth-derived 401s come from the router, not the handlers.
Putting it all together: createAgentRunner
// app/api/chat/[...path]/route.ts (Next.js App Router)
import { createAgentRunner } from '@psci-labs/chat-runtime';
import { getServerSession } from 'next-auth';
const runner = createAgentRunner({
apiKey: process.env.ANTHROPIC_API_KEY,
model: 'claude-sonnet-4-6',
getUserContext: async (req) => {
const session = await getServerSession();
return session ? { userId: session.user.id } : null;
},
systemPrompt: { preset: 'claude_code', append: 'Use only SharePoint MCP tools.' },
allowedTools: ['Read', 'Grep', 'Glob', 'WebSearch'],
mcpServers: { sharepoint: sharepointMcpServer },
onNotify: async ({ type, title, body, userContext, threadId }) => {
await pushService.send(userContext.userId, { type, title, body, threadId });
},
onAskUser: async ({ questions, threadId }) => {
return await ui.deliverAndAwait(threadId, questions);
},
});
export const { GET, POST } = runner.toNextHandlers();The factory:
- Defaults
persistencetoInMemoryPersistenceif omitted - Pins
sdkVersionto the bundled SDK release (overridable viasdkVersion) - Defaults
disallowedToolsto['AskUserQuestion']so the SDK's built-in AskUser stops shadowing themcp__interaction__ask_userMCP renderer. Pass an explicit empty array (disallowedTools: []) to opt out and allow the SDK's version. - Always registers the
interactionMCP server. If you don't supply anonAskUsercallback, the runtime supplies a default that resolves via the session manager'spendingToolResponseslot — wired toPOST /threads/:id/respond. Apps with custom AskUser delivery (Slack-mediated, server-side automation, etc.) override the default by supplying their ownonAskUser. - Builds the per-session
notificationsMCP server only whenonNotifyis supplied - Exposes
sessionManagerandpersistenceon the returnedAgentRunnerso apps can read history or callrespondToTooloutside the HTTP layer - Accepts a
runAgentoverride for tests and the future opencode/pi adapter
runner.handle(req) is also a Web-standard handler — works under any framework that exposes Request/Response (Hono, Fastify with @fastify/web-fetch, plain Node 22+ HTTP server, etc.).
