@koda-sl/baker-bridge
v0.39.3
Published
HTTP server wrapping the Claude Agent SDK. Persists chat events to Convex; the dashboard subscribes via Convex realtime
Downloads
4,086
Maintainers
Readme
@koda-sl/baker-bridge
HTTP server wrapping the Claude Agent SDK. Runs inside a sandbox, persists every chat event directly to Convex, and exposes HTTP endpoints for dispatch / abort / answer-question. The dashboard subscribes to Convex realtime — there is no chat WebSocket.
Installation
npm install @koda-sl/baker-bridge
# or
pnpm add @koda-sl/baker-bridgeQuick Start
CLI
npx @koda-sl/baker-bridge hono
npx @koda-sl/baker-bridge --versionProgrammatic
import { createServer } from "@koda-sl/baker-bridge/hono";
const server = createServer();
await server.start();
// Server listening on http://0.0.0.0:3000Environment Variables
| Variable | Description |
|---|---|
| AUTH_TOKEN | Bearer token for /message/* and /answer-question |
| BAKER_CONVEX_SITE_URL | Convex site URL for chat persistence (e.g. https://your-deployment.convex.site) |
| BAKER_API_KEY | API key for Convex callbacks (bk_...) |
| INTERNAL_SECRET_KEY | AES key for encrypting flow _data.json config at rest |
| ANTHROPIC_API_KEY | Anthropic API key (required by the Agent SDK) |
| NOTDIAMOND_ENABLED | "true" to route each prompt through NotDiamond's model selector. Defaults to false — every turn runs on Opus 4.6 (claude-opus-4-6) |
| NOTDIAMOND_API_KEY | NotDiamond bearer token. Required when NOTDIAMOND_ENABLED=true |
All variables are validated at startup via @t3-oss/env-core + zod.
Model selection
The bridge picks a Claude model per turn before calling the Agent SDK:
- Default (router disabled): every turn runs on
claude-opus-4-6. - Router enabled (
NOTDIAMOND_ENABLED=true): the prompt + repo context are sent to NotDiamond with three candidates (Haiku 4.5, Sonnet 4.6, Opus 4.6). NotDiamond's default selection tradeoff is used — nolatency/costoverride. The returned OpenRouter-formatted ID is normalized to the Anthropic-native form (e.g.anthropic/claude-opus-4.6→claude-opus-4-6) before being passed toquery({ options: { model } }). On any router error or empty response, it falls back to Opus 4.6.
Endpoints
| Method | Path | Auth | Description |
|--------|------|------|-------------|
| GET | /health | None | Health check — returns { status, projectDir } |
| GET | /commands | None | List available slash commands (cached 5m) |
| GET | /company | None | Return COMPANY.md as { content, title } |
| GET | /brand | None | Return BRAND.md as { content, title } |
| GET | /knowledge/:collection/:entry | None | Return knowledge entry as { content, title } |
| GET | /documents/:slug | None | Serve document binary (pdf, pptx, docx, xlsx) |
| GET | /landings/:slug/definition | None | Return landing _definition.md as { content, title } |
| GET | /flows/:slug | None | Read flow _data.json (decrypted) |
| PUT | /flows/:slug | None | Write flow _data.json (encrypts at rest) |
| POST | /message/async | Bearer token | Dispatch a chat turn — fire-and-forget, 202 immediately |
| POST | /message/abort | Bearer token | Abort a running query — kills the SDK iterator |
| POST | /answer-question | Bearer token | Resolve a pending AskUserQuestion |
Agent scheduling
The bridge disables Claude Code's native cron tools (CronCreate, CronDelete,
CronList, and ScheduleWakeup) in Agent SDK sessions. Scheduled work must use
Baker's product surface instead: baker scheduled-actions create/update/delete/trigger.
Chat flow
Convex action ──POST /message/async──▶ Bridge AgentSession
│
▼
SDK query() iterator
│
┌───────────────────────────┼───────────────────────────┐
▼ ▼ ▼
/api/chat/stream /api/chat/event /api/chat/complete
(throttled ~50ms (assistant events, (final cost / status)
text + tool list, idempotent on uuid)
replaces preview)
│
▼
Convex (single source
of truth — UI reads
via realtime query)- Live preview is throttled to ~50ms — the bridge buffers text deltas and
in-flight tool starts and POSTs them to
/api/chat/stream. The Convex mutation overwritesthread.streamingState. The UI reactively re-renders. - Persisted messages go to
/api/chat/eventwith the SDK'sevent.uuidfor idempotent insert. The mutation also clearsstreamingStateinline so the UI snaps from preview to persisted text without a flicker. - Stop button flips the thread to
cancelledvia Convex first (UI is freed immediately) and best-effort POSTs/message/abortto kill the SDK. - AskUserQuestion posts
/api/chat/input-requestand blocks until the dashboard answers via Convex action → bridge/answer-question.
Attachment validation
When the bridge downloads attachments referenced in a chat dispatch, it sniffs each file's magic bytes before handing it to the agent:
- A file whose extension claims to be an image (
.jpg,.jpeg,.png,.gif,.webp) but whose bytes are a different image format is renamed on disk to match the real format. Anthropic infersmedia_typefrom the extension; mislabeled files causeCould not process image400s. - A file claiming to be an image but whose bytes are HTML, plain text, or
any non-supported format is dropped with a
console.error. This typically means the source URL returned an error page. Better to ship the run without that one attachment than to abort everything. - Images larger than 5 MB raw are dropped — past Anthropic's effective vision endpoint ceiling once base64-encoded.
The same magic-byte rules are enforced as a pre-commit hook in
apps/scaffold-template (validate-images). Extension mismatches are
auto-fixed: the hook renames the file to match its real format, updates
all references in source files, and stages the changes. Corrupted or
oversized images still block the commit.
Resilience
- All Convex POSTs retry 3× with exponential backoff (1s, 2s, 4s).
- Persistent failures of
/api/chat/eventand/api/chat/completeare appended to~/.baker/relay-errors.jsonlfor post-mortem. - The SDK iterator is wrapped in a 30 min watchdog: if no event arrives for that long, the run is aborted and the thread is marked errored.
- A 45s heartbeat keeps the Convex thread alive during long tool executions when no SDK events are flowing. Without it, the 2 min stale-thread cron would falsely timeout active threads.
- The session registry evicts entries idle for >24h to bound memory.
- A new chat that arrives while a previous one is still running aborts the previous run with a 30s timeout cap — a hung run cannot block the next message.
Sending dispatch by hand
curl -X POST http://localhost:3000/message/async \
-H "Authorization: Bearer $AUTH_TOKEN" \
-H "Content-Type: application/json" \
-d '{"threadId": "thread_abc123", "prompt": "Analyze the campaign performance"}'The bridge processes in the background and writes to Convex; nothing is returned in the dispatch response besides 202.
Architecture
- One session per thread —
getOrCreateSession(threadId)ensures every message on the same thread shares anAgentSession. Multi-turn context is preserved across bridge restarts via~/.baker/sessions/<threadId>.json. - Run-token cancellation — each call to
sendAndStreammints a Symbol; if the symbol changes (abort, replace) the run cleans up and exits without writing further. No globally-mutable abort flag. - Single-writer to Convex — there is no parallel WebSocket. Convex is the only place the UI reads state from, eliminating dual-write races.
Development
# From monorepo root
pnpm --filter @koda-sl/baker-bridge dev # Watch mode with tsx
pnpm --filter @koda-sl/baker-bridge build # Compile to dist/Local linking
pnpm --filter @koda-sl/baker-bridge link:local # Build + link for local testing
pnpm --filter @koda-sl/baker-bridge unlink:local # Remove link, restore registry versionPublishing
Auto-publish (CI)
Pushing to main with changes in packages/bridge/ triggers the GitHub Actions
workflow, which publishes to npm with the latest tag by default. To publish a
pre-release, use the workflow dispatch with npm_tag: next.
Manual publish
From the monorepo root:
./scripts/publish-package.sh bridge # @latest
./scripts/publish-package.sh bridge next # @next pre-releaseBump version in package.json (and the VERSION const in src/cli.ts)
before publishing.
Testing a pre-release in sandboxes
npx convex env set BAKER_BRIDGE_VERSION <version>
# … test …
npx convex env remove BAKER_BRIDGE_VERSIONDebugging relay failures
When postToConvex exhausts all retries, the error is appended to ~/.baker/relay-errors.jsonl on the sandbox filesystem. Each line is a JSON object:
{"timestamp":"2026-04-24T15:19:55.123Z","path":"/api/chat/complete","threadId":"abc123","status":502,"responseBody":"Bad Gateway"}
{"timestamp":"2026-04-24T15:20:01.456Z","path":"/api/chat/event","threadId":"abc123","error":"fetch failed"}To inspect on a running sandbox:
cat ~/.baker/relay-errors.jsonlIf a thread is stuck in "streaming", this file tells you whether the bridge failed to reach Convex (and why — auth, network, HTTP status).
Requirements
- Node.js 18+
ANTHROPIC_API_KEYwith access to the Claude Agent SDK
