cf-browser-fleet
v5.0.0
Published
Cloudflare Browser Rendering fleet for mesh network agentic tasks
Readme
Cf Browser Fleet
Browser-fleet runs always-warm headless Chromium sessions and now exposes
speech-capable endpoints plus a real browser-executed image runtime for
ssd-1b-lcm-int8.
The public ingress lives in apps/edge-workers. The supported image chain is:
client -> edge-workers -> cf-browser-fleet -> session pool / DO -> Cloudflare Browser Rendering
The pool is internal to browser-fleet, not a hop after it. Aeon Shell local is
a separate local/offline lane that shares the same /runtime/image/models/...
artifacts through the edge gateway.
Runtime Focus
- Capability-aware warm pools for
speech-tts,speech-stt, andspeech-duplex - Browser-session execution for
ssd-1b-lcm-int8via the stable-origin/runtime/imagepage and/v1/images/generations - Stable-origin speech artifact serving via
/runtime/speech/*so browser and shell runtimes can cachemoss-tts-localandwhisper-smallwith the same shared manifest contract - Browser-fleet
/v1/speech/modelsnow mirrors the edge-worker speech catalog so the fleet and app runtime share one canonical speech model policy - Browser-fleet-owned
/wasm-tts,/wasm-stt,/v1/audio/speech, and/v1/audio/transcriptions - Session health aggregation for
cold,warming,ready, anddegradedspeech capability states - A durable session registry so list and health endpoints do not depend solely on KV list consistency
- Session reuse and introspection now reconcile registry/KV snapshots against live durable-object state, pruning stale "ready" sessions before they skew health checks or trigger unnecessary browser launches
- When Browser Rendering launch pressure is high, the fleet can repurpose an already-ready speech browser for the other speech capability instead of forcing a second cold launch
- Warm defaults are now intentionally aggressive: 6 TTS browsers, 3 STT browsers, 2 duplex browsers, and 6 image browsers by default, plus spare total pool headroom so browser-side media stays hot even through ugly cold-start windows
- Image warmups now default to a 10 minute budget so full model artifacts can hydrate into browser storage without fleet-side OPFS size pruning during warmup
- Image inference jobs now default to a 12 minute timeout so cold image sessions can finish full artifact hydration plus generation before the fleet gives up
- Pool maintenance now fans out missing browser launches in parallel, so the cron pass can actually reach those higher warm targets instead of spending its budget on serialized Browser Rendering boots
- Launch fan-out is intentionally bounded to avoid Browser Rendering
429bursts while still filling the pool faster than one-browser-at-a-time - Speech, image, session, and health requests now also schedule background pool maintenance so the fleet can self-heal warm deficits between cron ticks
- Optional coordinator warmup (
MOSS_WARM_POOL_MODELS,MOSS_WARM_POOL_TIMEOUT_MS) keeps all configured MOSS coordinators hot via edge/ai/warmupduring maintenance passes - Direct browser clients can target fleet-owned speech routes and forward
X-Speech-Upstream-Baseto the edge gateway when needed, while the worker itself resolves speech catalog lookups through the bound edge service - Browser image requests hard-error when they target disabled synthetic IDs or try to bypass the browser/shell runtime
Cloud Run is not part of the default browser-fleet image path.
Subdirectories
Last Updated: 2026-03-07
