@flopod/runtime
v0.3.1
Published
The durable workflow execution engine and HTTP server. Workflow authors using the compiler never import it directly — the compiler wires it up. For hand-authored runtime projects, import from `@flopod/runtime`. For the conceptual picture (how this fits th
Readme
@flopod/runtime
The durable workflow execution engine and HTTP server. Workflow authors using the
compiler never import it directly — the compiler wires it up. For hand-authored runtime
projects, import from @flopod/runtime. For the conceptual picture (how this fits the
rest of the system), see ../../docs/architecture.md.
Initialization
import { createRuntime, SqliteBackend } from '@flopod/runtime'
const backend = new SqliteBackend('./workflow.db')
const runId = process.env.FLOPOD_RUN_ID ?? 'main-run'
const wf = await createRuntime({ runId, backend })On createRuntime:
- Loads the full event log for
runIdfrom the backend — one read, once - Builds in-memory state: completed activities, var snapshots, branch history
- All subsequent checks are pure memory lookups
Activity primitives
wf.activity(id, fn, options?)
The atomic unit of work. Leaf activities can do anything. Return value is stored in the event log and returned from cache on replay.
const pokemon = await wf.activity('fetch', () => fetchPokemon('bulbasaur'))On resume: returns stored output without calling fn. Return value must be serializable.
Options:
{ timeoutMs?: number, retry?: { maxAttempts: number, backoff?: 'fixed' | 'exponential', delayMs?: number } }wf.attempt(id, fn, policy)
Activity with retry policy. Returns null if all attempts fail.
const result = await wf.attempt('classify', () => classify(item), { maxAttempts: 3 })Control flow primitives
wf.branch(id, key, cases)
Durable conditional. The branch taken is snapshotted — on resume the same branch is replayed.
await wf.branch('handle', status, {
active: async () => { ... },
suspended: async () => { ... },
deleted: async () => { ... },
})wf.each(id, items, fn, options?)
Durable iteration. Each item's result is stored individually — completed iterations are skipped on replay.
const { results, errors } = await wf.each('process', items, async (item, index) => {
return wf.activity(`enrich-${index}`, () => enrich(item))
}, { concurrency: 3 })wf.repeat(id, fn)
Durable loop. Runs until STOP is returned.
import { STOP } from '@flopod/runtime'
await wf.repeat('paginate', async (iteration) => {
const page = await wf.activity(`fetch-${iteration}`, () => fetchPage(iteration))
if (!page.hasMore) return STOP
})wf.parallel(id, fns)
Fan-out — all functions execute concurrently.
const results = await wf.parallel('fetch-all', names.map(name =>
() => wf.activity(`fetch-${name}`, () => fetchPokemon(name))
))Timing primitives
wf.sleep(id, duration)
Durable sleep. Survives process restarts.
await wf.sleep('cooldown', '30m')
// Duration format: '30s', '5m', '1h', '2d'wf.wait(id, event, options?)
Suspend until an external event is delivered. Returns payload or null on timeout.
const approval = await wf.wait<{ by: string }>('approval', 'researcher-approved', { timeout: '48h' })Deliver externally:
flopod-run deliver <run-id> <path> <event> '{"by":"alice"}'State primitives
All workflow state must go through these primitives — they are snapshotted by the runtime after every mutation and visible to the visual graph renderer.
wf.var(id, initial)
Durable scalar variable.
const total = await wf.var('total', 0)
await total.set(42)
await total.update(v => v + 1)
console.log(total.value) // 43wf.list(id, initial?)
Durable ordered list.
const items = await wf.list<string>('items', [])
await items.push('a', 'b')
await items.filter(x => x !== 'a')
console.log(items.items) // ['b']wf.map(id, initial?)
Durable key-value map.
const counts = await wf.map<string, number>('counts')
await counts.set('fire', 3)
await counts.delete('water')
console.log(counts.map.get('fire')) // 3Path addressing
Every primitive takes an id. The runtime maintains a call stack. The full path of a nested call:
fetch-details
expedition[2]/analyze[5]/classify-bulbasaur
paginate[0]/fetch-pageRules:
idmust be unique within its parent scope- Loop iterations:
id[index] - Don't rename ids while runs are in-flight — renaming changes the path, causing re-execution
Backends
| Backend | When to use |
|---|---|
| MemoryBackend | Tests — in-process, no persistence |
| SqliteBackend('./path.db') | Local development — zero infra (native dep) |
| FileBackend(dataDir?) | Append-only JSONL, one file per run, no native dep. Single-node durable (local, or a single-container Docker deploy with a volume). Human-readable/greppable. |
| PostgresBackend(connectionString) | Production — persistent, multi-process / multi-replica |
Listing runs — a cheap read-model, never a replay
Backend.listRuns() returns the run list without replaying any event log. Each
backend maintains a StoredRunSummary projection (status, started/ended, error,
progress) incrementally via foldEvent (run-summary.ts): MemoryBackend folds
into a map; FileBackend writes a <runId>.summary.json sidecar (LWW overwrite)
on each append and reads sidecars on list; SqliteBackend/PostgresBackend
upsert a run_summaries row and list with a single SELECT. The full event log
is touched only when a specific run is opened — and, once, to migrate a run that
predates the read-model. The fold persists open-wait/sleep counters so live
status survives a restart without a replay.
There are three run targets:
| Command | Process | Control plane | Lifetime | Use |
|---|---|---|---|---|
| flopod dev | TS via node | yes (browser, animated, step) | stays up | local dev/debug |
| flopod build → bundle.js | bundled job | no | one run, exits | run a workflow in a container |
| flopod serve → server.js | bundled, headless | yes | long-lived | deployable control plane |
flopod serve builds a single self-contained dist/server.js (runtime + control
plane inlined) plus dist/Dockerfile.server. It hosts the HTTP control plane
headless, never auto-runs a default run, and waits for runs triggered over
POST /runs. It uses the durable FileBackend and hydrates its run list on boot,
so it survives restarts. Mount a volume at FLOPOD_DATA_DIR (default /data in
the image) to persist. Single-node (one writer); multi-replica needs Postgres +
shared control-plane state (not yet built).
The dev control plane consumes this: when given a durable backend (the dev codegen
now wires a shared FileBackend), DevServerPlugin hydrates its run list from
backend.listRuns() on startup, so runs from before a restart still appear; a run's
frames are replayed from its log lazily, only when it's opened. Live runs always win
over hydrated ones. Because the runtime emits RunStarted but has no run-level
bookend, the control plane's run() wrapper appends the terminal RunCompleted/
RunFailed event so a finished run reads as ok/error (not running) after a restart.
FileBackend writes <dataDir>/events/<runId>.jsonl. The data dir resolves via
resolveDataDir(): FLOPOD_DATA_DIR → (FLOPOD_DATA_SCOPE=user → OS data dir) →
./.flopod. In a compiled artifact, FLOPOD_BACKEND=file selects it and
FLOPOD_DATA_DIR points it at a mounted volume. It is single-writer (per-run
file + cooperative-concurrency seq assignment) — cross to Postgres for multiple
replicas. On DigitalOcean: a Droplet Volume persists it; App Platform has no
persistent disk, so use DATABASE_URL → Managed Postgres there.
Durability model
Snapshotting
The runtime maintains an internal state object. On each significant operation, the full current value of that state is written to the backend — not individual mutation deltas. On resume, the latest snapshot is loaded and execution continues.
What is snapshotted: every wf.activity() completion (output stored at its path), every
wf.var/list/map mutation (full current value), every wf.branch() (which branch was
taken), every wf.each() iteration completion, every wf.sleep() / wf.wait() state.
On resume: load all events → build state (latest value per path/id) → re-execute the
workflow body from the top → wf.activity() returns its stored output (skips the real call)
→ wf.var/list/map restore from the latest snapshot → continue until the first
un-snapshotted operation.
Why wf.* is mandatory for state
Plain variables in the workflow body are invisible to the runtime — not snapshotted, not
visualizable, and lost on crash. The wf.* call is what makes state durable and visible to
the graph renderer. (Authors never write wf.* directly — the compiler emits it; see the
two-tier model in ../../docs/architecture.md.)
Serialization rules
Activity return values and wf.var values are persisted via superjson.
| Type | Survives |
|---|---|
| string, number, boolean, null | ✓ |
| Date | ✓ — restored as Date |
| Map, Set | ✓ |
| Plain object, array | ✓ |
| Class instance, file handle, stream, function, BigInt, Symbol | ✗ — throws SerializableError |
Non-determinism & the naming rule
Date.now(), Math.random(), and any I/O must live inside leaf functions — never in the
workflow body, which re-executes on every resume and would otherwise produce different values
each time.
Don't rename activity ids while runs are in-flight. The event-log key is the activity's
path; renaming changes the path, so in-flight runs re-execute it rather than skipping it. For
long-running workflows, new deployments are new project snapshots — old runs continue on the
old snapshot. (The planned move from positional to semantic paths relaxes this; see
../../docs/design-notes.md.)
