@flopod/runtime

v0.3.1

Published

2 days ago

The durable workflow execution engine and HTTP server. Workflow authors using the compiler never import it directly — the compiler wires it up. For hand-authored runtime projects, import from `@flopod/runtime`. For the conceptual picture (how this fits th

0High
0Medium
0Low

mciureanu

`@flopod/runtime`

The durable workflow execution engine and HTTP server. Workflow authors using the compiler never import it directly — the compiler wires it up. For hand-authored runtime projects, import from @flopod/runtime. For the conceptual picture (how this fits the rest of the system), see ../../docs/architecture.md.

Initialization

import { createRuntime, SqliteBackend } from '@flopod/runtime'

const backend = new SqliteBackend('./workflow.db')
const runId = process.env.FLOPOD_RUN_ID ?? 'main-run'
const wf = await createRuntime({ runId, backend })

On createRuntime:

Loads the full event log for runId from the backend — one read, once
Builds in-memory state: completed activities, var snapshots, branch history
All subsequent checks are pure memory lookups

Activity primitives

`wf.activity(id, fn, options?)`

The atomic unit of work. Leaf activities can do anything. Return value is stored in the event log and returned from cache on replay.

const pokemon = await wf.activity('fetch', () => fetchPokemon('bulbasaur'))

On resume: returns stored output without calling fn. Return value must be serializable.

Options:

{ timeoutMs?: number, retry?: { maxAttempts: number, backoff?: 'fixed' | 'exponential', delayMs?: number } }

`wf.attempt(id, fn, policy)`

Activity with retry policy. Returns null if all attempts fail.

const result = await wf.attempt('classify', () => classify(item), { maxAttempts: 3 })

Control flow primitives

`wf.branch(id, key, cases)`

Durable conditional. The branch taken is snapshotted — on resume the same branch is replayed.

await wf.branch('handle', status, {
  active:    async () => { ... },
  suspended: async () => { ... },
  deleted:   async () => { ... },
})

`wf.each(id, items, fn, options?)`

Durable iteration. Each item's result is stored individually — completed iterations are skipped on replay.

const { results, errors } = await wf.each('process', items, async (item, index) => {
  return wf.activity(`enrich-${index}`, () => enrich(item))
}, { concurrency: 3 })

`wf.repeat(id, fn)`

Durable loop. Runs until STOP is returned.

import { STOP } from '@flopod/runtime'

await wf.repeat('paginate', async (iteration) => {
  const page = await wf.activity(`fetch-${iteration}`, () => fetchPage(iteration))
  if (!page.hasMore) return STOP
})

`wf.parallel(id, fns)`

Fan-out — all functions execute concurrently.

const results = await wf.parallel('fetch-all', names.map(name =>
  () => wf.activity(`fetch-${name}`, () => fetchPokemon(name))
))

Timing primitives

`wf.sleep(id, duration)`

Durable sleep. Survives process restarts.

await wf.sleep('cooldown', '30m')
// Duration format: '30s', '5m', '1h', '2d'

`wf.wait(id, event, options?)`

Suspend until an external event is delivered. Returns payload or null on timeout.

const approval = await wf.wait<{ by: string }>('approval', 'researcher-approved', { timeout: '48h' })

Deliver externally:

flopod-run deliver <run-id> <path> <event> '{"by":"alice"}'

State primitives

All workflow state must go through these primitives — they are snapshotted by the runtime after every mutation and visible to the visual graph renderer.

`wf.var(id, initial)`

Durable scalar variable.

const total = await wf.var('total', 0)
await total.set(42)
await total.update(v => v + 1)
console.log(total.value)  // 43

`wf.list(id, initial?)`

Durable ordered list.

const items = await wf.list<string>('items', [])
await items.push('a', 'b')
await items.filter(x => x !== 'a')
console.log(items.items)  // ['b']

`wf.map(id, initial?)`

Durable key-value map.

const counts = await wf.map<string, number>('counts')
await counts.set('fire', 3)
await counts.delete('water')
console.log(counts.map.get('fire'))  // 3

Path addressing

Every primitive takes an id. The runtime maintains a call stack. The full path of a nested call:

fetch-details
expedition[2]/analyze[5]/classify-bulbasaur
paginate[0]/fetch-page

Rules:

id must be unique within its parent scope
Loop iterations: id[index]
Don't rename ids while runs are in-flight — renaming changes the path, causing re-execution

Backends

| Backend | When to use | |---|---| | MemoryBackend | Tests — in-process, no persistence | | SqliteBackend('./path.db') | Local development — zero infra (native dep) | | FileBackend(dataDir?) | Append-only JSONL, one file per run, no native dep. Single-node durable (local, or a single-container Docker deploy with a volume). Human-readable/greppable. | | PostgresBackend(connectionString) | Production — persistent, multi-process / multi-replica |

Listing runs — a cheap read-model, never a replay

Backend.listRuns() returns the run list without replaying any event log. Each backend maintains a StoredRunSummary projection (status, started/ended, error, progress) incrementally via foldEvent (run-summary.ts): MemoryBackend folds into a map; FileBackend writes a <runId>.summary.json sidecar (LWW overwrite) on each append and reads sidecars on list; SqliteBackend/PostgresBackend upsert a run_summaries row and list with a single SELECT. The full event log is touched only when a specific run is opened — and, once, to migrate a run that predates the read-model. The fold persists open-wait/sleep counters so live status survives a restart without a replay.

There are three run targets:

| Command | Process | Control plane | Lifetime | Use | |---|---|---|---|---| | flopod dev | TS via node | yes (browser, animated, step) | stays up | local dev/debug | | flopod build → bundle.js | bundled job | no | one run, exits | run a workflow in a container | | flopod serve → server.js | bundled, headless | yes | long-lived | deployable control plane |

flopod serve builds a single self-contained dist/server.js (runtime + control plane inlined) plus dist/Dockerfile.server. It hosts the HTTP control plane headless, never auto-runs a default run, and waits for runs triggered over POST /runs. It uses the durable FileBackend and hydrates its run list on boot, so it survives restarts. Mount a volume at FLOPOD_DATA_DIR (default /data in the image) to persist. Single-node (one writer); multi-replica needs Postgres + shared control-plane state (not yet built).

The dev control plane consumes this: when given a durable backend (the dev codegen now wires a shared FileBackend), DevServerPlugin hydrates its run list from backend.listRuns() on startup, so runs from before a restart still appear; a run's frames are replayed from its log lazily, only when it's opened. Live runs always win over hydrated ones. Because the runtime emits RunStarted but has no run-level bookend, the control plane's run() wrapper appends the terminal RunCompleted/ RunFailed event so a finished run reads as ok/error (not running) after a restart.

FileBackend writes <dataDir>/events/<runId>.jsonl. The data dir resolves via resolveDataDir(): FLOPOD_DATA_DIR → (FLOPOD_DATA_SCOPE=user → OS data dir) → ./.flopod. In a compiled artifact, FLOPOD_BACKEND=file selects it and FLOPOD_DATA_DIR points it at a mounted volume. It is single-writer (per-run file + cooperative-concurrency seq assignment) — cross to Postgres for multiple replicas. On DigitalOcean: a Droplet Volume persists it; App Platform has no persistent disk, so use DATABASE_URL → Managed Postgres there.

Durability model

Snapshotting

The runtime maintains an internal state object. On each significant operation, the full current value of that state is written to the backend — not individual mutation deltas. On resume, the latest snapshot is loaded and execution continues.

What is snapshotted: every wf.activity() completion (output stored at its path), every wf.var/list/map mutation (full current value), every wf.branch() (which branch was taken), every wf.each() iteration completion, every wf.sleep() / wf.wait() state.

On resume: load all events → build state (latest value per path/id) → re-execute the workflow body from the top → wf.activity() returns its stored output (skips the real call) → wf.var/list/map restore from the latest snapshot → continue until the first un-snapshotted operation.

Why `wf.*` is mandatory for state

Plain variables in the workflow body are invisible to the runtime — not snapshotted, not visualizable, and lost on crash. The wf.* call is what makes state durable and visible to the graph renderer. (Authors never write wf.* directly — the compiler emits it; see the two-tier model in ../../docs/architecture.md.)

Serialization rules

Activity return values and wf.var values are persisted via superjson.

| Type | Survives | |---|---| | string, number, boolean, null | ✓ | | Date | ✓ — restored as Date | | Map, Set | ✓ | | Plain object, array | ✓ | | Class instance, file handle, stream, function, BigInt, Symbol | ✗ — throws SerializableError |

Non-determinism & the naming rule

Date.now(), Math.random(), and any I/O must live inside leaf functions — never in the workflow body, which re-executes on every resume and would otherwise produce different values each time.

Don't rename activity ids while runs are in-flight. The event-log key is the activity's path; renaming changes the path, so in-flight runs re-execute it rather than skipping it. For long-running workflows, new deployments are new project snapshots — old runs continue on the old snapshot. (The planned move from positional to semantic paths relaxes this; see ../../docs/design-notes.md.)

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

@flopod/runtime

Initialization

Activity primitives

wf.activity(id, fn, options?)

wf.attempt(id, fn, policy)

Control flow primitives

wf.branch(id, key, cases)

wf.each(id, items, fn, options?)

wf.repeat(id, fn)

wf.parallel(id, fns)