@choubo/ordosdk
v0.4.0
Published
ordo task scheduler Node.js worker SDK (zero-dep, ESM)
Downloads
340
Readme
ordo — Node.js SDK
Zero-dependency worker SDK for the ordo task scheduler. Node 18+ (uses global fetch).
Quick start
import { Client, Success, Fail, BUSINESS_FAILURE } from './ordo.mjs';
const client = new Client({
baseUrl: 'http://localhost:8080',
token: 'devtoken',
domain: 'demo',
workerId: 'node-1',
});
client.register('greet', async (run) => {
if (run.shouldBail()) return Fail(BUSINESS_FAILURE, 'cancelled by operator');
return Success({ hello: run.payload.who ?? 'anon' });
});
await client.run(); // blocks; Ctrl+C for clean shutdownSee example_worker.mjs for a runnable demo.
Two ways to use
The SDK is split into a high-level loop and a set of low-level primitives.
Pick one — they share the same Client instance, but you should not mix
them on the same client (don't call claim() while run() is running).
1. High-level: Client.run()
The default. You register() handlers and call run(); the SDK owns the
claim loop, heartbeats, the 409 abort signal, and graceful SIGINT/SIGTERM.
Suitable for the common "one worker process, one slot" shape and what the
quick-start example shows.
2. Low-level: claim / heartbeat / report (+ createRun)
For multi-slot pools, custom drain semantics, integration with an external abort bus, or any other case where the high-level loop is too opinionated. Each method maps 1:1 to one HTTP endpoint:
const claimed = await client.claim({ signal: drainSignal });
if (claimed) {
// …handle claimed.run on your own slot. Heartbeat at <= leaseSeconds/4:
const t = setInterval(async () => {
const { alive, cancelRequested } = await client.heartbeat(
claimed.run.id, claimed.lease_token, claimed.fencing_token);
if (!alive) abortYourHandler(); // D4: abort + drop /result
}, (client.leaseSeconds * 1000) / client.heartbeatRatio);
// …on completion:
clearInterval(t);
await client.report(claimed.run.id, claimed.lease_token, claimed.fencing_token,
Success({ ok: true }));
}Caller responsibilities when bypassing run():
- Fencing. Every
heartbeat()andreport()call MUST carry the exactfencing_tokenfromclaim(). Never compute or cache it across claims. - Cadence. Heartbeat at
<= leaseSeconds / heartbeatRatio(default 30s with 120s lease). One missed cadence + a brief network blip and the sweeper reclaims you. - 409 = abort. When
heartbeat()returns{ alive: false }, abort in-flight handler work immediately and DROPreport()— the server would 409 it on fencing anyway, but the side effects you started have already happened. - Drain.
claim({ signal })accepts an externalAbortSignalso a long poll can be interrupted on shutdown without waiting forwait_secondsto elapse.
A handler that wants to fan out follow-up runs from inside its own slot:
await client.createRun({
definition_id: 42,
payload: { parent_run_id: run.id, /* …work-specific input… */ },
});This is just POST /api/v1/runs. The scheduler stays domain-agnostic
(D7/D8) — parent/child linkage lives in your payload, not in the
scheduler's schema.
A handler that wants to leave a breadcrumb in its own audit stream:
client.register('charge', async (run) => {
await run.progress('step1: fetched invoice', { invoice_id: 42 });
// ...do work...
await run.progress('step2: charged');
return Success({ ok: true });
});run.progress(message, attributes?) is fire-and-forget — any HTTP error
is logged at WARN and swallowed; a failed checkpoint write is never a
reason to fail the run. Persisted as a progress row in
task_run_events. Server does NOT throttle, so call at milestones
(≤ ~1Hz typical), not in tight loops.
Contract (mirrors sdk/go/client Go SDK)
- Long-poll claim →
POST /api/v1/claimwithwait_seconds=20. - Heartbeat every
leaseSeconds / 4in the background (defaultleaseSeconds=120). requestTimeoutMs(Config, default10_000) caps the short calls (heartbeat / report / progress / createRun). Per-callopts.timeoutMsoverrides.claim()is independent — its timeout is derived fromwaitSeconds + 10.- Every
/heartbeat+/resultcarriesfencing_token. /heartbeat409 → lease lost. The SDK callsAbortController.abort()onrun.signal; the handler canawaitfetch/timers wired against it for instant cancel./resultis not sent.run.shouldBail()— 0 RTT; true oncancel_requestedOR lease loss (the abort-signal aborted). Userun.cancelRequested()/run.leaseLost()if you need to distinguish.run.signal— a realAbortSignal. Pass tofetch({ signal })or any AbortSignal-aware API for fully interruptible I/O.- Handler throw →
Fail(HANDLER_BUG, msg)reported. - SIGINT / SIGTERM → stop the claim loop after the in-flight run finishes.
shouldBail() checkpoint pattern
client.register('charge', async (run) => {
// run.signal makes I/O interruptible: fetch will reject with AbortError
// the moment the lease is lost, so the irreversible call below is never
// entered after a 409.
const invoice = await fetch(`/invoices/${run.payload.id}`, { signal: run.signal })
.then((r) => r.json());
if (run.shouldBail()) return Fail(BUSINESS_FAILURE, 'cancelled');
await chargeCard(invoice); // irreversible
return Success({ charged: invoice.amount });
});Fencing is the authoritative defense — if cancel arrives between the check
and the charge, /result will 409 and the worker drops the run.
shouldBail() is the "bail early, unwind cleanly" lever; run.signal
is the "kill in-flight I/O the moment the lease is gone" lever.
