oh-my-fable
v0.2.0
Published
The autonomous-agent harness that actually finishes long tasks — because it plans first, self-corrects every step, and survives crashes. The whole run lives in one serializable RunContext, checkpointed after every step, so it resumes exactly where it died
Maintainers
Readme
oh-my-fable
Fable 5's way of working a long task — plan first, self-correct every step, never lose the thread — as a model-agnostic agent harness.
The fable is Fable 5's way of thinking; the oh-my- is because, like oh-my-zsh, you just want the good defaults. The mindset is the model's — the engine is any provider.
npm i oh-my-fableThe demos are magical. Then you point an agent at a real multi-hour task and it loops on the same step, loses the plan somewhere in a 40-message chat history, and — when your process restarts — forgets everything and starts over.
oh-my-fable encodes the way a strong reasoning model works a long task — the mindset, not the model — into a harness: plan first, self-correct every step, keep the thread, and finish. It's built around two mechanisms and one rule:
The whole run lives in a single
RunContext— the only source of truth, and always serializable. It's checkpointed after every step.
From that one rule you get the thing nobody else gives you: a crash is a pause.
The name is about the thinking, not a model lock-in — the mindset is Fable 5's, the
engine is whatever Provider you hand it (Anthropic, OpenAI-compatible, local, …).
── run run_mqf… ──
📋 planned 3 steps: outline → draft → edit
▶ outline
→ outlined
💾 checkpoint saved
▶ draft
💥 the process just died (power outage, OOM, deploy, whatever)
── resuming from the last checkpoint ──
▶ draft ← picks up exactly where it died
💾 checkpoint saved
▶ edit
✅ done
steps: outline [done], draft [done], edit [done]const result = await run(goal, { provider, store }); // crashes at step 2
// ...process restarts...
await resume(result.runId, { provider, store }); // finishes from step 2That's examples/scripted-run.mjs — run it with npm run example, no API key needed.
The three things it does that most frameworks don't
1. It survives crashes (resumable by construction)
State doesn't live in memory or in a chat transcript — it lives in RunContext,
saved to disk after every step. Kill the process at step 47 of 60 and resume()
continues from step 47, plan and progress intact. Swap the FileStore for
SQLite/Redis by implementing one interface.
2. It plans first, then self-corrects (plan ≠ history)
The plan is structured data that lives outside the conversation, so the model never loses track of "where am I" in a wall of text. After every step a reflector checks the result against the goal and routes:
| verdict | meaning | what happens |
| --- | --- | --- |
| on_track | normal progress | next step |
| needs_replan | the result changed the plan's assumptions | replan |
| blocked | same obstacle keeps recurring | replan around it / escalate |
| goal_met | success criteria satisfied | stop (even with steps left — no busywork) |
And replanning accumulates: finished steps are preserved verbatim; only the remaining work is regenerated. Long tasks move forward instead of restarting.
3. It's deterministically testable (genuinely rare for an agent framework)
Because every model call is stateless, you can script the model and assert the loop's behavior — no network, no flakiness:
import { run, ScriptedProvider, reply, MemoryStore } from "oh-my-fable";
const provider = new ScriptedProvider([
reply.plan([{ id: "s1", intent: "do the thing" }]),
reply.text("did it"),
reply.reflection("goal_met"),
]);
const { status } = await run("do the thing", { provider, store: new MemoryStore() });
expect(status).toBe("done"); // fully deterministicThe whole harness is tested this way — crash-recovery, replan-accumulation, budget halts, the tool loop — all without a single API call.
Quick start
import { run, AnthropicProvider } from "oh-my-fable";
const result = await run(
{
description: "Research the top 3 Rust web frameworks and write a comparison table",
successCriteria: ["a markdown table comparing 3 frameworks exists"],
constraints: ["only use information you can verify"],
},
{ provider: new AnthropicProvider() }, // reads ANTHROPIC_API_KEY
);
console.log(result.status); // "done" | "halted" | "failed"
console.log(result.ctx.plan.steps);npm i oh-my-fable # zero runtime dependenciesNode ≥ 18. Ships with AnthropicProvider and OpenAICompatProvider (works with
OpenAI, Ollama, LM Studio, OpenRouter, Groq… — ollama("llama3.1") for a local
model with no key), both over fetch, no SDK. Or bring any model by implementing
the Provider interface (three methods).
AnthropicProvider works with the current flagship models (claude-opus-4-8,
claude-fable-5) out of the box — it drops the temperature parameter they
reject — and prompt-caches the system+tools prefix by default, so a long
durable run pays ~10× less on the context it replays every step. Opt into
{ thinking: "adaptive", effort: "high" } for harder planning. The claude
provider can return real --output-format json cost/usage and run Claude's own
tools ({ tools: true, permissionMode: "acceptEdits" }).
Or use it from the terminal
Don't want to write code? It ships a CLI (zero extra deps):
npx oh-my-fable demo # watch crash → resume, no API key
# ⭐ already pay for Claude Code? drive it as a DURABLE, TOOL-USING agent — your
# login, no separate API key, $0 per token. Claude edits files & runs commands:
npx oh-my-fable run "refactor utils.ts and run the tests" --provider claude --cli-tools
# pure-reasoning over the same login (no tools):
npx oh-my-fable run "outline a talk on durable agents" --provider claude
# or a LOCAL model (Ollama / LM Studio), also no key:
npx oh-my-fable run "outline a talk on durable agents" --provider ollama --model llama3.1
# or any hosted model:
export ANTHROPIC_API_KEY=sk-...
npx oh-my-fable run "summarize README.md into SUMMARY.md" --tools fs
npx oh-my-fable list # your saved runs
npx oh-my-fable show run_abc123 # the run's plan, steps & budget as a timeline
npx oh-my-fable resume run_abc123 # continue one from its checkpointYou don't need an Anthropic API key. Pick how it talks to a model:
| --provider | uses | key? | tools? |
| --- | --- | --- | --- |
| claude | your Claude Code login | none | --cli-tools → Claude runs Read/Write/Edit/Bash itself |
| codex | your Codex CLI login | none | --cli-tools → workspace-write |
| ollama | a local Ollama model | none | --tools fs (harness-run) |
| --base-url <url> | LM Studio / OpenRouter / Groq / any OpenAI-compatible | per that server | --tools fs |
| openai | OpenAI | OPENAI_API_KEY | --tools fs |
| (default) | Anthropic | ANTHROPIC_API_KEY | --tools fs |
Two ways to give an agent hands:
--cli-tools(claude/codex) — the CLI runs its own tools (file edits, shell) on your subscription. oh-my-fable stays the durable planner/reflector around it: it plans, checkpoints every step, and reflects — Claude does the work. Tune with--permission-mode acceptEdits|dontAsk|planand--allow "Read,Edit,Bash(npm test)".--tools fs(API providers) — the harness gives the agent a sandboxedread_file/write_file/list_dir, confined to the working directory.
You watch the plan form and each step get reflected on, live. Every run is
checkpointed, so resume <runId> always works — and show <runId> prints the
whole run (plan, steps, budget) from its serialized RunContext.
Tools
import { run, defineTool, AnthropicProvider } from "oh-my-fable";
const search = defineTool(
"web_search",
"Search the web and return results.",
{ type: "object", properties: { query: { type: "string" } }, required: ["query"] },
async ({ query }) => ({ ok: true, output: await fetchResults(query) }),
);
await run(goal, { provider: new AnthropicProvider(), tools: [search] });A tool that throws becomes an Observation, not a crash — the reflector decides
what to do about it.
Watch it work
await run(goal, {
provider,
onEvent: (e) => console.log(e.type, e),
// plan_created · step_start · step_done · reflection · replan · compaction · checkpoint · done · halted
});It can't run away
Three hard ceilings, checked at the top of every loop turn, plus two recovery caps — exceed any and it halts cleanly, preserving all work:
await run(goal, {
provider,
maxSteps: 50, // total step budget
maxTokens: 2_000_000, // cumulative token budget
maxWallClockMs: 1_800_000,
maxStepAttempts: 3, // a single step retried this many times → blocked
maxReplans: 12, // replan storm → halted
});How it's built
A planner ↔ executor ↔ reflector loop over a serializable RunContext:
plan → [ budget? → next step → compact? → execute → reflect → checkpoint → route ] → done- planner — goal → ordered steps;
replanaccumulates instead of resetting. - executor — runs one step, including a provider-agnostic tool mini-loop.
- reflector — heuristics first (cheap, certain), then the model, with JSON self-repair and a conservative fallback (a wrong early exit is worse than one more loop).
- contextManager — folds old turns into digests so long runs stay inside the window; the plan is never compacted.
- store / budget — checkpoint after every step; guard against runaways.
Every piece is an interface you can replace without touching the core. The full
architecture writeup is in ARCHITECTURE.md.
Roadmap
- A web dashboard that tails a run's events and lets you resume from any checkpoint (
show <runId>is the CLI version of this today). - More providers in-repo (OpenAI-compatible, local) — though it's a 3-method interface.
- Parallel step execution for independent branches of the plan DAG.
- Human-in-the-loop: pause for approval as a first-class step status.
💖 Sponsor
Free, MIT, zero-dependency, built in spare time. If it saved your agent from starting over:
- ⭐ Star the repo — it's how the next person building an agent finds it.
- 🍋 Sponsor via Lemon Squeezy — one-time or recurring.
License
MIT © oh-my-fable contributors
