oliver-agent
v0.2.0
Published
Opinionated agent harness for Next.js + Drizzle + Postgres SaaS. One tool → Next.js server action + Vercel AI SDK tool. HITL, audit, precondition, concurrency built in.
Downloads
367
Maintainers
Readme
Oliver
🇧🇷 Versão em português (rola pro fim)
Oliver is a TypeScript harness for embedding LLM-powered agents inside multi-tenant SaaS products. You define a tool once; Oliver routes it to your UI buttons (as a Next.js server action) and to your chat agent (as a Vercel AI SDK tool). Approval gates, audit log, domain invariant checks, and per-resource concurrency control come included.
defineTool({
name: "applyDiscount",
description: "Apply a percentage discount to a quote.",
input: z.object({ quoteNumber: z.string(), percent: z.number() }),
requiresApproval: true,
precondition: async ({ input }) => assertDraft(input.quoteNumber),
concurrencyKey: ({ input }) => `quote:${input.quoteNumber}`,
previewChange: async ({ input }) => buildDiff(input),
execute: async ({ input }) => applyDiscountToDB(input),
verify: async ({ input, result }) => checkDBMatches(result),
});That single definition ships through both channels, enforces the "published quote is read-only" invariant at every step, blocks races on the same quote, shows a before/after card in the chat, runs only after a human click, and double-checks the DB after.
Install
Two paths to the same end-state. Pick what fits your workflow.
Option A — AI coding agent (~2 min)
If you already work with Claude Code, Codex, or Cursor, tell your agent:
"Install Oliver in this project — follow the instructions at https://raw.githubusercontent.com/caio-overmind-ventures/oliver-harness-saas/main/INSTALL.md"
Your agent reads INSTALL.md, detects your stack (auth lib, ORM setup, project layout), generates the files, runs the migration, and reports back.
Option B — Manual (~15 min)
8 steps you run yourself. See Manual install further down.
Background
Oliver was built after studying a set of agent harnesses across two scopes.
Coding agents (most public references):
- Codex — OpenAI's terminal coding agent (Rust)
- Claude Code — Anthropic's coding CLI; closed source but heavily documented
- OpenClaude — TypeScript Claude Code clone
- walkinglabs / learn-harness-engineering — the verification-first principle
Generalist personal-AI harnesses (Pi → OpenClaw → Hermes lineage):
- Pi — Mario Zechner's minimal coding-agent harness
- OpenClaw — embeds Pi, adds bootstrap files (SOUL/AGENTS/USER/IDENTITY/TOOLS) + multi-channel
- Hermes Agent — Nous Research's successor to OpenClaw
Plus: Manus on prefix-cache economics (~10x cost reduction with stable system-prompt prefix).
What we borrowed (with attribution)
| Pattern | Source | Where it lives in Oliver |
|---|---|---|
| Verification-first ("only passing verify counts as done") | walkinglabs | verify hook with 5s timeout |
| Stable prefix + mutable suffix for KV-cache | Manus | Context system-prompt assembly |
| Layered markdown instructions (SOUL.md, AGENTS.md-style) | OpenClaw / Hermes | Instructions component (SOUL/domain/playbook/lessons) |
| Subagent isolation with own tool allowlist | Claude Code | Roadmap v0.1 candidate |
| Hooks pipeline (Pre/PostToolUse) for extensibility | Claude Code / Pi / OpenClaude | Roadmap v0.1 candidate |
| Tool interface split (validate vs permission gate) | OpenClaude | Implicit in precondition + execute |
| Approval × capability as orthogonal axes | Codex | Roadmap v0.1 candidate (richer than current binary requiresApproval) |
| Tool discoverability tools (tool_search) | Codex / OpenClaude | Roadmap, kicks in when tools >30 |
| Progressive disclosure of skills | Pi / OpenClaude / Codex | Roadmap |
What's Oliver-original
None of the studied harnesses target multi-tenant SaaS with first-class B2B concerns. So:
- Audit log as a primitive — none of them have one. Oliver writes every lifecycle event (
invoked,approved,succeeded,verified, etc.) tooliver.audit_logwith non-throwing writes and a pluggableonAuditFailurehandler. - Multi-tenancy by design —
orgId/userIdflow through every tool call from the start. Cross-tenant calls are impossible by construction. - Dual-channel Gateway — same
defineTooldefinition routes to a Next.js server action (UI buttons) AND a Vercel AI SDK tool (chat agent). No duplicated business logic. - DB-backed HITL state machine —
oliver.pending_toolstable with re-invocation guard (LLM proposing the same action twice in flight returns the existing card, not a duplicate). preconditionhook at every divergence point — runs before preview, before execute, before pending insert, AND again at approve time. Catches the classic "stale state between propose and approve" race.concurrencyKeymutex — process-level FIFO queue per key. Two LLM calls hitting the samequote:abcresource serialize at execute time without the builder writing any locking code.
Architecture
Oliver is six components, each with a focused job:
| Component | What it does |
|---|---|
| Tools | Atomic operations defined via defineTool() with Zod schemas. |
| Gateway | Routes one tool to multiple channels: Next.js server action, Vercel AI SDK tool, MCP (v0.1). |
| Context | Assembles the system prompt from instructions + tool list. KV-cache-friendly stable prefix. |
| Approval Gates | DB-backed state machine for tools marked requiresApproval: true. |
| Audit | Every lifecycle event written to oliver.audit_log with non-throwing writes. |
| Instructions | Layered markdown (SOUL → domain → playbook → lessons) loaded once at boot. |
How they fit together:
defineTool() ─┐
├─► Gateway ──► Server Action (UI buttons)
Instructions ─┤ └► Agent Tool (chat LLM)
│ └► MCP (v0.1)
┌──────────┴────────┐
▼ ▼
Context Approval Gates ──► Audit
(system prompt) (HITL state) (oliver.audit_log)The builder writes tools and instructions. Oliver does the rest: builds the system prompt, routes each invocation to the right channel, intercepts HITL tools through the approval state machine, and records every step. Reads/writes share the same Postgres database as your app via the dedicated oliver schema, so cross-schema transactions work.
What Oliver is not: a workflow engine, a chat UI library, an auth solution, or a model-routing layer. It's the harness between your domain logic and the LLM.
Manual install
The 8-step path. Same end-state as the AI install above; pick this if you don't have an AI coding agent in your workflow or want to drive each step yourself. ~15 minutes. Assumes you have a Next.js 15+ app with Drizzle + Postgres.
1. Install
pnpm add oliver-agent ai zod drizzle-orm2. Schema
Add Oliver's tables to drizzle.config.ts:
schema: [
"./src/db/schema.ts",
"./node_modules/oliver-agent/src/db/schema.ts",
],Then pnpm drizzle-kit generate && pnpm drizzle-kit migrate. Creates oliver.pending_tools + oliver.audit_log in a separate Postgres schema.
3. Create the agent
// lib/oliver.ts
import "server-only";
import { createAgent, loadInstructions } from "oliver-agent";
import { headers } from "next/headers";
import { auth } from "@/lib/auth";
import { database } from "@/db";
import { allTools } from "@/tools";
const instructions = await loadInstructions("./instructions");
export const oliver = createAgent({
tools: allTools,
instructions,
db: database,
resolveServerActionContext: async (override) => {
const session = await auth.getSession({ headers: await headers() });
if (!session) throw new Error("Not authenticated");
return {
orgId: await resolveOrgId(override?.slug),
userId: session.user.id,
source: "ui",
slug: override?.slug,
};
},
});ctx is fully typed — extend with anything your tools need (db, logger, flags, etc.).
4. Define your first tool
// tools/createCustomer.ts
import { defineTool } from "oliver-agent";
import { z } from "zod";
export const createCustomer = defineTool({
name: "createCustomer",
description: "Create a new customer in the current organization.",
input: z.object({
name: z.string().min(1).max(256),
email: z.string().email().optional(),
}),
execute: async ({ input, ctx }) => {
const id = generateId.customer();
await database.insert(customers).values({
id, organizationId: ctx.orgId, name: input.name, email: input.email ?? null,
});
return { id, name: input.name };
},
});// tools/index.ts
import { createCustomer } from "./createCustomer";
export const allTools = [createCustomer] as const;
export { createCustomer };5. Expose as a server action
// actions/customers.ts
"use server";
import { oliver } from "@/lib/oliver";
import { createCustomer } from "@/tools";
export const createCustomerAction = oliver.serverAction(createCustomer);Call from any client component: await createCustomerAction({ name: "Acme" }, { slug }).
6. Wire the chat route
// app/api/chat/route.ts
import { ToolLoopAgent, createAgentUIStreamResponse, stepCountIs } from "@repo/ai";
import { oliver } from "@/lib/oliver";
export async function POST(req: Request) {
const session = await auth.getSession({ headers: await headers() });
const { messages, context } = await req.json();
const oliverSession = oliver.assembleSession({
ctx: { orgId: await resolveOrgId(context.orgSlug), userId: session.user.id, source: "agent", slug: context.orgSlug },
pageContext: { route: context.page },
});
const agent = new ToolLoopAgent({
model: models.chat,
instructions: oliverSession.systemPrompt,
tools: oliverSession.tools,
stopWhen: stepCountIs(25),
});
return createAgentUIStreamResponse({ agent, uiMessages: messages });
}7. Chat UI
Oliver doesn't ship a chat UI. Pick one: assistant-ui, Shadcn chat, or roll your own with useChat from @ai-sdk/react.
8. Approval card (when you ship a HITL tool)
When a tool has requiresApproval: true, its result becomes { status: "awaiting_approval", pendingToolId, ... }. Two pieces:
// actions/oliver-approvals.ts
"use server";
import { oliver } from "@/lib/oliver";
export async function approvePendingToolAction(input, ctx) {
return oliver.approvePendingTool(input, ctx);
}
export async function rejectPendingToolAction(input, ctx) {
return oliver.rejectPendingTool(input, ctx);
}Plus a component that detects the shape and renders Approve/Reject buttons. Reference implementation: ~120 LOC headless React.
That's it. Each new tool ships through both channels, gets audit + HITL + concurrency for free.
What you bring
| Concern | Pick | |---|---| | Chat UI | assistant-ui, Shadcn chat, custom | | Authentication | better-auth, Clerk, NextAuth, Supabase Auth | | Database + ORM | Drizzle + Postgres (Neon recommended) | | LLM model config | Vercel AI SDK + provider key | | Approval card UI | One ~120 LOC component |
A future create-oliver-app template will pre-cable an opinionated stack (see Roadmap).
Component deep-dive
Tools
defineTool<typeof inputSchema, OutputType, ContextExt>({
name, description, input,
execute: async ({ input, ctx }) => ...,
// optional:
precondition, concurrencyKey, requiresApproval, previewChange, verify,
});Sub-features layered on a tool:
precondition— domain invariant check at every divergence point. ThrowsToolErrorto block. Catches "stale state between propose and approve" races.concurrencyKey—({ input }) => "quote:" + input.idserializes same-keyexecute()calls process-wide. FIFO queue per key, deadlock-safe.verify— runs after execute to confirm DB matches. 5s timeout. Logged to audit (verified/failed_verification/verification_skipped).previewChange— only invoked for HITL; produces the before/after diff on the approval card.
Gateway
Three entry points off agent:
agent.serverAction(tool)—(input, ctxOverride?) => Promise<ActionResult>. Wrap in"use server".agent.agentTools({ ctx })— Vercel AI SDK tool record bound to session context.agent.assembleSession({ ctx, pageContext? })—{ systemPrompt, tools }together. Recommended for chat routes.
Context
ToolContext<TContextExt> is type-parameterized. Oliver provides the base (orgId, userId, source); the builder extends. System prompt = stable prefix (instructions + tool list + tenant) cached by providers + mutable suffix (page context, pending approvals) per turn. ~10x cost reduction on long sessions (Manus-pattern).
Approval Gates (HITL)
When the agent calls a requiresApproval: true tool:
- Compute preview (best-effort).
- Insert row in
oliver.pending_toolswith statuspending_approval. - Return
awaiting_approval+pendingToolIdto the LLM.
UI queries agent.listPendingTools(orgId), renders cards. Approve → re-checks precondition → runs execute → updates pending row → audits full lifecycle. Re-invocation guard: same action while pending returns the existing pendingToolId.
Audit
| Status | When |
|---|---|
| invoked | execute() starts |
| pending_approval | HITL card inserted |
| approved / rejected | User clicked |
| succeeded / failed | execute() returned |
| verified / failed_verification / verification_skipped | After verify hook |
Rows grouped by traceId (non-HITL) or linked via pendingToolId (HITL). Writes are non-throwing — failed insert falls through to onAuditFailure (default: console.error).
Instructions
Layered markdown loaded once at module init (SOUL.md naming inspired by OpenClaw/Hermes, structure adapted for SaaS):
SOUL.md— voice, boundaries, non-negotiable rulesdomain.md— concepts, entities, vocabularyplaybook.md— workflows (DISCOVER → SUMMARIZE → EXECUTE)lessons.md— learned corrections across sessions
Edits require dev server restart — reading per turn would invalidate KV-cache.
Database
Tables in a dedicated oliver Postgres schema (not public):
oliver.pending_tools -- HITL state machine
oliver.audit_log -- invocation + verification logSame database as your app — cross-schema transactional atomicity available.
Slash commands
Chat shortcuts that bypass the LLM. When the user types /pending or /help, the chat route intercepts BEFORE calling the model — cheap, deterministic, no token burn. Useful for harness state inspection, /clear-style operator commands, and surfacing what the agent can do.
Three commands ship built in:
| Command | What it does |
|---|---|
| /help | Lists all available commands. |
| /tools | Lists registered tools with descriptions, marks HITL ones. |
| /pending | Lists active HITL approvals waiting for the user. |
Adopters add custom commands via defineSlashCommand and register them on the agent:
import { defineSlashCommand } from "oliver-agent";
const audit = defineSlashCommand({
name: "audit",
description: "Show the last 10 audit entries.",
handler: async ({ ctx }) => {
const rows = await db.select().from(auditLog)
.where(eq(auditLog.orgId, ctx.orgId))
.orderBy(desc(auditLog.createdAt)).limit(10);
return rows.map((r) => `${r.toolName} → ${r.status}`).join("\n");
},
});
createAgent({
tools: allTools,
commands: [audit], // user commands take precedence; can shadow built-ins
...
});Wire in your chat route — short-circuit BEFORE streamText:
const slashText = await oliver.handleSlashCommand(messages, ctx);
if (slashText !== null) return oliver.respondWithText(slashText);
// otherwise: streamText(...)respondWithText returns a UI message stream Response, so the chat UI sees the result like any other assistant message.
UI-only commands (/clear, /new) live in the chat UI — assistant-ui exposes aui.thread.reset() for that. They never need to round-trip to the backend.
Testing
pnpm test # 97 tests
pnpm demo:mutex # CLI proof of concurrencyKey serialization
pnpm typecheckRoadmap
Full detail in ROADMAP.md. Quick view:
v0.1 candidates (likely):
- Subagent primitive — a tool spawns a scoped LLM loop with a subset of tools. Atomic compound flows, isolated context. (Claude Code Task pattern.)
- Hooks pipeline — Pre/PostToolUse extensibility seam. Approval Gates becomes one hook implementation among many. (Claude Code / OpenClaude / Pi pattern.)
- MCP channel — expose tools as Model Context Protocol endpoints for external agents (Claude Desktop, Cursor, etc.).
- DB advisory locks — replace process-level
concurrencyKeymutex withpg_advisory_xact_lockfor multi-instance deployments. - Approval card React component — headless, ship the ~120 LOC pattern as a reusable primitive.
create-oliver-apptemplate — opinionated Next.js + Drizzle + assistant-ui starter.
v0.2 exploratory:
- Granular permission policy (allow/ask/deny per tool, source-attributed) — richer than binary
requiresApproval. (Codex / OpenClaude pattern.) - Tool discoverability tools (
tool_search,tool_suggest) — kicks in when tools >30. (Codex / OpenClaude pattern.) - Pending approval expiration cron.
- User modeling (Honcho-style dialectic).
Explicit non-goals: workflow engine (use Temporal/Inngest), chat UI library, auth solution, hosted control plane, generic CRUD generation.
License
MIT. See LICENSE.
🇧🇷 Versão em português
Oliver é um harness em TypeScript para embedar agentes LLM dentro de produtos SaaS multi-tenant. Você define uma tool uma vez; Oliver entrega ela como server action do Next.js (pros botões da UI) e como tool do Vercel AI SDK (pro chat agent). Approval gates, audit log, checagem de invariantes de domínio e controle de concorrência por recurso vêm inclusos.
defineTool({
name: "applyDiscount",
description: "Aplica desconto percentual em uma cotação.",
input: z.object({ quoteNumber: z.string(), percent: z.number() }),
requiresApproval: true,
precondition: async ({ input }) => assertDraft(input.quoteNumber),
concurrencyKey: ({ input }) => `quote:${input.quoteNumber}`,
previewChange: async ({ input }) => buildDiff(input),
execute: async ({ input }) => applyDiscountToDB(input),
verify: async ({ input, result }) => checkDBMatches(result),
});Essa única definição roda nos dois canais, garante a invariante "cotação publicada é read-only" em todo passo, bloqueia race em chamadas paralelas na mesma cotação, mostra um card before/after no chat, só executa depois do clique humano, e confere o DB depois.
Instalação
Dois caminhos pro mesmo end-state. Escolhe o que combina com seu workflow.
Opção A — Coding agent (~2 min)
Se você já trabalha com Claude Code, Codex ou Cursor, fala pro seu agent:
"Instala o Oliver nesse projeto — segue as instruções em https://raw.githubusercontent.com/caio-overmind-ventures/oliver-harness-saas/main/INSTALL.md"
Seu agent lê INSTALL.md, detecta seu stack (lib de auth, setup do ORM, layout do projeto), gera os arquivos, roda a migration, e te reporta.
Opção B — Manual (~15 min)
8 passos que você executa. Veja Install manual mais abaixo.
Background
Oliver foi construído depois de estudar um conjunto de harnesses de agente em dois escopos.
Coding agents (mais referência pública):
- Codex — coding agent terminal da OpenAI (Rust)
- Claude Code — CLI da Anthropic; closed source mas amplamente documentado
- OpenClaude — clone TypeScript do Claude Code
- walkinglabs / learn-harness-engineering — princípio verification-first
Generalist personal-AI harnesses (lineage Pi → OpenClaw → Hermes):
- Pi — harness minimal de coding agent do Mario Zechner
- OpenClaw — embeda Pi, adiciona bootstrap files (SOUL/AGENTS/USER/IDENTITY/TOOLS) + multi-canal
- Hermes Agent — sucessor da Nous Research pro OpenClaw
Mais: Manus sobre economia de prefix-cache (~10x redução de custo com prefixo de system prompt estável).
O que pegamos emprestado (com atribuição)
| Padrão | Fonte | Onde vive no Oliver |
|---|---|---|
| Verification-first ("only passing verify counts as done") | walkinglabs | Hook verify com timeout 5s |
| Prefixo estável + sufixo mutável pro KV-cache | Manus | Montagem do system prompt no Context |
| Layered markdown instructions (SOUL.md, estilo AGENTS.md) | OpenClaw / Hermes | Componente Instructions (SOUL/domain/playbook/lessons) |
| Subagent isolado com tool allowlist próprio | Claude Code | Candidato no Roadmap v0.1 |
| Hooks pipeline (Pre/PostToolUse) pra extensibilidade | Claude Code / Pi / OpenClaude | Candidato no Roadmap v0.1 |
| Tool interface split (validate vs permission gate) | OpenClaude | Implícito em precondition + execute |
| Approval × capability como eixos ortogonais | Codex | Candidato no Roadmap v0.1 (mais rico que o binário requiresApproval) |
| Tool discoverability tools (tool_search) | Codex / OpenClaude | Roadmap, ativa quando tools >30 |
| Progressive disclosure de skills | Pi / OpenClaude / Codex | Roadmap |
O que é Oliver-original
Nenhum dos harnesses estudados mira SaaS multi-tenant com concerns B2B first-class. Então:
- Audit log como primitivo — nenhum tem. Oliver escreve todo evento de lifecycle (
invoked,approved,succeeded,verified, etc.) emoliver.audit_logcom writes non-throwing e handleronAuditFailureplugável. - Multi-tenancy by design —
orgId/userIdfluem por toda tool call desde o início. Chamadas cross-tenant são impossíveis por construção. - Dual-channel Gateway — mesma definição
defineToolroteia pra server action Next.js (botões UI) E tool Vercel AI SDK (chat). Sem lógica duplicada. - State machine HITL no DB — tabela
oliver.pending_toolscom re-invocation guard (LLM propondo a mesma ação duas vezes em voo retorna o card existente, não duplicado). - Hook
preconditionem todo divergence point — roda antes de preview, antes de execute, antes de pending insert, E DE NOVO no approve. Pega o race clássico "estado mudou entre propose e approve". - Mutex
concurrencyKey— FIFO queue por key no nível do processo. Duas chamadas LLM no mesmoquote:abcserializam no execute sem o builder escrever locking.
Arquitetura
Seis componentes:
| Componente | O que faz |
|---|---|
| Tools | Operações atômicas via defineTool() com schemas Zod. |
| Gateway | Roteia uma tool pra múltiplos canais: server action Next.js, tool Vercel AI SDK, MCP (v0.1). |
| Context | Monta o system prompt das instructions + lista de tools. Prefixo estável KV-cache-friendly. |
| Approval Gates | State machine no DB pra tools com requiresApproval: true. |
| Audit | Todo evento de lifecycle escrito em oliver.audit_log com writes non-throwing. |
| Instructions | Markdown em camadas (SOUL → domain → playbook → lessons) carregado uma vez no boot. |
Como se conectam:
defineTool() ─┐
├─► Gateway ──► Server Action (botões UI)
Instructions ─┤ └► Agent Tool (chat LLM)
│ └► MCP (v0.1)
┌──────────┴────────┐
▼ ▼
Context Approval Gates ──► Audit
(system prompt) (state HITL) (oliver.audit_log)O builder escreve tools e instructions. Oliver faz o resto: monta o system prompt, roteia cada invocação pro canal certo, intercepta tools HITL pelo state machine, registra cada passo. Reads/writes compartilham o mesmo Postgres da app pelo schema dedicado oliver, então transações cross-schema funcionam.
O que Oliver não é: workflow engine, biblioteca de chat UI, solução de auth, camada de routing de modelo.
Install manual
O caminho de 8 passos. Mesmo end-state que o install com AI acima; escolhe esse se você não tem coding agent no workflow ou prefere dirigir cada passo. ~15 minutos. Assume Next.js 15+ com Drizzle + Postgres.
1. Instalar
pnpm add oliver-agent ai zod drizzle-orm2. Schema
// drizzle.config.ts
schema: [
"./src/db/schema.ts",
"./node_modules/oliver-agent/src/db/schema.ts",
],pnpm drizzle-kit generate && pnpm drizzle-kit migrate. Cria oliver.pending_tools + oliver.audit_log.
3. Criar o agent
// lib/oliver.ts
import "server-only";
import { createAgent, loadInstructions } from "oliver-agent";
import { headers } from "next/headers";
import { auth } from "@/lib/auth";
import { database } from "@/db";
import { allTools } from "@/tools";
const instructions = await loadInstructions("./instructions");
export const oliver = createAgent({
tools: allTools,
instructions,
db: database,
resolveServerActionContext: async (override) => {
const session = await auth.getSession({ headers: await headers() });
if (!session) throw new Error("Não autenticado");
return {
orgId: await resolveOrgId(override?.slug),
userId: session.user.id,
source: "ui",
slug: override?.slug,
};
},
});4. Definir sua primeira tool
// tools/createCustomer.ts
import { defineTool } from "oliver-agent";
import { z } from "zod";
export const createCustomer = defineTool({
name: "createCustomer",
description: "Cria um novo cliente na organização atual.",
input: z.object({
name: z.string().min(1).max(256),
email: z.string().email().optional(),
}),
execute: async ({ input, ctx }) => {
const id = generateId.customer();
await database.insert(customers).values({
id, organizationId: ctx.orgId, name: input.name, email: input.email ?? null,
});
return { id, name: input.name };
},
});// tools/index.ts
import { createCustomer } from "./createCustomer";
export const allTools = [createCustomer] as const;
export { createCustomer };5. Expor como server action
// actions/customers.ts
"use server";
import { oliver } from "@/lib/oliver";
import { createCustomer } from "@/tools";
export const createCustomerAction = oliver.serverAction(createCustomer);Chama de qualquer client component: await createCustomerAction({ name: "Acme" }, { slug }).
6. Wire da rota de chat
// app/api/chat/route.ts
import { ToolLoopAgent, createAgentUIStreamResponse, stepCountIs } from "@repo/ai";
import { oliver } from "@/lib/oliver";
export async function POST(req: Request) {
const session = await auth.getSession({ headers: await headers() });
const { messages, context } = await req.json();
const oliverSession = oliver.assembleSession({
ctx: { orgId: await resolveOrgId(context.orgSlug), userId: session.user.id, source: "agent", slug: context.orgSlug },
pageContext: { route: context.page },
});
const agent = new ToolLoopAgent({
model: models.chat,
instructions: oliverSession.systemPrompt,
tools: oliverSession.tools,
stopWhen: stepCountIs(25),
});
return createAgentUIStreamResponse({ agent, uiMessages: messages });
}7. Chat UI
Oliver não traz chat UI. Escolha: assistant-ui, Shadcn chat, ou faz o seu com useChat de @ai-sdk/react.
8. Approval card (quando entregar uma tool HITL)
Quando uma tool tem requiresApproval: true, o resultado vira { status: "awaiting_approval", pendingToolId, ... }. Duas peças:
// actions/oliver-approvals.ts
"use server";
import { oliver } from "@/lib/oliver";
export async function approvePendingToolAction(input, ctx) {
return oliver.approvePendingTool(input, ctx);
}
export async function rejectPendingToolAction(input, ctx) {
return oliver.rejectPendingTool(input, ctx);
}E um componente que detecta a shape e renderiza botões Approve/Reject. Implementação de referência: ~120 LOC de React headless.
Pronto. Cada nova tool entrega nos dois canais, ganha audit + HITL + concurrency de graça.
O que você traz
| Concern | Escolha | |---|---| | Chat UI | assistant-ui, Shadcn chat, custom | | Autenticação | better-auth, Clerk, NextAuth, Supabase Auth | | Database + ORM | Drizzle + Postgres (Neon recomendado) | | Config do modelo LLM | Vercel AI SDK + chave do provider | | Approval card UI | Um componente de ~120 LOC |
Um futuro template create-oliver-app vai pré-cabear um stack opinativo (ver Roadmap).
Mergulho nos componentes
Tools
defineTool<typeof inputSchema, OutputType, ContextExt>({
name, description, input,
execute: async ({ input, ctx }) => ...,
// opcionais:
precondition, concurrencyKey, requiresApproval, previewChange, verify,
});Sub-features que se sobrepõem na tool:
precondition— checagem de invariante de domínio em todo divergence point. JogaToolErrorpra bloquear. Pega o race "estado mudou entre propose e approve".concurrencyKey—({ input }) => "quote:" + input.idserializa chamadas com mesma key noexecute(). FIFO queue por key, deadlock-safe.verify— roda depois do execute pra confirmar DB. 5s timeout. Vai pro audit (verified/failed_verification/verification_skipped).previewChange— só invocado em HITL; produz o diff before/after no card.
Gateway
Três pontos de entrada em agent:
agent.serverAction(tool)—(input, ctxOverride?) => Promise<ActionResult>. Embrulha em"use server".agent.agentTools({ ctx })— record de tools Vercel AI SDK ligado ao contexto da sessão.agent.assembleSession({ ctx, pageContext? })—{ systemPrompt, tools }juntos. Recomendado pra rotas de chat.
Context
ToolContext<TContextExt> é parametrizado por tipo. Oliver fornece a base (orgId, userId, source); o builder estende. System prompt = prefixo estável (instructions + tools + tenant) cacheado pelos providers + sufixo mutável (page context, pending approvals) por turn. ~10x redução de custo em sessões longas (padrão Manus).
Approval Gates (HITL)
Quando o agent chama tool com requiresApproval: true:
- Computa preview (best-effort).
- Insere row em
oliver.pending_toolscom statuspending_approval. - Retorna
awaiting_approval+pendingToolIdpro LLM.
UI consulta agent.listPendingTools(orgId), renderiza cards. Approve → re-checa precondition → roda execute → atualiza pending row → loga lifecycle no audit. Re-invocation guard: mesma ação enquanto pending retorna o pendingToolId existente.
Audit
| Status | Quando |
|---|---|
| invoked | execute() começa |
| pending_approval | Card HITL inserido |
| approved / rejected | Usuário clicou |
| succeeded / failed | execute() retornou |
| verified / failed_verification / verification_skipped | Após verify |
Rows agrupados por traceId (não-HITL) ou linkados via pendingToolId (HITL). Writes non-throwing — insert que falha cai em onAuditFailure (default: console.error).
Instructions
Markdown em camadas carregado uma vez no module init (naming SOUL.md inspirado em OpenClaw/Hermes, estrutura adaptada pra SaaS):
SOUL.md— voz, boundaries, regras não-negociáveisdomain.md— conceitos, entidades, vocabulárioplaybook.md— workflows (DISCOVER → SUMMARIZE → EXECUTE)lessons.md— correções aprendidas entre sessões
Edits exigem restart do dev server — leitura por turn invalidaria o KV-cache.
Database
Tabelas em schema Postgres dedicado oliver (não public):
oliver.pending_tools -- state machine HITL
oliver.audit_log -- log de invocação + verificaçãoMesmo banco da app — atomicidade transacional cross-schema disponível.
Slash commands
Atalhos no chat que pulam o LLM. Quando o user digita /pending ou /help, a chat route intercepta ANTES de chamar o modelo — barato, determinístico, sem queimar token. Útil pra inspecionar estado do harness, comandos tipo /clear de operador, e mostrar o que o agent sabe fazer.
Três commands vêm built-in:
| Command | O que faz |
|---|---|
| /help | Lista todos os comandos disponíveis. |
| /tools | Lista tools registradas com descriptions, marca as HITL. |
| /pending | Lista approvals HITL ativos esperando o user. |
Adopters adicionam comandos custom via defineSlashCommand e registram no agent:
import { defineSlashCommand } from "oliver-agent";
const audit = defineSlashCommand({
name: "audit",
description: "Mostra os últimos 10 audit entries.",
handler: async ({ ctx }) => {
const rows = await db.select().from(auditLog)
.where(eq(auditLog.orgId, ctx.orgId))
.orderBy(desc(auditLog.createdAt)).limit(10);
return rows.map((r) => `${r.toolName} → ${r.status}`).join("\n");
},
});
createAgent({
tools: allTools,
commands: [audit], // user commands têm precedência; podem sobrescrever built-ins
...
});Wire na chat route — short-circuit ANTES de streamText:
const slashText = await oliver.handleSlashCommand(messages, ctx);
if (slashText !== null) return oliver.respondWithText(slashText);
// senão: streamText(...)respondWithText retorna um Response de UI message stream, então o chat UI vê o resultado como qualquer outra mensagem do assistant.
Comandos UI-only (/clear, /new) vivem no chat UI — assistant-ui expõe aui.thread.reset() pra isso. Não precisam round-trip pro backend.
Testing
pnpm test # 97 testes
pnpm demo:mutex # prova CLI de serialização do concurrencyKey
pnpm typecheckRoadmap
Detalhe completo em ROADMAP.md. Resumo:
Candidatos v0.1 (prováveis):
- Primitivo de subagent — uma tool spawna um loop LLM com escopo de tools restrito. Compound flows atômicos, contexto isolado. (Padrão Task do Claude Code.)
- Hooks pipeline — seam de extensibilidade Pre/PostToolUse. Approval Gates vira uma implementação de hook entre várias. (Padrão Claude Code / OpenClaude / Pi.)
- Canal MCP — expor tools como endpoints Model Context Protocol pra agentes externos (Claude Desktop, Cursor, etc.).
- DB advisory locks — substituir mutex
concurrencyKeyprocess-level porpg_advisory_xact_lockpra deployments multi-instância. - Componente React de approval card — headless, entregar o pattern de ~120 LOC como primitivo reutilizável.
- Template
create-oliver-app— starter Next.js + Drizzle + assistant-ui pré-cabeado.
v0.2 exploratório:
- Permission policy granular (allow/ask/deny por tool, com source attribution) — mais rico que
requiresApprovalbinário. (Padrão Codex / OpenClaude.) - Tool discoverability tools (
tool_search,tool_suggest) — ativa quando tools >30. (Padrão Codex / OpenClaude.) - Cron de expiração de pending approvals.
- User modeling (Honcho-style dialectic).
Non-goals explícitos: workflow engine (use Temporal/Inngest), biblioteca de chat UI, solução de auth, control plane hosted, geração CRUD genérica.
Licença
MIT. Veja LICENSE.
