npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

oliver-agent

v0.2.0

Published

Opinionated agent harness for Next.js + Drizzle + Postgres SaaS. One tool → Next.js server action + Vercel AI SDK tool. HITL, audit, precondition, concurrency built in.

Downloads

367

Readme

Oliver

🇧🇷 Versão em português (rola pro fim)

Oliver is a TypeScript harness for embedding LLM-powered agents inside multi-tenant SaaS products. You define a tool once; Oliver routes it to your UI buttons (as a Next.js server action) and to your chat agent (as a Vercel AI SDK tool). Approval gates, audit log, domain invariant checks, and per-resource concurrency control come included.

defineTool({
  name: "applyDiscount",
  description: "Apply a percentage discount to a quote.",
  input: z.object({ quoteNumber: z.string(), percent: z.number() }),
  requiresApproval: true,
  precondition: async ({ input }) => assertDraft(input.quoteNumber),
  concurrencyKey: ({ input }) => `quote:${input.quoteNumber}`,
  previewChange: async ({ input }) => buildDiff(input),
  execute: async ({ input }) => applyDiscountToDB(input),
  verify: async ({ input, result }) => checkDBMatches(result),
});

That single definition ships through both channels, enforces the "published quote is read-only" invariant at every step, blocks races on the same quote, shows a before/after card in the chat, runs only after a human click, and double-checks the DB after.


Install

Two paths to the same end-state. Pick what fits your workflow.

Option A — AI coding agent (~2 min)

If you already work with Claude Code, Codex, or Cursor, tell your agent:

"Install Oliver in this project — follow the instructions at https://raw.githubusercontent.com/caio-overmind-ventures/oliver-harness-saas/main/INSTALL.md"

Your agent reads INSTALL.md, detects your stack (auth lib, ORM setup, project layout), generates the files, runs the migration, and reports back.

Option B — Manual (~15 min)

8 steps you run yourself. See Manual install further down.


Background

Oliver was built after studying a set of agent harnesses across two scopes.

Coding agents (most public references):

Generalist personal-AI harnesses (Pi → OpenClaw → Hermes lineage):

  • Pi — Mario Zechner's minimal coding-agent harness
  • OpenClaw — embeds Pi, adds bootstrap files (SOUL/AGENTS/USER/IDENTITY/TOOLS) + multi-channel
  • Hermes Agent — Nous Research's successor to OpenClaw

Plus: Manus on prefix-cache economics (~10x cost reduction with stable system-prompt prefix).

What we borrowed (with attribution)

| Pattern | Source | Where it lives in Oliver | |---|---|---| | Verification-first ("only passing verify counts as done") | walkinglabs | verify hook with 5s timeout | | Stable prefix + mutable suffix for KV-cache | Manus | Context system-prompt assembly | | Layered markdown instructions (SOUL.md, AGENTS.md-style) | OpenClaw / Hermes | Instructions component (SOUL/domain/playbook/lessons) | | Subagent isolation with own tool allowlist | Claude Code | Roadmap v0.1 candidate | | Hooks pipeline (Pre/PostToolUse) for extensibility | Claude Code / Pi / OpenClaude | Roadmap v0.1 candidate | | Tool interface split (validate vs permission gate) | OpenClaude | Implicit in precondition + execute | | Approval × capability as orthogonal axes | Codex | Roadmap v0.1 candidate (richer than current binary requiresApproval) | | Tool discoverability tools (tool_search) | Codex / OpenClaude | Roadmap, kicks in when tools >30 | | Progressive disclosure of skills | Pi / OpenClaude / Codex | Roadmap |

What's Oliver-original

None of the studied harnesses target multi-tenant SaaS with first-class B2B concerns. So:

  • Audit log as a primitive — none of them have one. Oliver writes every lifecycle event (invoked, approved, succeeded, verified, etc.) to oliver.audit_log with non-throwing writes and a pluggable onAuditFailure handler.
  • Multi-tenancy by designorgId / userId flow through every tool call from the start. Cross-tenant calls are impossible by construction.
  • Dual-channel Gateway — same defineTool definition routes to a Next.js server action (UI buttons) AND a Vercel AI SDK tool (chat agent). No duplicated business logic.
  • DB-backed HITL state machineoliver.pending_tools table with re-invocation guard (LLM proposing the same action twice in flight returns the existing card, not a duplicate).
  • precondition hook at every divergence point — runs before preview, before execute, before pending insert, AND again at approve time. Catches the classic "stale state between propose and approve" race.
  • concurrencyKey mutex — process-level FIFO queue per key. Two LLM calls hitting the same quote:abc resource serialize at execute time without the builder writing any locking code.

Architecture

Oliver is six components, each with a focused job:

| Component | What it does | |---|---| | Tools | Atomic operations defined via defineTool() with Zod schemas. | | Gateway | Routes one tool to multiple channels: Next.js server action, Vercel AI SDK tool, MCP (v0.1). | | Context | Assembles the system prompt from instructions + tool list. KV-cache-friendly stable prefix. | | Approval Gates | DB-backed state machine for tools marked requiresApproval: true. | | Audit | Every lifecycle event written to oliver.audit_log with non-throwing writes. | | Instructions | Layered markdown (SOUL → domain → playbook → lessons) loaded once at boot. |

How they fit together:

defineTool() ─┐
              ├─► Gateway ──► Server Action  (UI buttons)
Instructions ─┤            └► Agent Tool      (chat LLM)
              │            └► MCP             (v0.1)
   ┌──────────┴────────┐
   ▼                   ▼
Context              Approval Gates ──► Audit
(system prompt)      (HITL state)       (oliver.audit_log)

The builder writes tools and instructions. Oliver does the rest: builds the system prompt, routes each invocation to the right channel, intercepts HITL tools through the approval state machine, and records every step. Reads/writes share the same Postgres database as your app via the dedicated oliver schema, so cross-schema transactions work.

What Oliver is not: a workflow engine, a chat UI library, an auth solution, or a model-routing layer. It's the harness between your domain logic and the LLM.


Manual install

The 8-step path. Same end-state as the AI install above; pick this if you don't have an AI coding agent in your workflow or want to drive each step yourself. ~15 minutes. Assumes you have a Next.js 15+ app with Drizzle + Postgres.

1. Install

pnpm add oliver-agent ai zod drizzle-orm

2. Schema

Add Oliver's tables to drizzle.config.ts:

schema: [
  "./src/db/schema.ts",
  "./node_modules/oliver-agent/src/db/schema.ts",
],

Then pnpm drizzle-kit generate && pnpm drizzle-kit migrate. Creates oliver.pending_tools + oliver.audit_log in a separate Postgres schema.

3. Create the agent

// lib/oliver.ts
import "server-only";
import { createAgent, loadInstructions } from "oliver-agent";
import { headers } from "next/headers";
import { auth } from "@/lib/auth";
import { database } from "@/db";
import { allTools } from "@/tools";

const instructions = await loadInstructions("./instructions");

export const oliver = createAgent({
  tools: allTools,
  instructions,
  db: database,
  resolveServerActionContext: async (override) => {
    const session = await auth.getSession({ headers: await headers() });
    if (!session) throw new Error("Not authenticated");
    return {
      orgId: await resolveOrgId(override?.slug),
      userId: session.user.id,
      source: "ui",
      slug: override?.slug,
    };
  },
});

ctx is fully typed — extend with anything your tools need (db, logger, flags, etc.).

4. Define your first tool

// tools/createCustomer.ts
import { defineTool } from "oliver-agent";
import { z } from "zod";

export const createCustomer = defineTool({
  name: "createCustomer",
  description: "Create a new customer in the current organization.",
  input: z.object({
    name: z.string().min(1).max(256),
    email: z.string().email().optional(),
  }),
  execute: async ({ input, ctx }) => {
    const id = generateId.customer();
    await database.insert(customers).values({
      id, organizationId: ctx.orgId, name: input.name, email: input.email ?? null,
    });
    return { id, name: input.name };
  },
});
// tools/index.ts
import { createCustomer } from "./createCustomer";
export const allTools = [createCustomer] as const;
export { createCustomer };

5. Expose as a server action

// actions/customers.ts
"use server";
import { oliver } from "@/lib/oliver";
import { createCustomer } from "@/tools";
export const createCustomerAction = oliver.serverAction(createCustomer);

Call from any client component: await createCustomerAction({ name: "Acme" }, { slug }).

6. Wire the chat route

// app/api/chat/route.ts
import { ToolLoopAgent, createAgentUIStreamResponse, stepCountIs } from "@repo/ai";
import { oliver } from "@/lib/oliver";

export async function POST(req: Request) {
  const session = await auth.getSession({ headers: await headers() });
  const { messages, context } = await req.json();

  const oliverSession = oliver.assembleSession({
    ctx: { orgId: await resolveOrgId(context.orgSlug), userId: session.user.id, source: "agent", slug: context.orgSlug },
    pageContext: { route: context.page },
  });

  const agent = new ToolLoopAgent({
    model: models.chat,
    instructions: oliverSession.systemPrompt,
    tools: oliverSession.tools,
    stopWhen: stepCountIs(25),
  });

  return createAgentUIStreamResponse({ agent, uiMessages: messages });
}

7. Chat UI

Oliver doesn't ship a chat UI. Pick one: assistant-ui, Shadcn chat, or roll your own with useChat from @ai-sdk/react.

8. Approval card (when you ship a HITL tool)

When a tool has requiresApproval: true, its result becomes { status: "awaiting_approval", pendingToolId, ... }. Two pieces:

// actions/oliver-approvals.ts
"use server";
import { oliver } from "@/lib/oliver";

export async function approvePendingToolAction(input, ctx) {
  return oliver.approvePendingTool(input, ctx);
}
export async function rejectPendingToolAction(input, ctx) {
  return oliver.rejectPendingTool(input, ctx);
}

Plus a component that detects the shape and renders Approve/Reject buttons. Reference implementation: ~120 LOC headless React.

That's it. Each new tool ships through both channels, gets audit + HITL + concurrency for free.


What you bring

| Concern | Pick | |---|---| | Chat UI | assistant-ui, Shadcn chat, custom | | Authentication | better-auth, Clerk, NextAuth, Supabase Auth | | Database + ORM | Drizzle + Postgres (Neon recommended) | | LLM model config | Vercel AI SDK + provider key | | Approval card UI | One ~120 LOC component |

A future create-oliver-app template will pre-cable an opinionated stack (see Roadmap).


Component deep-dive

Tools

defineTool<typeof inputSchema, OutputType, ContextExt>({
  name, description, input,
  execute: async ({ input, ctx }) => ...,
  // optional:
  precondition, concurrencyKey, requiresApproval, previewChange, verify,
});

Sub-features layered on a tool:

  • precondition — domain invariant check at every divergence point. Throws ToolError to block. Catches "stale state between propose and approve" races.
  • concurrencyKey({ input }) => "quote:" + input.id serializes same-key execute() calls process-wide. FIFO queue per key, deadlock-safe.
  • verify — runs after execute to confirm DB matches. 5s timeout. Logged to audit (verified / failed_verification / verification_skipped).
  • previewChange — only invoked for HITL; produces the before/after diff on the approval card.

Gateway

Three entry points off agent:

  • agent.serverAction(tool)(input, ctxOverride?) => Promise<ActionResult>. Wrap in "use server".
  • agent.agentTools({ ctx }) — Vercel AI SDK tool record bound to session context.
  • agent.assembleSession({ ctx, pageContext? }){ systemPrompt, tools } together. Recommended for chat routes.

Context

ToolContext<TContextExt> is type-parameterized. Oliver provides the base (orgId, userId, source); the builder extends. System prompt = stable prefix (instructions + tool list + tenant) cached by providers + mutable suffix (page context, pending approvals) per turn. ~10x cost reduction on long sessions (Manus-pattern).

Approval Gates (HITL)

When the agent calls a requiresApproval: true tool:

  1. Compute preview (best-effort).
  2. Insert row in oliver.pending_tools with status pending_approval.
  3. Return awaiting_approval + pendingToolId to the LLM.

UI queries agent.listPendingTools(orgId), renders cards. Approve → re-checks precondition → runs execute → updates pending row → audits full lifecycle. Re-invocation guard: same action while pending returns the existing pendingToolId.

Audit

| Status | When | |---|---| | invoked | execute() starts | | pending_approval | HITL card inserted | | approved / rejected | User clicked | | succeeded / failed | execute() returned | | verified / failed_verification / verification_skipped | After verify hook |

Rows grouped by traceId (non-HITL) or linked via pendingToolId (HITL). Writes are non-throwing — failed insert falls through to onAuditFailure (default: console.error).

Instructions

Layered markdown loaded once at module init (SOUL.md naming inspired by OpenClaw/Hermes, structure adapted for SaaS):

  • SOUL.md — voice, boundaries, non-negotiable rules
  • domain.md — concepts, entities, vocabulary
  • playbook.md — workflows (DISCOVER → SUMMARIZE → EXECUTE)
  • lessons.md — learned corrections across sessions

Edits require dev server restart — reading per turn would invalidate KV-cache.


Database

Tables in a dedicated oliver Postgres schema (not public):

oliver.pending_tools    -- HITL state machine
oliver.audit_log         -- invocation + verification log

Same database as your app — cross-schema transactional atomicity available.


Slash commands

Chat shortcuts that bypass the LLM. When the user types /pending or /help, the chat route intercepts BEFORE calling the model — cheap, deterministic, no token burn. Useful for harness state inspection, /clear-style operator commands, and surfacing what the agent can do.

Three commands ship built in:

| Command | What it does | |---|---| | /help | Lists all available commands. | | /tools | Lists registered tools with descriptions, marks HITL ones. | | /pending | Lists active HITL approvals waiting for the user. |

Adopters add custom commands via defineSlashCommand and register them on the agent:

import { defineSlashCommand } from "oliver-agent";

const audit = defineSlashCommand({
  name: "audit",
  description: "Show the last 10 audit entries.",
  handler: async ({ ctx }) => {
    const rows = await db.select().from(auditLog)
      .where(eq(auditLog.orgId, ctx.orgId))
      .orderBy(desc(auditLog.createdAt)).limit(10);
    return rows.map((r) => `${r.toolName} → ${r.status}`).join("\n");
  },
});

createAgent({
  tools: allTools,
  commands: [audit],   // user commands take precedence; can shadow built-ins
  ...
});

Wire in your chat route — short-circuit BEFORE streamText:

const slashText = await oliver.handleSlashCommand(messages, ctx);
if (slashText !== null) return oliver.respondWithText(slashText);
// otherwise: streamText(...)

respondWithText returns a UI message stream Response, so the chat UI sees the result like any other assistant message.

UI-only commands (/clear, /new) live in the chat UI — assistant-ui exposes aui.thread.reset() for that. They never need to round-trip to the backend.


Testing

pnpm test             # 97 tests
pnpm demo:mutex       # CLI proof of concurrencyKey serialization
pnpm typecheck

Roadmap

Full detail in ROADMAP.md. Quick view:

v0.1 candidates (likely):

  • Subagent primitive — a tool spawns a scoped LLM loop with a subset of tools. Atomic compound flows, isolated context. (Claude Code Task pattern.)
  • Hooks pipeline — Pre/PostToolUse extensibility seam. Approval Gates becomes one hook implementation among many. (Claude Code / OpenClaude / Pi pattern.)
  • MCP channel — expose tools as Model Context Protocol endpoints for external agents (Claude Desktop, Cursor, etc.).
  • DB advisory locks — replace process-level concurrencyKey mutex with pg_advisory_xact_lock for multi-instance deployments.
  • Approval card React component — headless, ship the ~120 LOC pattern as a reusable primitive.
  • create-oliver-app template — opinionated Next.js + Drizzle + assistant-ui starter.

v0.2 exploratory:

  • Granular permission policy (allow/ask/deny per tool, source-attributed) — richer than binary requiresApproval. (Codex / OpenClaude pattern.)
  • Tool discoverability tools (tool_search, tool_suggest) — kicks in when tools >30. (Codex / OpenClaude pattern.)
  • Pending approval expiration cron.
  • User modeling (Honcho-style dialectic).

Explicit non-goals: workflow engine (use Temporal/Inngest), chat UI library, auth solution, hosted control plane, generic CRUD generation.


License

MIT. See LICENSE.


🇧🇷 Versão em português

↑ English version

Oliver é um harness em TypeScript para embedar agentes LLM dentro de produtos SaaS multi-tenant. Você define uma tool uma vez; Oliver entrega ela como server action do Next.js (pros botões da UI) e como tool do Vercel AI SDK (pro chat agent). Approval gates, audit log, checagem de invariantes de domínio e controle de concorrência por recurso vêm inclusos.

defineTool({
  name: "applyDiscount",
  description: "Aplica desconto percentual em uma cotação.",
  input: z.object({ quoteNumber: z.string(), percent: z.number() }),
  requiresApproval: true,
  precondition: async ({ input }) => assertDraft(input.quoteNumber),
  concurrencyKey: ({ input }) => `quote:${input.quoteNumber}`,
  previewChange: async ({ input }) => buildDiff(input),
  execute: async ({ input }) => applyDiscountToDB(input),
  verify: async ({ input, result }) => checkDBMatches(result),
});

Essa única definição roda nos dois canais, garante a invariante "cotação publicada é read-only" em todo passo, bloqueia race em chamadas paralelas na mesma cotação, mostra um card before/after no chat, só executa depois do clique humano, e confere o DB depois.

Instalação

Dois caminhos pro mesmo end-state. Escolhe o que combina com seu workflow.

Opção A — Coding agent (~2 min)

Se você já trabalha com Claude Code, Codex ou Cursor, fala pro seu agent:

"Instala o Oliver nesse projeto — segue as instruções em https://raw.githubusercontent.com/caio-overmind-ventures/oliver-harness-saas/main/INSTALL.md"

Seu agent lê INSTALL.md, detecta seu stack (lib de auth, setup do ORM, layout do projeto), gera os arquivos, roda a migration, e te reporta.

Opção B — Manual (~15 min)

8 passos que você executa. Veja Install manual mais abaixo.

Background

Oliver foi construído depois de estudar um conjunto de harnesses de agente em dois escopos.

Coding agents (mais referência pública):

Generalist personal-AI harnesses (lineage Pi → OpenClaw → Hermes):

  • Pi — harness minimal de coding agent do Mario Zechner
  • OpenClaw — embeda Pi, adiciona bootstrap files (SOUL/AGENTS/USER/IDENTITY/TOOLS) + multi-canal
  • Hermes Agent — sucessor da Nous Research pro OpenClaw

Mais: Manus sobre economia de prefix-cache (~10x redução de custo com prefixo de system prompt estável).

O que pegamos emprestado (com atribuição)

| Padrão | Fonte | Onde vive no Oliver | |---|---|---| | Verification-first ("only passing verify counts as done") | walkinglabs | Hook verify com timeout 5s | | Prefixo estável + sufixo mutável pro KV-cache | Manus | Montagem do system prompt no Context | | Layered markdown instructions (SOUL.md, estilo AGENTS.md) | OpenClaw / Hermes | Componente Instructions (SOUL/domain/playbook/lessons) | | Subagent isolado com tool allowlist próprio | Claude Code | Candidato no Roadmap v0.1 | | Hooks pipeline (Pre/PostToolUse) pra extensibilidade | Claude Code / Pi / OpenClaude | Candidato no Roadmap v0.1 | | Tool interface split (validate vs permission gate) | OpenClaude | Implícito em precondition + execute | | Approval × capability como eixos ortogonais | Codex | Candidato no Roadmap v0.1 (mais rico que o binário requiresApproval) | | Tool discoverability tools (tool_search) | Codex / OpenClaude | Roadmap, ativa quando tools >30 | | Progressive disclosure de skills | Pi / OpenClaude / Codex | Roadmap |

O que é Oliver-original

Nenhum dos harnesses estudados mira SaaS multi-tenant com concerns B2B first-class. Então:

  • Audit log como primitivo — nenhum tem. Oliver escreve todo evento de lifecycle (invoked, approved, succeeded, verified, etc.) em oliver.audit_log com writes non-throwing e handler onAuditFailure plugável.
  • Multi-tenancy by designorgId / userId fluem por toda tool call desde o início. Chamadas cross-tenant são impossíveis por construção.
  • Dual-channel Gateway — mesma definição defineTool roteia pra server action Next.js (botões UI) E tool Vercel AI SDK (chat). Sem lógica duplicada.
  • State machine HITL no DB — tabela oliver.pending_tools com re-invocation guard (LLM propondo a mesma ação duas vezes em voo retorna o card existente, não duplicado).
  • Hook precondition em todo divergence point — roda antes de preview, antes de execute, antes de pending insert, E DE NOVO no approve. Pega o race clássico "estado mudou entre propose e approve".
  • Mutex concurrencyKey — FIFO queue por key no nível do processo. Duas chamadas LLM no mesmo quote:abc serializam no execute sem o builder escrever locking.

Arquitetura

Seis componentes:

| Componente | O que faz | |---|---| | Tools | Operações atômicas via defineTool() com schemas Zod. | | Gateway | Roteia uma tool pra múltiplos canais: server action Next.js, tool Vercel AI SDK, MCP (v0.1). | | Context | Monta o system prompt das instructions + lista de tools. Prefixo estável KV-cache-friendly. | | Approval Gates | State machine no DB pra tools com requiresApproval: true. | | Audit | Todo evento de lifecycle escrito em oliver.audit_log com writes non-throwing. | | Instructions | Markdown em camadas (SOUL → domain → playbook → lessons) carregado uma vez no boot. |

Como se conectam:

defineTool() ─┐
              ├─► Gateway ──► Server Action  (botões UI)
Instructions ─┤            └► Agent Tool      (chat LLM)
              │            └► MCP             (v0.1)
   ┌──────────┴────────┐
   ▼                   ▼
Context              Approval Gates ──► Audit
(system prompt)      (state HITL)       (oliver.audit_log)

O builder escreve tools e instructions. Oliver faz o resto: monta o system prompt, roteia cada invocação pro canal certo, intercepta tools HITL pelo state machine, registra cada passo. Reads/writes compartilham o mesmo Postgres da app pelo schema dedicado oliver, então transações cross-schema funcionam.

O que Oliver não é: workflow engine, biblioteca de chat UI, solução de auth, camada de routing de modelo.

Install manual

O caminho de 8 passos. Mesmo end-state que o install com AI acima; escolhe esse se você não tem coding agent no workflow ou prefere dirigir cada passo. ~15 minutos. Assume Next.js 15+ com Drizzle + Postgres.

1. Instalar

pnpm add oliver-agent ai zod drizzle-orm

2. Schema

// drizzle.config.ts
schema: [
  "./src/db/schema.ts",
  "./node_modules/oliver-agent/src/db/schema.ts",
],

pnpm drizzle-kit generate && pnpm drizzle-kit migrate. Cria oliver.pending_tools + oliver.audit_log.

3. Criar o agent

// lib/oliver.ts
import "server-only";
import { createAgent, loadInstructions } from "oliver-agent";
import { headers } from "next/headers";
import { auth } from "@/lib/auth";
import { database } from "@/db";
import { allTools } from "@/tools";

const instructions = await loadInstructions("./instructions");

export const oliver = createAgent({
  tools: allTools,
  instructions,
  db: database,
  resolveServerActionContext: async (override) => {
    const session = await auth.getSession({ headers: await headers() });
    if (!session) throw new Error("Não autenticado");
    return {
      orgId: await resolveOrgId(override?.slug),
      userId: session.user.id,
      source: "ui",
      slug: override?.slug,
    };
  },
});

4. Definir sua primeira tool

// tools/createCustomer.ts
import { defineTool } from "oliver-agent";
import { z } from "zod";

export const createCustomer = defineTool({
  name: "createCustomer",
  description: "Cria um novo cliente na organização atual.",
  input: z.object({
    name: z.string().min(1).max(256),
    email: z.string().email().optional(),
  }),
  execute: async ({ input, ctx }) => {
    const id = generateId.customer();
    await database.insert(customers).values({
      id, organizationId: ctx.orgId, name: input.name, email: input.email ?? null,
    });
    return { id, name: input.name };
  },
});
// tools/index.ts
import { createCustomer } from "./createCustomer";
export const allTools = [createCustomer] as const;
export { createCustomer };

5. Expor como server action

// actions/customers.ts
"use server";
import { oliver } from "@/lib/oliver";
import { createCustomer } from "@/tools";
export const createCustomerAction = oliver.serverAction(createCustomer);

Chama de qualquer client component: await createCustomerAction({ name: "Acme" }, { slug }).

6. Wire da rota de chat

// app/api/chat/route.ts
import { ToolLoopAgent, createAgentUIStreamResponse, stepCountIs } from "@repo/ai";
import { oliver } from "@/lib/oliver";

export async function POST(req: Request) {
  const session = await auth.getSession({ headers: await headers() });
  const { messages, context } = await req.json();

  const oliverSession = oliver.assembleSession({
    ctx: { orgId: await resolveOrgId(context.orgSlug), userId: session.user.id, source: "agent", slug: context.orgSlug },
    pageContext: { route: context.page },
  });

  const agent = new ToolLoopAgent({
    model: models.chat,
    instructions: oliverSession.systemPrompt,
    tools: oliverSession.tools,
    stopWhen: stepCountIs(25),
  });

  return createAgentUIStreamResponse({ agent, uiMessages: messages });
}

7. Chat UI

Oliver não traz chat UI. Escolha: assistant-ui, Shadcn chat, ou faz o seu com useChat de @ai-sdk/react.

8. Approval card (quando entregar uma tool HITL)

Quando uma tool tem requiresApproval: true, o resultado vira { status: "awaiting_approval", pendingToolId, ... }. Duas peças:

// actions/oliver-approvals.ts
"use server";
import { oliver } from "@/lib/oliver";

export async function approvePendingToolAction(input, ctx) {
  return oliver.approvePendingTool(input, ctx);
}
export async function rejectPendingToolAction(input, ctx) {
  return oliver.rejectPendingTool(input, ctx);
}

E um componente que detecta a shape e renderiza botões Approve/Reject. Implementação de referência: ~120 LOC de React headless.

Pronto. Cada nova tool entrega nos dois canais, ganha audit + HITL + concurrency de graça.

O que você traz

| Concern | Escolha | |---|---| | Chat UI | assistant-ui, Shadcn chat, custom | | Autenticação | better-auth, Clerk, NextAuth, Supabase Auth | | Database + ORM | Drizzle + Postgres (Neon recomendado) | | Config do modelo LLM | Vercel AI SDK + chave do provider | | Approval card UI | Um componente de ~120 LOC |

Um futuro template create-oliver-app vai pré-cabear um stack opinativo (ver Roadmap).

Mergulho nos componentes

Tools

defineTool<typeof inputSchema, OutputType, ContextExt>({
  name, description, input,
  execute: async ({ input, ctx }) => ...,
  // opcionais:
  precondition, concurrencyKey, requiresApproval, previewChange, verify,
});

Sub-features que se sobrepõem na tool:

  • precondition — checagem de invariante de domínio em todo divergence point. Joga ToolError pra bloquear. Pega o race "estado mudou entre propose e approve".
  • concurrencyKey({ input }) => "quote:" + input.id serializa chamadas com mesma key no execute(). FIFO queue por key, deadlock-safe.
  • verify — roda depois do execute pra confirmar DB. 5s timeout. Vai pro audit (verified / failed_verification / verification_skipped).
  • previewChange — só invocado em HITL; produz o diff before/after no card.

Gateway

Três pontos de entrada em agent:

  • agent.serverAction(tool)(input, ctxOverride?) => Promise<ActionResult>. Embrulha em "use server".
  • agent.agentTools({ ctx }) — record de tools Vercel AI SDK ligado ao contexto da sessão.
  • agent.assembleSession({ ctx, pageContext? }){ systemPrompt, tools } juntos. Recomendado pra rotas de chat.

Context

ToolContext<TContextExt> é parametrizado por tipo. Oliver fornece a base (orgId, userId, source); o builder estende. System prompt = prefixo estável (instructions + tools + tenant) cacheado pelos providers + sufixo mutável (page context, pending approvals) por turn. ~10x redução de custo em sessões longas (padrão Manus).

Approval Gates (HITL)

Quando o agent chama tool com requiresApproval: true:

  1. Computa preview (best-effort).
  2. Insere row em oliver.pending_tools com status pending_approval.
  3. Retorna awaiting_approval + pendingToolId pro LLM.

UI consulta agent.listPendingTools(orgId), renderiza cards. Approve → re-checa precondition → roda execute → atualiza pending row → loga lifecycle no audit. Re-invocation guard: mesma ação enquanto pending retorna o pendingToolId existente.

Audit

| Status | Quando | |---|---| | invoked | execute() começa | | pending_approval | Card HITL inserido | | approved / rejected | Usuário clicou | | succeeded / failed | execute() retornou | | verified / failed_verification / verification_skipped | Após verify |

Rows agrupados por traceId (não-HITL) ou linkados via pendingToolId (HITL). Writes non-throwing — insert que falha cai em onAuditFailure (default: console.error).

Instructions

Markdown em camadas carregado uma vez no module init (naming SOUL.md inspirado em OpenClaw/Hermes, estrutura adaptada pra SaaS):

  • SOUL.md — voz, boundaries, regras não-negociáveis
  • domain.md — conceitos, entidades, vocabulário
  • playbook.md — workflows (DISCOVER → SUMMARIZE → EXECUTE)
  • lessons.md — correções aprendidas entre sessões

Edits exigem restart do dev server — leitura por turn invalidaria o KV-cache.

Database

Tabelas em schema Postgres dedicado oliver (não public):

oliver.pending_tools    -- state machine HITL
oliver.audit_log         -- log de invocação + verificação

Mesmo banco da app — atomicidade transacional cross-schema disponível.

Slash commands

Atalhos no chat que pulam o LLM. Quando o user digita /pending ou /help, a chat route intercepta ANTES de chamar o modelo — barato, determinístico, sem queimar token. Útil pra inspecionar estado do harness, comandos tipo /clear de operador, e mostrar o que o agent sabe fazer.

Três commands vêm built-in:

| Command | O que faz | |---|---| | /help | Lista todos os comandos disponíveis. | | /tools | Lista tools registradas com descriptions, marca as HITL. | | /pending | Lista approvals HITL ativos esperando o user. |

Adopters adicionam comandos custom via defineSlashCommand e registram no agent:

import { defineSlashCommand } from "oliver-agent";

const audit = defineSlashCommand({
  name: "audit",
  description: "Mostra os últimos 10 audit entries.",
  handler: async ({ ctx }) => {
    const rows = await db.select().from(auditLog)
      .where(eq(auditLog.orgId, ctx.orgId))
      .orderBy(desc(auditLog.createdAt)).limit(10);
    return rows.map((r) => `${r.toolName} → ${r.status}`).join("\n");
  },
});

createAgent({
  tools: allTools,
  commands: [audit],   // user commands têm precedência; podem sobrescrever built-ins
  ...
});

Wire na chat route — short-circuit ANTES de streamText:

const slashText = await oliver.handleSlashCommand(messages, ctx);
if (slashText !== null) return oliver.respondWithText(slashText);
// senão: streamText(...)

respondWithText retorna um Response de UI message stream, então o chat UI vê o resultado como qualquer outra mensagem do assistant.

Comandos UI-only (/clear, /new) vivem no chat UI — assistant-ui expõe aui.thread.reset() pra isso. Não precisam round-trip pro backend.

Testing

pnpm test             # 97 testes
pnpm demo:mutex       # prova CLI de serialização do concurrencyKey
pnpm typecheck

Roadmap

Detalhe completo em ROADMAP.md. Resumo:

Candidatos v0.1 (prováveis):

  • Primitivo de subagent — uma tool spawna um loop LLM com escopo de tools restrito. Compound flows atômicos, contexto isolado. (Padrão Task do Claude Code.)
  • Hooks pipeline — seam de extensibilidade Pre/PostToolUse. Approval Gates vira uma implementação de hook entre várias. (Padrão Claude Code / OpenClaude / Pi.)
  • Canal MCP — expor tools como endpoints Model Context Protocol pra agentes externos (Claude Desktop, Cursor, etc.).
  • DB advisory locks — substituir mutex concurrencyKey process-level por pg_advisory_xact_lock pra deployments multi-instância.
  • Componente React de approval card — headless, entregar o pattern de ~120 LOC como primitivo reutilizável.
  • Template create-oliver-app — starter Next.js + Drizzle + assistant-ui pré-cabeado.

v0.2 exploratório:

  • Permission policy granular (allow/ask/deny por tool, com source attribution) — mais rico que requiresApproval binário. (Padrão Codex / OpenClaude.)
  • Tool discoverability tools (tool_search, tool_suggest) — ativa quando tools >30. (Padrão Codex / OpenClaude.)
  • Cron de expiração de pending approvals.
  • User modeling (Honcho-style dialectic).

Non-goals explícitos: workflow engine (use Temporal/Inngest), biblioteca de chat UI, solução de auth, control plane hosted, geração CRUD genérica.

Licença

MIT. Veja LICENSE.