@cool-ai/beach-llm

v0.6.1

Published

7 hours ago

LLM participant runtime for Beach: respond() discipline, scoped tools, approval interception.

Downloads

2,679

0High
0Medium
0Low

johncandrew

@cool-ai/beach-llm

Owns the LLM-shaped participant runtime — the respond() discipline, the actor taxonomy (orchestrating / interior specialist / composer), scoped tools, and approval interception. LLMs are one participant type inside Beach's opaque interior; the rules in this package (structured respond() calls, no free text, tool-scoped capability) are what makes them composable with deterministic handlers and durable processes.

Home: cool-ai.org · Documentation: cool-ai.org/docs

Install

npm install @cool-ai/beach-llm @anthropic-ai/sdk

What this package provides

AnthropicProvider — LLMProvider wrapping @anthropic-ai/sdk. Full extended-thinking support.
VercelAIProvider — LLMProvider wrapping the Vercel AI SDK (ai). 100+ models across OpenAI, Google, Mistral, Meta, Cohere, and others. Does not support Anthropic extended thinking — use AnthropicProvider for that.
ToolRegistry — declares every tool available in the system. Actor configs select by name.
callActor(options) — runs an actor through its full tool-use loop until respond() is called. Used by @cool-ai/beach-session; call directly only when you don't need session lifecycle.
ActorConfig — the configuration shape passed to runTurn() / callActor().
System-prompt snippets — importable text that teaches the LLM the respond() tool and turn states.
LLMProvider interface — implement to add other model providers.

Actor taxonomy — three kinds

Beach uses a single mechanism (callActor()) for all LLM invocations, but three roles an actor can play. The role determines what the actor is allowed to know about — specifically, whether it is allowed to know the channel the user is on. This distinction exists because principle 8 (channel-agnostic interior) is positional, not mechanical: the interior is forbidden channel knowledge, edges are not.

| Kind | Position | Channel-aware? | Examples | |------|----------|----------------|----------| | Orchestrating actor | Interior | No | TA Concierge, PO Baxter | | Interior specialist | Interior | No | TA Researcher, PO email-triage | | Composer specialist | Edge | Yes | EmailComposer, future WhatsAppComposer |

All three use callActor(). The mechanism is identical. What differs is the position in the architecture and therefore the constraints:

Orchestrating actors run inside a SessionTurnManager.runTurn(). They reason about user intent, drive the tool loop, and emit the final respond(). They must not know anything about the channel their output will land on — that knowledge is the outbound edge's job. A channel-aware orchestrator would need a different prompt per channel, a different tool set per channel, and would couple business logic to delivery.
Interior specialists run inside handlers dispatched by the event router. They perform narrow, focused work (triage, research, classification) and return structured data. They are channel-blind for the same reason orchestrators are — their output may feed into replies on any channel or into no reply at all.
Composer specialists run at the outbound edge, inside a Channel Formatter (see @cool-ai/beach-format). Their one job is to produce channel-appropriate connective prose (salutations, lead-ins, sign-offs) around structured content placeholders. They are allowed to know the channel because they are, by definition, not interior — they are part of the channel's outbound plumbing. A Composer is always paired with one channel.

The Composer operates under a strict constraint: it never sees structured content directly, only placeholder tokens and descriptions of what will replace them. This prevents hallucination on prices, dates, names. Deterministic Content Renderers produce the structured output; the LLM produces only the connective tissue.

Principle 8 remains categorical: no interior component is allowed to read anything channel-shaped. The Composer does not break the principle because the Composer is not interior. It is the exception that proves the rule — the only kind of actor that may know its channel, permitted because the position (edge) is precisely where channel knowledge is meant to live.

ActorConfig

import type { ActorConfig } from '@cool-ai/beach-llm';
import { respondToolSnippet, turnStatesSnippet } from '@cool-ai/beach-llm';

const actorConfig: ActorConfig = {
  id: 'baxter',
  model: 'claude-haiku-4-5',
  systemPrompt: [
    respondToolSnippet,   // required — teaches respond() shape
    turnStatesSnippet,    // required — teaches valid turnState values
    'You are Baxter, a personal productivity assistant.',
  ].join('\n\n'),
  tools: ['task_list', 'task_create'],  // names from ToolRegistry; respond() is injected automatically
  maxTokens: 4096,      // optional; defaults apply per provider
  temperature: 0,       // optional

  // Domain-data enforcement (Plan 28 / CR-037):
  domainDataSchema: {   // optional — embedded in respond() input_schema; LLM must produce conforming output
    type: 'object',
    properties: {
      tasks: { type: 'array', items: { type: 'object' } },
    },
    required: ['tasks'],
  },
  domainDataMergeStrategy: 'replace',  // optional — 'replace' | 'append' | 'deep-merge'; default 'replace'
};

respond() is injected automatically — do not list it in tools.

ToolRegistry

import { ToolRegistry } from '@cool-ai/beach-llm';

const registry = new ToolRegistry();

registry.register({
  name: 'task_list',
  description: 'List open tasks for the current user.',
  scope: 'generalist',
  inputSchema: {
    type: 'object',
    properties: {
      limit: { type: 'number', description: 'Maximum number of tasks to return.' },
    },
  },
  handler: async (args, context) => {
    // context.sessionId, context.turnId, context.slotKey, context.signal,
    // context.routeEvent are available.
    return { tasks: await db.tasks.list({ limit: (args as { limit: number }).limit }) };
  },
});

The handler returns any JSON-serialisable value. The framework owns the dispatch through the event router (audit, gating, capability scoping); the handler owns the result computation. The result is passed back to the actor as a tool result and the loop continues until respond() is called.

Registering a name that is already registered throws immediately. Unknown names in ActorConfig.tools throw at invocation time.

registry.unregister('task_list');  // remove a single tool
registry.clear();                  // remove all tools (useful in tests)

Tool scope and routing — the two-axis design

scope describes ownership; routing describes infrastructure. They are independent.

| Scope | Default routing | Bypass allowed | |---|---|---| | generalist | routed | No — generalist is the trust-gate scope; bypass would defeat the invariant | | specialist | routed | Yes, with an articulated bypassRouting.reason |

A specialist tool is one that operates on a private substrate the consumer team owns; the framework still wraps the call for audit and gating unless the tool elects bypass. A specialist tool that wants framework routing (the default) only needs justification. A specialist tool that elects bypass needs both justification (why specialist) and bypassRouting.reason (why bypass).

// Generalist — the recommended default for any tool that touches shared data.
registry.register({
  name: 'task_list',
  scope: 'generalist',
  description: 'List open tasks',
  inputSchema: { /* ... */ },
  handler: async (args) => db.tasks.list(args),
});

// Specialist with default routing — articulated justification required.
registry.register({
  name: 'imap_fetch',
  scope: 'specialist',
  justification: 'Operates on the researcher\'s private IMAP cache; not part of the user-visible capability surface',
  description: 'Fetch raw IMAP message bytes',
  inputSchema: { /* ... */ },
  handler: async (args) => fetchImapBytes(args),
});

// Specialist electing bypass for inner-loop latency — both reasons required.
registry.register({
  name: 'cache_lookup',
  scope: 'specialist',
  routing: 'bypass',
  justification: 'Operates on a process-local cache substrate not part of the consumer\'s public surface',
  bypassRouting: {
    reason: 'Sub-millisecond latency required for the researcher inner loop; routing overhead exceeds the per-call budget',
  },
  description: 'Look up a value in the in-process cache',
  inputSchema: { /* ... */ },
  handler: async (args) => cache.get((args as { key: string }).key),
});

The framework rejects ill-formed declarations at app startup, not at first invocation:

scope: 'generalist' with routing: 'bypass' → registration error.
Specialist without justification, or with a placeholder string ('TODO', 'tbd', 'fix me', etc.) → registration error.
routing: 'bypass' without bypassRouting.reason → registration error.

Generalist tools may set peerExposed: true to publish on the consumer's Surface Card (CR-109) when that infrastructure ships. Specialist tools cannot set peerExposed: true — specialist scope means private substrate, which is by construction not federation-shaped.

Async tools — `ctx.routeEvent`

A tool that needs to dispatch async work (research, multi-hop fetches, anything taking seconds-to-minutes) calls ctx.routeEvent and returns an ack. The actor's turn proceeds; the actual result lands later as a routed event triggering a new participant turn.

registry.register({
  name: 'email_research',
  scope: 'generalist',
  description: 'Search the user\'s email accounts for a topic',
  inputSchema: { /* ... */ },
  handler: async (args, ctx) => {
    const searchId = randomUUID();
    await ctx.routeEvent!({
      source: 'assistant',
      eventType: 'email_research_started',
      data: { ...(args as object), searchId },
    });
    return { searchId };
  },
});

The framework does not auto-attach destinations to the eventual result event — that's the routing config's job. See documentation/changePlans/cr-155-framework-enforced-routing.md for the locked design and documentation/migrations/cr-155-framework-enforced-routing.md for the migration walkthrough.

AnthropicProvider

import Anthropic from '@anthropic-ai/sdk';
import { AnthropicProvider } from '@cool-ai/beach-llm';

const provider = new AnthropicProvider(new Anthropic());
// Pass an already-configured Anthropic client — API key, base URL, etc. are yours to set.

Use AnthropicProvider for all Anthropic models, including those with extended thinking enabled. It preserves thinking block signatures across multi-turn tool-use loops.

VercelAIProvider

import { generateText, jsonSchema } from 'ai';
import { createOpenAI } from '@ai-sdk/openai';
import { VercelAIProvider } from '@cool-ai/beach-llm';

const provider = new VercelAIProvider(
  createOpenAI()('gpt-4o'),
  { generateText, jsonSchema },
);

import { generateText, jsonSchema } from 'ai';
import { createGoogleGenerativeAI } from '@ai-sdk/google';
import { VercelAIProvider } from '@cool-ai/beach-llm';

const provider = new VercelAIProvider(
  createGoogleGenerativeAI()('gemini-2.0-flash'),
  { generateText, jsonSchema },
);

VercelAIProvider takes the model instance and the two Vercel AI SDK functions it needs (generateText and jsonSchema). Beach does not import ai directly — only the consumer does, meaning ai is not a required install for users of AnthropicProvider.

Install the Vercel AI SDK and the relevant provider package:

npm install ai @ai-sdk/openai       # OpenAI / Azure
npm install ai @ai-sdk/google       # Gemini
npm install ai @ai-sdk/mistral      # Mistral
# etc.

LLMProvider interface

To add other model providers, implement:

interface LLMProvider {
  complete(options: CompletionOptions): Promise<CompletionResult>;
}

CompletionOptions carries the model, system prompt, messages, and tool schemas. CompletionResult carries stop reason, tool calls, text blocks, reasoning blocks, and token usage. Pass your implementation to runTurn() or callActor().

System-prompt snippets

import { respondToolSnippet, turnStatesSnippet } from '@cool-ai/beach-llm';

respondToolSnippet explains the respond() tool structure. turnStatesSnippet explains valid turnState values. Both belong in every actor's system prompt — without them the LLM does not know it must call respond() instead of replying with free text.

callActor()

import { callActor } from '@cool-ai/beach-llm';
import type { SpecialistExecutionRecord } from '@cool-ai/beach-llm';

const result = await callActor({
  config: actorConfig,
  messages: [{ role: 'user', content: 'Hello' }],
  sessionId: 'my-session',
  slotKey: 'my-slot',              // required — the slot this invocation fills; threaded into ToolContext
  registry,
  provider,
  signal: abortController.signal,  // optional — passed to tool handlers
  onTextBlock: (text) => { ... },  // optional — fires for interim text before respond()
  onToolExecution: async (record: SpecialistExecutionRecord) => {
    // Fires after every tool execution — use for audit/replay log entries.
    // record.toolName, record.toolInput, record.toolOutput, record.durationMs,
    // record.actorId, record.sessionId, record.turnId, record.slotKey, record.iteration
    // record.error is set (string) when the handler threw.
    //
    // CR-155 audit fields: record.scope ('generalist' | 'specialist'),
    // record.routing ('routed' | 'bypass'), record.bypass (boolean),
    // record.bypassReason (the articulated reason when bypass is true),
    // record.registrationSite (best-effort 'file:line' from the registration
    // site), record.tags (the tool's declared tags).
    await auditLog.write(record);
  },
});

// result.respond   — the RespondCall from the actor
// result.messages  — full message thread after the tool loop
// result.usage     — { inputTokens, outputTokens }
// result.latencyMs
// result.slotKey   — echoed from options

HITL approval

Tools declare requiresApproval to gate execution on human approval before the handler runs.

registry.register({
  name: 'book_flight',
  // ...
  requiresApproval: true,  // always requires user-level approval
});

For context-dependent requirements, pass an ApprovalPolicy function instead:

import type { ApprovalPolicy } from '@cool-ai/beach-llm';

const bookingPolicy: ApprovalPolicy = async ({ args }, context) => {
  if ((args as { totalValue: number }).totalValue > 500) return 'user';
  return 'auto';
};

registry.register({
  name: 'book_flight',
  // ...
  requiresApproval: bookingPolicy,
});

ApprovalLevel:

'auto' — no approval needed; handler executes immediately
'user' — requires user approval
'admin' — requires admin approval

true is shorthand for always 'user'. Absent or false-y is always 'auto'.

Wire the intercept with withApprovalIntercept(), providing a callback that emits the approval-request part and waits for the user's response:

import { withApprovalIntercept } from '@cool-ai/beach-llm';

const intercepted = withApprovalIntercept(tool, {
  onApprovalRequired: async (request) => {
    // request.level — 'user' | 'admin'
    // request.toolName, request.toolInput, request.approvalId
    // Emit approval-request part, wait for decision, return:
    return { approvalId: request.approvalId, decision: 'approved' };
  },
});

Not in this package

Session lifecycle (@cool-ai/beach-session).
Event routing (@cool-ai/beach-core).
Envelope assembly (@cool-ai/beach-protocol).

https://cool-ai.org/docs/respond-tool — the respond() tool schema.
https://cool-ai.org/docs/design-principles — principles 2.3, 2.4 (LLMs never emit free text; tools constrain and prompts guide).

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

@cool-ai/beach-llm

Install

What this package provides

Actor taxonomy — three kinds

ActorConfig

ToolRegistry

Tool scope and routing — the two-axis design

Async tools — ctx.routeEvent

AnthropicProvider

VercelAIProvider

LLMProvider interface

System-prompt snippets

callActor()

HITL approval

Not in this package

Related

Async tools — `ctx.routeEvent`