lemon-ai-agent

v1.3.1

Published

20 days ago

Lemon AI Agent — spawn specialist agents from Node.js APIs with tools, streaming, and structured output

0High
0Medium
0Low

gakwaya

lemon-ai-agent ai-agent multi-agent sub-agent tool-calling streaming llm

Lemon AI Agent

Delegate specialist agents from your API. One package, your models, your server.

Lemon AI Agent is a production-focused SDK for tool-calling agents in Node.js: delegate work to specialist agents, stream tokens and tool events, and return structured answers—without a workflow canvas or hosted runtime.

Why Lemon AI Agent

Plant seeds, don’t wire graphs — Register child agents as tools; the parent LLM delegates with LemonGrove. No graph editor, no engine lock-in.
API-first — Built for Next.js routes, Express handlers, and background jobs: invoke, stream, session memory, SSE-friendly chunks.
One import, minified dist — Single ESM bundle. Your node_modules get a compiled artifact, not a sprawling source tree.
Plug-and-play — createLemonAgent({ model: 'gpt-4o-mini', tools: [...] }) with lemonTool and z re-exported. OpenAI adapter included.
Your endpoint — String shorthand or { model, apiKey, baseURL } for OpenAI-compatible gateways (cloud APIs, Unsloth, local proxies).
Production-shaped outputs — Intermediate step traces, Zod-validated finals, fallback models, and streaming callbacks.

Implementation ease

Three lines to an agent: npm install lemon-ai-agent → createLemonAgent({ model, tools }) → invoke({ input }).

Three lines to multi-agent: create specialist agents → LemonGrove + addSpecialist → run().

LangChain when you need it; Lemon when you don’t.

For a full breakdown (architecture, progressive disclosure, examples matrix, friction points), see docs/IMPLEMENTATION-EASE.md.

Install

npm install lemon-ai-agent

Set API keys for your provider (or pass apiKey in model config). OpenAI, Anthropic, and Gemini are bundled — one npm install lemon-ai-agent, no extra provider packages.

| Provider | Env var | |----------|---------| | OpenAI (default) | OPENAI_API_KEY | | Anthropic | ANTHROPIC_API_KEY | | Google Gemini | GOOGLE_API_KEY |

resolveModel auto-detects provider from the model id (claude-*, gemini-*, gpt-*) or you can set provider explicitly.

Quick start

import { createLemonAgent, lemonTool, z } from 'lemon-ai-agent';

const agent = await createLemonAgent({
  model: 'gpt-4o-mini',
  tools: [
    lemonTool({
      name: 'search_docs',
      description: 'Search internal documentation',
      schema: z.object({ query: z.string() }),
      run: async ({ query }) => `Results for: ${query}`,
    }),
  ],
  systemMessage: 'You are a helpful assistant for our product team.',
});

const result = await agent.invoke({
  input: 'Summarize what changed in our last release',
});

console.log(result.output);

What you can build

| Capability | What you get | |------------|----------------| | Tool-calling loop | Automatic tool rounds with optional step tracing | | Specialist seeds | Child agents exposed as tools; LemonGrove coordinates the final answer | | Streaming | Token, tool-start, tool-end, and done chunks for live UIs | | Structured output | Zod-validated final JSON via built-in formatting tool | | Memory & sessions | Conversation history helpers and per-run metadata | | Fallback model | Secondary model if the primary call fails | | Delegation trace | See why the orchestrator skipped specialists (delegationTrace) | | Shared grove memory | Specialists see prior specialist output in the same session | | Specialist retry | Per-specialist maxRetries and fallbackModel | | Human-in-the-loop | Approve irreversible tools before they run (humanGate) | | Async grove jobs | enqueue + getJob for non-blocking runs | | Conductor mode | Orchestrator delegates to specialists instead of replacing them (conductorMode) | | Albedo trace | Persist intermediate tool steps per run (albedo, getAlbedo) | | Output buffer | Dampen low-quality specialist observations (outputBuffer) | | Chelators | Fast structural filters on tool output (chelators) | | Dynamic persona | Per-delegation role override on seeds (persona + roleTemplates) | | Alkaline output | Calm user-facing rewrite of final answer (outputTone) | | Conditional tools | activateWhen predicate on tools and specialists |

Multi-agent orchestration

import { createLemonAgent, LemonGrove } from 'lemon-ai-agent';

const model = 'gpt-4o-mini';

const researcher = await createLemonAgent({
  model,
  systemMessage: 'You research topics and return concise facts.',
});

const writer = await createLemonAgent({
  model,
  systemMessage: 'You turn facts into polished copy.',
});

const grove = new LemonGrove({
  model,
  systemMessage: 'Coordinate specialists and synthesize a final answer.',
})
  .addSpecialist('research_agent', 'Research and analysis tasks', researcher)
  .addSpecialist('writer_agent', 'Writing and editing tasks', writer);

const result = await grove.run('Draft a short launch note for our API');

LemonGrove advanced features (opt-in)

All advanced features default to off so the quick LemonGrove + run() path stays lean. Enable only what you need.

| Flag | Default | Turn on when | |------|---------|--------------| | delegationTrace | false | Debugging why the orchestrator did not delegate | | sharedMemory | false | Specialists must see each other’s prior output | | humanGate | unset | Irreversible tools need approval before running | | enqueue() | not used | Non-blocking jobs with getJob() polling | | maxRetries / fallbackModel | 0 / unset | Per-specialist resilience | | streamToolEvents | false | SSE tool events without full step trace | | conductorMode | false | Orchestrator prefers delegating to specialists | | albedo | false | Persist intermediate steps; getAlbedo(runId) | | outputBuffer | false | Heuristic dampening of specialist output | | chelators | [] | Structural observation filters | | roleTemplates | unset | Named personas for seed persona field | | internalTone | unset | Sharp internal orchestrator instructions (acid layer) | | outputTone | unset | User-facing tone rewrite of final output (alkaline layer) | | activateWhen | always on | Per lemonTool / addSpecialist — not a grove ctor flag; see below |

Electrolyte orchestration (conductor mode)

The grove orchestrator completes the circuit between specialists — it conducts, it does not replace them:

const grove = new LemonGrove(
  { model, systemMessage: 'Coordinate specialists.' },
  { conductorMode: true, delegationTrace: true },
)
  .addSpecialist('research_agent', 'Research', researcher);

Delegation trace (why no specialist?)

Opt in with delegationTrace: true on LemonGrove (or LemonAgent). Inspect result.delegationTrace after run():

const grove = new LemonGrove({ model }, { delegationTrace: true })
  .addSpecialist('research_agent', 'Research', researcher);

const result = await grove.run('Say hello');
console.log(result.delegationTrace?.directAnswer);
console.log(result.delegationTrace?.skippedSpecialists);

Shared memory across specialists

const grove = new LemonGrove({ model, sharedMemory: true })
  .addSpecialist('research_agent', 'Research', researcher)
  .addSpecialist('writer_agent', 'Writing', writer);

Per-specialist retry and fallback

grove.addSpecialist('research_agent', 'Research', researcher, {
  maxRetries: 2,
  fallbackModel: 'gpt-4o-mini',
});

Human-in-the-loop

const grove = new LemonGrove({
  model,
  humanGate: async (action) => {
    // return { approved: true } or { approved: false, reason: '...' }
    return { approved: true };
  },
});

Mark tools with irreversible: true on lemonTool or requiresApproval on specialists.

Async jobs (fire-and-forget)

const { jobId } = await grove.enqueue('Long task…');
let job;
do {
  await new Promise((r) => setTimeout(r, 500));
  job = await grove.getJob(jobId);
} while (job?.status === 'queued' || job?.status === 'running');

console.log(job?.result?.output);

// Resume when status is awaiting_approval:
// await grove.approve(jobId, { approved: true }, job.approvalId);

Albedo (intermediate step persistence)

Persist tool steps and specialist outputs per run for debugging, evals, or learning loops. Enabling albedo: true also turns on step collection (same as delegationTrace for collection).

const grove = new LemonGrove({ model }, { albedo: true })
  .addSpecialist('research_agent', 'Research', researcher);

const result = await grove.run('Compare two API designs', {
  metadata: { sessionId: 'sess-42' },
});

console.log(result.albedoId);
const record = await grove.getAlbedo(result.albedoId!);
console.log(record?.steps, record?.specialistOutputs);

// When sessionId is on metadata, list all runs in a session:
const history = await grove.listAlbedoBySession('sess-42');

Async jobs store job.albedoId when albedo is enabled. Pass a custom store: { albedo: myAlbedoStore }.

Output buffer (quality dampening)

Dampen low-quality specialist observations before the orchestrator sees them. Use for signal quality, not structural safety (use chelators for that).

const grove = new LemonGrove({ model }, {
  outputBuffer: {
    minObservationLength: 40,
    blockPatterns: [/^I cannot/i],
    onReject: 'placeholder', // 'omit' | 'placeholder' | 'retry'
    // confidenceScorer: async (task, output) => score, // optional 0–1
    // minConfidence: 0.3,
  },
}).addSpecialist('research_agent', 'Research', researcher);

Shorthand: outputBuffer: true uses defaults (minObservationLength: 20, placeholder on reject).

Chelation (structural filters)

Fast filters on tool observations before they reach the next agent. Use for safety and structure (SQL patterns, thin citations, contradictions).

import {
  LemonGrove,
  sqlInjectionChelator,
  emptyCitationChelator,
  defaultChelators,
} from 'lemon-ai-agent';

const grove = new LemonGrove({ model }, {
  chelators: [sqlInjectionChelator, emptyCitationChelator],
  // or: chelators: defaultChelators,
}).addSpecialist('coder_agent', 'Code', coder);

Custom chelator — return pass, neutralize (replace observation), or block (replace with reason string):

import type { Chelator } from 'lemon-ai-agent';

const myChelator: Chelator = async ({ observation, tool }) => {
  if (observation.includes('SECRET_KEY')) {
    return { action: 'block', reason: 'secrets detected' };
  }
  return { action: 'pass' };
};

Dynamic roles (persona + roleTemplates)

Override a specialist’s role per delegation via the seed tool input persona. Map short names to full prompts with roleTemplates on the grove.

const grove = new LemonGrove(
  { model, systemMessage: 'Pass persona fact-checker when verifying claims.' },
  {
    roleTemplates: {
      'fact-checker': 'You are a rigorous fact-checker. Flag unverified claims.',
      'risk-analyst': 'You are a risk analyst. Focus on downside and mitigations.',
    },
  },
).addSpecialist('analyst_agent', 'Analysis; include persona in delegation', analyst);

// Parent LLM calls analyst_agent with: { task: '...', persona: 'fact-checker' }

Default seed schema includes task, context?, and persona? (see defaultSeedInput export).

Acid inside, alkaline outside (internalTone + outputTone)

Separate harsh internal coordination from calm user-facing answers.

const grove = new LemonGrove(
  { model, systemMessage: 'Coordinate specialists.' },
  {
    internalTone: 'Push back on vague inputs. Demand evidence.',
    outputTone: 'Clear, calm, and actionable. No jargon.',
    conductorMode: true,
  },
).addSpecialist('research_agent', 'Research', researcher);

outputTone runs one optional LLM rewrite after run() using the orchestrator model.

Conditional tools (activateWhen)

Tools stay visible to the model; if the predicate returns false, execution is skipped with an unavailable message. Set on lemonTool or addSpecialist (not on LemonGrove ctor).

import { lemonTool, z } from 'lemon-ai-agent';

const search = lemonTool({
  name: 'search',
  description: 'Search when session is present',
  schema: z.object({ query: z.string() }),
  run: async ({ query }) => `Results: ${query}`,
  activateWhen: ({ runMetadata }) => !!runMetadata?.sessionId,
});

grove.addSpecialist('helper', 'Help', helper, {
  activateWhen: ({ runMetadata }) => runMetadata?.userId === 'admin',
});

await grove.run('Find docs', { metadata: { sessionId: 'abc' } });

Combined production grove (P6)

Enable only what you need; this shows several primitives together:

import {
  createLemonAgent,
  LemonGrove,
  sqlInjectionChelator,
  emptyCitationChelator,
} from 'lemon-ai-agent';

const researcher = await createLemonAgent({ model, systemMessage: 'Research facts.' });

const grove = new LemonGrove(
  { model, systemMessage: 'Coordinate specialists and synthesize.' },
  {
    conductorMode: true,
    delegationTrace: true,
    sharedMemory: true,
    albedo: true,
    outputBuffer: { minObservationLength: 50, onReject: 'placeholder' },
    chelators: [sqlInjectionChelator, emptyCitationChelator],
    outputTone: 'Clear, calm, actionable.',
    roleTemplates: { 'fact-checker': 'Rigorous fact-checker.' },
  },
).addSpecialist('research_agent', 'Research tasks', researcher);

const result = await grove.run('Summarize our API launch risks.');

See also examples/emergent-combo.mjs for multi-specialist chains without extra API.

Claude and Gemini (plug-and-play)

Set the env key, then use a string shorthand — same as OpenAI:

// Auto-detects Anthropic from "claude-..."
const claudeAgent = await createLemonAgent({
  model: 'claude-3-5-sonnet-latest',
  tools: [/* lemonTool(...) */],
});

// Auto-detects Google from "gemini-..."
const geminiAgent = await createLemonAgent({
  model: 'gemini-2.0-flash',
});

// Or set provider explicitly
const geminiExplicit = await createLemonAgent({
  model: { provider: 'google', model: 'gemini-2.0-flash' },
});

Custom / compatible endpoints

OpenAI-compatible gateways (Unsloth, vLLM, local proxies) use provider: 'openai' (default) with baseURL:

const agent = await createLemonAgent({
  model: {
    model: 'unsloth/Qwen3.5-0.8B-MTP-GGUF',
    apiKey: process.env.UNSLOTH_API_KEY,
    baseURL: 'http://localhost:8080/v1',
  },
  tools: [/* lemonTool(...) */],
});

Streaming

for await (const chunk of agent.stream({ input: 'Hello' })) {
  if (chunk.type === 'token') process.stdout.write(chunk.text);
  if (chunk.type === 'tool_start') console.log('\nTool:', chunk.tool);
  if (chunk.type === 'done') console.log('\nFinal:', chunk.result.output);
}

Structured output

import { createLemonAgent, z } from 'lemon-ai-agent';

const agent = await createLemonAgent({
  model: 'gpt-4o-mini',
  outputSchema: z.object({
    summary: z.string(),
    score: z.number(),
  }),
});

const result = await agent.invoke({ input: 'Review this feature request…' });
console.log(result.structuredOutput);

Examples

Runnable samples: examples/ (see examples/README.md).

| Example | What it shows | |---------|----------------| | node-basic.mjs | createLemonAgent + lemonTool | | multi-agent.mjs | LemonGrove + specialist seeds | | streaming-node.mjs | Token and tool streaming | | structured-output.mjs | Zod outputSchema | | fallback-model.mjs | Primary + fallback model | | memory-session.mjs | Session memory | | express-route.mjs | HTTP POST + SSE | | nextjs-route.ts | Next.js invoke + SSE | | unsloth-agent-test.mjs | npm run test:agent:unsloth — live agent test | | delegation-trace.mjs | Why orchestrator skipped specialists | | grove-shared-memory.mjs | Shared memory across specialists | | specialist-fallback.mjs | Per-specialist retry / fallback | | human-gate.mjs | Approve tools before execution | | grove-jobs.mjs | enqueue + poll getJob | | albedo-trace.mjs | Persist and query intermediate steps | | output-buffer.mjs | Dampen low-quality specialist output | | chelators.mjs | Structural observation filters | | dynamic-role.mjs | Per-delegation persona override | | alkaline-output.mjs | outputTone user-facing rewrite | | emergent-combo.mjs | Multi-specialist chain in one run() |

API reference

`createLemonAgent(config)` (async)

| Option | Type | Default | |--------|------|---------| | model | string \| ModelConfig \| BaseChatModel | required | | fallbackModel | same as model | — | | tools | lemonTool[] \| Tool[] | [] | | seeds | lemonSeed[] \| LemonSeedDefinition[] | [] | | systemMessage | string | "You are a helpful assistant" | | maxIterations | number | 15 | | memory | BaseChatMemory | — | | outputSchema | ZodObject | — | | returnIntermediateSteps | boolean | false | | delegationTrace | boolean | false | | streamToolEvents | boolean | false | | enableStreaming | boolean | true | | humanGate | HumanGate | — | | callbacks | BaseCallbackHandler[] | — | | agentCallbacks | AgentCallbacks | — | | sessionId | string | — |

`ModelConfig` (when `model` is an object)

| Field | Type | Notes | |-------|------|--------| | model | string | Model id (required) | | provider | 'openai' \| 'anthropic' \| 'google' (+ aliases claude, gemini, google-genai) | Inferred from model id if omitted | | apiKey | string | Falls back to OPENAI_API_KEY, ANTHROPIC_API_KEY, or GOOGLE_API_KEY | | baseURL | string | OpenAI-compatible or Anthropic clientOptions.baseURL | | temperature | number | — |

`LemonGrove` constructor options (`LemonGroveOptions`)

Second argument to new LemonGrove(orchestratorConfig, options?):

| Option | Type | Default | |--------|------|---------| | sharedMemory | boolean \| GroveSharedMemory | false | | maxContextTurns | number | 20 | | humanGate | HumanGate | — | | defaultRequiresApproval | boolean | — | | jobStore | JobStore | InMemoryJobStore | | delegationTrace | boolean | false | | conductorMode | boolean | false | | albedo | boolean \| AlbedoStore | false | | outputBuffer | boolean \| BufferPolicy | false | | chelators | Chelator[] | [] | | roleTemplates | Record<string, string> | — | | internalTone | string | — | | outputTone | string | — |

`BufferPolicy` (when `outputBuffer` is an object)

| Field | Type | Default | |-------|------|---------| | minObservationLength | number | 20 | | blockPatterns | RegExp[] | built-in low-signal patterns | | confidenceScorer | (task, output) => Promise<number> | — | | minConfidence | number | 0.3 (when scorer set) | | onReject | 'omit' \| 'placeholder' \| 'retry' | 'placeholder' |

`AlbedoStore` / `AlbedoRecord`

| AlbedoStore method | Description | |---------------------|-------------| | append(record) | Store a run’s intermediate layer | | get(runId) | Fetch by id | | listBySession?(sessionId) | Optional list for session-scoped history |

AlbedoRecord fields: runId, jobId?, sessionId?, timestamp, input, steps, specialistOutputs?, delegationTrace?, output?.

AgentResult.albedoId and GroveJob.albedoId are set when albedo is enabled.

`LemonGrove` methods

run(input, options?) → Promise<AgentResult> (blocking)
enqueue(input, options?) → Promise<{ jobId }> (non-blocking)
getJob(jobId) → Promise<GroveJob | undefined>
approve(jobId, decision, approvalId?) — resume after human gate
addSpecialist(name, description, agent, options?) — register specialist (maxRetries, activateWhen, …)
getSpecialist(name) — lookup registered specialist agent
getAlbedo(runId) — retrieve persisted intermediate record
listAlbedoBySession(sessionId) — list albedo records for a session (when store supports it)
getOrchestrator() / getSharedMemory()

`LemonAgent` methods

invoke(options) → Promise<AgentResult>
stream(options) → AsyncGenerator<StreamChunk>
plantSeed(definition) — add a specialist at runtime
addTool(tool) — attach another tool
asTool(name, description) — expose this agent to a parent

Helpers

lemonTool({ name, description, schema, run, activateWhen? }) — define a tool without LangChain imports
lemonSeed({ name, description, agent, activateWhen? }) — seed definition for seeds config
defaultSeedInput — default Zod schema for seed tools (task, context?, persona?)
z — re-exported Zod for schemas
resolveModel(input) — resolve shorthand to BaseChatModel (OpenAI, Anthropic, Gemini)
inferProvider(modelId) — detect provider from model string
createLemonAgentSync(config) — when you already have a BaseChatModel

P6 — chelators: sqlInjectionChelator, emptyCitationChelator, contradictionChelator, defaultChelators, applyChelators

P6 — buffer: applyOutputBuffer, normalizeBufferPolicy

P6 — albedo: InMemoryAlbedoStore, createAlbedoRunId

P6 — conductor: CONDUCTOR_SYSTEM_APPEND, applyConductorSystemMessage

P6 — tone: formatForUser (used internally when outputTone is set)

Types also exported: BufferPolicy, Chelator, ChelatorResult, ToolCondition, AlbedoRecord, AlbedoStore, LemonGroveOptions

Advanced (LangChain)

For full control, pass a BaseChatModel and LangChain Tool instances:

import { ChatOpenAI } from '@langchain/openai';
import { createLemonAgentSync } from 'lemon-ai-agent';

const agent = createLemonAgentSync({
  model: new ChatOpenAI({ model: 'gpt-4o-mini' }),
});

Requirements

Node.js 20.15+
A chat model that supports tool calling (bindTools on the model instance)

Built with LangChain packages (@langchain/core, @langchain/classic, and bundled OpenAI/Anthropic/Gemini integrations). Lemon AI Agent is the product; LangChain is the engine under the hood.

Troubleshooting

bindTools error — Use a tool-capable model (e.g. gpt-4o-mini, Claude 3.5+, Gemini 1.5+).

Local development — npm run build, then npm link or "lemon-ai-agent": "file:..".

Brand

Logo and voice: brand/BRAND.md

Agent Studio (UI preview)

Multi-agent chat UI powered by lemon-ai-agent and your local Unsloth endpoint. See examples/README.md (Unsloth Studio) to run the agent against Studio.

License

MIT — see LICENSE.