agentic-ai-framework

v1.0.0

Published

3 months ago

Reusable agentic framework with session memory, tool calling, CoT, and multi-agent team support

0High
0Medium
0Low

cihagnir_insider

agent llm ai framework tool-calling memory

@insider/agent-framework

Reusable agentic framework for Node.js (ES Modules). Provides session memory, LLM-driven tool-calling, Chain-of-Thought, and multi-agent team coordination with pluggable LLM providers.

Installation

// In another package inside the monorepo
import { createAgent, createTeam, Tool } from '@insider/agent-framework';

Quick Start

import 'dotenv/config';
import { createAgent, Tool } from '@insider/agent-framework';

const agent = createAgent({
    name: 'my-agent',
    provider: 'grok',                           // 'grok' | 'claude' | 'openai'
    apiKey: process.env.GROK_API_KEY,
    systemPromptTemplate: 'You are a helpful assistant for ${userName}.',
    chainOfThought: true,
});

agent.registerTool(new Tool({
    name: 'get_data',
    description: 'Fetch data by ID',
    parameters: {
        type: 'object',
        properties: { id: { type: 'string' } },
        required: ['id'],
    },
    handler: async ({ id }) => `Data for ${id}`,
}));

agent.setContext('userName', 'Alice');
const result = await agent.run('Fetch data for item-42');
console.log(result.text);

Core Concepts

Agent

The main class. Each agent owns an LLM provider, tool registry, session memory, and a prompt builder.

One instance per request — create a fresh agent per HTTP request. Instances have no shared mutable state between themselves.

import { createAgent } from '@insider/agent-framework';

const agent = createAgent(options);  // shorthand for: new Agent(new AgentConfig(options))

AgentConfig Options

All options passed to createAgent(). Unknown keys throw an error (strict validation).

| Option | Type | Default | Description | |--------|------|---------|-------------| | name | string | required | Unique agent name | | provider | 'grok'│'claude'│'openai' | required | LLM provider | | apiKey | string | required | Provider API key | | systemPromptTemplate | string | one of these required | Inline system prompt with ${var} placeholders | | systemPromptFile | string | one of these required | Path to .md/.txt prompt file | | model | string | provider default | Model ID | | temperature | number | 0.1 | Sampling temperature (0–2) | | maxTokens | number | 4000 | Max output tokens | | maxToolIterations | number | 5 | Max tool-calling loop iterations | | loopTimeoutMs | number | 300000 | Tool loop timeout in ms (5 min) | | requestTimeoutMs | number | provider default | Individual LLM request timeout | | maxHistoryMessages | number | 50 | Max session history messages | | persistenceDir | string | null | Directory for session JSON files | | chainOfThought | boolean | true | Inject CoT reasoning block into prompt | | cotMode | 'prompt'│'reflect' | 'prompt' | CoT strategy | | cotStyle | 'step-by-step'│'pros-cons'│'custom' | 'step-by-step' | CoT block style | | cotCustomInstructions | string | null | Custom CoT text (when cotStyle='custom') | | description | string | '' | Agent description (used by AgentTeam) | | outputSchema | object | null | JSON Schema for structured output |

Tools

Defining a Tool

import { Tool } from '@insider/agent-framework';

const myTool = new Tool({
    name: 'search_docs',                  // snake_case, unique
    description: 'Search documentation', // shown to the LLM — be descriptive
    parameters: {                         // JSON Schema object
        type: 'object',
        properties: {
            query:  { type: 'string',  description: 'Search query' },
            limit:  { type: 'number',  description: 'Max results' },
            strict: { type: 'boolean', description: 'Exact match only' },
        },
        required: ['query'],
    },
    handler: async ({ query, limit = 10, strict }) => {
        // Do real work here
        return `Results for "${query}"`;  // return string or JSON-serializable object
    },
});

Zod schema is also accepted for parameters:

import { z } from 'zod';

const myTool = new Tool({
    name: 'create_ticket',
    description: 'Create a support ticket',
    parameters: z.object({
        title:    z.string(),
        priority: z.enum(['low', 'medium', 'high']),
        assignee: z.string().optional(),
    }),
    handler: async ({ title, priority, assignee }) => {
        // ...
    },
});

Supported Zod types: z.string(), z.number(), z.boolean(), z.array(), z.enum(), z.object(), z.optional(), z.nullable(), z.default(). All others throw — use a plain JSON Schema object for complex types.

Registering Tools

agent.registerTool(myTool);               // single tool
agent.registerTools([tool1, tool2]);       // multiple tools — both return agent (chainable)

// Chainable
agent
    .registerTool(searchTool)
    .registerTool(createTool);

Handler Contract

Receives parsed argument object from the LLM
Returns string or any JSON-serializable value (objects are auto-stringified)
Throwing an error causes the runner to report the error to the LLM as the tool result — the LLM can then decide how to recover

Session Memory

Stateful (Session) Pattern

// Start or resume a named session
const sessionId = await agent.startSession('user-123');

// Inject variables into system prompt as ${key}
agent.setContext('userName', 'Alice');
agent.setContext('role', 'admin');

const r1 = await agent.run('What tickets are assigned to me?');
const r2 = await agent.run('Show only the high-priority ones.'); // remembers previous turn

await agent.saveSession();    // persist to disk (requires persistenceDir in config)
await agent.endSession();     // clear in-memory state
// await agent.endSession(true); // also delete the file on disk

Resuming a session restores history and context from disk:

// Next request — session restored automatically
await agent.startSession('user-123');  // loads history from {persistenceDir}/user-123.json

Stateless (Per-Request) Pattern

// No startSession() needed — pass appendToHistory: false
const result = await agent.run(userInput, { appendToHistory: false });

Context Values

setContext(key, value) injects values into system prompt ${key} placeholders. Values must be JSON-serializable (strings, numbers, booleans, arrays, plain objects). Functions and Symbols are rejected.

agent.setContext('companyName', 'Insider');
agent.setContext('userTier', 'enterprise');
// System prompt: "You are an assistant for ${companyName} users on the ${userTier} plan."
// Rendered as: "You are an assistant for Insider users on the enterprise plan."

AgentResult

Every agent.run() returns an AgentResult:

{
    success: true,               // false if LLM failed or max iterations reached
    text: 'Final answer...',     // convenience: best text representation of the answer
    content: 'Raw LLM output',  // raw text from the last LLM response
    parsed: null,                // parsed JSON when outputSchema is set
    toolCallHistory: [           // all tool calls that happened in this run
        {
            id: 'call_abc',
            name: 'get_data',
            arguments: { id: 'item-42' },
            result: 'Data for item-42',
            iteration: 1,
            timestamp: '2026-03-13T10:00:00.000Z',
        }
    ],
    usage: {
        promptTokens: 120,
        completionTokens: 45,
        totalTokens: 165,
    },
    iterations: 1,               // number of tool-calling iterations
    error: undefined,            // error message when success=false
    cotTrace: undefined,         // present when cotMode='reflect'
}

Chain of Thought (CoT)

Prompt Mode (default)

A reasoning block is appended to the system prompt. Zero extra LLM calls.

createAgent({
    // ...
    chainOfThought: true,
    cotMode: 'prompt',           // default
    cotStyle: 'step-by-step',    // default
});

Available styles:

'step-by-step' — numbered reasoning steps before answering
'pros-cons' — trade-off analysis before deciding
'custom' — provide your own instructions via cotCustomInstructions

Reflect Mode

After the tool-calling loop produces a final answer, a second LLM call verifies it. Use for high-stakes agents where accuracy matters more than cost.

createAgent({
    // ...
    chainOfThought: true,
    cotMode: 'reflect',
});

// result.cotTrace = { original, reflected, changed: true/false }

Multi-Agent Teams

Router Mode

The coordinator LLM decides which specialist(s) to call via tool-calling. Maps to the existing MasterAgent pattern.

import { createAgent, createTeam } from '@insider/agent-framework';

const sqlAgent = createAgent({
    name: 'sql-agent',
    provider: 'grok',
    apiKey: process.env.GROK_API_KEY,
    description: 'Answers questions by generating SQL queries',
    systemPromptTemplate: 'You are a SQL specialist. Table: tickets(id, status, priority).',
});

const ragAgent = createAgent({
    name: 'rag-agent',
    provider: 'grok',
    apiKey: process.env.GROK_API_KEY,
    description: 'Searches documentation for policy questions',
    systemPromptTemplate: 'You are a knowledge base specialist.',
});

const coordinator = createAgent({
    name: 'coordinator',
    provider: 'grok',
    apiKey: process.env.GROK_API_KEY,
    systemPromptTemplate: `You orchestrate specialist agents.
Available specialists:
\${teamMembers}
Always delegate to the most appropriate specialist.`,
});

const team = createTeam({
    coordinator,
    members: [sqlAgent, ragAgent],
    mode: 'router',   // default
});

const result = await team.run('How many open tickets are there?');
console.log(result.final);           // final synthesized answer
console.log(result.memberResults);   // per-member results keyed by agent name

The ${teamMembers} placeholder in the coordinator's system prompt is auto-populated with the member list.

Parallel Mode

All members run simultaneously. The coordinator synthesizes all responses.

const team = createTeam({
    coordinator: synthesizerAgent,
    members: [sqlAgent, ragAgent],
    mode: 'parallel',
});

const result = await team.run('How many critical tickets are open and what is the SLA?');
// sqlAgent and ragAgent run in parallel, coordinator combines both answers

Use parallel mode when a question genuinely needs multiple specialists. Use router when only one specialist is needed per question.

TeamResult

{
    success: true,
    final: 'Synthesized answer...',
    memberResults: {
        'sql-agent': { success: true, text: '...', content: '...', toolCallHistory: [] },
        'rag-agent': { success: true, text: '...', content: '...', toolCallHistory: [] },
    },
    coordinatorResult: { /* AgentResult */ },
    mode: 'router',
    error: null,
}

// Helper methods
result.getMemberResult('sql-agent');   // get one member's result
result.getSuccessfulMembers();         // ['sql-agent', 'rag-agent']

Dynamic Member Management

team.addMember(analyticsAgent);        // add at runtime
team.removeMember('sql-agent');        // remove by name
team.getMembers();                     // ['rag-agent', 'analytics-agent']
team.getInfo();                        // coordinator + members info + mode

LLM Providers

Built-in Providers

| Name | API | Default Model | |------|-----|---------------| | 'grok' | xAI / OpenAI-compatible | grok-code-fast-1 | | 'claude' | Anthropic Messages API | claude-sonnet-4-20250514 | | 'openai' | OpenAI Chat Completions | gpt-4o |

// Grok
createAgent({ provider: 'grok', apiKey: process.env.GROK_API_KEY, model: 'grok-2', ... });

// Claude
createAgent({ provider: 'claude', apiKey: process.env.ANTHROPIC_API_KEY, model: 'claude-opus-4-20250514', ... });

// OpenAI
createAgent({ provider: 'openai', apiKey: process.env.OPENAI_API_KEY, model: 'gpt-4o-mini', ... });

Provider instances are cached by (provider, model, apiKey) — the same combination always returns the same instance.

Custom Provider

import { BaseLLMProvider, LLMRouter } from '@insider/agent-framework';

class MyProvider extends BaseLLMProvider {
    constructor(apiKey, model = 'my-model') {
        super();
        this.apiKey = apiKey;
        this.model = model;
    }

    async complete(prompt, options = {}) {
        const messages = options.messages ?? [{ role: 'user', content: prompt }];
        // call your API...
        return {
            content: 'response text',
            parsed: null,
            usage: { promptTokens: 0, completionTokens: 0, totalTokens: 0 },
            model: this.model,
            finishReason: 'stop',
            toolCalls: undefined,
            messages: [...messages, { role: 'assistant', content: 'response text' }],
        };
    }

    async completeWithSchema(prompt, schema, options = {}) { /* ... */ }
    async testConnection() { return true; }
}

LLMRouter.register('my-provider', MyProvider);

// Now usable in createAgent
createAgent({ provider: 'my-provider', apiKey: '...', ... });

Structured Output

Force the LLM to respond with a specific JSON shape:

const agent = createAgent({
    // ...
    outputSchema: {
        type: 'object',
        properties: {
            answer:     { type: 'string' },
            confidence: { type: 'number' },
            sources:    { type: 'array', items: { type: 'string' } },
        },
        required: ['answer', 'confidence'],
    },
});

const result = await agent.run('What is our refund policy?');
console.log(result.parsed.answer);       // string
console.log(result.parsed.confidence);  // number

Error Handling

Framework throws typed errors — catch them specifically:

import {
    AgentError, ToolError, LLMError, ConfigError, MemoryError
} from '@insider/agent-framework';

try {
    const result = await agent.run(userInput);
    if (!result.success) {
        console.error('Agent failed:', result.error);
    }
} catch (err) {
    if (err instanceof LLMError)    { /* API key bad, network down, etc. */ }
    if (err instanceof ToolError)   { /* tool registration or execution issue */ }
    if (err instanceof MemoryError) { /* session load/save failed */ }
    if (err instanceof ConfigError) { /* bad AgentConfig options */ }
    if (err instanceof AgentError)  { /* other agent-level issue */ }
}

agent.run() itself does not throw on LLM failures — it returns { success: false, error: '...' }. It only throws for programming errors (wrong arguments, missing session, etc.).

Transient LLM errors (rate limits, 5xx, timeouts) are automatically retried up to 2 times with exponential backoff before failing.

Logging

Structured JSON logging via pino. Set log level with the AGENT_LOG_LEVEL environment variable.

AGENT_LOG_LEVEL=debug node app.js   # debug | info | warn | error | silent

In development, if pino-pretty is installed, logs are pretty-printed. In production (NODE_ENV=production) plain JSON is used regardless.

Complete Backend API Handler Example

import { createAgent, Tool, AgentError } from '@insider/agent-framework';
import { queryDatabase } from './db.js';

// Agent definition is typically created once at module load
function buildSQLAgent() {
    const agent = createAgent({
        name: 'sql-agent',
        provider: 'grok',
        apiKey: process.env.GROK_API_KEY,
        systemPromptTemplate: `You are a SQL specialist for ${`\${companyName}`}.
Table: zendesk_tickets(id, status, assignee, created_at, priority, subject)`,
        chainOfThought: true,
        maxToolIterations: 3,
    });

    agent.registerTool(new Tool({
        name: 'run_sql',
        description: 'Execute a SQL query and return results',
        parameters: {
            type: 'object',
            properties: { sql: { type: 'string', description: 'SQL query to execute' } },
            required: ['sql'],
        },
        handler: async ({ sql }) => {
            const rows = await queryDatabase(sql);
            return JSON.stringify(rows);
        },
    }));

    return agent;
}

// Express / Fastify handler
export async function handleQuestion(req, res) {
    const { question, sessionId, companyName } = req.body;

    // Fresh agent per request — no shared state
    const agent = buildSQLAgent();
    agent.setContext('companyName', companyName);

    let activeSessionId;
    try {
        activeSessionId = await agent.startSession(sessionId);  // restores history if exists
        const result = await agent.run(question);

        if (!result.success) {
            return res.status(500).json({ error: result.error });
        }

        await agent.saveSession();

        return res.json({
            answer: result.text,
            sessionId: activeSessionId,
            toolsUsed: result.toolCallHistory.map(t => t.name),
        });

    } catch (err) {
        return res.status(500).json({ error: err.message });
    } finally {
        await agent.endSession();  // always clear in-memory state
    }
}

Public API Reference

createAgent(options)           → Agent
createTeam(options)            → AgentTeam

Agent
  .registerTool(tool)          → Agent (chainable)
  .registerTools(tools[])      → Agent (chainable)
  .getToolRegistry()           → ToolRegistry
  .startSession(sessionId?)    → Promise<string>
  .saveSession()               → Promise<void>
  .endSession(deletePersisted?) → Promise<void>
  .setContext(key, value)      → Agent (chainable)
  .getContext(key)             → any
  .run(input, opts?)           → Promise<AgentResult>
  .getInfo()                   → Object
  .testConnection()            → Promise<boolean>

AgentTeam
  .addMember(agent)            → void
  .removeMember(name)          → void
  .getMembers()                → string[]
  .run(input, opts?)           → Promise<TeamResult>
  .getInfo()                   → Object

Tool
  new Tool({ name, description, parameters, handler })
  .toDefinition()              → { name, description, parameters }
  .execute(args)               → Promise<string | object>

ToolRegistry
  .register(tool)              → void  (throws if duplicate)
  .registerOrReplace(tool)     → void
  .unregister(name)            → void
  .has(name)                   → boolean
  .listNames()                 → string[]
  .getDefinitions()            → Array
  .execute(name, args)         → Promise<string>

LLMRouter
  .get(provider, model, apiKey) → BaseLLMProvider
  .register(name, Class)        → void
  .clearCache()                 → void
  .listProviders()              → string[]

SessionMemory
  .appendExchange(user, assistant) → void
  .getHistory()                    → Array
  .setContext(key, value)          → void
  .getContext(key)                 → any
  .snapshot()                      → Object
  .restore(snapshot)               → void
  .clear()                         → void

MemoryManager
  .load(sessionId)             → Promise<Object | null>
  .save(sessionId, snapshot)   → Promise<void>
  .delete(sessionId)           → Promise<void>
  .list()                      → Promise<string[]>