concevent-ai-agent-sdk
v2.1.1
Published
Framework-agnostic AI Agent SDK with tool calling, conversation management, and automatic summarization
Downloads
766
Maintainers
Readme
concevent-ai-agent-sdk
A framework-agnostic AI Agent SDK for building intelligent conversational agents with tool calling, automatic conversation summarization, and comprehensive event handling. Built on top of OpenAI-compatible APIs (including OpenRouter).
Features
- 🤖 AI Agent Framework - Create intelligent agents with tool/function calling capabilities
- 🔧 Tool Execution - Define and execute custom tools with typed parameters using Zod schemas
- 💬 Conversation Management - Automatic history tracking with built-in summarization
- 📊 Token Usage Tracking - Monitor token consumption across conversations
- 🔄 Auto-Summarization - Automatically summarize long conversations to stay within context limits
- 🎯 Event Callbacks - Comprehensive event system for real-time updates
- ⛔ Abort Support - Cancel ongoing requests with AbortController
- 🔄 Retry Logic - Built-in retry mechanism for resilient operations
- ⏱️ Timeout Control - Configurable timeouts for API requests and tool execution
- 📝 Reasoning Support - Access model reasoning/thinking outputs
- 🔌 Middleware System - Extensible plugin architecture for logging, sanitization, and more
- 📋 Structured Output - JSON schema validation with Zod for type-safe responses
- 🎭 Multi-Agent Orchestration - Route tasks to specialized sub-agents based on their capabilities
Installation
npm install concevent-ai-agent-sdk
# or
yarn add concevent-ai-agent-sdk
# or
pnpm add concevent-ai-agent-sdkQuick Start
import { createAgent } from "concevent-ai-agent-sdk";
import type {
ToolDefinition,
ToolExecutorContext,
} from "concevent-ai-agent-sdk";
import { z } from "zod";
import { zodToJsonSchema } from "zod-to-json-schema";
// Define your tools
const tools: ToolDefinition[] = [
{
declaration: {
name: "getWeather",
description: "Get current weather for a city",
parametersJsonSchema: zodToJsonSchema(
z.object({
city: z.string().describe("City name"),
})
),
},
executor: async (args) => {
const city = args.city as string;
// Your weather API logic here
return { city, temperature: 22, condition: "sunny" };
},
},
];
// Create the agent
const agent = createAgent({
apiKey: process.env.OPENROUTER_API_KEY!,
model: "anthropic/claude-3.5-sonnet",
systemPrompts: ["You are a helpful weather assistant."],
tools,
});
// Chat with the agent
const result = await agent.chat("What's the weather in Tokyo?", {
userId: "user-123",
timezone: "Asia/Tokyo",
});
console.log(result.message);
// "The weather in Tokyo is currently sunny with a temperature of 22°C."Table of Contents
Core Concepts
Agent
An Agent is the main interface for interacting with AI models. It manages conversation history, executes tools, handles token usage tracking, and provides automatic summarization when context limits are reached.
Tools
Tools (also known as functions) extend the agent's capabilities. Each tool has:
- A declaration describing its name, purpose, and parameter schema
- An executor function that performs the actual work
Conversation History
The agent automatically maintains conversation history, including user messages, assistant responses, and tool calls. This history can be retrieved, set, or cleared as needed.
API Reference
createAgent
Creates a new agent instance with the specified configuration.
import { createAgent } from 'concevent-ai-agent-sdk';
const agent = createAgent(config: AgentConfig): Agent;AgentConfig
| Property | Type | Required | Default | Description |
| ----------------- | ----------------------------- | -------- | ------------------- | -------------------------------------------------------- |
| apiKey | string | ✅ | - | API key for the AI provider |
| model | string | ✅ | - | Model identifier (e.g., 'anthropic/claude-3.5-sonnet') |
| systemPrompts | string[] | ✅ | - | Array of system prompt messages |
| tools | ToolDefinition[] | ✅ | - | Array of tool definitions |
| baseURL | string | ❌ | OpenRouter default | Custom API base URL |
| temperature | number | ❌ | 0.1 | Sampling temperature (0-2) |
| reasoningEffort | 'low' \| 'medium' \| 'high' | ❌ | 'high' | Reasoning effort level for supported models |
| maxIterations | number | ❌ | 20 | Maximum tool execution iterations per chat |
| stream | boolean | ❌ | true | Enable streaming responses with delta callbacks |
| summarization | SummarizationConfig | ❌ | { enabled: true } | Summarization settings |
| parallelToolExecution | ParallelExecutionConfig | ❌ | { maxConcurrency: 5 } | Parallel tool execution settings |
| errorMessages | ErrorMessages | ❌ | Default messages | Custom error messages |
| retry | RetryConfig | ❌ | { maxAttempts: 3 } | Retry configuration for API failures and validation retries |
| timeout | TimeoutConfig | ❌ | See TimeoutConfig | Timeout configuration for API requests and tool execution |
| responseFormat | ResponseFormatConfig | ❌ | undefined (text) | Structured output format (text, json_object, json_schema) |
| middleware | Middleware[] | ❌ | [] | Middleware array for intercepting agent behavior |
SummarizationConfig
interface SummarizationConfig {
enabled: boolean;
prompt?: string; // Custom summarization prompt
contextLimitTokens?: number; // Default: 100,000 tokens
}ParallelExecutionConfig
interface ParallelExecutionConfig {
maxConcurrency?: number; // Default: 5 - Max parallel tools to run simultaneously
}RetryConfig
interface RetryConfig {
maxAttempts?: number; // Default: 3 - Total attempts including initial (also used for validation retries)
baseDelayMs?: number; // Default: 1000 - Initial delay before first retry
maxDelayMs?: number; // Default: 30000 - Maximum delay cap
backoffMultiplier?: number; // Default: 2 - Exponential backoff multiplier
}TimeoutConfig
interface TimeoutConfig {
toolExecutionMs?: number; // Default: 30000 (30 seconds) - Per-tool execution timeout
apiRequestMs?: number; // Default: 120000 (2 minutes) - OpenAI API request timeout
}ResponseFormatConfig
type ResponseFormatConfig =
| { type: 'text' } // Default free-form text
| { type: 'json_object' } // JSON without validation
| {
type: 'json_schema';
schema: ZodType<unknown>; // Zod schema for validation
name?: string; // Schema name (default: 'response')
strict?: boolean; // Strict mode (default: true)
};Example with Full Configuration
const agent = createAgent({
apiKey: process.env.OPENROUTER_API_KEY!,
model: "anthropic/claude-3.5-sonnet",
baseURL: "https://openrouter.ai/api/v1",
temperature: 0.7,
reasoningEffort: "high",
maxIterations: 10,
systemPrompts: [
"You are a helpful assistant.",
"Always be concise and accurate.",
],
tools: myTools,
summarization: {
enabled: true,
contextLimitTokens: 50000,
prompt: "Summarize the key points of this conversation...",
},
errorMessages: {
maxIterations: "Too many steps required. Please simplify your request.",
genericError: "Something went wrong. Please try again.",
},
});Agent Interface
The agent instance returned by createAgent implements the following interface:
interface Agent {
chat(
message: string,
context: ToolExecutorContext,
callbacks?: AgentCallbacks
): Promise<AgentResult>;
abort(): void;
getHistory(): ChatMessage[];
setHistory(history: ChatMessage[]): void;
clearHistory(): void;
getTokenUsage(): UsageMetadata;
}agent.chat()
Sends a message to the agent and receives a response. The agent may execute multiple tool calls before returning a final response.
const result = await agent.chat(
message: string,
context: ToolExecutorContext,
callbacks?: AgentCallbacks
): Promise<AgentResult>;Parameters:
| Parameter | Type | Description |
| ----------- | --------------------- | --------------------------------- |
| message | string | The user's message |
| context | ToolExecutorContext | Execution context passed to tools |
| callbacks | AgentCallbacks | Optional event callbacks |
Returns: AgentResult
interface AgentResult<T = unknown> {
message: string; // Final response message
parsedResponse?: T; // Parsed/validated response (when using json_schema)
reasoning?: {
// Model's reasoning (if available)
text?: string;
details?: ReasoningDetail[];
tokenCount?: number;
};
conversationHistory: ChatMessage[]; // Full conversation history
usageMetadata: UsageMetadata; // Token usage statistics
requestId?: string; // API request ID
iterations: number; // Number of iterations taken
summarized: boolean; // Whether summarization occurred
}Example:
const result = await agent.chat(
"Calculate 25 * 4 and tell me the weather in Paris",
{ userId: "user-123", timezone: "Europe/Paris" },
{
onToolCallStart: (calls) => {
console.log(
"Executing tools:",
calls.map((c) => c.name)
);
},
onMessage: (message) => {
console.log("Response:", message);
},
}
);agent.abort()
Aborts the current chat request.
agent.abort();agent.getHistory()
Returns the current conversation history.
const history: ChatMessage[] = agent.getHistory();agent.setHistory()
Sets the conversation history (useful for restoring sessions).
agent.setHistory(previousHistory);agent.clearHistory()
Clears all conversation history.
agent.clearHistory();agent.getTokenUsage()
Returns cumulative token usage statistics.
const usage: UsageMetadata = agent.getTokenUsage();
console.log(`Total tokens used: ${usage.totalTokenCount}`);Tool Definitions
Tools extend the agent's capabilities by allowing it to perform actions or retrieve information.
interface ToolDefinition {
declaration: FunctionDeclaration;
executor: ToolExecutor;
parallel?: boolean; // Set to true for tools safe to run concurrently (default: false)
}
interface FunctionDeclaration {
name: string;
description: string;
parametersJsonSchema: JsonSchema7Type;
}
type ToolExecutor = (
args: Record<string, unknown>,
context: ToolExecutorContext
) => Promise<unknown>;
interface ToolExecutorContext {
userId: string;
timezone: string;
abortSignal?: AbortSignal;
}Creating Tools with Zod
Using Zod for parameter validation is recommended:
import { z } from 'zod';
import { zodToJsonSchema } from 'zod-to-json-schema';
import type { ToolDefinition } from 'concevent-ai-agent-sdk';
// Define parameter schema with Zod
const searchSchema = z.object({
query: z.string().describe('Search query'),
limit: z.number().optional().default(10).describe('Maximum results'),
});
// Create the tool
const searchTool: ToolDefinition = {
declaration: {
name: 'search',
description: 'Search for information in the database',
parametersJsonSchema: zodToJsonSchema(searchSchema),
},
executor: async (args, context) => {
const { query, limit } = args as z.infer<typeof searchSchema>;
// Access context information
console.log(`User ${context.userId} searching for: ${query}`);
// Perform search...
return { results: [...], total: 42 };
},
};Tool with Abort Support
const longRunningTool: ToolDefinition = {
declaration: {
name: "processData",
description: "Process large dataset",
parametersJsonSchema: zodToJsonSchema(
z.object({
datasetId: z.string(),
})
),
},
executor: async (args, context) => {
// Check for abort signal
if (context.abortSignal?.aborted) {
throw new Error("Operation cancelled");
}
// Pass abort signal to async operations
const response = await fetch(`/api/process/${args.datasetId}`, {
signal: context.abortSignal,
});
return response.json();
},
};Callbacks & Events
The SDK provides a comprehensive callback system for real-time updates during agent execution.
AgentCallbacks
interface AgentCallbacks {
onThinking?: (thinking: string, details?: ReasoningDetail[]) => void;
onMessage?: (
message: string,
reasoning?: {
text?: string;
details?: ReasoningDetail[];
tokenCount?: number;
}
) => void;
onMessageDelta?: (delta: string) => void;
onReasoningDelta?: (detail: ReasoningDetail) => void;
onToolCallStart?: (calls: ToolCallStartData[]) => void;
onToolResult?: (result: ToolResultData) => void;
onUsageUpdate?: (usage: UsageMetadata) => void;
onSummarizationStart?: (originalMessageCount: number) => void;
onSummarizationEnd?: (summary: string, tokensEstimate: number) => void;
onIterationStart?: (iteration: number, maxIterations: number) => void;
onRetry?: (attempt: number, maxAttempts: number, error: Error, delayMs: number) => void;
onError?: (error: {
code: string;
message: string;
recoverable: boolean;
}) => void;
onComplete?: (result: AgentResult) => void;
onAborted?: () => void;
}⚠️ Security Note for Browser/Client-Side Usage
When using callbacks in browser environments, be careful not to expose sensitive information. Callbacks like
onToolResult,onThinking,onSummarizationEnd, andonCompletemay contain internal data, tool execution details, or conversation content that should not be logged to the browser console or sent to client-side analytics in production. The SDK does not restrict this by design, as server-side integrations may safely log such data. It is the consumer's responsibility to sanitize or filter callback data before exposing it in client-side contexts.
Callback Examples
await agent.chat("Help me with my task", context, {
// Called when the model is "thinking" (for reasoning models)
onThinking: (thinking, details) => {
console.log("Model thinking:", thinking);
},
// Called when a final message is received
onMessage: (message, reasoning) => {
console.log("Assistant:", message);
if (reasoning?.text) {
console.log("Reasoning:", reasoning.text);
}
},
// Called for each chunk of streaming message content
onMessageDelta: (delta) => {
process.stdout.write(delta); // Real-time output
},
// Called for each chunk of streaming reasoning (for reasoning models)
onReasoningDelta: (detail) => {
if (detail.text) {
process.stdout.write(detail.text);
}
},
// Called before tool execution starts
onToolCallStart: (calls) => {
calls.forEach((call) => {
console.log(`Calling ${call.name} with:`, call.args);
});
},
// Called after each tool completes
onToolResult: (result) => {
if (result.error) {
console.error(`Tool ${result.functionName} failed:`, result.error);
} else {
console.log(`Tool ${result.functionName} returned:`, result.result);
}
},
// Called when token usage is updated
onUsageUpdate: (usage) => {
console.log(`Tokens used: ${usage.totalTokenCount}`);
},
// Called when summarization begins
onSummarizationStart: (messageCount) => {
console.log(`Summarizing ${messageCount} messages...`);
},
// Called when summarization completes
onSummarizationEnd: (summary, tokens) => {
console.log(`Summarized to ~${tokens} tokens`);
},
// Called at the start of each iteration
onIterationStart: (iteration, max) => {
console.log(`Iteration ${iteration}/${max}`);
},
// Called on errors
onError: (error) => {
console.error(`Error [${error.code}]:`, error.message);
if (!error.recoverable) {
// Handle fatal error
}
},
// Called when processing completes
onComplete: (result) => {
console.log(`Completed in ${result.iterations} iterations`);
},
// Called when request is aborted
onAborted: () => {
console.log("Request was cancelled");
},
});Event Types
The SDK also exports event types and a createEvent helper for building event-driven systems:
import { createEvent } from "concevent-ai-agent-sdk";
import type { AgentEventType, AgentEvent } from "concevent-ai-agent-sdk";
// Create typed events
const messageEvent = createEvent("message", {
message: "Hello!",
reasoning: undefined,
});
const errorEvent = createEvent("error", {
code: "TOOL_EXECUTION_FAILED",
message: "Tool failed to execute",
recoverable: true,
});
// Event types available:
// 'thinking' | 'message' | 'tool_call_start' | 'tool_result' |
// 'usage_update' | 'summarization_start' | 'summarization_end' |
// 'iteration_start' | 'error' | 'complete' | 'aborted'Types
ChatMessage
interface ChatMessage {
role: "user" | "assistant" | "system" | "tool-call" | "tool-call-results";
content: string;
timestamp: Date;
toolCalls?: ToolCall[];
toolCallId?: string;
reasoning?: string;
reasoning_details?: ReasoningDetail[];
reasoningTokenCount?: number;
}UsageMetadata
interface UsageMetadata {
promptTokenCount?: number;
candidatesTokenCount?: number;
totalTokenCount?: number;
cachedContentTokenCount?: number;
reasoningTokenCount?: number;
}ReasoningDetail
interface ReasoningDetail {
id?: string;
type: string;
text?: string;
data?: string;
format: string;
index: number;
}ToolCallStartData & ToolResultData
interface ToolCallStartData {
id: string;
name: string;
args: Record<string, unknown>;
}
interface ToolResultData {
id: string;
functionName: string;
result: unknown;
error?: string;
}Streaming
The SDK supports streaming responses by default, providing real-time updates as the model generates content.
Enabling/Disabling Streaming
Streaming is enabled by default. You can disable it in the agent configuration:
const agent = createAgent({
apiKey: process.env.OPENROUTER_API_KEY!,
model: "anthropic/claude-3.5-sonnet",
systemPrompts: ["You are a helpful assistant."],
tools: [],
stream: false, // Disable streaming (default: true)
});Streaming Callbacks
When streaming is enabled, you can use delta callbacks to receive real-time updates:
onMessageDelta
Called whenever a new chunk of the message content is received:
await agent.chat("Tell me a story", context, {
onMessageDelta: (delta) => {
// Append each chunk to your UI in real-time
process.stdout.write(delta);
},
onMessage: (fullMessage) => {
// Called when the complete message is ready
console.log("\n\nFull message:", fullMessage);
},
});onReasoningDelta
Called whenever a new chunk of model reasoning is received (for models that support reasoning):
await agent.chat("Solve this problem step by step", context, {
onReasoningDelta: (detail) => {
if (detail.type === "reasoning.text" && detail.text) {
// Stream the reasoning/thinking output
process.stdout.write(detail.text);
}
},
onThinking: (fullThinking, details) => {
// Called when reasoning is complete
console.log("\n\nFull reasoning:", fullThinking);
},
});Complete Streaming Example
import { createAgent } from "concevent-ai-agent-sdk";
const agent = createAgent({
apiKey: process.env.OPENROUTER_API_KEY!,
model: "anthropic/claude-3.5-sonnet",
systemPrompts: ["You are a helpful assistant."],
tools: myTools,
stream: true, // Enabled by default
});
let messageBuffer = "";
let reasoningBuffer = "";
const result = await agent.chat(
"Explain quantum computing",
{ userId: "user-123", timezone: "UTC" },
{
// Real-time message chunks
onMessageDelta: (delta) => {
messageBuffer += delta;
updateUI(messageBuffer); // Update your UI with partial content
},
// Real-time reasoning chunks (for reasoning models)
onReasoningDelta: (detail) => {
if (detail.text) {
reasoningBuffer += detail.text;
updateReasoningUI(reasoningBuffer);
}
},
// Tool execution still works with streaming
onToolCallStart: (calls) => {
showToolIndicator(calls.map((c) => c.name));
},
onToolResult: (result) => {
hideToolIndicator(result.functionName);
},
// Final complete message
onMessage: (message, reasoning) => {
console.log("Complete message received");
},
}
);Streaming Event Types
The SDK exports event types for building event-driven streaming systems:
import { createEvent } from "concevent-ai-agent-sdk";
import type {
MessageDeltaEventData,
ReasoningDeltaEventData,
} from "concevent-ai-agent-sdk";
// Create typed streaming events
const messageDeltaEvent = createEvent("message_delta", {
delta: "Hello, ",
});
const reasoningDeltaEvent = createEvent("reasoning_delta", {
detail: {
type: "reasoning.text",
text: "Let me think about this...",
format: "text",
index: 0,
},
});Streaming vs Non-Streaming
| Feature | Streaming (stream: true) | Non-Streaming (stream: false) |
| ------------------- | --------------------------------------- | ------------------------------- |
| Message delivery | Real-time chunks via onMessageDelta | Complete message only |
| Reasoning output | Real-time via onReasoningDelta | Complete reasoning only |
| Perceived latency | Lower (immediate feedback) | Higher (wait for completion) |
| Tool calls | Fully supported | Fully supported |
| Token usage | Included in final chunk | Included in response |
| Default | ✅ Enabled | Must explicitly disable |
Advanced Usage
Parallel Tool Execution
Tools can be marked as safe for parallel execution using the parallel flag. When the model requests multiple tool calls, parallel-safe tools run concurrently while sequential tools run one at a time after the parallel batch completes.
import { createAgent } from "concevent-ai-agent-sdk";
import type { ToolDefinition } from "concevent-ai-agent-sdk";
const tools: ToolDefinition[] = [
{
declaration: {
name: "readFile",
description: "Read contents of a file",
parametersJsonSchema: { /* ... */ },
},
executor: async (args) => { /* read file */ },
parallel: true, // ✅ Safe for concurrent execution
},
{
declaration: {
name: "fetchWeather",
description: "Get weather for a city",
parametersJsonSchema: { /* ... */ },
},
executor: async (args) => { /* API call */ },
parallel: true, // ✅ Independent API calls are safe
},
{
declaration: {
name: "writeFile",
description: "Write contents to a file",
parametersJsonSchema: { /* ... */ },
},
executor: async (args) => { /* write file */ },
// parallel defaults to false — runs sequentially
},
];
const agent = createAgent({
apiKey: process.env.API_KEY!,
model: "anthropic/claude-3.5-sonnet",
systemPrompts: ["You are a helpful assistant."],
tools,
parallelToolExecution: {
maxConcurrency: 3, // Optional: limit concurrent tools (default: 5)
},
});Execution Order:
When the model requests [readFile(a), readFile(b), writeFile(c), readFile(d)]:
- Parallel tools run concurrently:
readFile(a),readFile(b),readFile(d) - Sequential tools run in order:
writeFile(c) - All results are collected and returned to the model
Conversation Summarization
When conversations become too long, the agent automatically summarizes them to stay within context limits.
const agent = createAgent({
apiKey: process.env.API_KEY!,
model: "anthropic/claude-3.5-sonnet",
systemPrompts: ["You are a helpful assistant."],
tools: [],
summarization: {
enabled: true,
contextLimitTokens: 50000, // Summarize when approaching 50k tokens
prompt: `
Summarize this conversation, preserving:
1. Key user requests and questions
2. Important decisions made
3. Any data or numbers mentioned
4. Context needed to continue the conversation
`,
},
});
// Listen for summarization events
await agent.chat("Continue our discussion...", context, {
onSummarizationStart: (messageCount) => {
console.log(`Summarizing ${messageCount} messages to save context...`);
},
onSummarizationEnd: (summary, tokensEstimate) => {
console.log(
`Conversation summarized. Estimated tokens saved: ${tokensEstimate}`
);
},
});Error Handling
The SDK provides typed errors and customizable error messages.
import { createAgent } from "concevent-ai-agent-sdk";
const agent = createAgent({
// ... config
errorMessages: {
apiKeyRequired: "Please provide an API key",
modelRequired: "Please specify a model",
emptyResponse: "No response received, please try again",
maxIterations: "Request too complex. Please simplify.",
toolExecutionFailed: "A tool failed to execute",
genericError: "Something went wrong",
},
});
// Handle errors via callbacks
await agent.chat("Do something", context, {
onError: (error) => {
switch (error.code) {
case "MAX_ITERATIONS":
console.log("Too many iterations, simplify the request");
break;
case "TOOL_EXECUTION_FAILED":
console.log("Tool failed:", error.message);
break;
case "REQUEST_ABORTED":
console.log("Request was cancelled");
break;
default:
console.log("Error:", error.message);
}
if (!error.recoverable) {
// Handle fatal error
}
},
});Error Codes
| Code | Description | Recoverable |
| ----------------------- | --------------------------- | ----------- |
| API_KEY_REQUIRED | API key not provided | No |
| MODEL_REQUIRED | Model name not provided | No |
| EMPTY_RESPONSE | API returned empty response | Yes |
| REQUEST_ABORTED | Request was aborted | No |
| NO_RESPONSE | No response from API | No |
| NO_RESPONSE_MESSAGE | Response missing message | No |
| MAX_ITERATIONS | Max iterations reached | Yes |
| TOOL_EXECUTION_FAILED | Tool execution failed | Yes |
| TOOL_EXECUTION_TIMEOUT| Tool execution timed out | Yes |
| SUMMARIZATION_FAILED | Summarization failed | Yes |
| GENERIC_ERROR | Unknown error occurred | No |
Retry Configuration
The SDK includes automatic retry logic with exponential backoff and jitter for handling transient API failures (e.g., rate limits, network errors).
Configuration
const agent = createAgent({
apiKey: process.env.OPENROUTER_API_KEY!,
model: "anthropic/claude-3.5-sonnet",
systemPrompts: ["You are a helpful assistant."],
tools: [],
retry: {
maxAttempts: 5, // Total attempts including initial (default: 3)
baseDelayMs: 1000, // Initial delay before first retry (default: 1000)
maxDelayMs: 30000, // Maximum delay cap (default: 30000)
backoffMultiplier: 2, // Exponential multiplier (default: 2)
},
});How It Works
- Exponential Backoff: Delay doubles with each retry attempt
- Jitter: Random ±25% variation prevents thundering herd problems
- Retry-After Support: Honors server-provided
Retry-Afterheaders - Abort Integration: Respects abort signals during retry delays
Delay Calculation:
delay = min(baseDelayMs * (backoffMultiplier ^ attempt), maxDelayMs) * jitterFor example, with defaults (base=1000ms, multiplier=2, max=30000ms):
- Retry 1: ~1000ms (±250ms jitter)
- Retry 2: ~2000ms (±500ms jitter)
- Retry 3: ~4000ms (±1000ms jitter)
Retry Callback
Monitor retry attempts using the onRetry callback:
await agent.chat("Hello", context, {
onRetry: (attempt, maxAttempts, error, delayMs) => {
console.log(`Retry ${attempt}/${maxAttempts} after ${delayMs}ms`);
console.log(`Error: ${error.message}`);
},
});Retryable Errors
The following HTTP status codes trigger automatic retries:
408- Request Timeout429- Too Many Requests (Rate Limited)500- Internal Server Error502- Bad Gateway503- Service Unavailable504- Gateway Timeout
Timeout Configuration
The SDK provides configurable timeouts to prevent hanging on long-running operations. There are two timeout settings:
- Tool Execution Timeout: Limits how long individual tool executions can run
- API Request Timeout: Limits how long OpenAI API requests can take
Default Values
| Setting | Default | Description |
|---------|---------|-------------|
| toolExecutionMs | 30 seconds | Timeout for each tool execution |
| apiRequestMs | 2 minutes | Timeout for OpenAI API requests |
Configuration
import { createAgent } from "concevent-ai-agent-sdk";
const agent = createAgent({
apiKey: process.env.OPENROUTER_API_KEY!,
model: "anthropic/claude-3.5-sonnet",
systemPrompts: ["You are a helpful assistant."],
tools: myTools,
timeout: {
toolExecutionMs: 60000, // 60 seconds per tool
apiRequestMs: 180000, // 3 minutes for API calls
},
});Disabling Timeouts
Set timeout values to 0 or omit them to use defaults:
const agent = createAgent({
// ... config
timeout: {
toolExecutionMs: 0, // No tool timeout (not recommended)
apiRequestMs: 300000, // 5 minutes for API calls
},
});Behavior
Tool Execution Timeout:
- When a tool execution exceeds the timeout, it returns an error result (does not throw)
- The LLM receives the timeout error and can decide how to proceed
- Other tools in the same batch are not affected
API Request Timeout:
- When an API request times out, it throws an
APIConnectionTimeoutError - The request is automatically retried according to your retry configuration
- Respects the
Retry-Afterheader if provided by the server
Error Handling
Tool timeout errors are returned as part of the tool result, allowing the LLM to recover:
// The LLM will receive an error like:
// "Tool execution timed out after 30000ms"
// You can provide a custom error message:
const agent = createAgent({
// ... config
errorMessages: {
toolExecutionTimeout: "The operation took too long. Please try a simpler request.",
},
});Using with AbortSignal
Timeouts work alongside abort signals. If both are set, whichever triggers first will cancel the operation:
const abortController = new AbortController();
// Set a 10 second manual timeout
setTimeout(() => abortController.abort(), 10000);
await agent.chat("Process this data", {
userId: "user-123",
timezone: "UTC",
abortSignal: abortController.signal, // Manual abort at 10s
});
// Tool timeout at 30s (default) or API timeout at 2min (default)
// Whichever comes first will cancel the operationStructured Output
The SDK supports structured JSON output with automatic Zod schema validation. This ensures type-safe responses from the AI model.
Response Format Types
| Type | Description | Validation |
|------|-------------|------------|
| text | Default free-form text response | None |
| json_object | Forces JSON response | JSON parsing only |
| json_schema | Forces JSON with schema validation | Zod schema validation |
Basic JSON Mode
Force the model to return valid JSON without schema validation:
const agent = createAgent({
apiKey: process.env.OPENROUTER_API_KEY!,
model: "anthropic/claude-3.5-sonnet",
systemPrompts: ["Return responses as JSON objects."],
tools: [],
responseFormat: { type: "json_object" },
});
const result = await agent.chat("List 3 colors", context);
// result.message: '{"colors": ["red", "blue", "green"]}'Schema-Validated JSON
Define a Zod schema for type-safe, validated responses:
import { z } from "zod";
// Define the expected response structure
const WeatherSchema = z.object({
location: z.string(),
temperature: z.number(),
unit: z.enum(["celsius", "fahrenheit"]),
conditions: z.array(z.string()),
});
type WeatherResponse = z.infer<typeof WeatherSchema>;
const agent = createAgent({
apiKey: process.env.OPENROUTER_API_KEY!,
model: "anthropic/claude-3.5-sonnet",
systemPrompts: ["You provide weather information."],
tools: [],
responseFormat: {
type: "json_schema",
schema: WeatherSchema,
name: "WeatherResponse", // Optional: schema name for API
strict: true, // Optional: strict mode (default: true)
},
});
const result = await agent.chat("Weather in Tokyo", context);
// Access typed, validated response
const weather = result.parsedResponse as WeatherResponse;
console.log(weather.temperature); // Typed as number
console.log(weather.conditions); // Typed as string[]Validation Retries
When schema validation fails, the SDK can automatically retry with error feedback. This uses the same retry configuration as API failures:
const agent = createAgent({
// ... config
responseFormat: {
type: "json_schema",
schema: MySchema,
},
retry: { maxAttempts: 3 }, // 3 total attempts (2 validation retries)
});When validation fails:
- The SDK captures the Zod validation error
- Sends the error details back to the model as feedback
- Model attempts to generate a corrected response
- Process repeats until validation passes or retries exhausted
Custom Error Messages
Customize validation error messages:
const agent = createAgent({
// ... config
errorMessages: {
jsonParseError: "Failed to parse JSON response",
responseSchemaValidationFailed: "Response didn't match expected format",
},
});AgentResult with Parsed Response
When using json_schema format, AgentResult includes the validated data:
interface AgentResult<T = unknown> {
message: string; // Raw JSON string
parsedResponse?: T; // Parsed & validated object (only with json_schema)
// ... other fields
}Abort Requests
Cancel ongoing requests using the abort functionality.
const agent = createAgent(config);
// Start a chat
const chatPromise = agent.chat("Process this large dataset", context, {
onAborted: () => {
console.log("Request cancelled by user");
},
});
// Cancel after 5 seconds
setTimeout(() => {
agent.abort();
}, 5000);
try {
const result = await chatPromise;
} catch (error) {
if (error.name === "AbortError") {
console.log("Chat was aborted");
}
}With Custom AbortSignal
const abortController = new AbortController();
// Pass abort signal in context
agent.chat("Hello", {
userId: "user-123",
timezone: "UTC",
abortSignal: abortController.signal,
});
// Cancel from external source
abortController.abort();Serverless / Stateless Deployments
When deploying in serverless environments (e.g., AWS Lambda, Vercel, Cloudflare Workers) or stateless API routes (e.g., Next.js API routes), the agent instance is created fresh for each request. To maintain conversation continuity, the client must store and forward the conversation history with each request.
Pattern
- Client maintains
conversationHistorystate - Client sends the history with each chat request
- Server creates a fresh agent, restores history via
setHistory(), processes the message - Server returns the result including
conversationHistory - Client updates its local history from the response
Server-Side Example (Next.js API Route)
import { createAgent } from "concevent-ai-agent-sdk";
import type { ChatMessage } from "concevent-ai-agent-sdk";
export async function POST(request: Request) {
const { message, conversationHistory = [] } = await request.json();
const agent = createAgent({
apiKey: process.env.API_KEY!,
model: "anthropic/claude-3.5-sonnet",
systemPrompts: ["You are a helpful assistant."],
tools: myTools,
});
// Restore conversation history from the client
if (conversationHistory.length > 0) {
agent.setHistory(conversationHistory);
}
const result = await agent.chat(message, {
userId: "user-123",
timezone: "UTC",
});
// Return result - client should use result.conversationHistory for next request
return Response.json({
message: result.message,
conversationHistory: result.conversationHistory,
});
}Client-Side Example
const [conversationHistory, setConversationHistory] = useState<ChatMessage[]>(
[]
);
async function sendMessage(message: string) {
const response = await fetch("/api/chat", {
method: "POST",
body: JSON.stringify({ message, conversationHistory }),
});
const result = await response.json();
// Update local history for the next request
setConversationHistory(result.conversationHistory);
return result.message;
}Note: The SDK handles summarization automatically when context limits are approached. The summarized history is included in
result.conversationHistory, so clients always receive the properly managed history state.
Middleware
Middleware allows you to intercept and modify the agent's execution flow at key points. Use middleware for logging, analytics, rate limiting, input sanitization, and output filtering.
Middleware Interface
import type { Middleware } from "concevent-ai-agent-sdk";
interface Middleware {
name: string;
beforeChat?: (ctx: BeforeChatContext) => Promise<string>;
afterChat?: (ctx: AfterChatContext) => Promise<AgentResult>;
beforeToolCall?: (ctx: BeforeToolCallContext) => Promise<ToolCall>;
afterToolCall?: (ctx: AfterToolCallContext) => Promise<FunctionCallResult>;
onError?: (ctx: ErrorContext) => Promise<void>;
}Middleware Hooks
| Hook | When Called | Can Modify |
|------|-------------|------------|
| beforeChat | Before chat processing begins | User message |
| afterChat | After chat completes successfully | AgentResult |
| beforeToolCall | Before each tool executes | ToolCall args |
| afterToolCall | After each tool executes | FunctionCallResult |
| onError | When an error occurs | Observation only |
Creating Middleware
import { createAgent } from "concevent-ai-agent-sdk";
import type { Middleware } from "concevent-ai-agent-sdk";
// Logging middleware
const loggingMiddleware: Middleware = {
name: "logging",
beforeChat: async (ctx) => {
console.log(`[${new Date().toISOString()}] Chat started:`, ctx.message);
return ctx.message;
},
afterChat: async (ctx) => {
console.log(`[${new Date().toISOString()}] Chat completed:`, ctx.result.message);
return ctx.result;
},
beforeToolCall: async (ctx) => {
console.log(`[${new Date().toISOString()}] Tool call:`, ctx.toolCall.name);
return ctx.toolCall;
},
onError: async (ctx) => {
console.error(`[${new Date().toISOString()}] Error:`, ctx.error.code, ctx.error.message);
},
};
// Input sanitization middleware
const sanitizationMiddleware: Middleware = {
name: "sanitization",
beforeChat: async (ctx) => {
// Remove potentially harmful content from user input
const sanitized = ctx.message.replace(/<script[^>]*>.*?<\/script>/gi, "");
return sanitized;
},
};
// Rate limiting middleware
const rateLimitMiddleware: Middleware = {
name: "rate-limit",
beforeChat: async (ctx) => {
const userId = ctx.toolContext.userId;
if (await isRateLimited(userId)) {
throw new Error("Rate limit exceeded. Please try again later.");
}
return ctx.message;
},
};
// Output filtering middleware
const outputFilterMiddleware: Middleware = {
name: "output-filter",
afterChat: async (ctx) => {
// Remove sensitive data from responses
const filteredMessage = ctx.result.message.replace(/\b\d{3}-\d{2}-\d{4}\b/g, "[REDACTED]");
return { ...ctx.result, message: filteredMessage };
},
afterToolCall: async (ctx) => {
// Filter sensitive data from tool results
if (typeof ctx.result.result === "object" && ctx.result.result !== null) {
const filtered = JSON.parse(
JSON.stringify(ctx.result.result).replace(/"password":\s*"[^"]*"/g, '"password": "[REDACTED]"')
);
return { ...ctx.result, result: filtered };
}
return ctx.result;
},
};Registering Middleware
Middleware is registered via the middleware property in AgentConfig. Middleware executes in registration order (first in array = first executed).
const agent = createAgent({
apiKey: process.env.API_KEY!,
model: "anthropic/claude-3.5-sonnet",
systemPrompts: ["You are a helpful assistant."],
tools: myTools,
middleware: [
loggingMiddleware, // Executes first
sanitizationMiddleware, // Executes second
rateLimitMiddleware, // Executes third
outputFilterMiddleware, // Executes fourth
],
});Middleware Context
Each middleware hook receives a context object with relevant information:
// BeforeChatContext & AfterChatContext
interface MiddlewareContext {
message: string; // The user's input message
history: ChatMessage[]; // Current conversation history
toolContext: ToolExecutorContext; // userId, timezone, abortSignal
}
// BeforeToolCallContext (extends MiddlewareContext)
interface BeforeToolCallContext extends MiddlewareContext {
toolCall: ToolCall; // The tool call to be executed
}
// AfterToolCallContext (extends BeforeToolCallContext)
interface AfterToolCallContext extends BeforeToolCallContext {
result: FunctionCallResult; // The tool execution result
}
// AfterChatContext (extends MiddlewareContext)
interface AfterChatContext extends MiddlewareContext {
result: AgentResult; // The chat result
}
// ErrorContext (extends MiddlewareContext)
interface ErrorContext extends MiddlewareContext {
error: AgentError; // The error that occurred
}Error Handling in Middleware
- If a middleware hook throws an error, the error propagates and stops the chain
onErrorhooks are called for observation only and do not affect the error flow- Errors in
onErrorhooks are logged but do not throw
Multi-Agent Orchestration
The SDK supports multi-agent orchestration, allowing you to create specialized agents and have an orchestrator agent route tasks to them based on their capabilities.
Creating an Orchestrator
import { createAgent, createOrchestratorAgent } from "concevent-ai-agent-sdk";
import type { AgentDefinition } from "concevent-ai-agent-sdk";
// Create specialized sub-agents
const codeAgent = createAgent({
apiKey: process.env.API_KEY!,
model: "anthropic/claude-3.5-sonnet",
systemPrompts: ["You are an expert programmer. Write clean, efficient code."],
tools: codingTools,
});
const researchAgent = createAgent({
apiKey: process.env.API_KEY!,
model: "anthropic/claude-3.5-sonnet",
systemPrompts: ["You are a research specialist. Find accurate information."],
tools: researchTools,
});
const dataAgent = createAgent({
apiKey: process.env.API_KEY!,
model: "anthropic/claude-3.5-sonnet",
systemPrompts: ["You are a data analyst. Analyze data and provide insights."],
tools: analysisTools,
});
// Define sub-agents with descriptions
const subAgents: AgentDefinition[] = [
{
agent: codeAgent,
name: "code_expert",
description: "Handles coding tasks: writing, reviewing, and debugging code",
parallel: true, // Can run in parallel with other parallel agents
},
{
agent: researchAgent,
name: "researcher",
description: "Researches topics, finds documentation, and gathers information",
parallel: true,
},
{
agent: dataAgent,
name: "data_analyst",
description: "Analyzes data, creates visualizations, and provides statistical insights",
},
];
// Create the orchestrator
const orchestrator = createOrchestratorAgent({
apiKey: process.env.API_KEY!,
model: "anthropic/claude-3.5-sonnet",
systemPrompts: ["Additional instructions for the orchestrator..."],
subAgents,
});Using the Orchestrator
The orchestrator automatically routes requests to appropriate sub-agents:
// The orchestrator decides which sub-agent(s) to use
const result = await orchestrator.chat(
"Research the best sorting algorithms and implement quicksort in TypeScript",
{ userId: "user-123", timezone: "UTC" },
{
// Track sub-agent delegation
onSubAgentStart: (data) => {
console.log(`Delegating to ${data.agentName}: ${data.message}`);
},
onSubAgentComplete: (data) => {
console.log(`${data.agentName} completed in ${data.result.iterations} iterations`);
},
// Standard callbacks also work
onToolCallStart: (calls) => {
console.log("Tool calls:", calls.map(c => c.name));
},
}
);OrchestratorConfig
| Property | Type | Required | Default | Description |
|-----------------|---------------------|----------|-------------------|----------------------------------------------|
| apiKey | string | ✅ | - | API key for the orchestrator |
| model | string | ✅ | - | Model for the orchestrator |
| systemPrompts | string[] | ✅ | - | Additional prompts (orchestrator prompt auto-injected) |
| subAgents | AgentDefinition[] | ✅ | - | Array of sub-agent definitions |
| (other) | - | - | Same as AgentConfig | All other AgentConfig options except tools |
AgentDefinition
interface AgentDefinition {
agent: Agent; // The agent instance
name: string; // Unique name (used as tool name)
description: string; // Describes capabilities (helps routing)
parallel?: boolean; // Can run in parallel (default: false)
}OrchestratorAgent Interface
interface OrchestratorAgent extends Agent {
getSubAgents(): AgentDefinition[]; // Returns registered sub-agents
}Key Concepts
Stateless Delegation: Sub-agents have their history cleared before each delegation. The orchestrator must include all necessary context in the delegated message.
Parallel Execution: Sub-agents marked with
parallel: truecan be called concurrently when the orchestrator needs multiple specialists.No Nested Orchestrators: Sub-agents cannot themselves be orchestrators. This prevents infinite delegation loops.
Automatic Routing: The orchestrator's LLM decides which sub-agent(s) to use based on their descriptions and the user's request.
Exports Summary
Main Entry (concevent-ai-agent-sdk)
// Functions
export { createAgent } from "./core/agent";
export { createOrchestratorAgent } from "./core/orchestrator";
export { createEvent } from "./types/events";
// Constants
export { ENCRYPTED_REASONING_MARKER } from "./core/openrouter-utils";
// Types
export type {
Agent,
ReasoningDetail,
ChatMessage,
FunctionDeclaration,
ToolExecutorContext,
ToolDefinition,
AgentConfig,
ParallelExecutionConfig,
RetryConfig,
TimeoutConfig,
UsageMetadata,
AgentResult,
ToolCallStartData,
ToolResultData,
AgentCallbacks,
AgentEventType,
ThinkingEventData,
UsageUpdateEventData,
RetryEventData,
ErrorEventData,
CompleteEventData,
MessageDeltaEventData,
ReasoningDeltaEventData,
AgentEvent,
ResponseFormatConfig,
// Middleware types
Middleware,
MiddlewareContext,
BeforeChatContext,
AfterChatContext,
BeforeToolCallContext,
AfterToolCallContext,
ErrorContext,
// Orchestrator types
AgentDefinition,
OrchestratorConfig,
OrchestratorAgent,
SubAgentStartData,
SubAgentCompleteData,
OrchestratorCallbacks,
} from "./types";
// Timeout constants
export {
DEFAULT_TOOL_EXECUTION_TIMEOUT_MS,
DEFAULT_API_REQUEST_TIMEOUT_MS,
} from "./types";
// Timeout error class
export { ToolExecutionTimeoutError } from "./core/tool-executor";License
MIT
Contributing
Contributions are welcome! Please read our contributing guidelines before submitting a pull request.
