@illuma-ai/agents
v1.0.90
Published
Illuma AI Agents Library
Downloads
912
Readme
Illuma Agents
Enterprise-grade TypeScript library for building and orchestrating LLM-powered agents.
Built on LangChain and LangGraph, Illuma Agents provides multi-agent orchestration, real-time streaming, tool integration, prompt caching, extended thinking, and structured output — supporting 12+ LLM providers out of the box.
Features
- Multi-Agent Orchestration — Handoff, sequential, parallel (fan-out/fan-in), conditional, and hybrid agent flows
- 12+ LLM Providers — OpenAI, Anthropic, AWS Bedrock, Google Gemini, Vertex AI, Azure OpenAI, Mistral, DeepSeek, xAI, OpenRouter, Moonshot
- Streaming-First — Real-time token streaming with split-stream buffering and content aggregation
- Built-in Tools — Code execution (12+ languages), calculator, web search, browser automation, programmatic tool calling
- Prompt Caching — Anthropic and Bedrock cache control for reduced latency and cost
- Extended Thinking — Anthropic/Bedrock thinking blocks with proper tool-call sequencing
- Structured Output — JSON schema-constrained responses via tool calling, provider-native, or auto mode
- Dynamic Tool Discovery — BM25-ranked tool search for large tool registries (MCP servers)
- Context Management — Automatic message pruning, token counting, and context window optimization
- Observability — Langfuse + OpenTelemetry tracing
- Dual Module Output — ESM + CJS with full TypeScript declarations
Installation
npm install @illuma-ai/agentsPeer Dependencies
The library requires @langchain/core as a peer dependency. If not already installed:
npm install @langchain/coreEnvironment Variables
Set API keys for the providers you plan to use:
# LLM Providers (add the ones you need)
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...
AWS_ACCESS_KEY_ID=...
AWS_SECRET_ACCESS_KEY=...
AWS_REGION=us-east-1
GOOGLE_API_KEY=...
AZURE_OPENAI_API_KEY=...
DEEPSEEK_API_KEY=...
XAI_API_KEY=...
MISTRAL_API_KEY=...
OPENROUTER_API_KEY=...
# Code Executor (optional)
CODE_EXECUTOR_BASEURL=http://localhost:8088
CODE_EXECUTOR_API_KEY=your-api-key
# Observability (optional)
LANGFUSE_SECRET_KEY=sk-lf-...
LANGFUSE_PUBLIC_KEY=pk-lf-...
LANGFUSE_BASE_URL=https://cloud.langfuse.comQuick Start
Single Agent
import { HumanMessage } from '@langchain/core/messages';
import {
Run,
ChatModelStreamHandler,
createContentAggregator,
ToolEndHandler,
ModelEndHandler,
GraphEvents,
Providers,
} from '@illuma-ai/agents';
import type * as t from '@illuma-ai/agents';
const { contentParts, aggregateContent } = createContentAggregator();
const run = await Run.create<t.IState>({
runId: 'my-run-001',
graphConfig: {
type: 'standard',
llmConfig: {
provider: Providers.ANTHROPIC,
model: 'claude-sonnet-4-20250514',
apiKey: process.env.ANTHROPIC_API_KEY,
},
instructions: 'You are a helpful AI assistant.',
},
returnContent: true,
customHandlers: {
[GraphEvents.TOOL_END]: new ToolEndHandler(),
[GraphEvents.CHAT_MODEL_END]: new ModelEndHandler(),
[GraphEvents.CHAT_MODEL_STREAM]: new ChatModelStreamHandler(),
[GraphEvents.ON_RUN_STEP]: {
handle: (event: string, data: t.RunStep) => aggregateContent({ event, data }),
},
[GraphEvents.ON_RUN_STEP_DELTA]: {
handle: (event: string, data: t.RunStepDeltaEvent) => aggregateContent({ event, data }),
},
[GraphEvents.ON_MESSAGE_DELTA]: {
handle: (event: string, data: t.MessageDeltaEvent) => aggregateContent({ event, data }),
},
},
});
const result = await run.processStream(
{ messages: [new HumanMessage('What is the capital of France?')] },
{ version: 'v2', configurable: { user_id: 'user-123', thread_id: 'conv-1' } }
);
console.log('Response:', contentParts);Multi-Agent with Handoffs
import { Run, Providers } from '@illuma-ai/agents';
import type * as t from '@illuma-ai/agents';
const run = await Run.create({
runId: 'multi-agent-001',
graphConfig: {
type: 'multi-agent',
agents: [
{
agentId: 'flight_assistant',
provider: Providers.ANTHROPIC,
clientOptions: { modelName: 'claude-haiku-4-5' },
instructions: 'You are a flight booking assistant.',
},
{
agentId: 'hotel_assistant',
provider: Providers.ANTHROPIC,
clientOptions: { modelName: 'claude-haiku-4-5' },
instructions: 'You are a hotel booking assistant.',
},
],
edges: [
{
from: 'flight_assistant',
to: 'hotel_assistant',
description: 'Transfer when user needs hotel help',
},
{
from: 'hotel_assistant',
to: 'flight_assistant',
description: 'Transfer when user needs flight help',
},
],
},
customHandlers: { /* ...event handlers... */ },
returnContent: true,
});Parallel Fan-out / Fan-in
const run = await Run.create({
runId: 'parallel-001',
graphConfig: {
type: 'multi-agent',
agents: [
{ agentId: 'coordinator', provider: Providers.ANTHROPIC, clientOptions: { modelName: 'claude-haiku-4-5' }, instructions: 'Coordinate research tasks.' },
{ agentId: 'analyst_a', provider: Providers.ANTHROPIC, clientOptions: { modelName: 'claude-haiku-4-5' }, instructions: 'Financial analysis.' },
{ agentId: 'analyst_b', provider: Providers.ANTHROPIC, clientOptions: { modelName: 'claude-haiku-4-5' }, instructions: 'Technical analysis.' },
{ agentId: 'summarizer', provider: Providers.ANTHROPIC, clientOptions: { modelName: 'claude-haiku-4-5' }, instructions: 'Synthesize all findings.' },
],
edges: [
{ from: 'coordinator', to: ['analyst_a', 'analyst_b'], edgeType: 'direct' }, // Fan-out (parallel)
{ from: ['analyst_a', 'analyst_b'], to: 'summarizer', edgeType: 'direct' }, // Fan-in
],
},
customHandlers: { /* ... */ },
});Using Tools
import { createCodeExecutionTool, Calculator } from '@illuma-ai/agents';
const run = await Run.create<t.IState>({
runId: 'tools-001',
graphConfig: {
type: 'standard',
llmConfig: { provider: Providers.OPENAI, model: 'gpt-4o' },
instructions: 'You can execute code and do math.',
tools: [createCodeExecutionTool(), new Calculator()],
},
customHandlers: { /* ... */ },
});Extended Thinking
const run = await Run.create<t.IState>({
runId: 'thinking-001',
graphConfig: {
type: 'standard',
llmConfig: {
provider: Providers.ANTHROPIC,
model: 'claude-3-7-sonnet-latest',
thinking: { type: 'enabled', budget_tokens: 5000 },
},
instructions: 'Think through problems carefully.',
},
customHandlers: {
// ...standard handlers...
[GraphEvents.ON_REASONING_DELTA]: {
handle: (event: string, data: t.ReasoningDeltaEvent) => {
// Receive thinking/reasoning tokens as they stream
},
},
},
});Structured Output
const run = await Run.create<t.IState>({
runId: 'structured-001',
graphConfig: {
type: 'standard',
agents: [{
agentId: 'analyzer',
provider: Providers.OPENAI,
clientOptions: { model: 'gpt-4o' },
instructions: 'Analyze sentiment.',
structuredOutput: {
schema: {
type: 'object',
properties: {
sentiment: { type: 'string', enum: ['positive', 'negative', 'neutral'] },
confidence: { type: 'number' },
},
required: ['sentiment', 'confidence'],
},
mode: 'auto',
strict: true,
},
}],
},
customHandlers: {
[GraphEvents.ON_STRUCTURED_OUTPUT]: {
handle: (_event: string, data: unknown) => console.log('Result:', data),
},
},
});Providers
| Provider | Enum | Notes |
|----------|------|-------|
| OpenAI | Providers.OPENAI | GPT-4o, o1, o3 |
| Anthropic | Providers.ANTHROPIC | Claude 4, Sonnet, Haiku — thinking, caching, web search |
| AWS Bedrock | Providers.BEDROCK | Claude via Bedrock — caching, reasoning |
| Google Gemini | Providers.GOOGLE | Gemini Pro, Flash |
| Vertex AI | Providers.VERTEXAI | Google models via GCP |
| Azure OpenAI | Providers.AZURE | OpenAI models via Azure |
| Mistral | Providers.MISTRALAI | Large, Medium, Small |
| DeepSeek | Providers.DEEPSEEK | Reasoning models |
| xAI | Providers.XAI | Grok |
| OpenRouter | Providers.OPENROUTER | Multi-model routing |
| Moonshot | Providers.MOONSHOT | Moonshot AI |
Provider config examples:
// OpenAI
{ provider: Providers.OPENAI, clientOptions: { model: 'gpt-4o', apiKey: '...' } }
// Anthropic
{ provider: Providers.ANTHROPIC, clientOptions: { modelName: 'claude-sonnet-4-20250514', apiKey: '...' } }
// AWS Bedrock
{ provider: Providers.BEDROCK, clientOptions: { model: 'us.anthropic.claude-sonnet-4-20250514-v1:0', region: 'us-east-1' } }Multi-Agent Patterns
Edge Types
| Type | Behavior |
|------|----------|
| Handoff (default) | LLM decides when to transfer — auto-generates transfer_to_<agent> tools |
| Direct | Fixed routing — agents run in sequence or parallel |
Handoff (Dynamic)
{ from: 'triage', to: 'billing', description: 'Transfer for billing questions' }
// → triage agent gets a transfer_to_billing tool it can callSequential Pipeline
{ from: 'drafter', to: 'reviewer', edgeType: 'direct', prompt: 'Review the draft above.' }Fan-out / Fan-in (Parallel)
{ from: 'coordinator', to: ['analyst_a', 'analyst_b'], edgeType: 'direct' } // parallel
{ from: ['analyst_a', 'analyst_b'], to: 'summarizer', edgeType: 'direct', prompt: '{results}' } // joinUse {results} in prompts to inject collected output from parallel agents.
Conditional Routing
{
from: 'router',
to: ['fast_model', 'powerful_model'],
condition: (state) => state.messages.at(-1)?.content.length > 500 ? 'powerful_model' : 'fast_model',
}Hybrid
Agents with both handoff and direct edges use exclusive routing: if a handoff fires, only the handoff destination runs; otherwise direct edges execute.
Built-in Tools
| Tool | Import | Description |
|------|--------|-------------|
| Code Executor | createCodeExecutionTool() | Sandboxed execution in 12+ languages (Python, JS, TS, C, C++, Java, PHP, Rust, Go, D, Fortran, R) |
| Calculator | new Calculator() | Math expressions via mathjs |
| Browser Tools | createBrowserTools() | 12 browser actions (navigate, click, type, screenshot, etc.) |
| Tool Search | createToolSearchTool() | BM25-ranked discovery for large tool registries |
| Programmatic Tool Calling | createProgrammaticToolCallingTool() | LLM writes Python to call tools as async functions |
Event System
Register handlers to receive real-time streaming events:
const customHandlers = {
[GraphEvents.CHAT_MODEL_STREAM]: new ChatModelStreamHandler(), // Token-by-token streaming
[GraphEvents.CHAT_MODEL_END]: new ModelEndHandler(usageArray), // Usage metadata
[GraphEvents.TOOL_END]: new ToolEndHandler(), // Tool results
[GraphEvents.ON_RUN_STEP]: { handle: (e, data) => ... }, // New run step
[GraphEvents.ON_RUN_STEP_DELTA]: { handle: (e, data) => ... }, // Step delta (tool args)
[GraphEvents.ON_MESSAGE_DELTA]: { handle: (e, data) => ... }, // Text delta
[GraphEvents.ON_REASONING_DELTA]:{ handle: (e, data) => ... }, // Thinking delta
[GraphEvents.ON_AGENT_UPDATE]: { handle: (e, data) => ... }, // Agent switch
[GraphEvents.ON_STRUCTURED_OUTPUT]: { handle: (e, data) => ... },// Structured JSON
};Use createContentAggregator() to automatically collect deltas into a complete response:
const { contentParts, aggregateContent } = createContentAggregator();
// Pass aggregateContent into your handlers
// After streaming, contentParts has the full structured responsePrompt Caching
Caching is automatic for Anthropic and Bedrock providers:
- System messages get cache control markers
- Last 2 conversation messages get cache breakpoints
- Use
dynamicContextfor per-request data (keeps system prompt cacheable):
{
instructions: 'You are a helpful assistant.', // Cached
dynamicContext: `Current time: ${new Date().toISOString()}`, // Not cached
}Structured Output Modes
| Mode | Description |
|------|-------------|
| 'auto' | Auto-selects best strategy per provider (default) |
| 'tool' | Uses tool calling — universal compatibility |
| 'provider' | Provider-native JSON mode |
| 'native' | Constrained decoding — guaranteed schema compliance |
structuredOutput: {
schema: { /* JSON Schema */ },
mode: 'auto',
strict: true,
handleErrors: true, // Auto-retry on validation failure
maxRetries: 2,
}Observability
Set these env vars to enable automatic Langfuse tracing:
LANGFUSE_SECRET_KEY=sk-lf-...
LANGFUSE_PUBLIC_KEY=pk-lf-...
LANGFUSE_BASE_URL=https://cloud.langfuse.comEach trace captures userId, sessionId, messageId, and full LangChain callback spans.
Title Generation
Generate conversation titles from the first exchange:
const { title, language } = await run.generateTitle({
provider: Providers.ANTHROPIC,
inputText: userMessage,
contentParts,
titleMethod: TitleMethod.COMPLETION,
clientOptions: { model: 'claude-3-5-haiku-latest' },
});API Exports
// Core
export { Run } from '@illuma-ai/agents';
export { ChatModelStreamHandler, createContentAggregator, SplitStreamHandler } from '@illuma-ai/agents';
export { HandlerRegistry, ModelEndHandler, ToolEndHandler } from '@illuma-ai/agents';
// Tools
export { createCodeExecutionTool, Calculator, createBrowserTools } from '@illuma-ai/agents';
export { createToolSearchTool, createProgrammaticToolCallingTool } from '@illuma-ai/agents';
// Graphs
export { StandardGraph, MultiAgentGraph } from '@illuma-ai/agents';
// LLM
export { getChatModelClass, llmProviders } from '@illuma-ai/agents';
// Enums & Constants
export { GraphEvents, Providers, ContentTypes, StepTypes, TitleMethod, Constants } from '@illuma-ai/agents';
// Types
export type { IState, RunConfig, AgentInputs, GraphEdge, StructuredOutputConfig } from '@illuma-ai/agents';License
UNLICENSED — Proprietary software. All rights reserved @TeamIlluma.
