npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@illuma-ai/agents

v1.0.90

Published

Illuma AI Agents Library

Downloads

912

Readme

Illuma Agents

Enterprise-grade TypeScript library for building and orchestrating LLM-powered agents.

Built on LangChain and LangGraph, Illuma Agents provides multi-agent orchestration, real-time streaming, tool integration, prompt caching, extended thinking, and structured output — supporting 12+ LLM providers out of the box.


Features

  • Multi-Agent Orchestration — Handoff, sequential, parallel (fan-out/fan-in), conditional, and hybrid agent flows
  • 12+ LLM Providers — OpenAI, Anthropic, AWS Bedrock, Google Gemini, Vertex AI, Azure OpenAI, Mistral, DeepSeek, xAI, OpenRouter, Moonshot
  • Streaming-First — Real-time token streaming with split-stream buffering and content aggregation
  • Built-in Tools — Code execution (12+ languages), calculator, web search, browser automation, programmatic tool calling
  • Prompt Caching — Anthropic and Bedrock cache control for reduced latency and cost
  • Extended Thinking — Anthropic/Bedrock thinking blocks with proper tool-call sequencing
  • Structured Output — JSON schema-constrained responses via tool calling, provider-native, or auto mode
  • Dynamic Tool Discovery — BM25-ranked tool search for large tool registries (MCP servers)
  • Context Management — Automatic message pruning, token counting, and context window optimization
  • Observability — Langfuse + OpenTelemetry tracing
  • Dual Module Output — ESM + CJS with full TypeScript declarations

Installation

npm install @illuma-ai/agents

Peer Dependencies

The library requires @langchain/core as a peer dependency. If not already installed:

npm install @langchain/core

Environment Variables

Set API keys for the providers you plan to use:

# LLM Providers (add the ones you need)
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...
AWS_ACCESS_KEY_ID=...
AWS_SECRET_ACCESS_KEY=...
AWS_REGION=us-east-1
GOOGLE_API_KEY=...
AZURE_OPENAI_API_KEY=...
DEEPSEEK_API_KEY=...
XAI_API_KEY=...
MISTRAL_API_KEY=...
OPENROUTER_API_KEY=...

# Code Executor (optional)
CODE_EXECUTOR_BASEURL=http://localhost:8088
CODE_EXECUTOR_API_KEY=your-api-key

# Observability (optional)
LANGFUSE_SECRET_KEY=sk-lf-...
LANGFUSE_PUBLIC_KEY=pk-lf-...
LANGFUSE_BASE_URL=https://cloud.langfuse.com

Quick Start

Single Agent

import { HumanMessage } from '@langchain/core/messages';
import {
  Run,
  ChatModelStreamHandler,
  createContentAggregator,
  ToolEndHandler,
  ModelEndHandler,
  GraphEvents,
  Providers,
} from '@illuma-ai/agents';
import type * as t from '@illuma-ai/agents';

const { contentParts, aggregateContent } = createContentAggregator();

const run = await Run.create<t.IState>({
  runId: 'my-run-001',
  graphConfig: {
    type: 'standard',
    llmConfig: {
      provider: Providers.ANTHROPIC,
      model: 'claude-sonnet-4-20250514',
      apiKey: process.env.ANTHROPIC_API_KEY,
    },
    instructions: 'You are a helpful AI assistant.',
  },
  returnContent: true,
  customHandlers: {
    [GraphEvents.TOOL_END]: new ToolEndHandler(),
    [GraphEvents.CHAT_MODEL_END]: new ModelEndHandler(),
    [GraphEvents.CHAT_MODEL_STREAM]: new ChatModelStreamHandler(),
    [GraphEvents.ON_RUN_STEP]: {
      handle: (event: string, data: t.RunStep) => aggregateContent({ event, data }),
    },
    [GraphEvents.ON_RUN_STEP_DELTA]: {
      handle: (event: string, data: t.RunStepDeltaEvent) => aggregateContent({ event, data }),
    },
    [GraphEvents.ON_MESSAGE_DELTA]: {
      handle: (event: string, data: t.MessageDeltaEvent) => aggregateContent({ event, data }),
    },
  },
});

const result = await run.processStream(
  { messages: [new HumanMessage('What is the capital of France?')] },
  { version: 'v2', configurable: { user_id: 'user-123', thread_id: 'conv-1' } }
);

console.log('Response:', contentParts);

Multi-Agent with Handoffs

import { Run, Providers } from '@illuma-ai/agents';
import type * as t from '@illuma-ai/agents';

const run = await Run.create({
  runId: 'multi-agent-001',
  graphConfig: {
    type: 'multi-agent',
    agents: [
      {
        agentId: 'flight_assistant',
        provider: Providers.ANTHROPIC,
        clientOptions: { modelName: 'claude-haiku-4-5' },
        instructions: 'You are a flight booking assistant.',
      },
      {
        agentId: 'hotel_assistant',
        provider: Providers.ANTHROPIC,
        clientOptions: { modelName: 'claude-haiku-4-5' },
        instructions: 'You are a hotel booking assistant.',
      },
    ],
    edges: [
      {
        from: 'flight_assistant',
        to: 'hotel_assistant',
        description: 'Transfer when user needs hotel help',
      },
      {
        from: 'hotel_assistant',
        to: 'flight_assistant',
        description: 'Transfer when user needs flight help',
      },
    ],
  },
  customHandlers: { /* ...event handlers... */ },
  returnContent: true,
});

Parallel Fan-out / Fan-in

const run = await Run.create({
  runId: 'parallel-001',
  graphConfig: {
    type: 'multi-agent',
    agents: [
      { agentId: 'coordinator', provider: Providers.ANTHROPIC, clientOptions: { modelName: 'claude-haiku-4-5' }, instructions: 'Coordinate research tasks.' },
      { agentId: 'analyst_a', provider: Providers.ANTHROPIC, clientOptions: { modelName: 'claude-haiku-4-5' }, instructions: 'Financial analysis.' },
      { agentId: 'analyst_b', provider: Providers.ANTHROPIC, clientOptions: { modelName: 'claude-haiku-4-5' }, instructions: 'Technical analysis.' },
      { agentId: 'summarizer', provider: Providers.ANTHROPIC, clientOptions: { modelName: 'claude-haiku-4-5' }, instructions: 'Synthesize all findings.' },
    ],
    edges: [
      { from: 'coordinator', to: ['analyst_a', 'analyst_b'], edgeType: 'direct' },   // Fan-out (parallel)
      { from: ['analyst_a', 'analyst_b'], to: 'summarizer', edgeType: 'direct' },     // Fan-in
    ],
  },
  customHandlers: { /* ... */ },
});

Using Tools

import { createCodeExecutionTool, Calculator } from '@illuma-ai/agents';

const run = await Run.create<t.IState>({
  runId: 'tools-001',
  graphConfig: {
    type: 'standard',
    llmConfig: { provider: Providers.OPENAI, model: 'gpt-4o' },
    instructions: 'You can execute code and do math.',
    tools: [createCodeExecutionTool(), new Calculator()],
  },
  customHandlers: { /* ... */ },
});

Extended Thinking

const run = await Run.create<t.IState>({
  runId: 'thinking-001',
  graphConfig: {
    type: 'standard',
    llmConfig: {
      provider: Providers.ANTHROPIC,
      model: 'claude-3-7-sonnet-latest',
      thinking: { type: 'enabled', budget_tokens: 5000 },
    },
    instructions: 'Think through problems carefully.',
  },
  customHandlers: {
    // ...standard handlers...
    [GraphEvents.ON_REASONING_DELTA]: {
      handle: (event: string, data: t.ReasoningDeltaEvent) => {
        // Receive thinking/reasoning tokens as they stream
      },
    },
  },
});

Structured Output

const run = await Run.create<t.IState>({
  runId: 'structured-001',
  graphConfig: {
    type: 'standard',
    agents: [{
      agentId: 'analyzer',
      provider: Providers.OPENAI,
      clientOptions: { model: 'gpt-4o' },
      instructions: 'Analyze sentiment.',
      structuredOutput: {
        schema: {
          type: 'object',
          properties: {
            sentiment: { type: 'string', enum: ['positive', 'negative', 'neutral'] },
            confidence: { type: 'number' },
          },
          required: ['sentiment', 'confidence'],
        },
        mode: 'auto',
        strict: true,
      },
    }],
  },
  customHandlers: {
    [GraphEvents.ON_STRUCTURED_OUTPUT]: {
      handle: (_event: string, data: unknown) => console.log('Result:', data),
    },
  },
});

Providers

| Provider | Enum | Notes | |----------|------|-------| | OpenAI | Providers.OPENAI | GPT-4o, o1, o3 | | Anthropic | Providers.ANTHROPIC | Claude 4, Sonnet, Haiku — thinking, caching, web search | | AWS Bedrock | Providers.BEDROCK | Claude via Bedrock — caching, reasoning | | Google Gemini | Providers.GOOGLE | Gemini Pro, Flash | | Vertex AI | Providers.VERTEXAI | Google models via GCP | | Azure OpenAI | Providers.AZURE | OpenAI models via Azure | | Mistral | Providers.MISTRALAI | Large, Medium, Small | | DeepSeek | Providers.DEEPSEEK | Reasoning models | | xAI | Providers.XAI | Grok | | OpenRouter | Providers.OPENROUTER | Multi-model routing | | Moonshot | Providers.MOONSHOT | Moonshot AI |

Provider config examples:

// OpenAI
{ provider: Providers.OPENAI, clientOptions: { model: 'gpt-4o', apiKey: '...' } }

// Anthropic
{ provider: Providers.ANTHROPIC, clientOptions: { modelName: 'claude-sonnet-4-20250514', apiKey: '...' } }

// AWS Bedrock
{ provider: Providers.BEDROCK, clientOptions: { model: 'us.anthropic.claude-sonnet-4-20250514-v1:0', region: 'us-east-1' } }

Multi-Agent Patterns

Edge Types

| Type | Behavior | |------|----------| | Handoff (default) | LLM decides when to transfer — auto-generates transfer_to_<agent> tools | | Direct | Fixed routing — agents run in sequence or parallel |

Handoff (Dynamic)

{ from: 'triage', to: 'billing', description: 'Transfer for billing questions' }
// → triage agent gets a transfer_to_billing tool it can call

Sequential Pipeline

{ from: 'drafter', to: 'reviewer', edgeType: 'direct', prompt: 'Review the draft above.' }

Fan-out / Fan-in (Parallel)

{ from: 'coordinator', to: ['analyst_a', 'analyst_b'], edgeType: 'direct' }          // parallel
{ from: ['analyst_a', 'analyst_b'], to: 'summarizer', edgeType: 'direct', prompt: '{results}' }  // join

Use {results} in prompts to inject collected output from parallel agents.

Conditional Routing

{
  from: 'router',
  to: ['fast_model', 'powerful_model'],
  condition: (state) => state.messages.at(-1)?.content.length > 500 ? 'powerful_model' : 'fast_model',
}

Hybrid

Agents with both handoff and direct edges use exclusive routing: if a handoff fires, only the handoff destination runs; otherwise direct edges execute.


Built-in Tools

| Tool | Import | Description | |------|--------|-------------| | Code Executor | createCodeExecutionTool() | Sandboxed execution in 12+ languages (Python, JS, TS, C, C++, Java, PHP, Rust, Go, D, Fortran, R) | | Calculator | new Calculator() | Math expressions via mathjs | | Browser Tools | createBrowserTools() | 12 browser actions (navigate, click, type, screenshot, etc.) | | Tool Search | createToolSearchTool() | BM25-ranked discovery for large tool registries | | Programmatic Tool Calling | createProgrammaticToolCallingTool() | LLM writes Python to call tools as async functions |


Event System

Register handlers to receive real-time streaming events:

const customHandlers = {
  [GraphEvents.CHAT_MODEL_STREAM]: new ChatModelStreamHandler(),   // Token-by-token streaming
  [GraphEvents.CHAT_MODEL_END]:    new ModelEndHandler(usageArray), // Usage metadata
  [GraphEvents.TOOL_END]:          new ToolEndHandler(),            // Tool results
  [GraphEvents.ON_RUN_STEP]:       { handle: (e, data) => ... },   // New run step
  [GraphEvents.ON_RUN_STEP_DELTA]: { handle: (e, data) => ... },   // Step delta (tool args)
  [GraphEvents.ON_MESSAGE_DELTA]:  { handle: (e, data) => ... },   // Text delta
  [GraphEvents.ON_REASONING_DELTA]:{ handle: (e, data) => ... },   // Thinking delta
  [GraphEvents.ON_AGENT_UPDATE]:   { handle: (e, data) => ... },   // Agent switch
  [GraphEvents.ON_STRUCTURED_OUTPUT]: { handle: (e, data) => ... },// Structured JSON
};

Use createContentAggregator() to automatically collect deltas into a complete response:

const { contentParts, aggregateContent } = createContentAggregator();
// Pass aggregateContent into your handlers
// After streaming, contentParts has the full structured response

Prompt Caching

Caching is automatic for Anthropic and Bedrock providers:

  • System messages get cache control markers
  • Last 2 conversation messages get cache breakpoints
  • Use dynamicContext for per-request data (keeps system prompt cacheable):
{
  instructions: 'You are a helpful assistant.',           // Cached
  dynamicContext: `Current time: ${new Date().toISOString()}`,  // Not cached
}

Structured Output Modes

| Mode | Description | |------|-------------| | 'auto' | Auto-selects best strategy per provider (default) | | 'tool' | Uses tool calling — universal compatibility | | 'provider' | Provider-native JSON mode | | 'native' | Constrained decoding — guaranteed schema compliance |

structuredOutput: {
  schema: { /* JSON Schema */ },
  mode: 'auto',
  strict: true,
  handleErrors: true,  // Auto-retry on validation failure
  maxRetries: 2,
}

Observability

Set these env vars to enable automatic Langfuse tracing:

LANGFUSE_SECRET_KEY=sk-lf-...
LANGFUSE_PUBLIC_KEY=pk-lf-...
LANGFUSE_BASE_URL=https://cloud.langfuse.com

Each trace captures userId, sessionId, messageId, and full LangChain callback spans.


Title Generation

Generate conversation titles from the first exchange:

const { title, language } = await run.generateTitle({
  provider: Providers.ANTHROPIC,
  inputText: userMessage,
  contentParts,
  titleMethod: TitleMethod.COMPLETION,
  clientOptions: { model: 'claude-3-5-haiku-latest' },
});

API Exports

// Core
export { Run } from '@illuma-ai/agents';
export { ChatModelStreamHandler, createContentAggregator, SplitStreamHandler } from '@illuma-ai/agents';
export { HandlerRegistry, ModelEndHandler, ToolEndHandler } from '@illuma-ai/agents';

// Tools
export { createCodeExecutionTool, Calculator, createBrowserTools } from '@illuma-ai/agents';
export { createToolSearchTool, createProgrammaticToolCallingTool } from '@illuma-ai/agents';

// Graphs
export { StandardGraph, MultiAgentGraph } from '@illuma-ai/agents';

// LLM
export { getChatModelClass, llmProviders } from '@illuma-ai/agents';

// Enums & Constants
export { GraphEvents, Providers, ContentTypes, StepTypes, TitleMethod, Constants } from '@illuma-ai/agents';

// Types
export type { IState, RunConfig, AgentInputs, GraphEdge, StructuredOutputConfig } from '@illuma-ai/agents';

License

UNLICENSED — Proprietary software. All rights reserved @TeamIlluma.