@alldaytech/multiverse-sdk

v0.0.29

Published

2 months ago

Simulation testing SDK for AI agents

0High
0Medium
0Low

jvgaeta93

ai agents testing simulation llm

@alldaytech/multiverse-sdk

Simulation testing for AI agents. Test your agent against realistic scenarios with simulated users and automated quality evaluation — no real APIs needed.

Install

npm install @alldaytech/multiverse-sdk zod

For LangChain agents:

npm install @alldaytech/multiverse-sdk zod @langchain/core @langchain/anthropic

Quick Start

Autonomous agent (default — pipelines, document processing, background jobs):

import { multiverse } from '@alldaytech/multiverse-sdk';
import { z } from 'zod';

multiverse.configure({
  baseUrl: process.env.MULTIVERSE_URL,
  apiKey: process.env.MULTIVERSE_API_KEY,
});

const test = multiverse.describe({
  name: 'submission-intake-agent',
  task: 'Process insurance submission: extract documents, validate coverage, produce summary',
  agent: runAgent,
  // Mirror the real-world event your agent receives (optional but recommended)
  triggerSchema: z.object({
    submissionId: z.string(),
    priority: z.enum(['standard', 'urgent']),
  }),
});

const scenarios = await test.generateScenarios({ count: 5 });
const results = await test.run({
  scenarios,
  success: (world) => world.getCollection('intake_summaries').size > 0,
});

console.log(`${results.passRate}% pass`);

Conversational agent (opt in when there's a human user):

const test = multiverse.describe({
  name: 'flight-booking-agent',
  task: 'Help the user book a flight',
  agent: runAgent,
  conversational: true,  // Enables simulated user — mutually exclusive with triggerSchema
  variables: z.object({
    expectedBookings: z.number().describe('Total bookings to create'),
  }),
});

const scenarios = await test.generateScenarios({ count: 5 });

const results = await test.run({
  scenarios,
  success: (world, trace, scenario) =>
    world.getCollection('bookings').size === scenario.variables.expectedBookings,
});

console.log(`${results.passRate}% pass`);

API

`multiverse.configure(config)`

Initialize the SDK. Call once at startup.

multiverse.configure({
  baseUrl: process.env.MULTIVERSE_URL,
  apiKey: process.env.MULTIVERSE_API_KEY,
});

`multiverse.tool(def)`

const searchFlights = multiverse.tool({
  name: 'searchFlights',
  description: 'Search for available flights',
  input: z.object({
    from: z.string().describe('Departure airport code'),
    to: z.string().describe('Arrival airport code'),
    date: z.string().describe('Departure date (YYYY-MM-DD)'),
  }),
  output: SearchResultSchema,
  execute: async (input) => realSearchFlights(input),
  effects: (output, world) =>
    output.flights.map((f) => ({
      operation: 'create' as const,
      collection: 'flights',
      id: f.id,
      data: f,
    })),
});

Returns a callable function (input) => Promise<output>. During tests, calls are intercepted and simulated. Outside tests, execute is called directly.

| Option | Type | Description | |--------|------|-------------| | name | string | Tool name | | description | string | Tool description | | input | ZodSchema | Input schema | | output | ZodSchema | Output schema | | execute | (input) => Promise<output> | Real implementation | | effects | (output, world) => Effect[] | Declare state changes from output |

`wrap(tool, config)`

Wrap a LangChain tool for simulation. Extracts name, description, and schema automatically.

import { wrap } from '@alldaytech/multiverse-sdk';

const myTool = wrap(langchainTool, {
  output: OutputSchema,
  effects: (output, world) => [
    { operation: 'create', collection: 'orders', id: output.id, data: output },
  ],
});

| Option | Type | Description | |--------|------|-------------| | output | ZodSchema | Output schema for responses | | effects | (output, world) => Effect[] | Declare state changes from output | | input | ZodSchema | Input schema (auto-extracted from LangChain tools) | | name | string | Tool name (auto-extracted from LangChain tools) | | description | string | Tool description (auto-extracted from LangChain tools) |

`multiverse.describe(options)`

Define a test. Returns an object with generateScenarios() and run() methods.

// Autonomous agent (default)
const test = multiverse.describe({
  name: 'my-agent',
  task: 'Process the task',
  agent: runAgent,
  // triggerSchema mirrors the real-world event schema that triggers your agent (optional)
  triggerSchema: z.object({
    jobId: z.string(),
    type: z.enum(['ingest', 'process', 'export']),
  }),
});

// Conversational agent (opt in)
const test = multiverse.describe({
  name: 'my-chatbot',
  task: 'Help users complete the task',
  agent: runAgent,
  conversational: true,   // Enables simulated user — mutually exclusive with triggerSchema
  variables: z.object({   // Optional: typed variables for assertions in success()
    expectedBookings: z.number(),
  }),
});

| Option | Type | Description | |--------|------|-------------| | name | string | Agent name for grouping in the dashboard | | task | string | What the agent is being tested on | | agent | AgentFn | Agent function to test | | conversational | boolean | Enable simulated user (chatbots, assistants). Mutually exclusive with triggerSchema | | triggerSchema | ZodSchema | Constrains the generated event payload (autonomous agents only) | | variables | ZodSchema | Typed scenario variables accessible in success() via scenario.variables |

conversational and triggerSchema are mutually exclusive at the TypeScript level.

Agent function signature:

async function runAgent(ctx: {
  userMessage: string;  // Generated event payload (autonomous) or latest user message (conversational)
  runId: string;        // Stable across turns, use for memory/thread scoping
}): Promise<unknown>

`test.generateScenarios(options)`

Generate test scenarios upfront for inspection or reuse.

const scenarios = await test.generateScenarios({ count: 10 });

Variables are typed on multiverse.describe() via the variables option, not here.

`test.saveScenarios(scenarios)`

Save generated scenarios for reuse across runs.

await test.saveScenarios(scenarios);

Appends to any previously saved scenarios. Each scenario has a stable id (nanoid).

`test.getScenarios()`

Load previously saved scenarios.

const { scenarios, scenarioCount } = await test.getScenarios();

`test.clearScenarios()`

Remove all saved scenarios.

await test.clearScenarios();

`test.run(options)`

Run tests against the agent.

const results = await test.run({
  scenarios,              // From generateScenarios()
  success: (world, trace, scenario) => {
    return world.getCollection('bookings').size === scenario.variables.expectedBookings;
  },
  trialsPerScenario: 4,
  maxTurns: 20,           // Max turns per run (conversational agents)
  qualityThreshold: 70,   // Default: 70
  criteria: [             // Custom quality criteria (default: communication, error_handling, efficiency, accuracy)
    { name: 'politeness', description: 'Responds politely at all times' },
  ],
  skipReport: true,       // Skip LLM report generation
  concurrency: 8,
  onProgress: (p) => console.log(`${p.completed}/${p.total}`),
  ci: {
    postToPR: true,       // Install the Multiverse GitHub App to enable
    printReport: true,
  },
});

Results:

interface TestResults {
  passRate: number;
  runs: RunResult[];
  duration: number;
  url?: string;
  markdown?: string;
}

LangChain Integration

wrap() works with any LangChain tool. It extracts name, description, and schema automatically:

import { ChatAnthropic } from '@langchain/anthropic';
import { tool } from '@langchain/core/tools';
import { createReactAgent } from '@langchain/langgraph/prebuilt';
import { multiverse, wrap } from '@alldaytech/multiverse-sdk';

// Your LangChain tools
const searchFlightsTool = tool(
  async ({ from, to, date }) => { /* real implementation */ },
  {
    name: 'searchFlights',
    description: 'Search for available flights',
    schema: z.object({
      from: z.string().describe('Departure airport code'),
      to: z.string().describe('Arrival airport code'),
      date: z.string().describe('Departure date (YYYY-MM-DD)'),
    }),
  }
);

// Wrap for simulation
const searchFlights = wrap(searchFlightsTool, {
  output: SearchResultSchema,
  effects: (output, world) =>
    output.flights.map((f) => ({
      operation: 'create' as const,
      collection: 'flights',
      id: f.id,
      data: f,
    })),
});

// Use wrapped tools directly in your agent
const agent = createReactAgent({
  llm: new ChatAnthropic({ model: 'claude-sonnet-4-20250514' }),
  tools: [searchFlights, bookFlight],
});

Wrapped tools are drop-in replacements — they preserve the original tool's type, name, and schema.

License

MIT

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

@alldaytech/multiverse-sdk

Install

Quick Start

API

multiverse.configure(config)

multiverse.tool(def)

wrap(tool, config)

multiverse.describe(options)

test.generateScenarios(options)

test.saveScenarios(scenarios)

test.getScenarios()

test.clearScenarios()

test.run(options)

LangChain Integration

License

`multiverse.configure(config)`

`multiverse.tool(def)`

`wrap(tool, config)`

`multiverse.describe(options)`

`test.generateScenarios(options)`

`test.saveScenarios(scenarios)`

`test.getScenarios()`

`test.clearScenarios()`

`test.run(options)`