@alldaytech/multiverse-sdk
v0.0.29
Published
Simulation testing SDK for AI agents
Maintainers
Readme
@alldaytech/multiverse-sdk
Simulation testing for AI agents. Test your agent against realistic scenarios with simulated users and automated quality evaluation — no real APIs needed.
Install
npm install @alldaytech/multiverse-sdk zodFor LangChain agents:
npm install @alldaytech/multiverse-sdk zod @langchain/core @langchain/anthropicQuick Start
Autonomous agent (default — pipelines, document processing, background jobs):
import { multiverse } from '@alldaytech/multiverse-sdk';
import { z } from 'zod';
multiverse.configure({
baseUrl: process.env.MULTIVERSE_URL,
apiKey: process.env.MULTIVERSE_API_KEY,
});
const test = multiverse.describe({
name: 'submission-intake-agent',
task: 'Process insurance submission: extract documents, validate coverage, produce summary',
agent: runAgent,
// Mirror the real-world event your agent receives (optional but recommended)
triggerSchema: z.object({
submissionId: z.string(),
priority: z.enum(['standard', 'urgent']),
}),
});
const scenarios = await test.generateScenarios({ count: 5 });
const results = await test.run({
scenarios,
success: (world) => world.getCollection('intake_summaries').size > 0,
});
console.log(`${results.passRate}% pass`);Conversational agent (opt in when there's a human user):
const test = multiverse.describe({
name: 'flight-booking-agent',
task: 'Help the user book a flight',
agent: runAgent,
conversational: true, // Enables simulated user — mutually exclusive with triggerSchema
variables: z.object({
expectedBookings: z.number().describe('Total bookings to create'),
}),
});
const scenarios = await test.generateScenarios({ count: 5 });
const results = await test.run({
scenarios,
success: (world, trace, scenario) =>
world.getCollection('bookings').size === scenario.variables.expectedBookings,
});
console.log(`${results.passRate}% pass`);API
multiverse.configure(config)
Initialize the SDK. Call once at startup.
multiverse.configure({
baseUrl: process.env.MULTIVERSE_URL,
apiKey: process.env.MULTIVERSE_API_KEY,
});multiverse.tool(def)
Register a tool for simulation. Works with any plain function — no framework required.
const searchFlights = multiverse.tool({
name: 'searchFlights',
description: 'Search for available flights',
input: z.object({
from: z.string().describe('Departure airport code'),
to: z.string().describe('Arrival airport code'),
date: z.string().describe('Departure date (YYYY-MM-DD)'),
}),
output: SearchResultSchema,
execute: async (input) => realSearchFlights(input),
effects: (output, world) =>
output.flights.map((f) => ({
operation: 'create' as const,
collection: 'flights',
id: f.id,
data: f,
})),
});Returns a callable function (input) => Promise<output>. During tests, calls are intercepted and simulated. Outside tests, execute is called directly.
| Option | Type | Description |
|--------|------|-------------|
| name | string | Tool name |
| description | string | Tool description |
| input | ZodSchema | Input schema |
| output | ZodSchema | Output schema |
| execute | (input) => Promise<output> | Real implementation |
| effects | (output, world) => Effect[] | Declare state changes from output |
wrap(tool, config)
Wrap a LangChain tool for simulation. Extracts name, description, and schema automatically.
import { wrap } from '@alldaytech/multiverse-sdk';
const myTool = wrap(langchainTool, {
output: OutputSchema,
effects: (output, world) => [
{ operation: 'create', collection: 'orders', id: output.id, data: output },
],
});| Option | Type | Description |
|--------|------|-------------|
| output | ZodSchema | Output schema for responses |
| effects | (output, world) => Effect[] | Declare state changes from output |
| input | ZodSchema | Input schema (auto-extracted from LangChain tools) |
| name | string | Tool name (auto-extracted from LangChain tools) |
| description | string | Tool description (auto-extracted from LangChain tools) |
multiverse.describe(options)
Define a test. Returns an object with generateScenarios() and run() methods.
// Autonomous agent (default)
const test = multiverse.describe({
name: 'my-agent',
task: 'Process the task',
agent: runAgent,
// triggerSchema mirrors the real-world event schema that triggers your agent (optional)
triggerSchema: z.object({
jobId: z.string(),
type: z.enum(['ingest', 'process', 'export']),
}),
});
// Conversational agent (opt in)
const test = multiverse.describe({
name: 'my-chatbot',
task: 'Help users complete the task',
agent: runAgent,
conversational: true, // Enables simulated user — mutually exclusive with triggerSchema
variables: z.object({ // Optional: typed variables for assertions in success()
expectedBookings: z.number(),
}),
});| Option | Type | Description |
|--------|------|-------------|
| name | string | Agent name for grouping in the dashboard |
| task | string | What the agent is being tested on |
| agent | AgentFn | Agent function to test |
| conversational | boolean | Enable simulated user (chatbots, assistants). Mutually exclusive with triggerSchema |
| triggerSchema | ZodSchema | Constrains the generated event payload (autonomous agents only) |
| variables | ZodSchema | Typed scenario variables accessible in success() via scenario.variables |
conversational and triggerSchema are mutually exclusive at the TypeScript level.
Agent function signature:
async function runAgent(ctx: {
userMessage: string; // Generated event payload (autonomous) or latest user message (conversational)
runId: string; // Stable across turns, use for memory/thread scoping
}): Promise<unknown>test.generateScenarios(options)
Generate test scenarios upfront for inspection or reuse.
const scenarios = await test.generateScenarios({ count: 10 });Variables are typed on multiverse.describe() via the variables option, not here.
test.saveScenarios(scenarios)
Save generated scenarios for reuse across runs.
await test.saveScenarios(scenarios);Appends to any previously saved scenarios. Each scenario has a stable id (nanoid).
test.getScenarios()
Load previously saved scenarios.
const { scenarios, scenarioCount } = await test.getScenarios();test.clearScenarios()
Remove all saved scenarios.
await test.clearScenarios();test.run(options)
Run tests against the agent.
const results = await test.run({
scenarios, // From generateScenarios()
success: (world, trace, scenario) => {
return world.getCollection('bookings').size === scenario.variables.expectedBookings;
},
trialsPerScenario: 4,
maxTurns: 20, // Max turns per run (conversational agents)
qualityThreshold: 70, // Default: 70
criteria: [ // Custom quality criteria (default: communication, error_handling, efficiency, accuracy)
{ name: 'politeness', description: 'Responds politely at all times' },
],
skipReport: true, // Skip LLM report generation
concurrency: 8,
onProgress: (p) => console.log(`${p.completed}/${p.total}`),
ci: {
postToPR: true, // Install the Multiverse GitHub App to enable
printReport: true,
},
});Results:
interface TestResults {
passRate: number;
runs: RunResult[];
duration: number;
url?: string;
markdown?: string;
}LangChain Integration
wrap() works with any LangChain tool. It extracts name, description, and schema automatically:
import { ChatAnthropic } from '@langchain/anthropic';
import { tool } from '@langchain/core/tools';
import { createReactAgent } from '@langchain/langgraph/prebuilt';
import { multiverse, wrap } from '@alldaytech/multiverse-sdk';
// Your LangChain tools
const searchFlightsTool = tool(
async ({ from, to, date }) => { /* real implementation */ },
{
name: 'searchFlights',
description: 'Search for available flights',
schema: z.object({
from: z.string().describe('Departure airport code'),
to: z.string().describe('Arrival airport code'),
date: z.string().describe('Departure date (YYYY-MM-DD)'),
}),
}
);
// Wrap for simulation
const searchFlights = wrap(searchFlightsTool, {
output: SearchResultSchema,
effects: (output, world) =>
output.flights.map((f) => ({
operation: 'create' as const,
collection: 'flights',
id: f.id,
data: f,
})),
});
// Use wrapped tools directly in your agent
const agent = createReactAgent({
llm: new ChatAnthropic({ model: 'claude-sonnet-4-20250514' }),
tools: [searchFlights, bookFlight],
});Wrapped tools are drop-in replacements — they preserve the original tool's type, name, and schema.
License
MIT
