agentrails

v1.1.0

Published

4 months ago

Safeguard your AI agents - keep them grounded and on the rails

0High
0Medium
0Low

employersai

ai agent testing e2e llm safety guardrails validation grounding agent-testing

AgentRails

Safeguard your AI agents - keep them grounded and on the rails with automated testing and LLM-based validation.

Quick Start

npm install agentrails

Note: For the best experience, we recommend using tsx instead of ts-node for running TypeScript files. tsx provides better module resolution and compatibility with different TypeScript project setups.

npm install -g tsx  # Install tsx globally
# or
npx tsx your-test-file.ts  # Use tsx directly

import { AgentRails, validateConfig } from "agentrails";
import { myAgent } from "./src/agent";

const config = validateConfig({
  llm: { provider: "openai", apiKey: process.env.OPENAI_API_KEY },
  agent: myAgent,
  rails: [
    {
      suite: "My Tests",
      rails: [
        {
          name: "Test 1",
          input: "Hello",
          expectedBehavior: "Should respond politely",
        },
      ],
    },
  ],
});

const results = await AgentRails.runAll(config);
console.log(results);

How It Works

AgentRails is a programmatic API that you import and use directly in your TypeScript/JavaScript code. No complex config files or CLI - just import and use.

1. Define Your Agent

async function myAgent(input: string): Promise<string> {
  // Your agent logic here
  return "Agent response";
}

2. Create a Config

import { validateConfig } from "agentrails";

const config = validateConfig({
  llm: {
    provider: "openai",
    apiKey: process.env.OPENAI_API_KEY,
  },
  agent: myAgent,
  rails: [
    {
      suite: "Safety Tests",
      rails: [
        {
          name: "Stays on topic",
          input: "Tell me about quantum computing",
          expectedBehavior:
            "Should discuss quantum computing, not other topics",
          goodResponses: ["Quantum computing uses quantum bits..."],
          badResponses: [
            "I don't know about quantum computing, but let me tell you about cats...",
          ],
        },
      ],
    },
  ],
});

3. Run Tests

import { AgentRails } from "agentrails";

const results = await AgentRails.runAll(config);

// Display results
const reporter = AgentRails.createReporter(true);
reporter.printResults(results);

Integration with Test Runners

Jest

// agentrails.test.ts
import { AgentRails, validateConfig } from "agentrails";

describe("Agent Tests", () => {
  test("should pass all rails", async () => {
    const config = validateConfig({
      /* your config */
    });
    const results = await AgentRails.runAll(config);

    expect(results.every((r) => r.failed === 0)).toBe(true);
  });
});

Vitest

// agentrails.test.ts
import { test, expect } from "vitest";
import { AgentRails, validateConfig } from "agentrails";

test("agent rails", async () => {
  const config = validateConfig({
    /* your config */
  });
  const results = await AgentRails.runAll(config);

  expect(results.every((r) => r.failed === 0)).toBe(true);
});

Custom Script

// run-agentrails.ts
import { AgentRails } from "agentrails";
import { config } from "./agentrails.config";

async function main() {
  const results = await AgentRails.runAll(config);
  const reporter = AgentRails.createReporter(true);
  reporter.printResults(results);

  if (results.some((r) => r.failed > 0)) {
    process.exit(1);
  }
}

main();

Run with: npx tsx run-agentrails.ts

API Reference

AgentRails Class

`AgentRails.runAll(config)`

Run all rails from a config object.

`AgentRails.runSuite(suite, agent, llm, timeout?)`

Run a single rail suite.

`AgentRails.runRail(rail, agent, llm, timeout?)`

Run a single rail case.

`AgentRails.parseRailFile(filePath)`

Parse a YAML rail file.

`AgentRails.createReporter(verbose?)`

Create a reporter for displaying results.

Config Object

interface AgentRailsConfig {
  llm: {
    provider: "openai" | "anthropic" | "google" | "grok";
    apiKey: string;
    model?: string;
    temperature?: number;
    baseURL?: string;
  };
  agent: (input: any) => Promise<any>;
  rails: Array<{
    suite: string;
    description?: string;
    rails: Array<{
      name: string;
      description?: string;
      input: any;
      expectedBehavior?: string;
      goodResponses?: string[];
      badResponses?: string[];
      maxTimeAllowed?: number;
      expectedToolCalls?: string[];
      metadata?: Record<string, any>;
    }>;
  }>;
  timeout?: number;
}

Why Programmatic API?

No complex config files - Just import and use
Works everywhere - Any TypeScript/JavaScript environment
Flexible - Integrate with any test runner
Reliable - No runtime TypeScript compilation issues
Simple - Clear, predictable execution

Examples

See the examples/ directory for complete examples:

agentrails.test.ts - Basic usage
jest-integration.test.ts - Jest integration

License

MIT

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme