@altairalabs/promptarena

v1.1.6

Published

7 days ago

PromptKit Arena - Multi-turn conversation simulation and testing tool for LLM applications

Downloads

385

0High
0Medium
0Low

chaholl

llm testing prompt ai conversation evaluation chatbot openai anthropic gemini quality-assurance test-automation

@altairalabs/promptarena

PromptKit Arena - Multi-turn conversation simulation and testing tool for LLM applications

Installation

npx (No Installation Required)

npx @altairalabs/promptarena run -c ./examples/customer-support

Global Installation

npm install -g @altairalabs/promptarena

# Use directly
promptarena --version
promptarena run -c ./config

Project Dev Dependency

npm install --save-dev @altairalabs/promptarena

# Use via npm scripts
# Add to package.json:
{
  "scripts": {
    "test:prompts": "promptarena run -c ./tests/arena-config"
  }
}

What is PromptKit Arena?

PromptKit Arena is a comprehensive testing framework for LLM-based applications. It allows you to:

🎯 Test conversations across multiple LLM providers (OpenAI, Anthropic, Google, Azure)
🔄 Run multi-turn simulations with automated agent interactions
✅ Validate outputs using assertions and quality metrics
📊 Generate reports with detailed analysis and comparisons
🛡️ Test guardrails and safety measures
🔧 Validate tool usage and function calling

Quick Start

Get started in under 2 minutes:

# Create a new project from a template
npx @altairalabs/promptarena init my-test --quick

# Navigate to your project
cd my-test

# Set your API key (or use mock provider for testing)
export OPENAI_API_KEY=your-key-here

# Run your first test
npx @altairalabs/promptarena run

# View the HTML report
open out/report.html

That's it! The template includes pre-configured scenarios, assertions, and examples to get you started.

Browse Available Templates

# List all available templates
npx @altairalabs/promptarena templates list

# Create from a specific template
npx @altairalabs/promptarena init my-project --template community/iot-maintenance-demo

# Interactive mode (choose template, provider, etc.)
npx @altairalabs/promptarena init

Key Features

🎯 Multi-Provider Testing - Compare OpenAI, Anthropic, Google, and Azure side-by-side
🔄 Self-Play Mode - AI agents simulate realistic user conversations with personas
✅ Turn-Level Assertions - Validate individual responses (content, tone, length, JSON)
📊 Conversation Assertions - Check patterns across entire conversations
🎭 Template & Persona System - Dynamic prompts with variables and reusable personas
🛡️ Guardrail Testing - Ensure tools and responses follow safety constraints
📈 HTML Reports - Beautiful, detailed reports with cost tracking and metrics

Learn More

Assertion Types

Turn-Level: content_includes, content_matches, json_schema, jsonpath, llm_judge, tone, length
Conversation-Level: llm_judge_conversation, tools_not_called_with_args, max_tool_calls

See the Assertions Guide for examples and best practices.

Documentation

Full Documentation - Comprehensive guides and tutorials
Configuration Reference - Complete schema documentation
Examples - Working examples:
- Assertions Test - Turn and conversation-level assertions
- Customer Support - Self-play with personas
- Variables Demo - Template rendering
- LLM Judge - AI-powered evaluation
Multi-Turn Tutorial - Self-play patterns

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

@altairalabs/promptarena

Installation

npx (No Installation Required)

Global Installation

Project Dev Dependency

What is PromptKit Arena?

Quick Start

Browse Available Templates

Key Features

Learn More

Assertion Types

Documentation

License

Contributing

Support