@akhilan-fluxon/llm-testrunner-components
v1.1.4
Published
A Stencil web component library for LLM test runner functionality
Downloads
66
Maintainers
Readme
LLM TestRunner Web Components
A Stencil web component library that provides a comprehensive LLM testing solution with automated evaluation capabilities.
Overview
The LLM TestRunner is a tool for testing Large Language Model (LLM) responses against expected criteria. It provides a complete interface for:
- Question Management: Add, edit, and organize test questions
- AI Integration: Direct integration with Google's Gemini AI API
- Automated Evaluation: Built-in evaluation engine that checks responses against expected keywords and source links
- Batch Testing: Run multiple tests sequentially
- Real-time Results: Live evaluation results with pass/fail indicators
Components
<llm-test-runner>
The main component that provides a complete LLM testing interface.
Features:
- Question input with expected keywords and source links
- Real-time AI response generation via Gemini API
- Test case management (add, delete, run individual or all tests)
- Built-in evaluation engine with keyword and source link matching
- Error handling and loading states
- Rate limiting for batch operations
Usage:
<llm-test-runner api-key="your-gemini-api-key-here"></llm-test-runner>🎯 Usage Modes
1. Direct HTML Usage
Simply include the component in your HTML:
<!DOCTYPE html>
<html>
<head>
<script type="module" src="/build/llm-testrunner.esm.js"></script>
<script nomodule src="/build/llm-testrunner.js"></script>
</head>
<body>
<llm-test-runner api-key="your-gemini-api-key-here"></llm-test-runner>
</body>
</html>2. Library Integration
Import as a module in your application:
import { LLMTestRunner } from 'llm-testrunner-components';
// The component is automatically registered and ready to useConfiguration
API Key Prop
The component accepts the Gemini API key as a prop.
<llm-test-runner api-key="your-gemini-api-key-here"></llm-test-runner>React/JSX Usage
function App() {
return (
<div>
<llm-test-runner apiKey="your-gemini-api-key-here" />
</div>
);
}Evaluation Engine
The built-in evaluation engine provides:
- Keyword Matching: Case-insensitive matching of expected keywords in AI responses
- Source Link Validation: Checks for presence of expected URLs in responses
- Pass/Fail Logic: Tests pass only when ALL expected items are found
- Detailed Results: Shows which keywords and links were found/missing
Evaluation Criteria
- Keywords: Must be present in the AI response (case-insensitive)
- Source Links: Must be present as exact URL matches
- Pass Condition: ALL expected keywords AND source links must be found
Development
Building
npm run buildTesting
npm testDevelopment Server
npm startProject Structure
src/
├── components/
│ └── llm-test-runner/ # Main component
│ ├── llm-test-runner.tsx # Component logic
│ ├── llm-test-runner.css # Styling
│ └── readme.md # Component documentation
├── lib/
│ └── evaluation/ # Evaluation engine
│ ├── evaluation-engine.ts # Core evaluation logic
│ ├── types.ts # TypeScript interfaces
│ └── index.ts # Exports
└── index.ts # Main library exportsUsing in React Applications
Installation
npm install llm-testrunner-componentsIntegration
import React, { useEffect } from 'react';
import { defineCustomElements } from 'llm-testrunner-components/loader';
function App() {
useEffect(() => {
defineCustomElements();
}, []);
return (
<div>
<h1>LLM Test Runner</h1>
<llm-test-runner api-key="your-gemini-api-key-here"></llm-test-runner>
</div>
);
}TypeScript Support
declare global {
namespace JSX {
interface IntrinsicElements {
'llm-test-runner': any;
}
}
}API Reference
Component Props
interface LLMTestRunnerProps {
apiKey: string; // Required: Your Gemini API key
}TestCase Interface
interface TestCase {
id: string;
question: string;
expectedKeywords: string[];
expectedSourceLinks: string[];
output?: string;
isRunning?: boolean;
error?: string;
evaluationResult?: EvaluationResult;
}EvaluationResult Interface
interface EvaluationResult {
testCaseId: string;
passed: boolean;
keywordMatches: KeywordMatch[];
sourceLinkMatches: SourceLinkMatch[];
timestamp?: string;
}