relia-prompt
v1.2.3
Published
Tool for testing llm prompts against multiple LLMs.
Readme
Relia Prompt
Test and benchmark prompts accross LLM providers and models
This tool is aimed at agentic use-cases for large production applications that require fast and reliable llm calls. For example, extracting sentiment from social media posts, converting a sentence into structured JSON, etc.
Features
- Multi-Provider Testing – OpenAI, Bedrock, DeepSeek, Gemini, Groq, OpenRouter
- Parallel Execution – Run tests concurrently across all configured LLMs
- Repeatability – Each test runs N times per model to measure consistency
- Code-first – Define prompts and tests in code
Quick Start
Prompts and tests live in your code. Use the example project pattern:
# From a project that has reliaprompt.definitions.ts (see example)
cd example
bun install
bun run reliaprompt:ui # or: from your app, add "reliaprompt:ui" and run from project root
# Open http://localhost:3000Set credentials via the RELIA_PROMPT_LLM_CONFIG_JSON environment variable (see Configuration). At least one provider is required.
Usage
Code-first (only mode)
Use ReliaPrompt inside your service for LLM benchmarking and testing from unit tests.
Install – Add
relia-promptas a dependency.Initialize – Pass credentials at startup (or load from
RELIA_PROMPT_LLM_CONFIG_JSONwhen using the UI):import { initializeReliaPrompt, runPromptTestsFromSuite, definePrompt, defineTestCase, defineSuite, } from "relia-prompt"; initializeReliaPrompt({ providers: { // Canonical keys can be provided directly in library mode. // For UI/server mode prefer RELIA_PROMPT_LLM_CONFIG_JSON in .env. }, });Define prompts and tests in code – Use the builder API and export
suitesfor the UI:const prompt = definePrompt({ name: "my-prompt", content: "..." }); const testCases = [ defineTestCase({ input: "...", expectedOutput: "[...]", expectedOutputType: "array" }), ]; export const suites = [defineSuite({ prompt, testCases })];Run tests – Require
testModels(andevaluationModelwhen using LLM evaluation) per run:const { score, results } = await runPromptTestsFromSuite(suite, { testModels: [{ provider: "provider-id", modelId: "model-id" }], evaluationModel: ..., // required when prompt.evaluationMode === "llm" runsPerTest: 1, });Optional UI – From your project root (where your definitions live), run:
yarn reliaprompt:uiThe UI shows prompts and tests from your code (read-only tests; prompt edits in the browser are drafts only). Configure
RELIA_PROMPT_LLM_CONFIG_JSONin.envand choose test/evaluation models on each run.
Configuration
Configuration is JSON-only via RELIA_PROMPT_LLM_CONFIG_JSON.
Use .env.example as the canonical template for the full JSON object.
See example for a full example and smoke test.
Development
bun dev # Backend + dashboard with hot reload
bun run dev:backend # Backend only with hot reload
bun dev:dashboard # Dashboard dev server
bun run build # Build dashboard + backend
bun run lint # Lint backend
bun run test # Unit tests
bun run test:e2e # E2E tests (Playwright)
bun run format # Format codeProject Structure
├── src/ # Backend (Express + Bun)
│ ├── server.ts # API routes
│ ├── llm-clients/ # Provider clients
│ └── services/ # Test runner
├── dashboard/ # SvelteKit app
│ └── src/
│ ├── lib/ # Components & stores
│ └── routes/ # Pages
└── example/ # Example projectLicense
MIT
