relia-prompt
v1.2.5
Published
Test and benchmark prompts accross LLM providers and models.
Readme
Relia Prompt
Test and benchmark prompts accross LLM providers and models
This tool is aimed at agentic use-cases for large production applications that require fast and reliable llm calls. For example, extracting sentiment from social media posts, converting a sentence into structured JSON, etc.
Features
- Multi-Provider Testing – OpenAI, Bedrock, DeepSeek, Gemini, Groq, OpenRouter
- Parallel Execution – Run tests concurrently across all configured LLMs
- Repeatability – Each test runs N times per model to measure consistency
- Code-first – Define prompts and tests in code
Quick Start
Prompts and tests live in your code. Use the example project pattern:
# From a project that has reliaprompt.definitions.ts (see example)
cd example
bun install
bun run reliaprompt:ui # or: from your app, add "reliaprompt:ui" and run from project root
# Open http://localhost:3000Set credentials via the RELIA_PROMPT_LLM_CONFIG_JSON environment variable (see Configuration). At least one provider is required.
Usage
Code-first (only mode)
Use ReliaPrompt inside your service for LLM benchmarking and testing from unit tests.
Install – Add
relia-promptas a dependency.Initialize – Pass credentials at startup (or load from
RELIA_PROMPT_LLM_CONFIG_JSONwhen using the UI):import { initializeReliaPrompt, runPromptTestsFromSuite, definePrompt, defineTestCase, defineSuite, } from "relia-prompt"; initializeReliaPrompt({ providers: { // Canonical keys can be provided directly in library mode. // For UI/server mode prefer RELIA_PROMPT_LLM_CONFIG_JSON in .env. }, });Define prompts and tests in code – Use the builder API and export
suitesfor the UI:const prompt = definePrompt({ name: "my-prompt", content: "..." }); const testCases = [ defineTestCase({ input: "...", expectedOutput: "[...]", expectedOutputType: "array" }), ]; export const suites = [defineSuite({ prompt, testCases })];Run tests – Require
testModels(andevaluationModelwhen using LLM evaluation) per run:const { score, results } = await runPromptTestsFromSuite(suite, { testModels: [{ provider: "provider-id", modelId: "model-id" }], evaluationModel: ..., // required when prompt.evaluationMode === "llm" runsPerTest: 1, });Optional UI – From your project root (where your definitions live), run:
yarn reliaprompt:uiThe UI shows prompts and tests from your code (read-only tests; prompt edits in the browser are drafts only). Configure
RELIA_PROMPT_LLM_CONFIG_JSONin.envand choose test/evaluation models on each run.
Configuration
Configuration is JSON-only via RELIA_PROMPT_LLM_CONFIG_JSON.
Use .env.example as the canonical template for the full JSON object.
See example for a full example and smoke test.
Development
bun dev # Backend + dashboard with hot reload
bun run dev:backend # Backend only with hot reload
bun dev:dashboard # Dashboard dev server
bun run build # Build dashboard + backend
bun run lint # Lint backend
bun run test # Unit tests
bun run test:e2e # E2E tests (Playwright)
bun run format # Format codeProject Structure
├── src/ # Backend (Express + Bun)
│ ├── server.ts # API routes
│ ├── llm-clients/ # Provider clients
│ └── services/ # Test runner
├── dashboard/ # SvelteKit app
│ └── src/
│ ├── lib/ # Components & stores
│ └── routes/ # Pages
└── example/ # Example projectLicense
MIT
