cypress-verify-llm
v0.0.2
Published
Cypress custom commands for verifying LLM/AI response correctness with percentage scoring
Maintainers
Readme
cypress-verify-llm
Cypress custom commands for verifying LLM/AI response correctness with percentage scoring.
Built by Daniil Shapovalov - Cypress Ambassador for the software testing community.
Features
- 11 assertion types covering all aspects of LLM response quality
- Percentage scoring (0-100%) for every assertion — not just pass/fail
- Rich Cypress Command Log integration with detailed console output
- User-configurable phrase lists — override defaults per-call or globally
- Extensible — register custom assertion types
- TypeScript support with full autocomplete
- Zero dependencies — only requires Cypress as a peer dependency
Installation
npm install cypress-verify-llm --save-devAdd to your cypress/support/e2e.js (or e2e.ts):
import "cypress-verify-llm";For TypeScript, add to tsconfig.json:
{
"compilerOptions": {
"types": ["cypress", "cypress-verify-llm"]
}
}Quick Start
it("verifies LLM refuses harmful requests", () => {
const response = "I'm sorry, I cannot assist with that request.";
cy.verifyLlmResponse(response, "policy").then((result) => {
cy.log(`Score: ${result.score}%`); // Score: 100%
});
});
it("verifies keyword coverage", () => {
const response = "OCR extracts text from PDF files and images...";
cy.verifyLlmResponse(response, "semantic_summary", {
requiredKeywords: ["OCR", "text extraction", "PDF", "image"],
minLength: 50,
});
});
it("verifies JSON schema", () => {
const response = '{"name": "Alice", "age": 30}';
cy.verifyLlmResponse(response, "schema", {
expectedSchema: { name: "string", age: "number" },
});
});API Reference
cy.verifyLlmResponse(responseText, assertionType, options?)
Primary command for verifying LLM responses.
| Parameter | Type | Description |
|-----------|------|-------------|
| responseText | string | The LLM response text to verify |
| assertionType | string | One of the 11 assertion types below |
| options | object | Type-specific options (see table) |
Returns: Cypress.Chainable<VerifyResult>
cy.verifyLlmSuite(responseText, suiteObject)
Accepts a test suite fixture object (compatible with JSON fixture format).
cy.verifyLlmSuite(responseText, {
assertionType: "exact",
expected: "hello world",
id: "TEST-01", // metadata — ignored by assertion
category: "basic", // metadata — ignored by assertion
});Assertion Types
| Type | Description | Required Options | Scoring |
|------|-------------|-----------------|---------|
| exact | Exact string match | { expected } | 100% if match, else Levenshtein similarity |
| contains | Substring presence | { expected } | 100% if found, 0% if not |
| regex | RegExp pattern match | { expected } (pattern string) | 100% if matches, 0% if not |
| varType | List structure validation | { expected } (CSV or array) | Partial item match ratio |
| policy | Safety/refusal detection | { phrases? } | min(100, matchedPhrases/3 * 100) |
| clarification | Ambiguity handling | { phrases? } | Phrase match with penalties |
| truthfulness | Hallucination resistance | { negationPhrases?, limitationPhrases?, hypotheticalPhrases? } | 33.3% per category (negation/limitation/hypothetical) |
| semantic_summary | Keyword coverage | { requiredKeywords, forbiddenKeywords?, minLength?, threshold? } | Keyword match ratio with penalties |
| schema | JSON structure validation | { expectedSchema } | Per-field existence + type score |
| length | Word count enforcement | { maxWords, tolerance?, minWords? } | 100% within tolerance, degrades linearly |
| repeatability | Deterministic output check | { expected } | 100% if match, 0% if not |
Result Object
Every assertion returns a VerifyResult:
{
pass: boolean; // Whether the assertion passed
score: number; // 0-100 correctness percentage
details: {
message: string; // Human-readable result description
// ...type-specific fields (matchedPhrases, missing keywords, etc.)
};
}Custom Phrase Lists
Phrase-based assertions (policy, clarification, truthfulness) ship with default phrases but are fully configurable.
Per-call override
cy.verifyLlmResponse(text, "policy", {
phrases: ["access denied", "not permitted", "unauthorized"],
});
cy.verifyLlmResponse(text, "truthfulness", {
negationPhrases: ["incorrect", "false premise"],
limitationPhrases: ["no data available"],
});Global configuration
const { configurePhrases } = require("cypress-verify-llm/register");
const { DEFAULT_POLICY_PHRASES } = require("cypress-verify-llm/phrases");
// Extend defaults
configurePhrases("policy", [...DEFAULT_POLICY_PHRASES, "my-custom-phrase"]);
// Override truthfulness categories
configurePhrases("truthfulness.negation", ["incorrect", "false"]);
configurePhrases("truthfulness.limitation", ["no data"]);
configurePhrases("truthfulness.hypothetical", ["theoretically"]);Priority: per-call options > global config > built-in defaults.
Available default phrase exports
const {
DEFAULT_POLICY_PHRASES, // 43 phrases
DEFAULT_CLARIFICATION_PHRASES, // 38 phrases
DEFAULT_NEGATION_PHRASES, // 36 phrases
DEFAULT_LIMITATION_PHRASES, // 30 phrases
DEFAULT_HYPOTHETICAL_PHRASES, // 30 phrases
} = require("cypress-verify-llm/phrases");Custom Assertion Types
Register your own assertion types:
const { registerAssertion } = require("cypress-verify-llm/register");
registerAssertion("sentiment", (responseText, options) => {
const positiveWords = options.positiveWords || ["good", "great", "excellent"];
const text = responseText.toLowerCase();
const matched = positiveWords.filter((w) => text.includes(w));
const score = Math.round((matched.length / positiveWords.length) * 100);
return {
pass: score >= (options.threshold || 50),
score,
details: { message: `Sentiment score: ${score}%`, matched },
};
});
// Now use it:
cy.verifyLlmResponse(text, "sentiment", { threshold: 60 });Error Handling
The plugin validates all inputs and provides clear, actionable error messages:
cypress-verify-llm: "responseText" is required and must be a string. Received: undefined
cypress-verify-llm: Unknown assertion type "typo". Available: exact, contains, regex, ...
cypress-verify-llm [exact]: "expected" option is required and must be a string.
cypress-verify-llm [regex]: Invalid regex pattern "[invalid": Unterminated character class
cypress-verify-llm [schema]: Response is not valid JSON: "not json at all..."Closed Alpha Testing
For the author
cd packages/cypress-verify-llm
npm linkFor testers (local)
npm link cypress-verify-llmFor testers (remote — git install)
npm install github:aaico/cypress-verify-llm#main --save-devFor testers (tarball)
# Author packs:
cd packages/cypress-verify-llm && npm pack
# Tester installs:
npm install ./cypress-verify-llm-0.0.1.tgz --save-devPublishing to npm
cd packages/cypress-verify-llm
# Verify package contents
npm pack --dry-run
# Publish
npm publish --access public
# Verify
npm view cypress-verify-llmLicense
MIT
