@kognitivedev/evals
v0.2.8
Published
Evaluation framework with model-graded and rule-based scorers
Maintainers
Readme
@kognitivedev/evals
Evaluation framework with model-graded, rule-based, and composite scorers.
Installation
bun add @kognitivedev/evals ai zodQuick Start
import { runEvals, ruleBased, modelGradedScorer, EvalDatasetBuilder } from "@kognitivedev/evals";
const report = await runEvals({
generateResponse: async (messages) => {
const result = await generateText({ model: openai("gpt-4o-mini"), messages });
return result.text;
},
dataset: EvalDatasetBuilder.fromArray([
{ input: [{ role: "user", content: "Hello" }] },
]),
scorers: [
ruleBased.containsKeyword("hello"),
modelGradedScorer({ model: openai("gpt-4o-mini"), criteria: "Is this helpful?" }),
],
});
console.log(report.summary.avgScore);Scorers
modelGradedScorer— LLM-as-judge via AI SDKgenerateObjectruleBased.containsKeyword,.matchesRegex,.exactMatch,.lengthRangecompositeScorer— weighted average of multiple scorers
