@refract-org/eval
v0.2.1
Published
Evaluation harness for validating Refract analyzers against public benchmark pages and labeled claim-history examples
Readme
@refract-org/eval
Evaluation harness for ground truth validation and benchmark pages.
bun add @refract-org/evalExports
Harness
createEvalHarness()— returns anEvalHarnesswithevaluate(),benchmarkPages(), andcomputeScores()EvalHarness— interface for running test cases against evidence eventsEvalTestCase— a single benchmark case (page, revision range, expected events)EvalResult— per-test result with precision, matches, misses, false positivesEvalScoreSummary— aggregate scores across all tests
Ground Truth
validateAgainstGroundTruth()— validate events against outcome labelsGROUND_TRUTH_LABELS— built-in ground truth labelsgetGroundTruthById()/getGroundTruthForPage()— lookup helpersOutcomeLabel— ground truth label typeL3ValidationResult/L3ValidationSummary— validation result types
import { createEvalHarness, validateAgainstGroundTruth } from "@refract-org/eval";