@agentic-kernel/evaluator
v0.6.0
Published
Evaluation harness — scenario × model matrix, deterministic / LLM judges, `pass^k` reliability summaries, and JSON/Markdown reports. Includes the `agent-engine-eval` CLI. Runtime-optional; for release gates and model selection.
Downloads
230
Readme
@agentic-kernel/evaluator
Evaluation harness — scenario × model matrix, deterministic / LLM judges, pass^k reliability summaries, and JSON/Markdown reports. Includes the agent-engine-eval CLI. Runtime-optional; for release gates and model selection.
Part of Agentic Kernel — a microkernel agent runtime, feature-paired across TypeScript and Python.
Install
npm install @agentic-kernel/evaluatorUsage
import { runEvalSuite, renderMarkdownReport } from "@agentic-kernel/evaluator";
const report = await runEvalSuite(suite, { trials: 5 });See the documentation site for guides and the full API.
License
Apache-2.0
