@benchlocal/sdk
v0.2.0
Published
BenchLocal SDK for authoring Bench Packs on top of the BenchLocal runtime contract.
Maintainers
Readme
@benchlocal/sdk
SDK for authoring Bench Packs that run inside BenchLocal.
This package sits on top of @benchlocal/core and provides the thin authoring layer used by official Bench Packs:
- typed manifest loading
- Bench Pack runtime definition helpers
- host-context lookup helpers
- result helpers for scoring
Install
npm install @benchlocal/sdkTypical usage
Keep your benchmark logic in your own repo code, and use the SDK only inside benchlocal/index.ts.
import {
createHostHelpers,
defineBenchPack,
loadBenchPackManifest,
requireScoredResults
} from "@benchlocal/sdk";
const manifest = loadBenchPackManifest(__dirname);
export { manifest };
export default defineBenchPack({
manifest,
async listScenarios() {
return [];
},
async prepare(context) {
const helpers = createHostHelpers(context);
return {
async runScenario(input) {
const provider = helpers.getRequiredProvider(input.model.provider, { enabledOnly: true });
const inference = helpers.getInferenceEndpoint(input.model.id);
return {
scenarioId: input.scenario.id,
status: "pass",
summary: inference?.status === "running" ? inference.baseUrl : provider.baseUrl
};
},
async dispose() {}
};
},
scoreModelResults(results) {
const scored = requireScoredResults(results);
return {
totalScore: scored.reduce((sum, result) => sum + result.score, 0),
categories: []
};
}
});Main helpers
loadBenchPackManifest(__dirname)defineBenchPack(...)defineBenchPackManifest(...)createHostHelpers(context)requireScoredResults(results)
Useful host lookups:
getRequiredProvider(providerId, { enabledOnly: true })getRequiredInferenceEndpoint(modelId)getRequiredVerifier(verifierId)
When a pack uses a Docker verifier, the returned inference endpoint may also include dockerBaseUrl. Forward dockerBaseUrl ?? baseUrl to the verifier-side runtime.
@benchlocal/sdk also re-exports the main public types from @benchlocal/core.
Authoring model
- keep canonical Bench Pack metadata in
benchlocal.pack.json - load and export that metadata from
benchlocal/index.ts - keep benchmark logic in your own repo modules such as
lib/ - use the SDK only for the BenchLocal adapter layer
Repository
- BenchLocal monorepo: https://github.com/stevibe/BenchLocal
- Issues: https://github.com/stevibe/BenchLocal/issues
