@methodts/pacta-testkit
v0.3.0
Published
Pacta verification affordances — RecordingProvider, builders, assertions, and the Cortex agent conformance testkit (`./conformance` subpath, S8/PRD-065).
Downloads
1,955
Readme
title: "@methodts/pacta-testkit" scope: package layer: L3 contents:
- src/recording-provider.ts
- src/mock-tool-provider.ts
- src/builders.ts
- src/assertions.ts
@methodts/pacta-testkit
Verification affordances for Pacta agents -- recording providers, fluent builders, and assertion helpers.
Overview
The testkit provides everything needed to test Pacta agents without invoking real LLMs or spawning real processes. It includes:
- RecordingProvider -- an AgentProvider that records all interactions and replays scripted responses
- MockToolProvider -- a ToolProvider with scripted tool results
- Fluent builders --
pactBuilder()andagentRequestBuilder()for constructing test objects - Assertion helpers -- verify tool call sequences, budget consumption, and output schemas
All assertions throw descriptive errors compatible with any test runner (node:test, vitest, jest).
Install
npm install @methodts/pacta-testkitUsage
RecordingProvider
Replays scripted responses and records every interaction for later inspection.
import { RecordingProvider } from '@methodts/pacta-testkit';
import { createAgent } from '@methodts/pacta';
import type { AgentResult, TokenUsage, CostReport } from '@methodts/pacta';
const provider = new RecordingProvider();
// Script the response the provider will return
provider.setDefaultResult({
output: 'Fixed the bug in parser.ts',
sessionId: 'test-001',
completed: true,
stopReason: 'complete',
usage: { inputTokens: 100, outputTokens: 50, cacheReadTokens: 0, cacheWriteTokens: 0, totalTokens: 150 },
cost: { totalUsd: 0.001, perModel: {} },
durationMs: 500,
turns: 1,
});
const agent = createAgent({
pact: { mode: { type: 'oneshot' } },
provider,
});
const result = await agent.invoke({ prompt: 'Fix the bug' });
// Inspect what happened
const recording = provider.lastRecording!;
console.log(recording.toolCalls.length); // 0
console.log(recording.events.length); // 0
console.log(recording.result?.output); // 'Fixed the bug in parser.ts'For multi-turn tests, queue multiple responses with addResponse():
provider.addResponse({
events: [
{ type: 'tool_use', tool: 'Read', input: { file_path: '/src/parser.ts' }, toolUseId: 'tu-1' },
{ type: 'tool_result', tool: 'Read', output: 'file contents...', toolUseId: 'tu-1', durationMs: 10 },
{ type: 'turn_complete', turnNumber: 1, usage: { inputTokens: 100, outputTokens: 50, cacheReadTokens: 0, cacheWriteTokens: 0, totalTokens: 150 } },
],
result: { output: 'Done', sessionId: 's1', completed: true, stopReason: 'complete', usage: { inputTokens: 100, outputTokens: 50, cacheReadTokens: 0, cacheWriteTokens: 0, totalTokens: 150 }, cost: { totalUsd: 0.001, perModel: {} }, durationMs: 500, turns: 1 },
});After invocation, the recording captures:
recording.events-- all events in orderrecording.turns-- tool calls grouped by turnrecording.toolCalls-- all tool calls flattenedrecording.thinkingTraces-- all thinking contentrecording.result-- the final AgentResult
MockToolProvider
Implements ToolProvider with scripted results per tool name.
import { MockToolProvider } from '@methodts/pacta-testkit';
const tools = new MockToolProvider();
tools
.addTool(
{ name: 'Read', description: 'Read a file' },
{ output: 'file contents here' },
{ output: 'second file contents' },
)
.addTool(
{ name: 'Grep', description: 'Search files' },
{ output: '/src/parser.ts\n/src/lexer.ts' },
);
// Each call to execute() consumes the next scripted response for that tool
const result1 = await tools.execute('Read', { file_path: '/src/parser.ts' });
const result2 = await tools.execute('Read', { file_path: '/src/lexer.ts' });
const result3 = await tools.execute('Grep', { pattern: 'TODO' });
// Inspect the call log
console.log(tools.callLog.length); // 3
console.log(tools.callLog[0].name); // 'Read'Fluent Builders
Construct Pact and AgentRequest objects with sensible defaults. Tests only specify the fields they care about.
import { pactBuilder, agentRequestBuilder } from '@methodts/pacta-testkit';
const pact = pactBuilder()
.withMode({ type: 'oneshot' })
.withBudget({ maxTurns: 5, maxCostUsd: 0.10 })
.withScope({ allowedTools: ['Read', 'Grep'] })
.withReasoning({ thinkTool: true })
.build();
const request = agentRequestBuilder()
.withPrompt('Analyze the codebase')
.withWorkdir('/project')
.withSystemPrompt('You are a code reviewer.')
.build();PactBuilder methods: withMode, withStreaming, withBudget, withOutput, withScope, withContext, withReasoning, build
AgentRequestBuilder methods: withPrompt, withWorkdir, withSystemPrompt, withResumeSessionId, withMetadata, build
Assertion Helpers
import {
assertToolsCalled,
assertToolsCalledUnordered,
assertBudgetUnder,
assertOutputMatches,
} from '@methodts/pacta-testkit';
// Assert exact tool call sequence (ordered)
assertToolsCalled(recording, ['Grep', 'Read', 'Edit']);
// Assert tool calls regardless of order
assertToolsCalledUnordered(recording, ['Edit', 'Grep', 'Read']);
// Assert budget consumption stayed within limits
assertBudgetUnder(result, {
maxTokens: 5000,
maxCostUsd: 0.10,
maxTurns: 10,
maxDurationMs: 30_000,
});
// Assert output matches a schema (returns the parsed data)
const parsed = assertOutputMatches(result, {
parse: (raw) => {
if (typeof raw === 'string' && raw.length > 0) {
return { success: true, data: raw };
}
return { success: false, errors: ['Expected non-empty string'] };
},
});API Surface
RecordingProvider
| Method / Property | Description |
|---|---|
| addResponse(response) | Queue a scripted response for the next invoke() call |
| setDefaultResult(result) | Set the fallback result when no scripted responses remain |
| recordings | All recordings (readonly array) |
| lastRecording | Most recent recording |
| reset() | Clear all recordings and scripted responses |
MockToolProvider
| Method / Property | Description |
|---|---|
| addTool(definition, ...responses) | Register a tool with scripted responses |
| callLog | Readonly array of all calls made |
| reset() | Clear all tools and call history |
Builders
| Function | Returns |
|---|---|
| pactBuilder<T>() | PactBuilder<T> with sensible defaults (oneshot mode) |
| agentRequestBuilder() | AgentRequestBuilder with default prompt "test prompt" |
Assertions
| Function | Throws on |
|---|---|
| assertToolsCalled(recording, expectedTools) | Wrong tool sequence |
| assertToolsCalledUnordered(recording, expectedTools) | Missing or extra tools |
| assertBudgetUnder(result, limits) | Any budget limit exceeded |
| assertOutputMatches(result, schema) | Schema validation failure |
Architecture
src/
recording-provider.ts RecordingProvider — captures events, tool calls, thinking traces
mock-tool-provider.ts MockToolProvider — scripted ToolResult sequences per tool name
builders.ts PactBuilder, AgentRequestBuilder — fluent test object construction
assertions.ts assertToolsCalled, assertBudgetUnder, assertOutputMatches
conformance/ Cortex agent conformance testkit (S8 / PRD-065)
conformance-runner.ts runCortexAgentConformance entry point
mock-cortex-ctx.ts MockCortexCtx + CallRecorder
compliance-report.ts ComplianceReport schema + JCS-lite canonicalization + Ed25519 signer
cortex-types.ts Structural mirrors of CortexCtx / MethodAgentResult (sync'd to @methodts/agent-runtime)
plugin.ts ConformancePlugin interface + DEFAULT_REQUIRED_PLUGIN_IDS
plugins/ Built-in s1-method-agent-port + s3-service-adapters plugins
fixtures/ Three canonical v1 fixturesCortex Agent Conformance (./conformance subpath — PRD-065)
Cortex tenant apps of category: agent run the conformance suite from their
own CI to self-certify compliance with MethodAgentPort (S1) and the Cortex
service adapters (S3). The produced ComplianceReport.json is uploaded to
Cortex, which verifies the detached Ed25519 signature and flips
certified: true.
import { runCortexAgentConformance } from '@methodts/pacta-testkit/conformance';
import app from '../src/agent.js'; // default export: (ctx) => unknown
const report = await runCortexAgentConformance({
app,
appId: 'incident-triage',
outputPath: './compliance-report.json',
signer: await loadSignerFromCi(), // optional, required for prod
keyId: process.env.METHOD_CONFORMANCE_KEY_ID,
});
if (!report.passed) throw new Error(report.summary);What runs: every canonical fixture (incident-triage,
feature-dev-commission, daily-report) against a fresh MockCortexCtx,
followed by the two required plugins checking six S1 invariants
(C1–C6) and three S3 adapter invariants (A1–A3). Future S4/S5/S6/S7 plugins
extend the set via opts.plugins: [...DEFAULT_PLUGINS, myPlugin].
Signing recipe — load a PEM key via 1Password and pass
createEd25519Signer as the signer:
import { createEd25519Signer } from '@methodts/pacta-testkit/conformance';
const pem = process.env.METHOD_CONFORMANCE_SIGNING_KEY_PEM!;
const signer = createEd25519Signer(pem);
// …pass signer to runCortexAgentConformance optsPeer dependency on @methodts/agent-runtime is declared optional; the
conformance subpath uses structural type mirrors (cortex-types.ts) so
non-conformance consumers of the testkit do not need to install it.
Provider Conformance (./provider-conformance subpath)
Reusable row runner for pacta AgentProvider implementations. Each
provider package registers its own conformance row in a test file and
asserts the row passes — capabilities, oneshot mode, and output
validation are covered by default.
import {
runProviderConformanceRow,
type ProviderConformanceRow,
} from '@methodts/pacta-testkit/provider-conformance';
const row: ProviderConformanceRow<string> = {
id: 'pacta-provider-foo',
expectedCapabilities: {
modes: ['oneshot'],
streaming: false,
resumable: false,
budgetEnforcement: 'client',
outputValidation: 'client',
toolModel: 'none',
},
makeProvider: () => fooProvider({ transport: mockTransport }),
runOneshot: async () => mockOneshotResult,
outputSchema: { parse: (raw) => /* … */ },
};
const report = await runProviderConformanceRow(row);
if (!report.passed) throw new Error(JSON.stringify(report, null, 2));Registered rows (ship with their provider package):
| Row id | Provider | Test file |
|---|---|---|
| pacta-provider-claude-agent-sdk | @methodts/pacta-provider-claude-agent-sdk | src/conformance.test.ts |
New provider packages should add a row in the same shape so the full set is easy to audit from a single table.
Development
npm run build # TypeScript build
npm test # Run all tests (core + conformance)
npm run test-conformance # Run only the conformance suite