@reaatech/agent-eval-harness-cost

v0.1.0

Published

13 days ago

Cost tracking, budget management, and reporting for agent-eval-harness

Downloads

150

0High
0Medium
0Low

reaatech

@reaatech/agent-eval-harness-cost

Status: Pre-1.0 — APIs may change in minor versions. Pin to a specific version in production.

Per-task cost calculation, budget enforcement, and cost reporting for AI agent trajectories. Tracks LLM token usage and tool invocation costs across 8 supported models with configurable pricing and 3-tier budget alerting.

Installation

npm install @reaatech/agent-eval-harness-cost

Feature Overview

8 built-in model pricings — Claude Opus/Sonnet/Haiku, GPT-4/Turbo/Mini, Gemini Pro/Flash
Per-turn and per-trajectory costing — granular breakdown with LLM vs tool cost separation
Budget enforcement — 3-tier alert system (50% log, 75% notify, 90% block) with daily cumulative tracking
Three budget presets — strict ($0.01/task), moderate ($0.05/task), lenient ($0.10/task)
Cost reporting — JSON, CSV, and formatted human-readable output
Optimization recommendations — identifies cost reduction opportunities per trajectory

Quick Start

import { calculateTrajectoryCost, checkBudget, createBudget, generateCostReport } from '@reaatech/agent-eval-harness-cost';
import type { Trajectory } from '@reaatech/agent-eval-harness-types';

const cost = calculateTrajectoryCost(trajectory, 'claude-opus');
console.log(`Total: $${cost.total_cost.toFixed(4)}`);

const budget = createBudget('moderate');
const result = checkBudget(cost, budget);
console.log(`Within budget: ${result.within_budget}, Usage: ${result.usage_percentage}%`);

API Reference

Cost Calculation Functions

| Export | Signature | Description | |--------|-----------|-------------| | calculateTurnCost | (turn: Turn, provider: string, options?: CostOptions) => TurnCost | Calculates cost for a single turn. Uses actual token counts from turn.cost if present, otherwise estimates from content length. Separates LLM cost (input + output tokens) from tool invocation cost. | | calculateTrajectoryCost | (trajectory: Trajectory, provider: string, options?: CostOptions) => CostBreakdown | Calculates cost for an entire trajectory. Aggregates all agent turns, returning a CostBreakdown with total, LLM, and tool costs plus per_turn breakdown. | | compareCosts | (baseline: CostBreakdown, candidate: CostBreakdown) => { costDiff, percentageChange, cheaper } | Compares two cost objects. Returns absolute difference, percentage change, and a cheaper boolean indicating if the candidate is less expensive. | | getCostPerMetric | (cost: CostBreakdown, metric: 'turn' \| 'tool_call' \| 'trajectory', trajectory: Trajectory) => number | Normalizes cost by the chosen metric. 'turn' divides by agent turn count, 'tool_call' divides by total tool calls, 'trajectory' returns total cost unchanged. |

Budget Functions

| Export | Signature | Description | |--------|-----------|-------------| | checkBudget | (cost: CostBreakdown, budget: BudgetConfig, thresholds?: AlertThreshold[]) => BudgetCheckResult | Checks cost against budget limits. Evaluates perTrajectory or perTask constraints and triggers alerts at configured thresholds. Default 3-tier thresholds: 50% log, 75% warn, 90% block. | | getOptimizationRecommendations | (cost: CostBreakdown, trajectory: Trajectory) => string[] | Analyzes cost breakdown and trajectory structure to suggest optimizations. Checks output/input token ratio, tool cost ratio, expensive turns, and conversation length. | | createBudget | (preset: 'strict' \| 'moderate' \| 'lenient') => BudgetConfig | Creates a pre-configured budget. Returns perTask, perTrajectory, daily, and perToolCall limits for the requested preset. |

CostTracker Class

| Method | Signature | Description | |--------|-----------|-------------| | constructor | (dailyBudget?: number) => CostTracker | Creates a tracker with an optional daily budget for cumulative monitoring. | | addTrajectory | (cost: CostBreakdown) => BudgetCheckResult | Adds a trajectory cost to the cumulative total. Triggers alerts at 75% (warning) and 90% (error) of daily budget. | | getTotalCost | () => number | Returns the cumulative total cost across all tracked trajectories. | | getTrajectoryCount | () => number | Returns the number of trajectories tracked. | | reset | () => void | Resets cumulative cost, trajectory count, and alerts. |

Reporting Functions

| Export | Signature | Description | |--------|-----------|-------------| | generateCostReport | (trajectories: Array<{ trajectory, cost }>, options?: CostReportOptions) => CostReport | Generates a full cost report with totals, per-trajectory breakdowns, hourly trends, and top expensive operations. | | formatCost | (cost: number, currency?: string) => string | Formats a numeric cost as a currency string (e.g., "$0.0023"). Defaults to USD with 4–6 decimal places. | | exportToCsv | (report: CostReport) => string | Exports a cost report to CSV format with per-trajectory details. | | exportToJson | (report: CostReport) => string | Exports a cost report to pretty-printed JSON. | | generateSummary | (report: CostReport) => string | Generates a human-readable summary with total cost, breakdown, and top expensive operations. |

Constants

DEFAULT_PRICING

Per-million-token pricing for all 8 built-in models:

| Model | Input ($/M tokens) | Output ($/M tokens) | |-------|--------------------|----------------------| | claude-opus | $15.00 | $75.00 | | claude-sonnet | $3.00 | $15.00 | | claude-haiku | $0.25 | $1.25 | | gpt-4-turbo | $10.00 | $30.00 | | gpt-4 | $30.00 | $60.00 | | gpt-4-mini | $3.00 | $10.00 | | gemini-pro | $2.50 | $7.50 | | gemini-flash | $0.075 | $0.30 |

Types

ProviderPricing

interface ProviderPricing {
  input: number;   // cost per million input tokens
  output: number;  // cost per million output tokens
}

CostOptions

interface CostOptions {
  customPricing?: Record<string, ProviderPricing>;  // override or extend DEFAULT_PRICING
  includeToolCosts?: boolean;                        // default: true — include tool invocation costs
  toolInvocationCost?: number;                       // default: 0.0001 — cost per tool call
}

TrackerTurnCost

interface TurnCost {
  turn_id: number;
  cost: number;          // same as total_cost
  llm_cost: number;      // LLM token cost for this turn
  tool_cost: number;     // tool invocation cost for this turn
  total_cost: number;    // llm_cost + tool_cost
  input_tokens?: number;
  output_tokens?: number;
}

BudgetConfig

interface BudgetConfig {
  perTask?: number;        // max cost per individual turn
  perTrajectory?: number;  // max cost per trajectory
  daily?: number;          // max cumulative cost per day
  perToolCall?: number;    // max cost per tool call
}

BudgetCheckResult

interface BudgetCheckResult {
  withinBudget: boolean;      // true if all checks pass
  currentCost: number;        // the cost being checked
  budgetLimit: number;        // the applicable budget limit
  usagePercentage: number;    // percentage of budget consumed (0–100+)
  alerts: BudgetAlert[];      // triggered alerts
  recommendations: string[];  // cost-reduction suggestions
}

BudgetAlert

interface BudgetAlert {
  level: 'info' | 'warning' | 'error';
  message: string;
  threshold: number;
  current: number;
  action: 'log' | 'warn' | 'block';
}

Related Packages

License

MIT

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

@reaatech/agent-eval-harness-cost

Installation

Feature Overview

Quick Start

API Reference

Cost Calculation Functions

Budget Functions

CostTracker Class

Reporting Functions

Constants

DEFAULT_PRICING

Types

ProviderPricing

CostOptions

TrackerTurnCost

BudgetConfig

BudgetCheckResult

BudgetAlert

Related Packages

License