verifiers-ts
v0.0.1-alpha.18
Published
TypeScript implementation of the verifiers framework for RL environments
Maintainers
Readme
verifiers-ts
TypeScript implementation of the verifiers framework for building RL environments and evaluations with AI SDK integration.
Overview
verifiers-ts provides the same core functionality as the Python verifiers library, enabling you to:
- Define custom interaction protocols between models and environments
- Build agents, multi-turn conversations, tool-augmented reasoning, and interactive games
- Create reusable evaluation environments with multi-criteria reward functions
- Integrate with AI SDK for model inference and native tool calling
Installation
npm install verifiers-tsOr if developing locally:
cd verifiers-ts
npm install
npm run buildQuick Start
Scaffold a Minimal RL Environment
pnpm dlx verifiers-ts vf-init weather-bot --minimal-rl
cd weather-bot
pnpm install
pnpm build
pnpm vf-eval -n 1 -r 1This template matches the screenshot example: a tool-enabled agent, tiny dataset, and a reward built with structuredOutputReward. Replace the prompt, tweak the agent defaults, and you’re ready to evaluate. Remember to export OPENAI_API_KEY (or pass --api-key to vf-eval).
Scaffold an Environment
pnpm dlx verifiers-ts vf-init my-environment
cd my-environment
pnpm install
pnpm build
pnpm vf-eval -n 1 -r 1Customize the generated src/index.ts, dataset, and reward functions to match your task.
vf-evalautomatically compiles your TypeScript, provisions a local.vf-eval/virtualenv, and exposes the environment to Python tooling—no manualuv syncrequired. ProvideOPENAI_API_KEY(or another provider key) so the default agent can make model calls.
Minimal RL Environment
import { generateText, tool } from "ai";
import { z } from "zod";
import { openai } from "@ai-sdk/openai";
import { createRLEnvironment } from "verifiers-ts";
const getCurrentWeather = tool({
description: "Get the current weather for a specific location.",
parameters: z.object({
location: z
.string()
.describe("City and state, for example: Seattle, WA"),
unit: z
.enum(["celsius", "fahrenheit"])
.describe("Temperature unit to return.")
.optional(),
}),
execute: async ({ location, unit }) => {
const preferredUnit = unit ?? "celsius";
const temperature = preferredUnit === "celsius" ? 18 : 64;
return `It is ${temperature}°${preferredUnit === "celsius" ? "C" : "F"} and sunny in ${location}.`;
},
});
const weatherAgent = {
generateText: (messages: any, options: Record<string, unknown> = {}) => {
const { tools = {}, ...rest } = options as {
tools?: Record<string, ReturnType<typeof tool>>;
};
return generateText({
model: openai("gpt-4o-mini") as any,
system:
"You are WeatherBot. When a user asks about the weather, call the getCurrentWeather tool and report the results clearly.",
temperature: 0,
tools: { getCurrentWeather, ...tools },
messages,
...rest,
});
},
tools: { getCurrentWeather },
};
const env = await createRLEnvironment({
agent: weatherAgent,
dataset: [
{
prompt: [
{
role: "user",
content: "What's the weather like in Seattle right now?",
},
],
answer: "seattle",
},
],
rewardFunction: (completion, answer) => {
const text = Array.isArray(completion)
? completion
.filter(
(msg) =>
typeof msg === "object" &&
msg !== null &&
"role" in msg &&
msg.role === "assistant"
)
.map((msg) => (msg as { content?: string }).content ?? "")
.join(" ")
: typeof completion === "string"
? completion
: "";
const normalized = text.toLowerCase();
return normalized.includes(answer) && normalized.includes("weather") ? 1 : 0;
},
});Single-Turn Environment
import { SingleTurnEnv, Rubric, Parser } from "verifiers-ts";
function correctAnswer(params: {
completion: any;
answer: string;
}): number {
const text = extractText(params.completion);
return text.trim() === params.answer.trim() ? 1.0 : 0.0;
}
const rubric = new Rubric({
funcs: [correctAnswer],
weights: [1.0],
});
const env = new SingleTurnEnv({
dataset: myDataset,
systemPrompt: "Solve step by step",
rubric,
});
const results = await env.evaluate(
"gpt-4",
{},
10, // numExamples
1, // rolloutsPerExample
true, // scoreRollouts
32, // maxConcurrent
undefined, // maxConcurrentGeneration
undefined, // maxConcurrentScoring
process.env.OPENAI_API_KEY
);Tool-Using Environment
import { ToolEnv, defineTool } from "verifiers-ts";
import { z } from "zod";
const calculator = defineTool(
"calculate",
"Perform arithmetic",
z.object({
expression: z.string(),
}),
async (args) => {
return eval(args.expression); // Use proper parser in production
}
);
const env = new ToolEnv({
tools: [calculator],
maxTurns: 10,
});
// AI SDK automatically handles tool calling loop
const results = await env.evaluate("gpt-4", {}, 10);Architecture
The library mirrors the Python verifiers structure:
- Environments: Base
Environmentclass withMultiTurnEnv,SingleTurnEnv,ToolEnv,StatefulToolEnv, andSandboxEnvvariants - Rubrics: Weighted reward functions for evaluation
- Parsers: Extract structured information (
Parser,ThinkParser,XMLParser) - Tools: Native AI SDK tool integration using
tool()function from 'ai' package - AI SDK Integration: Uses
generateTextfor model calls and automatic tool calling
Key Features
AI SDK Integration
- Native Tool Calling: Tools use AI SDK's
tool()function with Zod schemas - Automatic Loop Handling: AI SDK manages tool execution loops with
stopWhenconditions - Type-Safe Tools: Zod schemas provide runtime validation and TypeScript types
- Structured Outputs: Support for
generateObjectwhen needed
Compatibility
- Results Format: Saves results in JSONL format compatible with Python
vf-tui - Native TypeScript Evaluation: TypeScript projects use native
vf-evalCLI (no Python bridge needed) - Native Sandbox Client: Direct HTTP API integration with Prime Intellect sandboxes (no Python dependencies)
- State Management: Same state structure as Python verifiers
Environment Types
SingleTurnEnv
For Q&A tasks requiring a single model response.
MultiTurnEnv
Base class for custom interaction protocols. Override is_completed and env_response.
ToolEnv
Uses AI SDK's native tool calling. Tools are defined with defineTool() and automatically handled by AI SDK.
StatefulToolEnv
Extends ToolEnv for tools requiring dynamic state (e.g., sandbox IDs).
SandboxEnv
Abstract base for Prime Intellect sandbox integration.
Evaluation
TypeScript environments are evaluated natively using the TypeScript vf-eval CLI:
npx vf-eval hangman -n 5 -r 1The CLI automatically:
- Detects TypeScript projects (those with
package.jsoncontainingverifiers.envIdbut nopyproject.toml) - Uses native TypeScript evaluation implementation
- Saves results in compatible JSONL format for
vf-tui
For Python projects, vf-eval delegates to the Python verifiers CLI.
Sandbox Support
Sandbox environments (using SandboxEnv) use native TypeScript HTTP client to interact with Prime Intellect sandboxes. No Python dependencies required.
Configuration:
- Set
PRIME_INTELLECT_API_KEYorPRIME_API_KEYenvironment variable - Optional: Set
PRIME_INTELLECT_API_URL(default:https://api.primeintellect.ai) - Optional: Set
PRIME_INTELLECT_TEAM_IDfor team-scoped sandboxes
Examples
See environments/ directory for example implementations:
example-single-turn: Basic Q&A environmentexample-tool-use: Tool calling with AI SDK
Development
This workspace uses Turborepo for task orchestration and caching. Use turbo run commands to build all packages with automatic dependency resolution and caching.
# Install dependencies
pnpm install
# Build all packages (core + environments)
pnpm turbo run build
# Build a specific environment
pnpm turbo run build --filter hangman
# Run tests
pnpm turbo run test
# Lint all packages
pnpm turbo run lint
# Format code
pnpm turbo run format
# Watch mode (runs all dev tasks in parallel)
pnpm turbo run dev --parallel
# Watch a specific environment
pnpm turbo run dev --parallel --filter hangmanTurbo Features
- Task Dependencies: Builds automatically respect workspace dependencies (
dependsOn: ["^build"]) - Local Caching: Build outputs are cached locally for faster rebuilds
- Parallel Execution: Dev tasks run in parallel across packages
- Filtering: Use
--filter <package-name>to target specific packages
For remote caching (CI/CD), set TURBO_TEAM and TURBO_TOKEN environment variables.
Status
✅ Core Complete - All base classes and AI SDK integration implemented 🔄 In Progress - Python bridge refinement 📝 Pending - Comprehensive tests and examples
License
MIT
