ai-evaluate
v2.2.0
Published
Secure code execution in sandboxed environments
Downloads
1,359
Maintainers
Readme
ai-evaluate
You need to run user code. But untrusted code is terrifying.
One malicious snippet could crash your server, access your file system, or make unauthorized network requests. You've seen the horror stories. You know the risks.
What if you could run any code with confidence?
The Solution
ai-evaluate runs untrusted code in V8 isolates with zero access to your system. No file system. No network (by default). No risk.
// Before: Dangerous eval
const result = eval(userCode) // Could do ANYTHING
// After: Sandboxed execution
import { evaluate } from 'ai-evaluate'
const result = await evaluate({ script: userCode }, env)
// Runs in isolated V8 context - your system is protectedQuick Start
REST API (eval.workers.do)
Try it now with curl:
# Simple script execution
curl -X POST https://eval.workers.do \
-H "Content-Type: application/json" \
-d '{"script": "return 1 + 1"}'
# {"success":true,"value":2,"logs":[],"duration":2}
# With module exports
curl -X POST https://eval.workers.do \
-H "Content-Type: application/json" \
-d '{"module": "export const add = (a, b) => a + b", "script": "return add(2, 3)"}'
# {"success":true,"value":5,"logs":[],"duration":2}
# With console output
curl -X POST https://eval.workers.do \
-H "Content-Type: application/json" \
-d '{"script": "console.log(42); return 42"}'
# {"success":true,"value":42,"logs":[{"level":"log","message":"42",...}],"duration":2}
# With external imports (npm packages)
curl -X POST https://eval.workers.do \
-H "Content-Type: application/json" \
-d '{"script": "return _.chunk([1, 2, 3, 4, 5, 6], 2)", "imports": ["lodash"]}'
# {"success":true,"value":[[1,2],[3,4],[5,6]],"logs":[],"duration":42}
# With versioned imports
curl -X POST https://eval.workers.do \
-H "Content-Type: application/json" \
-d '{"script": "return dayjs().format(\"YYYY-MM-DD\")", "imports": ["[email protected]"]}'
# {"success":true,"value":"2026-01-25","logs":[],"duration":35}Deploy Your Own
cd example
pnpm install
pnpm deploySee example/ for a complete working Worker.
Cloudflare Workers (Production)
1. Install
pnpm add ai-evaluate2. Configure wrangler.jsonc
Important: Requires wrangler v4+ (
pnpm add -D wrangler@4)
{
"name": "my-worker",
"main": "src/index.ts",
"compatibility_date": "2026-01-01",
"worker_loaders": [
{ "binding": "loader" }
]
}3. Use in your Worker
import { evaluate } from 'ai-evaluate'
export default {
async fetch(request: Request, env: Env) {
const result = await evaluate({ script: '1 + 1' }, env)
return Response.json(result)
// { success: true, value: 2, logs: [], duration: 5 }
}
}
interface Env {
loader: unknown
}Node.js / Local Development
For local development, import from the /node subpath which uses Miniflare:
pnpm add ai-evaluate miniflareimport { evaluate } from 'ai-evaluate/node'
const result = await evaluate({ script: '1 + 1' })
// { success: true, value: 2, logs: [], duration: 50 }API Reference
evaluate(options, env?)
interface EvaluateOptions {
module?: string // Module code with exports
tests?: string // Vitest-style test code
script?: string // Script to execute
timeout?: number // Default: 5000ms, max: 60000ms
env?: Record<string, string> // Environment variables
sdk?: SDKConfig | boolean // Enable $, db, ai globals
imports?: string[] // External npm packages (see below)
}External Imports
The imports option lets you use npm packages in your sandboxed code. Supports:
// Bare package names (auto-resolved via esm.sh)
imports: ['lodash', 'dayjs', 'uuid']
// Versioned packages
imports: ['[email protected]', '[email protected]']
// Scoped packages
imports: ['@faker-js/faker']
// Full URLs (for custom CDNs)
imports: ['https://esm.sh/[email protected]']Imported packages are available as globals matching their package name:
lodash→_(special case) andlodashdayjs→dayjsuuid→uuid
const result = await evaluate({
imports: ['lodash', 'dayjs'],
script: `
const chunks = _.chunk([1,2,3,4,5,6], 2)
const today = dayjs().format('YYYY-MM-DD')
return { chunks, today }
`
}, env)
interface EvaluateResult {
success: boolean // Execution succeeded
value?: unknown // Script return value
logs: LogEntry[] // Console output
testResults?: TestResults // Test results if tests provided
error?: string // Error message if failed
duration: number // Execution time in ms
}createEvaluator(env)
Bind to a Cloudflare Workers environment for cleaner syntax:
import { createEvaluator } from 'ai-evaluate'
export default {
async fetch(request, env) {
const sandbox = createEvaluator(env)
const result = await sandbox({ script: '1 + 1' })
return Response.json(result)
}
}Usage Examples
Simple Script
const result = await evaluate({
script: `
const x = 10
const y = 20
return x + y
`
}, env)
// result.value === 30Module with Exports
const result = await evaluate({
module: `
export const greet = (name) => \`Hello, \${name}!\`
export const sum = (...nums) => nums.reduce((a, b) => a + b, 0)
`,
script: `
console.log(greet('World'))
return sum(1, 2, 3, 4, 5)
`
}, env)
// result.value === 15
// result.logs[0].message === 'Hello, World!'Testing User Code
const result = await evaluate({
module: `
export const isPrime = (n) => {
if (n < 2) return false
for (let i = 2; i <= Math.sqrt(n); i++) {
if (n % i === 0) return false
}
return true
}
`,
tests: `
describe('isPrime', () => {
it('returns false for numbers less than 2', () => {
expect(isPrime(0)).toBe(false)
expect(isPrime(1)).toBe(false)
})
it('returns true for prime numbers', () => {
expect(isPrime(2)).toBe(true)
expect(isPrime(17)).toBe(true)
})
it('returns false for composite numbers', () => {
expect(isPrime(4)).toBe(false)
expect(isPrime(100)).toBe(false)
})
})
`
}, env)
// result.testResults = { total: 3, passed: 3, failed: 0, ... }Test Framework
Full vitest-compatible API with async support.
Test Structure
describe('group', () => {
it('test name', () => { /* ... */ })
test('another test', () => { /* ... */ })
it.skip('skipped', () => { /* ... */ })
it.only('focused', () => { /* ... */ })
})Async Tests
it('async/await', async () => {
const result = await someAsyncFunction()
expect(result).toBe('expected')
})Hooks
describe('with setup', () => {
let data
beforeEach(() => { data = { count: 0 } })
afterEach(() => { data = null })
it('uses setup', () => {
data.count++
expect(data.count).toBe(1)
})
})Matchers
// Equality
expect(value).toBe(expected)
expect(value).toEqual(expected)
expect(value).toStrictEqual(expected)
// Truthiness
expect(value).toBeTruthy()
expect(value).toBeFalsy()
expect(value).toBeNull()
expect(value).toBeUndefined()
expect(value).toBeDefined()
// Numbers
expect(value).toBeGreaterThan(n)
expect(value).toBeLessThan(n)
expect(value).toBeCloseTo(n, digits)
// Strings & Arrays
expect(value).toMatch(/pattern/)
expect(value).toContain(item)
expect(value).toHaveLength(n)
// Objects
expect(value).toHaveProperty('path')
expect(value).toMatchObject(partial)
// Errors
expect(fn).toThrow()
expect(fn).toThrow('message')
// Negation
expect(value).not.toBe(expected)
// Promises
await expect(promise).resolves.toBe(value)
await expect(promise).rejects.toThrow('error')REPL Sessions
For interactive or multi-step evaluations, use the /repl export:
import { createReplSession } from 'ai-evaluate/repl'
// Create a persistent session
const session = await createReplSession({ local: true })
// Evaluate multiple expressions with shared context
await session.eval('const sum = (a, b) => a + b')
const result = await session.eval('sum(1, 2)')
console.log(result.value) // 3
// Context persists across evaluations
await session.eval('const x = 10')
const result2 = await session.eval('sum(x, 5)')
console.log(result2.value) // 15
// Clean up
await session.close()REPL Configuration
interface ReplSessionConfig {
local?: boolean // Use Miniflare (default: false, uses remote)
auth?: string // Auth token for remote execution
sdk?: SDKConfig | boolean // Enable platform primitives ($, db, ai)
prelude?: string // Code to run at session start
timeout?: number // Eval timeout in ms (default: 5000)
allowNetwork?: boolean // Allow fetch (default: true)
}Quick Eval
For one-off evaluations without session management:
import { quickEval } from 'ai-evaluate/repl'
const result = await quickEval('1 + 2 * 3')
console.log(result.value) // 7Requirements
| Environment | Requirement |
|-------------|-------------|
| Cloudflare Workers | wrangler v4+, worker_loaders binding |
| Node.js | miniflare (peer dependency) |
Security Model
| Protection | Description | |------------|-------------| | V8 Isolate | Code runs in isolated V8 context | | Network Control | Configurable: allow, block, or allowlist | | No File System | Zero filesystem access | | Memory Limits | Standard Worker limits apply | | CPU Limits | Execution time bounded |
Network Access Control
// Allow all network (default)
await evaluate({ script: '...', fetch: true })
// Block all network
await evaluate({ script: '...', fetch: false })
// Allowlist specific domains (wildcards supported)
await evaluate({
script: '...',
fetch: ['api.example.com', '*.trusted.com']
})Troubleshooting
"Unexpected fields found in top-level field: worker_loaders"
Upgrade wrangler to v4+:
pnpm add -D wrangler@4"Code generation from strings disallowed"
User code must be embedded at build time, not evaluated with new Function() or eval(). This is handled automatically by ai-evaluate - just pass your code as strings to evaluate().
"No loader binding"
Ensure your wrangler.jsonc has the worker_loaders config and you're passing env to evaluate():
{
"worker_loaders": [{ "binding": "loader" }]
}await evaluate({ script: code }, env) // Don't forget env!Types
interface LogEntry {
level: 'log' | 'warn' | 'error' | 'info' | 'debug'
message: string
timestamp: number
}
interface TestResults {
total: number
passed: number
failed: number
skipped: number
tests: TestResult[]
duration: number
}
interface TestResult {
name: string
passed: boolean
error?: string
duration: number
}Stop worrying about untrusted code. Start building.
pnpm add ai-evaluate