npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

tool-sandbox

v1.0.1

Published

Let AI agents write and execute sandboxed code to call tools.

Readme

tool-sandbox

Library for executing code and calling tools in a sandbox. Particularly useful for letting AI agents write and execute code, following the code execution pattern for AI agents.

Why?

When agents call tools directly, every tool definition and intermediate result flows through the context window. This gets expensive fast.

Code execution solves this. The agent writes code that calls tools, which:

  • Saves tokens: Load tool definitions on-demand, filter data before returning to the model.
  • Enables complex logic: Loops, conditionals, error handling in one execution instead of many tool calls.
  • Keeps data private: Intermediate results stay in the execution environment.
  • Runs safely: Unlike eval(), code runs in a WASM sandbox with no filesystem, network, or Node.js access.
  • Runs anywhere: Works in Node.js, browsers, Deno, Bun, and Cloudflare Workers.

Quick Start

npm install tool-sandbox
import {createSandbox, type Tool} from 'tool-sandbox';

const tools: Tool[] = [
  {
    name: 'listUsers',
    description: 'List all users',
    inputSchema: {type: 'object'},
    handler: async () => [
	  // In real-life, you'd fetch this from an API
	  // or plug in an MCP server (see below!)
      {email: '[email protected]', active: false},
      {email: '[email protected]', active: true},
    ],
  },
  {
    name: 'sendReactivationEmail',
    description: 'Send a reactivation email to a user',
    inputSchema: {type: 'object', properties: {to: {type: 'string'}}},
    handler: async () => ({sent: true}),
  },
];

const sandbox = await createSandbox({tools});

// Code can be generated by an LLM - see "Using with LLMs" below
await sandbox.execute.handler({
  code: `
    const users = await tool('listUsers', {});
    const inactiveUsers = users.filter(u => !u.active);

    for (const user of inactiveUsers) {
      await tool('sendReactivationEmail', {to: user.email});
    }

    return {notified: inactiveUsers.length};
  `,
});

Fetches users, filters to inactive, sends reactivation emails - all in one execution instead of many tool calls.

Using with LLMs

sandbox.execute is a Tool object you can pass to any LLM:

import Anthropic from '@anthropic-ai/sdk';
import {createSandbox, type Tool} from 'tool-sandbox';

const tools: Tool[] = [
	{
		name: 'getWeather',
		description: 'Get weather for a city',
		inputSchema: {type: 'object', properties: {city: {type: 'string'}}, required: ['city']},
		handler: async (args) => ({temp: 72, conditions: 'sunny'}),
	},
];

const anthropic = new Anthropic();
const sandbox = await createSandbox({tools});

const messages: Anthropic.MessageParam[] = [
	{role: 'user', content: 'What is the weather in Tokyo and Paris?'},
];

// Agent loop - continue until model stops calling tools
while (true) {
	const response = await anthropic.messages.create({
		model: 'claude-haiku-4-5',
		tools: [{
			name: sandbox.execute.name,
			description: sandbox.execute.description,
			input_schema: sandbox.execute.inputSchema,
		}],
		messages,
	});

	messages.push({role: 'assistant', content: response.content});

	if (response.stop_reason === 'end_turn') {
		const text = response.content.filter((b): b is Anthropic.TextBlock => b.type === 'text');
		console.log('Response:', text.map((b) => b.text).join(''));
		break;
	}

	const toolResults: Anthropic.ToolResultBlockParam[] = [];
	for (const block of response.content) {
		if (block.type === 'tool_use' && block.name === 'execute') {
			const result = await sandbox.execute.handler(block.input);
			toolResults.push({type: 'tool_result', tool_use_id: block.id, content: JSON.stringify(result)});
		}
	}
	messages.push({role: 'user', content: toolResults});
}
import {createSandbox, type Tool} from 'tool-sandbox';

const tools: Tool[] = [
	{
		name: 'getWeather',
		description: 'Get weather for a city',
		inputSchema: {type: 'object', properties: {city: {type: 'string'}}, required: ['city']},
		handler: async (args) => ({temp: 72, conditions: 'sunny'}),
	},
];

const sandbox = await createSandbox({tools});

const messages = [
	{role: 'user', content: 'What is the weather in Tokyo?'},
];

while (true) {
	const response = await fetch('https://api.anthropic.com/v1/messages', {
		method: 'POST',
		headers: {
			'Content-Type': 'application/json',
			'x-api-key': process.env.ANTHROPIC_API_KEY,
			'anthropic-version': '2023-06-01',
		},
		body: JSON.stringify({
			model: 'claude-haiku-4-5',
			tools: [{
				name: sandbox.execute.name,
				description: sandbox.execute.description,
				input_schema: sandbox.execute.inputSchema,
			}],
			messages,
		}),
	});

	const data = await response.json();
	messages.push({role: 'assistant', content: data.content});

	if (data.stop_reason === 'end_turn') {
		console.log('Response:', data.content.filter((b) => b.type === 'text').map((b) => b.text).join(''));
		break;
	}

	const toolResults = [];
	for (const block of data.content) {
		if (block.type === 'tool_use' && block.name === 'execute') {
			const result = await sandbox.execute.handler(block.input);
			toolResults.push({type: 'tool_result', tool_use_id: block.id, content: JSON.stringify(result)});
		}
	}
	messages.push({role: 'user', content: toolResults});
}
import OpenAI from 'openai';
import {createSandbox, type Tool} from 'tool-sandbox';

const tools: Tool[] = [
	{
		name: 'getWeather',
		description: 'Get weather for a city',
		inputSchema: {type: 'object', properties: {city: {type: 'string'}}, required: ['city']},
		handler: async (args) => ({temp: 72, conditions: 'sunny'}),
	},
];

const openai = new OpenAI();
const sandbox = await createSandbox({tools});

const messages: OpenAI.ChatCompletionMessageParam[] = [
	{role: 'user', content: 'What is the weather in Tokyo and Paris?'},
];

// Agent loop - continue until model stops calling tools
while (true) {
	const response = await openai.chat.completions.create({
		model: 'gpt-5-mini-2025-08-07',
		tools: [{
			type: 'function',
			function: {
				name: sandbox.execute.name,
				description: sandbox.execute.description,
				parameters: sandbox.execute.inputSchema,
			},
		}],
		messages,
	});

	const choice = response.choices[0];
	messages.push(choice.message);

	if (!choice.message.tool_calls) {
		console.log('Response:', choice.message.content);
		break;
	}

	for (const toolCall of choice.message.tool_calls) {
		if (toolCall.function.name === 'execute') {
			const args = JSON.parse(toolCall.function.arguments);
			const result = await sandbox.execute.handler(args);
			messages.push({role: 'tool', tool_call_id: toolCall.id, content: JSON.stringify(result)});
		}
	}
}
import {createSandbox, type Tool} from 'tool-sandbox';

const tools: Tool[] = [
	{
		name: 'getWeather',
		description: 'Get weather for a city',
		inputSchema: {type: 'object', properties: {city: {type: 'string'}}, required: ['city']},
		handler: async (args) => ({temp: 72, conditions: 'sunny'}),
	},
];

const sandbox = await createSandbox({tools});

const messages = [
	{role: 'user', content: 'What is the weather in Tokyo?'},
];

while (true) {
	const response = await fetch('https://api.openai.com/v1/chat/completions', {
		method: 'POST',
		headers: {
			'Content-Type': 'application/json',
			'Authorization': `Bearer ${process.env.OPENAI_API_KEY}`,
		},
		body: JSON.stringify({
			model: 'gpt-5-mini-2025-08-07',
			tools: [{
				type: 'function',
				function: {
					name: sandbox.execute.name,
					description: sandbox.execute.description,
					parameters: sandbox.execute.inputSchema,
				},
			}],
			messages,
		}),
	});

	const data = await response.json();
	const choice = data.choices[0];
	messages.push(choice.message);

	if (!choice.message.tool_calls) {
		console.log('Response:', choice.message.content);
		break;
	}

	for (const toolCall of choice.message.tool_calls) {
		if (toolCall.function.name === 'execute') {
			const args = JSON.parse(toolCall.function.arguments);
			const result = await sandbox.execute.handler(args);
			messages.push({role: 'tool', tool_call_id: toolCall.id, content: JSON.stringify(result)});
		}
	}
}
import {GoogleGenAI, type Content} from '@google/genai';
import {createSandbox, type Tool} from 'tool-sandbox';

const tools: Tool[] = [
	{
		name: 'getWeather',
		description: 'Get weather for a city',
		inputSchema: {type: 'object', properties: {city: {type: 'string'}}, required: ['city']},
		handler: async (args) => ({temp: 72, conditions: 'sunny'}),
	},
];

const genai = new GoogleGenAI({apiKey: process.env.GEMINI_API_KEY});
const sandbox = await createSandbox({tools});

const contents: Content[] = [
	{role: 'user', parts: [{text: 'What is the weather in Tokyo and Paris?'}]},
];

// Agent loop - continue until model stops calling tools
while (true) {
	const response = await genai.models.generateContent({
		model: 'gemini-3-flash-preview',
		contents,
		config: {
			tools: [{
				functionDeclarations: [{
					name: sandbox.execute.name,
					description: sandbox.execute.description,
					parameters: sandbox.execute.inputSchema,
				}],
			}],
		},
	});

	const calls = response.functionCalls();
	if (!calls?.length) {
		console.log('Response:', response.text());
		break;
	}

	contents.push({role: 'model', parts: response.candidates?.[0]?.content?.parts ?? []});

	const functionResponses = [];
	for (const call of calls) {
		if (call.name === 'execute') {
			const result = await sandbox.execute.handler(call.args);
			functionResponses.push({name: call.name, response: result});
		}
	}
	contents.push({role: 'user', parts: functionResponses.map((r) => ({functionResponse: r}))});
}

Using with MCP

Convert MCP clients to sandbox tools:

npm install @modelcontextprotocol/sdk
import {createSandbox, fromMcpClients} from 'tool-sandbox';
import {Client} from '@modelcontextprotocol/sdk/client/index.js';
import {StreamableHTTPClientTransport} from '@modelcontextprotocol/sdk/client/streamableHttp.js';

// Connect to an MCP server (e.g. https://github.com/domdomegg/gmail-mcp)
const transport = new StreamableHTTPClientTransport(new URL('http://localhost:3000/mcp'));
const client = new Client({name: 'my-app', version: '1.0.0'});
await client.connect(transport);

const tools = await fromMcpClients({gmail: client});
const sandbox = await createSandbox({tools});

fromMcpClients fetches and wraps:

  • Toolsgmail__send, gmail__search, etc.
  • Prompts (with arguments) → gmail__prompt__compose
  • Resourcesgmail__resource__inbox
  • Resource templates → parameterized resources like files__resource__file with {path}

Not supported: sampling, elicitation, roots, notifications, or other advanced MCP features.

Sandbox Environment

Code in the sandbox has access to:

| API | Description | |-----|-------------| | tool(name, args) | Call a tool and await its result | | console.log(...) | Debug output (visible to host) | | store | Persistent object across executions | | store._prev | Result from previous execution (read-only) |

Examples:

const _examples = [
  // Get a tool's schema on-demand
  `const schema = await tool('describe_tool', {name: 'gmail__send'});`,

  // Use store._prev to continue from last result
  `const updated = store._prev.map(x => x * 2); return updated;`,

  // Use store to accumulate across executions
  `store.total = (store.total || 0) + 1; return store.total;`,
];

Permissions

A couple ways to control what the sandbox can do:

Per-tool-call checks

Use onBeforeToolCall to inspect each tool call and block dangerous ones:

import {createSandbox, type Tool} from 'tool-sandbox';

const tools: Tool[] = [
  {
    name: 'readFile',
    description: 'Read a file',
    inputSchema: {type: 'object', properties: {path: {type: 'string'}}},
    annotations: {readOnlyHint: true},
    handler: async (args) => 'file contents...',
  },
  {
    name: 'deleteFile',
    description: 'Delete a file',
    inputSchema: {type: 'object', properties: {path: {type: 'string'}}},
    handler: async (args) => ({deleted: true}),
  },
];

const readonlySandbox = await createSandbox({
  tools,
  onBeforeToolCall(event) {
    const tool = tools.find((t) => t.name === event.toolName);
    // Only allow tools marked as read-only
    if (!tool?.annotations?.readOnlyHint) {
      throw new Error(`Tool ${event.toolName} is not allowed`);
    }
  },
});

Pre-execution review

Review the code before executing it—using a another model, SAST tool, or other logic:

import Anthropic from '@anthropic-ai/sdk';
import {createSandbox, type Tool} from 'tool-sandbox';

const anthropic = new Anthropic();
const sandbox = await createSandbox({tools});

// Get code from the model
const response = await anthropic.messages.create({
  model: 'claude-haiku-4-5',
  tools: [{
    name: sandbox.execute.name,
    description: sandbox.execute.description,
    input_schema: sandbox.execute.inputSchema,
  }],
  messages: [{role: 'user', content: userPrompt}],
});

const block = response.content.find((b): b is Anthropic.ToolUseBlock => b.type === 'tool_use');
if (block) {
  const {code} = block.input as {code: string};

  // Review code somehow e.g. with SAST tool or another AI model
  const review = await anthropic.messages.create({
    model: 'claude-haiku-4-5',
    system: 'Review this code against <policy>. Respond COMPLIANT or NON_COMPLIANT with a brief reason.',
    messages: [{role: 'user', content: code}],
  });

  const verdict = (review.content[0] as Anthropic.TextBlock).text;
  if (!verdict.startsWith('COMPLIANT')) {
    throw new Error(`Code rejected: ${verdict}`);
  }

  await sandbox.execute.handler({code});
}

Other approaches: allowlists (only include safe tools), SAST tools, capability tokens.

Other Use Cases

import {createSandbox, type Tool} from 'tool-sandbox';

const auditLog: Array<{tool: string; args: unknown; result: unknown}> = [];

const tools: Tool[] = [
  {
    name: 'getUser',
    description: 'Get user by ID',
    inputSchema: {type: 'object', properties: {id: {type: 'string'}}},
    handler: async (args) => ({id: '1', name: 'Alice'}),
  },
];

const sandbox = await createSandbox({
  tools,
  onToolCallSuccess(event) {
	// Could also add onToolCallError to log failures
	// or onBeforeToolCall to log before tools are called
    auditLog.push({
      tool: event.toolName,
      args: event.args,
      result: event.result,
    });
  },
});
import {createSandbox, type Tool} from 'tool-sandbox';

const cache = new Map<string, unknown>();

const tools: Tool[] = [
  {
    name: 'expensiveQuery',
    description: 'Run an expensive database query',
    inputSchema: {type: 'object', properties: {query: {type: 'string'}}},
    handler: async (args) => ({rows: []}),
  },
];

const sandbox = await createSandbox({
  tools,
  onBeforeToolCall(event) {
    const key = JSON.stringify([event.toolName, event.args]);
    if (cache.has(key)) {
      event.returnValue = cache.get(key);
    }
  },
  onToolCallSuccess(event) {
    const key = JSON.stringify([event.toolName, event.args]);
    cache.set(key, event.result);
  },
});

Replace PII with tokens before results reach the model, then restore real values when the model uses them in tool calls:

import {createSandbox, type Tool} from 'tool-sandbox';

// Bidirectional token store
const realToToken = new Map<string, string>();
const tokenToReal = new Map<string, string>();
let counter = 0;

function tokenize(value: string): string {
  if (!realToToken.has(value)) {
    const token = `[EMAIL_${++counter}]`;
    realToToken.set(value, token);
    tokenToReal.set(token, value);
  }
  return realToToken.get(value)!;
}

function redact(obj: unknown): unknown {
  if (typeof obj === 'string') {
    return tokenToReal.has(obj) ? obj : obj; // Already a token
  }
  if (typeof obj !== 'object' || obj === null) return obj;
  const result: Record<string, unknown> = {};
  for (const [key, value] of Object.entries(obj)) {
    if (key.toLowerCase().includes('email') && typeof value === 'string') {
      result[key] = tokenize(value);
    } else {
      result[key] = redact(value);
    }
  }
  return result;
}

function restore(obj: unknown): unknown {
  if (typeof obj === 'string') {
    return tokenToReal.get(obj) ?? obj;
  }
  if (typeof obj !== 'object' || obj === null) return obj;
  const result: Record<string, unknown> = {};
  for (const [key, value] of Object.entries(obj)) {
    result[key] = restore(value);
  }
  return result;
}

const tools: Tool[] = [
  {
    name: 'getCustomers',
    description: 'Get customer list',
    inputSchema: {type: 'object'},
    handler: async () => [{name: 'Alice', email: '[email protected]'}],
  },
  {
    name: 'sendEmail',
    description: 'Send an email',
    inputSchema: {type: 'object', properties: {to: {type: 'string'}}},
    handler: async (args) => ({sent: true}),
  },
];

const sandbox = await createSandbox({
  tools,
  onBeforeToolCall(event) {
    // Restore real emails before calling tools
    event.args = restore(event.args);
  },
  onToolCallSuccess(event) {
    // Tokenize emails before returning to model
    event.result = redact(event.result);
  },
});

// Model sees: [{name: 'Alice', email: '[EMAIL_1]'}]
// Model calls: sendEmail({to: '[EMAIL_1]'})
// Actual call: sendEmail({to: '[email protected]'})

API Reference

createSandbox(options)

| Option | Description | |--------|-------------| | tools | Tool[] — Tools available in the sandbox | | onBeforeToolCall | Called before each tool call | | onToolCallSuccess | Called after successful tool call | | onToolCallError | Called after failed tool call |

Sandbox

| Property/Method | Description | |-----------------|-------------| | execute | Tool object for code execution. Pass to LLM, call .handler({code}) | | tools | Current tools (read-only) | | store | Persistent store, shared with sandbox code | | addTool(tool) | Add a tool at runtime | | removeTool(name) | Remove a tool by name |

Tool

type Tool = {
	name: string;
	description?: string;
	inputSchema: {type: 'object'; properties?: Record<string, unknown>; required?: string[]};
	handler: (args: unknown) => Promise<unknown>;
};

Contributing

Pull requests are welcomed on GitHub! To get started:

  1. Install Git and Node.js
  2. Clone the repository
  3. Install dependencies with npm install
  4. Run npm run test to run tests
  5. Build with npm run build

Releases

Versions follow the semantic versioning spec.

To release:

  1. Use npm version <major | minor | patch> to bump the version
  2. Run git push --follow-tags to push with tags
  3. Wait for GitHub Actions to publish to the NPM registry.