@dillion-ai/open-agent

v0.2.3

Published

3 days ago

Provider-agnostic agent SDK with sandboxed code execution

Downloads

1,305

0High
0Medium
0Low

enjidotin

agent ai sdk llm sandbox code-execution provider-agnostic tool-use open-agent dillion

open-agent

A thin, vendor-agnostic wrapper around the Vercel AI SDK that gives any LLM Claude-style coding agent capabilities — sandboxed bash, read, write, edit, glob, and grep — in a few lines of code.

No framework lock-in. No opinions on prompts or orchestration. Just plug in your model, point it at a Daytona sandbox, and let it work.

How it relates to the Claude Agent SDK

Anthropic's Claude Agent SDK gives you the full Claude Code experience — built-in tools, agent loop, context management, hooks, subagents, MCP, and sessions — but it's tied to Claude models via the Anthropic API.

open-agent takes the same core idea (give an LLM coding tools that run in a sandbox) and makes it work with any model through the Vercel AI SDK. It's a much thinner layer: no hooks, no subagents, no sessions — just tools, a sandbox, and a generate/stream loop. If you want the full-featured Claude agent experience, use the Claude Agent SDK. If you want the same tool pattern with any model and minimal abstraction, use this.

Install

npm install @dillion-ai/open-agent ai

ai is a peer dependency (>= 6.0.0).

Quick start

import { createOpenRouter } from "@openrouter/ai-sdk-provider";
import { createAgent, Image } from "@dillion-ai/open-agent";

const openrouter = createOpenRouter({
  apiKey: process.env.OPENROUTER_API_KEY!,
});

const agent = createAgent({
  model: openrouter("minimax/minimax-m2.5:exacto"),
  instructions: "You are a coding assistant.",
  tools: ["bash", "read", "write", "edit", "glob", "grep"],
  sandbox: { apiKey: process.env.DAYTONA_API_KEY },
});

const result = await agent.generate({
  prompt: "Create a hello world TypeScript file and run it",
});

console.log(result.text);

await agent.destroy();

Swap the model string for any provider — OpenAI, Google, Mistral, open-source via OpenRouter — everything else stays the same.

What this gives you

This SDK is intentionally small. It provides the wiring between the AI SDK's ToolLoopAgent and a cloud sandbox, and nothing more:

Any model — anything the Vercel AI SDK supports works out of the box
Claude-style tools — the same bash, read, write, edit, glob, grep pattern, running inside a Daytona sandbox
Sandboxed execution — code runs in an isolated cloud environment, not on your machine
File I/O — upload files in, download results out
Streaming — first-class streaming support via the AI SDK
Skills — optional markdown instruction files for domain-specific workflows (e.g. Excel generation)

The default Daytona snapshots already include many common Python packages, including pandas and matplotlib, but not openpyxl. The python tool bootstraps openpyxl on first use when needed so spreadsheet workflows still work without a manual pip install. If you want openpyxl baked in before the sandbox starts, create the sandbox from a custom Daytona snapshot or image instead.

The SDK doesn't impose prompt templates, memory systems, RAG pipelines, or agent architectures. You bring those.

Tools

| Tool | Description | | ------ | ---------------------------------- | | bash | Execute shell commands | | read | Read file contents | | write| Write/create files | | edit | Make targeted edits to files | | glob | Find files by pattern | | grep | Search file contents with regex | | python | Execute Python code with openpyxl bootstrapped on first use when needed |

Skills

Skills are markdown files that get injected into the system prompt. They teach the agent domain-specific workflows without requiring code changes. Skills are opt-in: unless you pass skills, no skill prompt or use_skill tool is added. The SDK ships a built-in xlsx skill; you can add your own.

const agent = createAgent({
  model,
  tools: ["bash", "read", "write", "edit"],
  skills: {
    builtins: ["xlsx"],
    custom: ["./my-skills/"],
  },
  sandbox: { apiKey: process.env.DAYTONA_API_KEY },
});

File upload & download

await agent.uploadFiles([
  { path: "/home/daytona/data.csv", content: csvString },
]);

await agent.generate({ prompt: "Analyze data.csv and save results to output.xlsx" });

const file = await agent.downloadFile("/home/daytona/output.xlsx");
fs.writeFileSync("output.xlsx", file);

Streaming

const stream = await agent.stream({
  prompt: "Write a Python fibonacci generator and test it",
});

for await (const chunk of stream.textStream) {
  process.stdout.write(chunk);
}

Configuration

createAgent({
  model,                    // Any Vercel AI SDK LanguageModel
  instructions: "",         // System prompt — yours to define
  tools: [],                // Which built-in tools to enable
  sandbox: {                // Daytona sandbox config
    apiKey: "",
    apiUrl: "",             //   Custom API URL (optional)
    target: "",             //   Target region (optional)
    language: "typescript", //   Sandbox language/runtime (optional)
    snapshot: "python-tools", // Named Daytona snapshot to use (optional)
    image: Image.debianSlim("3.12").pipInstall(["pandas", "openpyxl", "matplotlib"]), // Public/custom image (optional)
    onSnapshotCreateLogs: console.log, // Image build logs callback (optional)
    instance: sandbox,      //   Reuse an existing Sandbox (optional)
  },
  skills: {                 // Optional
    builtins: false,        //   true | false | string[]
    custom: [],             //   Paths to .md files or directories
  },
  maxSteps: 30,             // Max tool-use loop iterations
  cwd: "/home/daytona",    // Working directory in the sandbox
});

Use a prebuilt Daytona snapshot when you already have one registered:

const agent = createAgent({
  model,
  tools: ["python"],
  sandbox: {
    apiKey: process.env.DAYTONA_API_KEY,
    snapshot: "my-awesome-snapshot",
    language: "python",
  },
});

Use a public image or declarative Daytona image when you want packages baked in up front:

const agent = createAgent({
  model,
  tools: ["python"],
  sandbox: {
    apiKey: process.env.DAYTONA_API_KEY,
    language: "python",
    image: Image.debianSlim("3.12").pipInstall(["pandas", "openpyxl"]),
    onSnapshotCreateLogs: console.log,
  },
});

Examples

See examples/:

basic.ts — minimal usage
streaming.ts — real-time streaming
xlsx-report.ts — Excel financial report from CSV
chart-generation.ts — matplotlib charts to PNG
upload-and-analyze.ts — upload, analyze, download spreadsheets
custom-skills.ts — custom skill files

Benchmarking models

The repository includes a benchmark harness for comparing multiple models on harder financial tasks:

chart creation from a complex Excel workbook via matplotlib
chart creation from the same workbook via Recharts
multi-step Excel financial modeling with scenario analysis, debt logic, and covenant checks

Each run:

executes every task against every model in BENCHMARK_MODELS
stores per-run, per-task metrics in a SQLite database
downloads generated artifacts to a local output directory
validates chart data and financial outputs against deterministic expected values

The benchmark lives at benchmarks/financial-benchmark.ts.

Build the package first so the benchmark can import dist/:

npm run build
BENCHMARK_MODELS="provider/model-a,provider/model-b" \
OPENROUTER_API_KEY=... \
DAYTONA_API_KEY=... \
node benchmarks/financial-benchmark.ts

Optional environment variables:

BENCHMARK_RUN_DIR — root directory for a single run (default: ./benchmarks/results/<timestamp>)
BENCHMARK_DB_PATH — path to the SQLite database file (default: <run dir>/benchmark-results.sqlite)
BENCHMARK_ARTIFACT_DIR — directory for downloaded artifacts (default: <run dir>/artifacts)

License

MIT